Two New Python C Extensions

Tuesday, March 08, 2011.

Today I’m releasing two new Python C extensions. They’ve been useful in fast text record processing, but could be used for plenty of other things. YMMV.

percentcoding – is a Python C extension for percent encoding and decoding. URL encoding is a specific instance of percent encoding, with a set of reserved characters defined by RFC 3986 . The percentcoding library can be used as a 10x faster drop-in replacement for the urllib.quote and urllib.unquote included with Python. I use it for escaping whitespace and non-printable characters in Unix text record formats.
flattery – is a Python C extension for converting hierarchical data to and from flat key/value pairs. This comes up in web form processing when you’ve got many different input elements in a single form – perhaps even tabular data that can be edited – and you want to map them onto a nested data structure. I use it together with percentcoding to process hierarchical record data stored in Unix text formats. Which makes them interchangeable with records in json or protocol buffer format, except that they’re sort(1), cut(1), join(1) etc. friendly.

I’ve had pure Python implementations of these kicking around for a while. They were slow, but it didn’t matter until recently. See also a day in the life of a back-end developer:

1. Find bottleneck.
2. Remove bottleneck.
3. Repeat.
4. Every once in a while, make a bold move to throw something out that can no longer work that way and replace it with something more scalable. But while this is important, it comes up less often than you might think.

Posted by Alan on Tuesday, March 08, 2011. (Discuss)

Maelstrom

Two New Python C Extensions