Inconsistent split Behavior in Python
Here’s a futile but cathartic bug report I filed against Python recently.
In Python, string.split and re.split both take an optional argument that limits the number of splits that are done. This is unlike Perl’s split builtin, which limits the number of pieces. But it makes sense I guess, and consistency between the two languages is not something I’d necessarily expect.
However, consistency within a language…a reasonable expectation, no?
The inconsistency lies in how the string.split and re.split handle the edge cases of “do an unlimited number of splits” and “don’t do any splits.” The two agree that “unlimited splits” is the default. They don’t agree on how to interpret the value of an explicit maxsplit parameter.
maxsplit=0 | maxsplit=-1 | |
string.split | no splits | unlimited splits |
re.split | unlimited splits | no splits |
I think string.split is doing the sensible thing here.
Of course, the “bug” has zero chance of being fixed at this point. I pretty much just filed it to create a search result for others similarly bitten, annoyed, or both.
blog comments powered by Disqus