R
Ron Garret
I'm trying to split a CamelCase string into its constituent components.
This kind of works:
but it consumes the boundary characters. To fix this I tried using
lookahead and lookbehind patterns instead, but it doesn't work:
However, it does seem to work with findall:
So the regular expression seems to be doing the Right Thing. Is this a
bug in re.split, or am I missing something?
(BTW, I tried looking at the source code for the re module, but I could
not find the relevant code. re.split calls sre_compile.compile().split,
but the string 'split' does not appear in sre_compile.py. So where does
this method come from?)
I'm using Python2.5.
Thanks,
rg
This kind of works:
['fo', 'a', 'az']re.split('[a-z][A-Z]', 'fooBarBaz')
but it consumes the boundary characters. To fix this I tried using
lookahead and lookbehind patterns instead, but it doesn't work:
['fooBarBaz']re.split('((?<=[a-z])(?=[A-Z]))', 'fooBarBaz')
However, it does seem to work with findall:
['', '']re.findall('(?<=[a-z])(?=[A-Z])', 'fooBarBaz')
So the regular expression seems to be doing the Right Thing. Is this a
bug in re.split, or am I missing something?
(BTW, I tried looking at the source code for the re module, but I could
not find the relevant code. re.split calls sre_compile.compile().split,
but the string 'split' does not appear in sre_compile.py. So where does
this method come from?)
I'm using Python2.5.
Thanks,
rg