F
Fred
Hi,
While parsing through a bunch of HTML pages using the latest
ActivePython, I experienced something funny using the re module. I
extracted the part that generates the errors (I'm just trying to
substitute once item with another in a string):
--------------------------------
import re
#NOK : doesn't like a single, ending backslash
#stuff = "\colortbl\red0\green0\"
# => SyntaxError: EOL while scanning single-quoted string
#NOK : doesn't like gn0?
stuff="\colortbl\red0\gn0"
# => traceback (most recent call last):
# File "C:\test.py", line 10, in ?
# template = re.sub('BLA', stuff, template)
# File "G:\Python23\lib\sre.py", line 143, in sub
# return _compile(pattern, 0).sub(repl, string, count)
# File "G:\Python23\lib\sre.py", line 257, in _subx
# template = _compile_repl(template, pattern)
# File "G:\Python23\lib\sre.py", line 244, in _compile_repl
# raise error, v # invalid expression
#sre_constants.error: bad group name
#OK....
stuff="\colortbl\red0\n0"
template = "BLA"
template = re.sub('BLA', stuff, template)
--------------------------------
=> It appears that the re module isn't very friendly with backslashes,
at least on the Windows platform. Does someone know why, and what I
could do, since I can't rewrite the source HTML documents that contain
backslashes.
Thank you
Fred.
While parsing through a bunch of HTML pages using the latest
ActivePython, I experienced something funny using the re module. I
extracted the part that generates the errors (I'm just trying to
substitute once item with another in a string):
--------------------------------
import re
#NOK : doesn't like a single, ending backslash
#stuff = "\colortbl\red0\green0\"
# => SyntaxError: EOL while scanning single-quoted string
#NOK : doesn't like gn0?
stuff="\colortbl\red0\gn0"
# => traceback (most recent call last):
# File "C:\test.py", line 10, in ?
# template = re.sub('BLA', stuff, template)
# File "G:\Python23\lib\sre.py", line 143, in sub
# return _compile(pattern, 0).sub(repl, string, count)
# File "G:\Python23\lib\sre.py", line 257, in _subx
# template = _compile_repl(template, pattern)
# File "G:\Python23\lib\sre.py", line 244, in _compile_repl
# raise error, v # invalid expression
#sre_constants.error: bad group name
#OK....
stuff="\colortbl\red0\n0"
template = "BLA"
template = re.sub('BLA', stuff, template)
--------------------------------
=> It appears that the re module isn't very friendly with backslashes,
at least on the Windows platform. Does someone know why, and what I
could do, since I can't rewrite the source HTML documents that contain
backslashes.
Thank you
Fred.