C
conan
This regexp
'<widget class=".*" id=".*">'
works well with 'grep' for matching lines of the kind
<widget class="GtkWindow" id="window1">
on a XML .glade file
However that's not true for the re module in python, since this one
takes the regexp as if were specified this way: '^<widget class=".*"
id=".*">'
For some reason regexp on python decide to match from the start of the
line, no matter if you used or not the caret symbol '^'.
I have a hard time to note why this regexp wasn't working:
regexp = re.compile(r'<widget class=".*" id="(.*)">')
The solution was to consider spaces:
regexp = re.compile(r'\s*<widget class=".*" id="(.*)">\s*')
To reproduce behaviour just take a .glade file and this python script:
<code>
import re
glade_file_name = 'some.glade'
bad_regexp = re.compile(r'<widget class=".*" id="(.*)">')
good_regexp = re.compile(r'\s*<widget class=".*" id="(.*)">\s*')
for line in open(glade_file_name):
if bad_regexp.match(line):
print 'bad:', line.strip()
if good_regexp.match(line):
print 'good:', line.strip()
</code>
The thing is i should expected to have to put caret explicitly to tell
the regexp to match at the start of the line, something like:
r'^<widget class=".*" id="(.*)">'
however python regexp is taking care of that for me. This is not a
desired behaviour for what i know about regexp, but maybe i'm missing
something.
'<widget class=".*" id=".*">'
works well with 'grep' for matching lines of the kind
<widget class="GtkWindow" id="window1">
on a XML .glade file
However that's not true for the re module in python, since this one
takes the regexp as if were specified this way: '^<widget class=".*"
id=".*">'
For some reason regexp on python decide to match from the start of the
line, no matter if you used or not the caret symbol '^'.
I have a hard time to note why this regexp wasn't working:
regexp = re.compile(r'<widget class=".*" id="(.*)">')
The solution was to consider spaces:
regexp = re.compile(r'\s*<widget class=".*" id="(.*)">\s*')
To reproduce behaviour just take a .glade file and this python script:
<code>
import re
glade_file_name = 'some.glade'
bad_regexp = re.compile(r'<widget class=".*" id="(.*)">')
good_regexp = re.compile(r'\s*<widget class=".*" id="(.*)">\s*')
for line in open(glade_file_name):
if bad_regexp.match(line):
print 'bad:', line.strip()
if good_regexp.match(line):
print 'good:', line.strip()
</code>
The thing is i should expected to have to put caret explicitly to tell
the regexp to match at the start of the line, something like:
r'^<widget class=".*" id="(.*)">'
however python regexp is taking care of that for me. This is not a
desired behaviour for what i know about regexp, but maybe i'm missing
something.