Hi all,
I would like to perform regular expression replace (e.g. removing
everything from within tags in a XML file) with multiple-line pattern.
How can I do this?
where = open("filename").read()
multilinePattern = "^<tag> .... <\/tag>$"
re.search(multilinePattern, where, re.MULTILINE)
If it helps, I have the following function:
8<-----------------------------------------------------------
def update_xml(infile, outfile, mapping, deep=False):
from xml.etree import cElementTree as ET
from utils.elementfilter import ElementFilter
doc = ET.parse(infile)
efilter = ElementFilter(doc.getroot())
changes = 0
for key, val in mapping.iteritems():
pattern, repl = val
efilter.filter = key
changes += efilter.sub(pattern, repl, deep=deep)
doc.write(outfile, encoding='UTF-8')
return changes
mapping = {
'/portal/content-node[@type=="page"]/@action': ('.*', 'ZZZZ'),
'/portal/web-app/portlet-app/portlet/localedata/title':
('Portal', 'Gateway'),
}
changes = update_xml('c:\\working\\tmp\\test.xml', 'c:\\working\\tmp\
\test2.xml', mapping, True)
print 'There were %s changes' % changes
8<-----------------------------------------------------------
where utils.elementfilter is this module:
http://gflanagan.net/site/python/elementfilter/elementfilter.py
It doesn't support `re` flags, but you could change the sub method of
elementfilter.ElementFilter to do so, eg.(UNTESTED!):
def sub(self, pattern, repl, count=0, deep=False, flags=None):
changes = 0
if flags:
pattern = re.compile(pattern, flags)
for elem in self.filtered:
...
[rest of method unchanged]
...
Gerard