multiline regular expression (replace)

Z

Zdenek Maxa

Hi all,

I would like to perform regular expression replace (e.g. removing
everything from within tags in a XML file) with multiple-line pattern.
How can I do this?

where = open("filename").read()
multilinePattern = "^<tag> .... <\/tag>$"
re.search(multilinePattern, where, re.MULTILINE)

Thanks greatly,
Zdenek
 
H

half.italian

Hi all,

I would like to perform regular expression replace (e.g. removing
everything from within tags in a XML file) with multiple-line pattern.
How can I do this?

where = open("filename").read()
multilinePattern = "^<tag> .... <\/tag>$"
re.search(multilinePattern, where, re.MULTILINE)

Thanks greatly,
Zdenek

Why not use an xml package for working with xml files? I'm sure
they'll handle your multiline tags.

http://effbot.org/zone/element-index.htm
http://codespeak.net/lxml/

~Sean
 
S

Steve Holden

Zdenek said:
Hi,

that was merely an example of what I would like to achieve. However, in
general, is there a way for handling multiline regular expressions in
Python, using presumably only modules from distribution like re?

Thanks,
Zdenek

So you mean you don't know how to *create* multiline patterns?

One way is to use """ ... """ or ''' ... ''' quoting, which allows you
to include newlines as part of your strings. Another is to use \n in
your strings to represent newlines.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
------------------ Asciimercial ---------------------
Get on the web: Blog, lens and tag your way to fame!!
holdenweb.blogspot.com squidoo.com/pythonology
tagged items: del.icio.us/steve.holden/python
All these services currently offer free registration!
-------------- Thank You for Reading ----------------
 
G

Gerard Flanagan

Hi all,

I would like to perform regular expression replace (e.g. removing
everything from within tags in a XML file) with multiple-line pattern.
How can I do this?

where = open("filename").read()
multilinePattern = "^<tag> .... <\/tag>$"
re.search(multilinePattern, where, re.MULTILINE)

If it helps, I have the following function:

8<-----------------------------------------------------------
def update_xml(infile, outfile, mapping, deep=False):
from xml.etree import cElementTree as ET
from utils.elementfilter import ElementFilter
doc = ET.parse(infile)
efilter = ElementFilter(doc.getroot())
changes = 0
for key, val in mapping.iteritems():
pattern, repl = val
efilter.filter = key
changes += efilter.sub(pattern, repl, deep=deep)
doc.write(outfile, encoding='UTF-8')
return changes

mapping = {
'/portal/content-node[@type=="page"]/@action': ('.*', 'ZZZZ'),
'/portal/web-app/portlet-app/portlet/localedata/title':
('Portal', 'Gateway'),
}

changes = update_xml('c:\\working\\tmp\\test.xml', 'c:\\working\\tmp\
\test2.xml', mapping, True)

print 'There were %s changes' % changes
8<-----------------------------------------------------------

where utils.elementfilter is this module:

http://gflanagan.net/site/python/elementfilter/elementfilter.py

It doesn't support `re` flags, but you could change the sub method of
elementfilter.ElementFilter to do so, eg.(UNTESTED!):

def sub(self, pattern, repl, count=0, deep=False, flags=None):
changes = 0
if flags:
pattern = re.compile(pattern, flags)
for elem in self.filtered:
...
[rest of method unchanged]
...

Gerard
 
H

Holger Berger

Hi,

yes:

import re

a="""
I Am
Multiline
but short anyhow"""

b="(I[\s\S]*line)"

print re.search(b, a,re.MULTILINE).group(1)


gives

I Am
Multiline

Be aware that . matches NO newlines!!!
May be this caused your problems?

regards
Holger
 
Z

Zdenek Maxa

Hi,

Thanks a lot for useful hints to all of you who replied to my question.
I could easily do now what I wanted.

Cheers,
Zdenek


Holger said:
Hi,

yes:

import re

a="""
I Am
Multiline
but short anyhow"""

b="(I[\s\S]*line)"

print re.search(b, a,re.MULTILINE).group(1)


gives

I Am
Multiline

Be aware that . matches NO newlines!!!
May be this caused your problems?

regards
Holger


Zdenek Maxa wrote:

Hi,

that was merely an example of what I would like to achieve. However, in
general, is there a way for handling multiline regular expressions in
Python, using presumably only modules from distribution like re?

Thanks,
Zdenek
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,817
Latest member
DicWeils

Latest Threads

Top