RE Perl Pattern matching

Deepan Perl XML Parser · Apr 2, 2008

Hi,
I am having a string say $str, the value of it is as
below:

<responseStatus>HTTP/1.1 200 OK</responseStatus>

<cookies>

<cookie name="ASPSESSIONIDSQDCBDBA" path="/" domain="www-
int.juniper.net">DOCFGJEAKNOMBLHCGEMOIMBA</cookie>

</cookies>

<headers>

<header name="Cache-control">private</header>

<header name="Content-Encoding">deflate</header>

<header name="Content-Type">text/html</header>

<header name="Date">Wed, 26 Mar 2008 04:48:16 GMT</header>

<header name="Server">Concealed by Juniper Networks Redline EX</
header>

<header name="Set-
Cookie">ASPSESSIONIDSQDCBDBA=DOCFGJEAKNOMBLHCGEMOIMBA; path=/</header>

<header name="Transfer-Encoding">chunked</header>

<header name="Vary">Accept-Encoding, User-Agent</header>

<header name="Via">1.1 sac-p-green-dx2 (Juniper Networks
Application Acceleration Platform - DX 5.1.8 0)</header>

<header name="Warning">214 www-int.juniper.net "Juniper
Networks DX Active"</header>

<header name="X-Powered-By">ASP.NET</header>

</headers>

<content>

<contentLength>27887</contentLength>

<compression>71.3</compression>

<encodingScheme>deflate</encodingScheme>

<text><![CDATA[
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"..."http://
www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">..<html>..<head>....<title>
Intranet Home Page</title>..<script language="JavaScript" type="text/
javascript">..function clicker()..{..document.seek2.qt.value =

document.seek1.qt.value;..return true;..} said:
..</div>....</body>..</html>..

]]></text>

<mimeType>text/html</mimeType>

</content>

----------------

Now i want to get everything between "<text><![CDATA[" and "]]></
text>" [ie i need to capture the CDATA section]and i am using the
below code

if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
{
print $1;
}

But not getting anything. Can anyone find out the fault in it?

Ben Bullock · Apr 2, 2008

if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
{
print $1;

}

But not getting anything. Can anyone find out the fault in it?

You need an "s" at the end:

if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text>#s )

See http://perldoc.perl.org/perlre.html#Modifiers

Deepan Perl XML Parser · Apr 2, 2008

if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
{
print $1;

}

Click to expand...

But not getting anything. Can anyone find out the fault in it?

Click to expand...

You need an "s" at the end:

if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text>#s )

Seehttp://perldoc.perl.org/perlre.html#Modifiers

Thank You Ben!

Mirco Wahab · Apr 2, 2008

Deepan said:
Now i want to get everything between "<text><![CDATA[" and "]]></
text>" [ie i need to capture the CDATA section]and i am using the
below code

if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
{
print $1;
}

Your expression is (besides the /s modifier) perfectly valid
but I'd like to make an additional remark. You could strip
the newline characters (if any) and extract more than one
CDATA section, sth. like:

my $reg = qr{
<text> # find section <text>
<!\[CDATA\[ [\r\n]? # which contains another CDATA section
(.+?) # capture the CDATA lines but ?check? \]\]
[\r\n]?\]\]> # until CDATA terminator
</text> # maybe even the <text> is closed properly
}sx;

print $1 while $str =~ /$reg/g; # extract each CDATA section

Regards

M.

Ben Bullock · Apr 2, 2008

You're trying to parse XML with regular expressions. Don't do that.
Perl has a large selection of excellent modules for processing XML. Use
them.

Chris, do you talk like that to people in real life, or is it just the
internet?

Charlton Wilbur · Apr 3, 2008

BB> Chris, do you talk like that to people in real life, or is it
BB> just the internet?

When you've said the same thing over and over to people who aren't
getting it, there is a clear temptation to speak slowly, with short
sentences and short words.

Charlton

Martijn Lievaart · Apr 9, 2008

Chris, do you talk like that to people in real life, or is it just the
internet?

I do. Even (especially?) if someone is new around here and is making a
mistake thousands have made before.

M4

Inserting new line to a string	2	Mar 24, 2008
help with LWP and log in after redirect	2	Mar 4, 2008
Image::Magick->Write() doesn't want to write	2	Nov 16, 2009
Custom HTTP header	1	May 19, 2009
newbie - mechanize multiple cookies problem with form login	0	Oct 4, 2007
Setting up YaBB Perl forum - weird respond as plain text in browser, including headers	2	May 17, 2007
Internal Server error	3	Feb 8, 2009
LWP::UserAgent and SSL is it impossible?	4	Nov 11, 2003

RE Perl Pattern matching

Deepan Perl XML Parser

Ben Bullock

Deepan Perl XML Parser

Mirco Wahab

Ben Bullock

Charlton Wilbur

Martijn Lievaart

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads