T
tushar.saxena
Hi,
I have a set of XML files from which I need to extract some data. The
format of the file is as follows :
<tag1>
<tag3>DATA1</tag3>
</tag1>
<tag2>
<tag3>DATA2</tag3>
</tag2>
I need to extract the DATA part of the xml structure
Note : tag3 can be contained either within tag1 or tag2, but I need to
extract data only from tag1. i.e. DATA1 should be extracted, but not
DATA2
If I want to get both DATA1 and DATA2 I can use a simple regex like :
if (($_ =~ /<tag3>(\w+)<\/tag3>/g))
{
print $1
}
But if I try to get only DATA1 (embedded within tag1) I try using
something like this, but am unable to get it to work
if (($_ =~ /<tag1>[\n\s\S\w\W]*<tag2>(\w+)<\/tag2>[\n\s\S\w\W]*<\/
tag1>/g))
{
print $1
}
In this second case, the match itself fails.
Any help would be appreciated !
I have a set of XML files from which I need to extract some data. The
format of the file is as follows :
<tag1>
<tag3>DATA1</tag3>
</tag1>
<tag2>
<tag3>DATA2</tag3>
</tag2>
I need to extract the DATA part of the xml structure
Note : tag3 can be contained either within tag1 or tag2, but I need to
extract data only from tag1. i.e. DATA1 should be extracted, but not
DATA2
If I want to get both DATA1 and DATA2 I can use a simple regex like :
if (($_ =~ /<tag3>(\w+)<\/tag3>/g))
{
print $1
}
But if I try to get only DATA1 (embedded within tag1) I try using
something like this, but am unable to get it to work
if (($_ =~ /<tag1>[\n\s\S\w\W]*<tag2>(\w+)<\/tag2>[\n\s\S\w\W]*<\/
tag1>/g))
{
print $1
}
In this second case, the match itself fails.
Any help would be appreciated !