My Regexp XML Parser -> Structured Perl Data, Cut & Paste Version, No Module's (Vol I)

J

Jürgen Exner

Matt said:
It would
be nice if you could drop the script-kiddie talk and write proper
English sentences in the future, though.

Well, actually I think it was just a typo and he meant to write "luff". It's
a common mistake for non-native English speakers to write vowels the way
they are pronounced.

Of course the semantic of that sentence in the context of Perl is still a
mystery.

jue

PS: Am I glad that my kill file is working fine.
 
M

Matt Garrish

Jürgen Exner said:
Well, actually I think it was just a typo and he meant to write "luff".
It's a common mistake for non-native English speakers to write vowels the
way they are pronounced.

But he's a proud American, you forget. It's still a long way from the
*laugh* I'm getting from him... : )

Matt
 
R

robic0

On Tue, 20 Dec 2005 23:59:06 -0800, robic0 wrote:

Update on the code. Lots of changes:

v.902
- Fixed white space issues surrounding attributes
- Fixed \"\' issue delimiting attribute content
- Fixed root container issues
- Fixed CDATA in comments
- Fixed comments in CDATA
- Added more warnings related to root level

Will be working on the usage regarding the
remove white spaces flag. Ie: to be applied
to content only or not.

To be done -
Still havent incorporated ":" logic in attributes.
Still no special xml character conversions (easy though)
Will incorporate simple callbacks for content.
No doctype or others as of yet, will look into this.
Will make the parsing a function with error trapping
capability for the caller (down the road).

The framework is working out pretty good.
Let me know if you have any questions or comments.

Thanks



print <<EOM;

# -----------------------
# XML Regex Parser
# Version .902 - 12/28/05
# Copyright 2005,
# by robic0-At-yahoo.com
# -----------------------
EOM

use strict;
use warnings;
use Data::Dumper;

#open DATA, "sumfile.xml" or die "can't open datafile...";
#my $gabage1 = join ('', <DATA>);
#close DATA;

my $gabage4 = '
<big name="asdf" date="33" >
asdf
<in1>
<!-- howdy f*%$olks -->
<in2>jjjj</in2>
<small biz="wefwf" ueue = "second" />
<!-- and still more -->
<bar><inside>asgfasdf<insF>2</insF>sdfb</inside></bar>
</in1>
<in2>some in3 content</in2>
asdfb
</big>
';

my $gabage5 = '
<root>

<!--
wasdfvgasvbg <![CDATA[ not really a CDATA ]]>
<tag>at tag in a real comment</tag>
<![CDATA[ not a CDATA ]]>
-->
<!-- This is a real comment -->
this is some content
<stag><br>some br stuff</br>after<t>some t

</t>

</stag>

<![CDATA[ <!-- imbed comment --> some text <!-- imbed as well -->]]>
<![CDATA[ <!-- imbed comment --> some text <!-- imbed as well -->]]>

</root>
';

my $gabage6 = "
<!-- This is a real comment -->

<node1
name = 'Barney'
date = \"1/1/05\"
/>

";

my $gabage7 = "
<node1
tire = 'Michelan'
size = \"235 x 16\">
Recalled by factory</node1>

";

my $gabage8 = "
<node1
color = 'green'
vtype = \"truck'
/>

";

my $gabage9 = '
<node>this is a node</node>
<node>this is a different</node>
';

my $gabage10 = '
<Node>this is a node</node>
<node>this is a different</Node>
';


my @xml_strings = ($gabage4, $gabage5, $gabage6, $gabage7, $gabage9,
$gabage10);

my $VERSION = .902;
my $debug = 0;
my $rmv_white_space = 0;
my $ForceArray = 0;
my $KeepRoot = 0;
my $KeepComments = 0;

## -- XML, start & end regexp substitution delimiter chars --
## match side , substitution side
## -------------------------/-------------------------------
my (@S_dlim, @E_dlim);
if ($debug) {
@S_dlim = ('\[' , '['); # use these for debug
@E_dlim = ('\]' , ']');
} else {
@S_dlim = (chr(140) , chr(140)); # use these for production
@E_dlim = (chr(141) , chr(141));
}

## -- Process xml data --
##
for (@xml_strings)
{
print "\n",'*'x30,"\nXML
string:\n",'-'x15,"$_\n\nOutput:\n",'-'x15,"\n\n";

my $ROOT = {}; # container
my %cdata_elements = ();
my ($last_cnt, $cnt, $i, $attr_error) = (-1, 1, 0, 0);

## Comment/CDATA block ==================================
#### To be done first -
# -- Questionable Comments --
while (s/(<!--(.*?)-->)/$S_dlim[1]$cnt$E_dlim[1]/s) {
#print "$cnt = Questionable comment: $1\n" if
($debug);
$ROOT->{$cnt} = $1;
$cnt++;
}
#### To be done second -
# -- Real CDATA --
while (s/<!\[CDATA\[(.*?)\]\]>/$S_dlim[1]$cnt$E_dlim[1]/s)
{
# reconstitute cdata contents
my $cdata_contents = $1;
my $str = '';
while ($cdata_contents =~
s/([^$S_dlim[0]$E_dlim[0]]+)|$S_dlim[0]([\d]+)$E_dlim[0]//) {
if (defined $1) {
$str .= $1;
} elsif (defined $2 && exists $ROOT->{$2}) {
$str .= $ROOT->{$2};
delete $ROOT->{$2};
} else {} # shouldn't get here
}
print "$cnt CDATA = $str\n" if ($debug);
$ROOT->{$cnt} = $str;
$cdata_elements{$cnt} = '';
$cnt++;
}
#### To be done third -
# -- Real Comments are left --
foreach my $key (sort {$a <=> $b} keys %{$ROOT}) {
if (!exists $cdata_elements{$key}) {
$ROOT->{$key} =~ s/^<!--(.*?)-->$/$1/s;
print "$key Comment = $1\n" if ($debug);
if ($KeepComments) {
$ROOT->{$key} = { comment => $1 };
} else {delete $ROOT->{$key};}
}
}
## End Comment/CDATA block ==============================

#### Non-tag markup go here -
# -- Versioning -- <?XML-Version ?> , have to check the format
of '<?'
while (s/<\?([^<>]*)\?>//) {} # void xml versioning for now
# while (s/<\?([^<>]*)\?>/$S_dlim[1]$cnt$E_dlim[1]/)
# { print "$cnt <$1> = \n" if ($debug); $cnt++}

#### White space removal before tags ? .. TBD -
if ($rmv_white_space) {
s/>[\s]+</></g;
s/[\s]+</</g;
s/>[\s]+/>/g;
}

#### Tags here - should only need 2 iterations max
my $finished = 0;
while ($cnt != $last_cnt && $i < 20)
{
$last_cnt = $cnt;

## <Tag/> , no content
while (s/<([0-9a-zA-Z]+)\/>/$S_dlim[1]$cnt$E_dlim[1]/)
{
print "$cnt <$1> = \n" if ($debug);
$ROOT->{$cnt} = { $1 => '' };
$cnt++;
}
## <Tag Attributes/> , no content
while
(s/<([0-9a-zA-Z]+)([\s]+[0-9a-zA-Z]+[\s]*=[\s]*["'][^<]*['"])+[\s]*\/>/$S_dlim[1]$cnt$E_dlim[1]/)
{
print "$cnt <$1> = attr: $2\n" if ($debug);
my $hattrib = getAttrHash($2);
if (ref($hattrib) ne "HASH") {
print "Invalid token in attribute
asignment:\n$hattrib\n"; $attr_error = 1; last;
}
$ROOT->{$cnt} = { $1 => $hattrib };
$cnt++;
}
## <Tag> Content </Tag>
while
(s/<([0-9a-zA-Z]+)>([^<]*)<\/\1>/$S_dlim[1]$cnt$E_dlim[1]/) {
print "$cnt <$1> = $2\n" if ($debug);
my $unknown = '';
if (length($2) > 0) {
my $hcontent = getContentHash($2,
$ROOT, \%cdata_elements);
$unknown = $hcontent;
if (keys (%{$hcontent}) > 1) {
if (!$ForceArray) {
adjustForSingleItemArrays ($hcontent); }
} else {
if (exists
$hcontent->{'content'} && scalar(@{$hcontent->{'content'}}) == 1) {
if (!$ForceArray ) {
$unknown =
${$hcontent->{'content'}}[0];
} else {$unknown =
$hcontent->{'content'}; }
}
if (!$ForceArray) {
adjustForSingleItemArrays ($hcontent); }
}
}
$ROOT->{$cnt} = { $1 => $unknown };
$cnt++;
}
last if ($attr_error);
## <Tag Attributes> Content </Tag>
while
(s/<([0-9a-zA-Z]+)([\s]+[0-9a-zA-Z]+[\s]*=[\s]*["'][^<]*['"])+[\s]*>([^<]*)<\/\1>/$S_dlim[1]$cnt$E_dlim[1]/)
{
print "$cnt <$1> = attr: $2, content: $3\n" if
($debug);
my $hattrib = getAttrHash($2);
if (ref($hattrib) ne "HASH") {
print "Invalid token in attribute
asignment:\n$hattrib\n"; $attr_error = 1; last;
}
if (length($3) > 0) {
my $hcontent = getContentHash($3,
$ROOT, \%cdata_elements);
if (!$ForceArray) {
adjustForSingleItemArrays ($hcontent); }
while (my ($key,$val) = each
(%{$hcontent})) {
$hattrib->{$key} = $val;
}
}
$ROOT->{$cnt} = { $1 => $hattrib };
$cnt++;
}
last if ($attr_error);
if ($last_cnt != $cnt) {
$i++; print "** End pass $i\n" if ($debug);
} else {
last if ($finished);
## Encapsulate the xml with a "root"
$_ = "<root>$_</root>";
$last_cnt--;
$finished = 1;
}
}
last if ($attr_error);
if (/<|>/) {
print "($i) XML problem: malformed, syntax or tag
closure:\n$_";
} else {
print "\n** Itterations = $i\n** ForceArray =
$ForceArray\n** KeepRoot = $KeepRoot\n** KeepComments =
$KeepComments\n\n";
#print Dumper($ROOT);
print "The remaining string is:\n$_\n\n" if ($debug);

## Strip off the outer element (our root) to
## examine the contents for errors.
## ---------------------------------------
my $outer_element = $cnt-1;
if (exists $ROOT->{$outer_element}) {

my $hroot = $ROOT->{$outer_element};
my ($key,$val) = each (%{$hroot});
my $htodump = $val;

# check for errors in root
if (ref($htodump) ne "HASH" || exists
$htodump->{'content'}) {
print "Error, bare content at root
level ..\n";
} else {
my $dmp_keys = keys (%{$htodump});

if ($dmp_keys > 1) {
print "Warning, multiple
elements at root level ..\n";
} else {
($key,$val) = each
(%{$htodump});
my $dmp_type = ref($val);

if ($dmp_keys == 0 || (exists
$htodump->{'comment'})) {
print "Warning, no
elements at root level ..\n";
}
if ($dmp_keys == 1) {
if ($dmp_type eq
"HASH") {
$htodump =
$val if (!$KeepRoot);
}
elsif ($dmp_type eq
"ARRAY") {
if
(!$ForceArray || scalar(@{$val}) > 1) {
print
"Warning, multiple elements at root level ..\n";
}
}
}
}
}
print "\n";
my $tmp = {};
%{$tmp} = %{$htodump};
print Dumper($tmp);
} else {
print "nothing to output!\n";
}
}
}
##
sub adjustForSingleItemArrays
{
my $href = shift;
## if $val is an array ref and has one element
## set $href->{$key} equal to the element
while (my ($key,$val) = each (%{$href})) {
if (ref($val) eq "ARRAY") {
if (scalar(@{$val}) == 1) {
$href->{$key} = $val->[0];
}
}
}
}
##
sub getAttrHash
{
my $attrstr = shift;
my $ahref = {};
return $ahref unless (defined $attrstr);
while ($attrstr =~
s/[\s]*([0-9a-zA-Z]+)[\s]*=[\s]*("|')([^=]*)\2[\s]*//i) {
$ahref->{$1} = $3;
}
if ($attrstr=~/=/) {
$attrstr =~ s/^\s+//s;
$attrstr =~ s/\s+$//s;
return $attrstr
}
return $ahref;
}
##
sub getContentHash
{
my ($contstr,$hStore,$hcdata_elements) = @_;
my $ahref = {};
return $ahref unless (defined $contstr && defined $hStore &&
defined $hcdata_elements);
my @ary = ();
my $append_flag = 0;

while ($contstr =~
s/^([^<$S_dlim[0]$E_dlim[0]]+)|$S_dlim[0]([\d]+)$E_dlim[0]//s) {
if (defined $1) {
my $tmp1 = $1;
# if flagged, append it to $ary[last]
if ($append_flag && scalar(@ary) > 0) {
my $size = scalar(@ary);
$ary[$size-1] .= $tmp1;
} else {
push (@ary, $1);
}
$append_flag = 0;
}
elsif (defined $2) {
# if it doesen't exist (Comments stripped?)
# turn on append flag.
if (!exists $hStore->{$2}) {
$append_flag = 1;
next;
}
# if its a CDATA, append it to $ary[last],
# turn on append flag.
if (exists $hcdata_elements->{$2}) {
my $size = scalar(@ary);
if ($size > 0) {
$ary[$size-1] .=
$hStore->{$2};
} else {push (@ary, $hStore->{$2});}
$append_flag = 1;
next;
}
my ($key,$val) = each (%{$hStore->{$2}});
if (exists $ahref->{$key}) {
push (@{$ahref->{$key}}, $val);
} else {
$ahref->{$key} = [$val];
}
$append_flag = 0;
}
else {} # shouldn't get here
}

# store contents, strip out
# pure whitespace text elements
my $hary = [];
for (@ary) {
next if (/^\s+$/s);
push (@{$hary}, $_);
}
if (scalar(@{$hary}) > 0) {
$ahref->{'content'} = $hary;
}
## if $val is an array ref and has one element and it
## is a hash ref, set {$key} equal to hash ref
if (!$ForceArray) {
while (my ($key,$val) = each (%{$ahref})) {
if (ref($val) eq "ARRAY") {
if (scalar(@{$val}) == 1 &&
ref($val->[0]) eq "HASH") {
$ahref->{$key} = $val->[0];
}
}
}
}
return $ahref;
}

__END__



# -----------------------
# XML Regex Parser
# Version .902 - 12/28/05
# Copyright 2005,
# by robic0-At-yahoo.com
# -----------------------

******************************
XML string:
---------------
<big name="asdf" date="33" >
asdf
<in1>
<!-- howdy f*%$olks -->
<in2>jjjj</in2>
<small biz="wefwf" ueue = "second" />
<!-- and still more -->
<bar><inside>asgfasdf<insF>2</insF>sdfb</inside></bar>
</in1>
<in2>some in3 content</in2>
asdfb
</big>


Output:
---------------


** Itterations = 2
** ForceArray = 0
** KeepRoot = 0
** KeepComments = 0


$VAR1 = {
'in2' => 'some in3 content',
'date' => '33',
'name' => 'asdf',
'content' => [
'
asdf
',
'
asdfb
'
],
'in1' => {
'small' => {
'ueue' => 'second',
'biz' => 'wefwf'
},
'bar' => {
'inside' => {
'insF' => '2',
'content' => [

'asgfasdf',
'sdfb'
]
}
},
'in2' => 'jjjj'
}
};

******************************
XML string:
---------------
<root>

<!--
wasdfvgasvbg <![CDATA[ not really a CDATA ]]>
<tag>at tag in a real comment</tag>
<![CDATA[ not a CDATA ]]>
-->
<!-- This is a real comment -->
this is some content
<stag><br>some br stuff</br>after<t>some t

</t>

</stag>

<![CDATA[ <!-- imbed comment --> some text <!-- imbed as well -->]]>
<![CDATA[ <!-- imbed comment --> some text <!-- imbed as well -->]]>

</root>


Output:
---------------


** Itterations = 2
** ForceArray = 0
** KeepRoot = 0
** KeepComments = 0


$VAR1 = {
'content' => [
'



this is some content
',
'

<!-- imbed comment --> some text <!-- imbed as well -->
<!-- imbed comment --> some text <!-- imbed as well -->

'
],
'stag' => {
'br' => 'some br stuff',
'content' => 'after',
't' => 'some t

'
}
};

******************************
XML string:
---------------
<!-- This is a real comment -->

<node1
name = 'Barney'
date = "1/1/05"
/>



Output:
---------------


** Itterations = 2
** ForceArray = 0
** KeepRoot = 0
** KeepComments = 0


$VAR1 = {
'date' => '1/1/05',
'name' => 'Barney'
};

******************************
XML string:
---------------
<node1
tire = 'Michelan'
size = "235 x 16">
Recalled by factory</node1>



Output:
---------------


** Itterations = 2
** ForceArray = 0
** KeepRoot = 0
** KeepComments = 0


$VAR1 = {
'content' => '
Recalled by factory',
'tire' => 'Michelan',
'size' => '235 x 16'
};

******************************
XML string:
---------------
<node>this is a node</node>
<node>this is a different</node>


Output:
---------------


** Itterations = 2
** ForceArray = 0
** KeepRoot = 0
** KeepComments = 0

Warning, multiple elements at root level ..

$VAR1 = {
'node' => [
'this is a node',
'this is a different'
]
};

******************************
XML string:
---------------
<Node>this is a node</node>
<node>this is a different</Node>


Output:
---------------

(0) XML problem: malformed, syntax or tag closure:
<root>
<Node>this is a node</node>
<node>this is a different</Node>
</root>
 
R

robic0

On Tue, 20 Dec 2005 23:59:06 -0800, robic0 wrote:

Update on the code. Lots of changes:

v.902
- Fixed white space issues surrounding attributes
- Fixed \"\' issue delimiting attribute content
- Fixed root container issues
- Fixed CDATA in comments
- Fixed comments in CDATA
- Added more warnings related to root level
Oh and
- Fixed case sensitivity of tag names

-t
 
R

robic0

robic0 said:
I'm back on the job.
I'm going to post some new code this week that
complies with XML spec.

There is more than meets the eye.

An XML file may be well-formed, but invalid if it doesn't comply with
its DTD. Would your program complain about that ?

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ELEMENT root ((mytag|mytag2),myothertag+,notrequiredtag?)>
<!ELEMENT mytag (#PCDATA)>
<!ELEMENT myothertag (#PCDATA)>
]>
<root>
<mytag>content 1</mytag>
<myothertag>content 2</myothertag>
</root>

What about the declaration of entities ?

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ENTITY my_entity "this content was set by !ENTITY">
]>
<root>
<mytag>&my_entity;</mytag>
<myothertag>content 2</myothertag>
</root>

What about an ATTLIST ?

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ATTLIST mytag
att1 CDATA #REQUIRED
att2 CDATA #IMPLIED>
<!ATTLIST myothertag att3 CDATA #FIXED
"this content was set by !ATTLIST">
]>
<root>
<mytag att1="attvalue1" att2="attvalue2">content 1</mytag>
<myothertag>content 2</myothertag>
</root>

What you gonna do with specific XSL tags ?

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<root>
<xsl:sort select="@ID" order="ascending" />
<mytag>
<xsl:attribute name='{name()}'>
<xsl:value-of select="." />
</xsl:attribute>
</mytag>
</root>
</xsl:stylesheet>

What about the rules from an XML schema ?

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="root">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="mytag" maxOccurs="unbounded" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>

It would be a good idea to decode numeric character references:

<?xml version="1.0" encoding="UTF-8"?>
<root>
<mytag>i</mytag>
</root>

Same for the non-numeric ones:

<?xml version="1.0" encoding="UTF-8"?>
<root>
<mytag>&amp;</mytag>
</root>

I would recommend "Perl & XML - XML Processing with Perl" by Erik T.
Ray & Jason McIntosh (edited by O'Reilly). Very good book. See
http://www.oreilly.com/catalog/perlxml/.

You need to learn more about XML:

http://www.w3.org/XML/
http://www.xml.com/
http://www.w3schools.com/xml/default.asp (tip!)

I'll read when I get to XSL and schema tags.
This is very easy to do. Haven't got there yet.
Version .902 framework is a big jump as far as decoding.
Numeric and special xml characters are very simple to do. I'll
probably do that next to get it out of the way.
 
R

robic0

Well, at least you're getting as much out of this as I am. It would be nice
if you could drop the script-kiddie talk and write proper English sentences
in the future, though.


That's exactly my point. What is this thing supposed to do? The (very
simple) point of an XML parser is to verify the integrity of the document
(validation: either well-formedness or compliance to a dtd or schema) and/or
allow you to access the content.

Your parser has no appreciation of nesting beyond the very trivial, so there
is no way that it can check well-formedness. It (you) also doesn't
understand dtds or schemas, and don't realize how nearly impossible it's
going to be for your parser to validate against one.
In the past I've used XML::Xerces module to interface Apache's
Xerces-C Version 2.3.0 for Windows. What do you use? How do you like
its documentation?
To get back to my original point, however, your parser does not build a
tree, so that makes it useless for half the applications of a parser. It
also doesn't handle events like a SAX parser, which makes it useless for the
other half. I'm honestly curious what real world application you think this
is going to have?

Parsers just parse and without thought extract monolithic chunks of
data, serially. Ever see a SAX parser do anything else?
Ever see a SAX parser build a bridge or maybe erect a pyrimid in
Egypt? Perhaps you think that all the XML:: modules you call parsers
that do any number of different things are all parsing. Unless you
call a PARSER directly, you aren't really parsing are you? So maybe
you think "parsing" is somehow reserved, a thing one shouldn't even
attempt. A thing so complicated to you that it makes your pecker
droop just when you think about it....
 
B

Bart Van der Donck

robic0 said:
I'll read when I get to XSL and schema tags.
This is very easy to do.

Utter nonsense.
Numeric and special xml characters are very simple to do.

Sure:

use HTML::Entities();
my $decoded = HTML::Entities::decode($encoded);

Or do you want to code out that part by hand as well ?

Oh, and... most (if not all) readers here won't run your code because
the line ends of your post are messed up. You still don't seem to
understand this. That happens when your IQ is 17,0.

Cut & paste the code from your last post and try to run that yourself.
You'll see what I mean.
 
M

Matt Garrish

Parsers just parse and without thought extract monolithic chunks of
data, serially.

That ranks as one of the stupidest things I've seen written in 2005. Lucky
you squeaked that in before the end of the week.

So your "parser" doesn't do anything. It doesn't validate the data. It
doesn't allow it to be handled in any meaningful way. All it does is
randomly chunk up data for no use to no one. Pretty much what I figured you
were doing.

Matt
 
R

robic0

Well, at least you're getting as much out of this as I am. It would be nice
if you could drop the script-kiddie talk and write proper English sentences
in the future, though.


That's exactly my point. What is this thing supposed to do? The (very
simple) point of an XML parser is to verify the integrity of the document
(validation: either well-formedness or compliance to a dtd or schema) and/or
allow you to access the content.

Your parser has no appreciation of nesting beyond the very trivial, so there
is no way that it can check well-formedness. It (you) also doesn't
understand dtds or schemas, and don't realize how nearly impossible it's
going to be for your parser to validate against one.
Wheather or not I can use it to write a schema checker is something I
will consider when I feel like it.
You have some misconception about the ability of schema to do 100%
validation. It can't on every level, period! In fact between level
sets, all it can do is validate a range of parents vs. a range of
children. It can't validate a relationship between a single parent
and posible several children. So schema is whoafully inadequate alone.
To propagate all the possible permutations would make schema 100%
valid. It doesen't have that capability and never will. If all you
use is schema to validate your xml, you don't know xml..
To get back to my original point, however, your parser does not build a
tree,
Doesn't build a tree? Wtf are you drinking?
so that makes it useless for half the applications of a parser. It
also doesn't handle events like a SAX parser, which makes it useless for the
other half. I'm honestly curious what real world application you think this
is going to have?

To start, it blows the doors off all parsers out there. It uses a
substitution method that exponentially gets quicker. It starts from
the inner xml blocks and works out. It takes data off right away
and builds a tree without waiting for parent closure. Its 100%
accurate because the logic is flawless. It works on a micro as opposed
to macro idiom. Its out of order with discrete cells. This is the
fastest possible method to parse xml. I'm suprised no one ever did
this before. Its on the level of a node model but it can easily
hone in on patterns and discard whats not necessary and filter.
All while using an examination method that exponentionally gets
quicker as the search progresses.

I wan't your promise that when this idea takes off that you won't
have any part of it and continue to bury your head in the sand.
 
R

robic0

Utter nonsense.


Sure:

use HTML::Entities();
my $decoded = HTML::Entities::decode($encoded);

Or do you want to code out that part by hand as well ?

Oh, and... most (if not all) readers here won't run your code because
the line ends of your post are messed up. You still don't seem to
understand this. That happens when your IQ is 17,0.

Cut & paste the code from your last post and try to run that yourself.
You'll see what I mean.

Ok, its my news reader I never had to fine tune so much. I have my
post width set to 100 characters (it was 70). This should fix it.
I'm using Forte Agent btw.. I'll repost the code.
 
R

robic0

On Tue, 20 Dec 2005 23:59:06 -0800, robic0 wrote:

For easier reading and after an adjustment to Agent, I have expanded the width of my posts to 200 chrs/line.
This is still version .902. I wll modularize this and add exception processing on the next go round.
-robic0-


print <<EOM;

# -----------------------
# XML Regex Parser
# Version .902 - 12/28/05
# Copyright 2005,
# by robic0-At-yahoo.com
# -----------------------
EOM

use strict;
use warnings;
use Data::Dumper;

#open DATA, "sumfile.xml" or die "can't open datafile...";
#my $gabage1 = join ('', <DATA>);
#close DATA;

my $gabage4 = '
<big name="asdf" date="33" >
asdf
<in1>
<!-- howdy f*%$olks -->
<in2>jjjj</in2>
<small biz="wefwf" ueue = "second" />
<!-- and still more -->
<bar><inside>asgfasdf<insF>2</insF>sdfb</inside></bar>
</in1>
<in2>some in3 content</in2>
asdfb
</big>
';

my $gabage5 = '
<root>

<!--
wasdfvgasvbg <![CDATA[ not really a CDATA ]]>
<tag>at tag in a real comment</tag>
<![CDATA[ not a CDATA ]]>
-->
<!-- This is a real comment -->
this is some content
<stag><br>some br stuff</br>after<t>some t

</t>

</stag>

<![CDATA[ <!-- imbed comment --> some text <!-- imbed as well -->]]>
<![CDATA[ <!-- imbed comment --> some text <!-- imbed as well -->]]>

</root>
';

my $gabage6 = "
<!-- This is a real comment -->

<node1
name = 'Barney'
date = \"1/1/05\"
/>

";

my $gabage7 = "
<node1
tire = 'Michelan'
size = \"235 x 16\">
Recalled by factory</node1>

";

my $gabage8 = "
<node1
color = 'green'
vtype = \"truck'
/>

";

my $gabage9 = '
<node>this is a node</node>
<node>this is a different</node>
';

my $gabage10 = '
<Node>this is a node</node>
<node>this is a different</Node>
';


my @xml_strings = ($gabage4, $gabage5, $gabage6, $gabage7, $gabage9, $gabage10);

my $VERSION = .902;
my $debug = 0;
my $rmv_white_space = 0;
my $ForceArray = 0;
my $KeepRoot = 0;
my $KeepComments = 0;

## -- XML, start & end regexp substitution delimiter chars --
## match side , substitution side
## -------------------------/-------------------------------
my (@S_dlim, @E_dlim);
if ($debug) {
@S_dlim = ('\[' , '['); # use these for debug
@E_dlim = ('\]' , ']');
} else {
@S_dlim = (chr(140) , chr(140)); # use these for production
@E_dlim = (chr(141) , chr(141));
}

## -- Process xml data --
##
for (@xml_strings)
{
print "\n",'*'x30,"\nXML string:\n",'-'x15,"$_\n\nOutput:\n",'-'x15,"\n\n";

my $ROOT = {}; # container
my %cdata_elements = ();
my ($last_cnt, $cnt, $i, $attr_error) = (-1, 1, 0, 0);

## Comment/CDATA block ==================================
#### To be done first -
# -- Questionable Comments --
while (s/(<!--(.*?)-->)/$S_dlim[1]$cnt$E_dlim[1]/s) {
#print "$cnt = Questionable comment: $1\n" if ($debug);
$ROOT->{$cnt} = $1;
$cnt++;
}
#### To be done second -
# -- Real CDATA --
while (s/<!\[CDATA\[(.*?)\]\]>/$S_dlim[1]$cnt$E_dlim[1]/s)
{
# reconstitute cdata contents
my $cdata_contents = $1;
my $str = '';
while ($cdata_contents =~ s/([^$S_dlim[0]$E_dlim[0]]+)|$S_dlim[0]([\d]+)$E_dlim[0]//) {
if (defined $1) {
$str .= $1;
} elsif (defined $2 && exists $ROOT->{$2}) {
$str .= $ROOT->{$2};
delete $ROOT->{$2};
} else {} # shouldn't get here
}
print "$cnt CDATA = $str\n" if ($debug);
$ROOT->{$cnt} = $str;
$cdata_elements{$cnt} = '';
$cnt++;
}
#### To be done third -
# -- Real Comments are left --
foreach my $key (sort {$a <=> $b} keys %{$ROOT}) {
if (!exists $cdata_elements{$key}) {
$ROOT->{$key} =~ s/^<!--(.*?)-->$/$1/s;
print "$key Comment = $1\n" if ($debug);
if ($KeepComments) {
$ROOT->{$key} = { comment => $1 };
} else {delete $ROOT->{$key};}
}
}
## End Comment/CDATA block ==============================

#### Non-tag markup go here -
# -- Versioning -- <?XML-Version ?> , have to check the format of '<?'
while (s/<\?([^<>]*)\?>//) {} # void xml versioning for now
# while (s/<\?([^<>]*)\?>/$S_dlim[1]$cnt$E_dlim[1]/)
# { print "$cnt <$1> = \n" if ($debug); $cnt++}

#### White space removal before tags ? .. TBD -
if ($rmv_white_space) {
s/>[\s]+</></g;
s/[\s]+</</g;
s/>[\s]+/>/g;
}

#### Tags here - should only need 2 iterations max
my $finished = 0;
while ($cnt != $last_cnt && $i < 20)
{
$last_cnt = $cnt;

## <Tag/> , no content
while (s/<([0-9a-zA-Z]+)\/>/$S_dlim[1]$cnt$E_dlim[1]/) {
print "$cnt <$1> = \n" if ($debug);
$ROOT->{$cnt} = { $1 => '' };
$cnt++;
}
## <Tag Attributes/> , no content
while (s/<([0-9a-zA-Z]+)([\s]+[0-9a-zA-Z]+[\s]*=[\s]*["'][^<]*['"])+[\s]*\/>/$S_dlim[1]$cnt$E_dlim[1]/) {
print "$cnt <$1> = attr: $2\n" if ($debug);
my $hattrib = getAttrHash($2);
if (ref($hattrib) ne "HASH") {
print "Invalid token in attribute asignment:\n$hattrib\n"; $attr_error = 1; last;
}
$ROOT->{$cnt} = { $1 => $hattrib };
$cnt++;
}
## <Tag> Content </Tag>
while (s/<([0-9a-zA-Z]+)>([^<]*)<\/\1>/$S_dlim[1]$cnt$E_dlim[1]/) {
print "$cnt <$1> = $2\n" if ($debug);
my $unknown = '';
if (length($2) > 0) {
my $hcontent = getContentHash($2, $ROOT, \%cdata_elements);
$unknown = $hcontent;
if (keys (%{$hcontent}) > 1) {
if (!$ForceArray) { adjustForSingleItemArrays ($hcontent); }
} else {
if (exists $hcontent->{'content'} && scalar(@{$hcontent->{'content'}}) == 1) {
if (!$ForceArray ) {
$unknown = ${$hcontent->{'content'}}[0];
} else {$unknown = $hcontent->{'content'}; }
}
if (!$ForceArray) { adjustForSingleItemArrays ($hcontent); }
}
}
$ROOT->{$cnt} = { $1 => $unknown };
$cnt++;
}
last if ($attr_error);
## <Tag Attributes> Content </Tag>
while (s/<([0-9a-zA-Z]+)([\s]+[0-9a-zA-Z]+[\s]*=[\s]*["'][^<]*['"])+[\s]*>([^<]*)<\/\1>/$S_dlim[1]$cnt$E_dlim[1]/) {
print "$cnt <$1> = attr: $2, content: $3\n" if ($debug);
my $hattrib = getAttrHash($2);
if (ref($hattrib) ne "HASH") {
print "Invalid token in attribute asignment:\n$hattrib\n"; $attr_error = 1; last;
}
if (length($3) > 0) {
my $hcontent = getContentHash($3, $ROOT, \%cdata_elements);
if (!$ForceArray) { adjustForSingleItemArrays ($hcontent); }
while (my ($key,$val) = each (%{$hcontent})) {
$hattrib->{$key} = $val;
}
}
$ROOT->{$cnt} = { $1 => $hattrib };
$cnt++;
}
last if ($attr_error);
if ($last_cnt != $cnt) {
$i++; print "** End pass $i\n" if ($debug);
} else {
last if ($finished);
## Encapsulate the xml with a "root"
$_ = "<root>$_</root>";
$last_cnt--;
$finished = 1;
}
}
last if ($attr_error);
if (/<|>/) {
print "($i) XML problem: malformed, syntax or tag closure:\n$_";
} else {
print "\n** Itterations = $i\n** ForceArray = $ForceArray\n** KeepRoot = $KeepRoot\n** KeepComments = $KeepComments\n\n";
#print Dumper($ROOT);
print "The remaining string is:\n$_\n\n" if ($debug);

## Strip off the outer element (our root) to
## examine the contents for errors.
## ---------------------------------------
my $outer_element = $cnt-1;
if (exists $ROOT->{$outer_element}) {

my $hroot = $ROOT->{$outer_element};
my ($key,$val) = each (%{$hroot});
my $htodump = $val;

# check for errors in root
if (ref($htodump) ne "HASH" || exists $htodump->{'content'}) {
print "Error, bare content at root level ..\n";
} else {
my $dmp_keys = keys (%{$htodump});

if ($dmp_keys > 1) {
print "Warning, multiple elements at root level ..\n";
} else {
($key,$val) = each (%{$htodump});
my $dmp_type = ref($val);

if ($dmp_keys == 0 || (exists $htodump->{'comment'})) {
print "Warning, no elements at root level ..\n";
}
if ($dmp_keys == 1) {
if ($dmp_type eq "HASH") {
$htodump = $val if (!$KeepRoot);
}
elsif ($dmp_type eq "ARRAY") {
if (!$ForceArray || scalar(@{$val}) > 1) {
print "Warning, multiple elements at root level ..\n";
}
}
}
}
}
print "\n";
my $tmp = {};
%{$tmp} = %{$htodump};
print Dumper($tmp);
} else {
print "nothing to output!\n";
}
}
}
##
sub adjustForSingleItemArrays
{
my $href = shift;
## if $val is an array ref and has one element
## set $href->{$key} equal to the element
while (my ($key,$val) = each (%{$href})) {
if (ref($val) eq "ARRAY") {
if (scalar(@{$val}) == 1) {
$href->{$key} = $val->[0];
}
}
}
}
##
sub getAttrHash
{
my $attrstr = shift;
my $ahref = {};
return $ahref unless (defined $attrstr);
while ($attrstr =~ s/[\s]*([0-9a-zA-Z]+)[\s]*=[\s]*("|')([^=]*)\2[\s]*//i) {
$ahref->{$1} = $3;
}
if ($attrstr=~/=/) {
$attrstr =~ s/^\s+//s;
$attrstr =~ s/\s+$//s;
return $attrstr
}
return $ahref;
}
##
sub getContentHash
{
my ($contstr,$hStore,$hcdata_elements) = @_;
my $ahref = {};
return $ahref unless (defined $contstr && defined $hStore && defined $hcdata_elements);
my @ary = ();
my $append_flag = 0;

while ($contstr =~ s/^([^<$S_dlim[0]$E_dlim[0]]+)|$S_dlim[0]([\d]+)$E_dlim[0]//s) {
if (defined $1) {
my $tmp1 = $1;
# if flagged, append it to $ary[last]
if ($append_flag && scalar(@ary) > 0) {
my $size = scalar(@ary);
$ary[$size-1] .= $tmp1;
} else {
push (@ary, $1);
}
$append_flag = 0;
}
elsif (defined $2) {
# if it doesen't exist (Comments stripped?)
# turn on append flag.
if (!exists $hStore->{$2}) {
$append_flag = 1;
next;
}
# if its a CDATA, append it to $ary[last],
# turn on append flag.
if (exists $hcdata_elements->{$2}) {
my $size = scalar(@ary);
if ($size > 0) {
$ary[$size-1] .= $hStore->{$2};
} else {push (@ary, $hStore->{$2});}
$append_flag = 1;
next;
}
my ($key,$val) = each (%{$hStore->{$2}});
if (exists $ahref->{$key}) {
push (@{$ahref->{$key}}, $val);
} else {
$ahref->{$key} = [$val];
}
$append_flag = 0;
}
else {} # shouldn't get here
}

# store contents, strip out
# pure whitespace text elements
my $hary = [];
for (@ary) {
next if (/^\s+$/s);
push (@{$hary}, $_);
}
if (scalar(@{$hary}) > 0) {
$ahref->{'content'} = $hary;
}
## if $val is an array ref and has one element and it
## is a hash ref, set {$key} equal to hash ref
if (!$ForceArray) {
while (my ($key,$val) = each (%{$ahref})) {
if (ref($val) eq "ARRAY") {
if (scalar(@{$val}) == 1 && ref($val->[0]) eq "HASH") {
$ahref->{$key} = $val->[0];
}
}
}
}
return $ahref;
}

__END__

# -----------------------
# XML Regex Parser
# Version .902 - 12/28/05
# Copyright 2005,
# by robic0-At-yahoo.com
# -----------------------

******************************
XML string:
---------------
<big name="asdf" date="33" >
asdf
<in1>
<!-- howdy f*%$olks -->
<in2>jjjj</in2>
<small biz="wefwf" ueue = "second" />
<!-- and still more -->
<bar><inside>asgfasdf<insF>2</insF>sdfb</inside></bar>
</in1>
<in2>some in3 content</in2>
asdfb
</big>


Output:
---------------


** Itterations = 2
** ForceArray = 0
** KeepRoot = 0
** KeepComments = 0


$VAR1 = {
'in2' => 'some in3 content',
'date' => '33',
'name' => 'asdf',
'content' => [
'
asdf
',
'
asdfb
'
],
'in1' => {
'small' => {
'ueue' => 'second',
'biz' => 'wefwf'
},
'bar' => {
'inside' => {
'insF' => '2',
'content' => [
'asgfasdf',
'sdfb'
]
}
},
'in2' => 'jjjj'
}
};

******************************
XML string:
---------------
<root>

<!--
wasdfvgasvbg <![CDATA[ not really a CDATA ]]>
<tag>at tag in a real comment</tag>
<![CDATA[ not a CDATA ]]>
-->
<!-- This is a real comment -->
this is some content
<stag><br>some br stuff</br>after<t>some t

</t>

</stag>

<![CDATA[ <!-- imbed comment --> some text <!-- imbed as well -->]]>
<![CDATA[ <!-- imbed comment --> some text <!-- imbed as well -->]]>

</root>


Output:
---------------


** Itterations = 2
** ForceArray = 0
** KeepRoot = 0
** KeepComments = 0


$VAR1 = {
'content' => [
'



this is some content
',
'

<!-- imbed comment --> some text <!-- imbed as well -->
<!-- imbed comment --> some text <!-- imbed as well -->

'
],
'stag' => {
'br' => 'some br stuff',
'content' => 'after',
't' => 'some t

'
}
};

******************************
XML string:
---------------
<!-- This is a real comment -->

<node1
name = 'Barney'
date = "1/1/05"
/>



Output:
---------------


** Itterations = 2
** ForceArray = 0
** KeepRoot = 0
** KeepComments = 0


$VAR1 = {
'date' => '1/1/05',
'name' => 'Barney'
};

******************************
XML string:
---------------
<node1
tire = 'Michelan'
size = "235 x 16">
Recalled by factory</node1>



Output:
---------------


** Itterations = 2
** ForceArray = 0
** KeepRoot = 0
** KeepComments = 0


$VAR1 = {
'content' => '
Recalled by factory',
'tire' => 'Michelan',
'size' => '235 x 16'
};

******************************
XML string:
---------------
<node>this is a node</node>
<node>this is a different</node>


Output:
---------------


** Itterations = 2
** ForceArray = 0
** KeepRoot = 0
** KeepComments = 0

Warning, multiple elements at root level ..

$VAR1 = {
'node' => [
'this is a node',
'this is a different'
]
};

******************************
XML string:
---------------
<Node>this is a node</node>
<node>this is a different</Node>


Output:
---------------

(0) XML problem: malformed, syntax or tag closure:
<root>
<Node>this is a node</node>
<node>this is a different</Node>
</root>
 
M

Matt Garrish

Wheather or not I can use it to write a schema checker is something I
will consider when I feel like it.
You have some misconception about the ability of schema to do 100%
validation.

I have no misconcenptions about what Schemas can do.
I also have no misconceptions about the ability of a DTD to specify
a document's structure.

I do, however, have no faith in your ability to validate against either.
To start, it blows the doors off all parsers out there.

No, it doesn't. Quick-and-dirty regular expression parsers have been around
for a long time. Google it if you really think you're doing something new.

You are still missing fundemental concepts of XML, namely order. Using your
code (what is
"gabage"?):

my $gabage4 = <<TEST;
<problem>
<elem><i>i</i> see a problem <b>here</b> with inline elements</elem>
</problem>
TEST

Nets me the following wonderful output:

$VAR1 = {
'elem' => {
'b' => 'here',
'content' => [
' see a problem ',
' with inline elements'
],
'i' => 'i'
}
};

Anyway, this is really growing tiresome, so when you release your module to
CPAN be sure to make an announcement so we can all be awed.

Matt
 
J

John Bokma

robic0 said:
To start, it blows the doors off all parsers out there. It uses a
substitution method that exponentially gets quicker. It starts from
the inner xml blocks and works out. It takes data off right away
and builds a tree without waiting for parent closure. Its 100%
accurate because the logic is flawless. It works on a micro as opposed
to macro idiom. Its out of order with discrete cells. This is the
fastest possible method to parse xml. I'm suprised no one ever did
this before. Its on the level of a node model but it can easily
hone in on patterns and discard whats not necessary and filter.
All while using an examination method that exponentionally gets
quicker as the search progresses.

Anyone has a free position at the marketing department?
 
R

robic0

Matt said:
Wheather or not I can use it to write a schema checker is something I
will consider when I feel like it.
You have some misconception about the ability of schema to do 100%
validation.

I have no misconcenptions about what Schemas can do.
I also have no misconceptions about the ability of a DTD to specify
a document's structure.

I do, however, have no faith in your ability to validate against either.
To start, it blows the doors off all parsers out there.

No, it doesn't. Quick-and-dirty regular expression parsers have been around
for a long time. Google it if you really think you're doing something new.

You are still missing fundemental concepts of XML, namely order. Using your
code (what is
"gabage"?):

my $gabage4 = <<TEST;
<problem>
<elem><i>i</i> see a problem <b>here</b> with inline elements</elem>
</problem>
TEST

Nets me the following wonderful output:

$VAR1 = {
'elem' => {
'b' => 'here',
'content' => [
' see a problem ',
' with inline elements'
],
'i' => 'i'
}
};

Anyway, this is really growing tiresome, so when you release your module to
CPAN be sure to make an announcement so we can all be awed.

Matt

Ok, inline mixed content is an issue that will produce a different form
IF it is
taken into account. The default (above) is where inline non-nested tags
are guaranteed
available at the same level.

I will add a flag to keep the ordering of inline mixed content. Ie:

$KeepInlineOrder = 1;

would change the output (html?) to something like this:

$VAR1 = {
'elem' => {
'content' => [
{'i' => 'i'},
' see a problem ',
{'b' => 'here'},
' with inline elements'
]
}
};

or, with extended root stripping:

$VAR1 = [{'i' => 'i'},' see a problem ',{'b' => 'here'},' with inline
elements'];
 
T

Tad McClellan

Matt Garrish said:
<robic0> wrote in message
[snip]

You are still missing fundemental concepts of XML, namely order. Using your
code (what is
"gabage"?):

my $gabage4 = <<TEST;
<problem>
<elem><i>i</i> see a problem <b>here</b> with inline elements</elem>
</problem>
TEST

Nets me the following wonderful output:

$VAR1 = {
'elem' => {
'b' => 'here',
'content' => [
' see a problem ',
' with inline elements'
],
'i' => 'i'
}
};


I think you have discovered a use for this wizard's code.

It is a Yoda-speak generator!


-------------------------
#!/usr/bin/perl
use warnings;
use strict;

my $VAR1 = {
'elem' => {
'b' => 'here',
'content' => [
' see a problem ',
' with inline elements'
],
'i' => 'i'
}
};

speak_yoda($VAR1);

sub speak_yoda {
foreach ( @_ ) {
if ( ref $_ eq 'HASH' ) { speak_yoda( values %$_ ) }
elsif ( ref $_ eq 'ARRAY' ) { speak_yoda( @$_ ) }
else { print "$_\n" }
}
}
-------------------------

Nets me the following wonderful output:

here
see a problem
with inline elements
i


:)
 
M

Matt Garrish

Tad McClellan said:
Matt Garrish said:
<robic0> wrote in message

[snip]

You are still missing fundemental concepts of XML, namely order. Using
your
code (what is
"gabage"?):

my $gabage4 = <<TEST;
<problem>
<elem><i>i</i> see a problem <b>here</b> with inline
elements</elem>
</problem>
TEST

Nets me the following wonderful output:

$VAR1 = {
'elem' => {
'b' => 'here',
'content' => [
' see a problem ',
' with inline elements'
],
'i' => 'i'
}
};


I think you have discovered a use for this wizard's code.

It is a Yoda-speak generator!

I wondered how long until someone spotted that! It wasn't intentional
on my part, but that was my first thought too when I reread the
output after posting!

Maybe Lucas will buy it and make him rich. Damn! : )

Matt
 
R

robic0

On Tue, 20 Dec 2005 23:59:06 -0800, robic0 wrote:

Version .903
Added a new flag:

- KeepContentOrder

Attributes are not kept in order. I could
easily do it if its really usefull, however
it would extend the containing tag out another
array level.

The output is truncated to save space, it is
ActiveState's Perl 5.6 release html file.

Let me know if you have any questions.
-robic0-


print <<EOM;

# -----------------------
# XML Regex Parser
# Version .903 - 12/31/05
# Copyright 2005,
# by robic0-At-yahoo.com
# -----------------------
EOM

use strict;
use warnings;
use Data::Dumper;

open DATA, "CHANGES56.html" or die "can't open CHANGES56.html...";
my $gabage1 = join ('', <DATA>);
close DATA;

my @xml_strings = ($gabage1);

my $alt_debug = 0;
my $VERSION = .903;
my $debug = 0;
my $rmv_white_space = 0;
my $ForceArray = 0;
my $KeepRoot = 1;
my $KeepComments = 1;
my $KeepContentOrder = 1;

## -- XML, start & end regexp substitution delimiter chars --
## match side , substitution side
## -------------------------/-------------------------------
my (@S_dlim, @E_dlim);
if ($debug) {
@S_dlim = ('\[' , '['); # use these for debug
@E_dlim = ('\]' , ']');
} else {
@S_dlim = (chr(140) , chr(140)); # use these for production
@E_dlim = (chr(141) , chr(141));
}

## -- Process xml data --
##
for (@xml_strings)
{
print "\n",'*'x30,"\nXML string:\n",'-'x15,"\n$_\n\nOutput:\n",'-'x15,"\n\n";
if ($alt_debug) {
ProcessAltDebugInfo ($_) ;
print "\n";
}
my $ROOT = {}; # container
my %cdata_elements = ();
my ($last_cnt, $cnt, $i, $attr_error) = (-1, 1, 0, 0);

## Comment/CDATA block ==================================
#### To be done first -
# -- Questionable Comments --
while (s/(<!--(.*?)-->)/$S_dlim[1]$cnt$E_dlim[1]/s) {
#print "$cnt = Questionable comment: $1\n" if ($debug);
$ROOT->{$cnt} = $1;
$cnt++;
}
#### To be done second -
# -- Real CDATA --
while (s/<!\[CDATA\[(.*?)\]\]>/$S_dlim[1]$cnt$E_dlim[1]/s)
{
# reconstitute cdata contents
my $cdata_contents = $1;
my $str = '';
while ($cdata_contents =~ s/([^$S_dlim[0]$E_dlim[0]]+)|$S_dlim[0]([\d]+)$E_dlim[0]//) {
if (defined $1) {
$str .= $1;
} elsif (defined $2 && exists $ROOT->{$2}) {
$str .= $ROOT->{$2};
delete $ROOT->{$2};
} else {} # shouldn't get here
}
print "$cnt CDATA = $str\n" if ($debug);
$ROOT->{$cnt} = $str;
$cdata_elements{$cnt} = '';
$cnt++;
}
#### To be done third -
# -- Real Comments are left --
foreach my $key (sort {$a <=> $b} keys %{$ROOT}) {
if (!exists $cdata_elements{$key}) {
$ROOT->{$key} =~ s/^<!--(.*?)-->$/$1/s;
print "$key Comment = $1\n" if ($debug);
if ($KeepComments) {
$ROOT->{$key} = { comment => $1 };
} else {delete $ROOT->{$key};}
}
}
## End Comment/CDATA block ==============================

#### Non-tag markups go here -
####

# -- Versioning -- <?XML-Version ?> - Placeholder, voided
while (s/<\?([^<>]*)\?>//) {
#while (s/<\?([^<>]*)\?>/$S_dlim[1]$cnt$E_dlim[1]/) {
print "$cnt <? ?> = $1\n" if ($debug);
$ROOT->{$cnt} = { 'XMLV' => $1 };
$cnt++;
}
# -- DOCTYPE -- <!DOCTYPE info> - Placeholder, voided
while (s/<!DOCTYPE([^<>]*)>//) {
#while (s/<!DOCTYPE([^<>]*)>/$S_dlim[1]$cnt$E_dlim[1]/) {
print "$cnt DOCTYPE = $1\n" if ($debug);
$ROOT->{$cnt} = { 'DOCTYPE' => $1 };
$cnt++;
}
#### White space removal before tags ? .. TBD -
if ($rmv_white_space) {
s/>[\s]+</></g;
s/[\s]+</</g;
s/>[\s]+/>/g;
}

#### Tags here - should only need 2 iterations max
my $finished = 0;
while ($cnt != $last_cnt && $i < 20)
{
$last_cnt = $cnt;

## <Tag/> , no content
while (s/<([0-9a-zA-Z]+)\/>/$S_dlim[1]$cnt$E_dlim[1]/) {
print "$cnt <$1> = \n" if ($debug);
$ROOT->{$cnt} = { $1 => '' };
$cnt++;
}
## <Tag Attributes/> , no content
while (s/<([0-9a-zA-Z]+)([\s]+[0-9a-zA-Z]+[\s]*=[\s]*["'][^<]*['"])+[\s]*\/>/$S_dlim[1]$cnt$E_dlim[1]/) {
print "$cnt <$1> = attr: $2\n" if ($debug);
my $hattrib = getAttrHash($2);
if (ref($hattrib) ne "HASH") {
print "Invalid token in attribute asignment:\n$hattrib\n"; $attr_error = 1; last;
}
$ROOT->{$cnt} = { $1 => $hattrib };
$cnt++;
}
## <Tag> Content </Tag>
while (s/<([0-9a-zA-Z]+)>([^<]*)<\/\1>/$S_dlim[1]$cnt$E_dlim[1]/) {
print "$cnt <$1> = $2\n" if ($debug);
my $unknown = '';
if (length($2) > 0) {
my $hcontent = getContentHash($2, $ROOT, \%cdata_elements);
$unknown = $hcontent;
if (keys (%{$hcontent}) > 1) {
if (!$ForceArray) { adjustForSingleItemArrays ($hcontent); }
} else {
if (exists $hcontent->{'content'})
{
my ($key);
if (!$ForceArray ) {
if (ref($hcontent->{'content'}) eq "ARRAY" && scalar(@{$hcontent->{'content'}}) == 1) {
$unknown = ${$hcontent->{'content'}}[0];
}
else {$unknown = $hcontent->{'content'}; }
}
}
if (!$ForceArray) { adjustForSingleItemArrays ($hcontent); }
}
}
$ROOT->{$cnt} = { $1 => $unknown };
$cnt++;
}
last if ($attr_error);
## <Tag Attributes> Content </Tag>
while (s/<([0-9a-zA-Z]+)([\s]+[0-9a-zA-Z]+[\s]*=[\s]*["'][^<]*['"])+[\s]*>([^<]*)<\/\1>/$S_dlim[1]$cnt$E_dlim[1]/) {
print "$cnt <$1> = attr: $2, content: $3\n" if ($debug);
my $hattrib = getAttrHash($2);
if (ref($hattrib) ne "HASH") {
print "Invalid token in attribute asignment:\n$hattrib\n"; $attr_error = 1; last;
}
if (length($3) > 0) {
my $hcontent = getContentHash($3, $ROOT, \%cdata_elements);
if (!$ForceArray) { adjustForSingleItemArrays ($hcontent); }
while (my ($key,$val) = each (%{$hcontent})) {
$hattrib->{$key} = $val;
}
}
$ROOT->{$cnt} = { $1 => $hattrib };
$cnt++;
}
last if ($attr_error);
if ($last_cnt != $cnt) {
$i++; print "** End pass $i\n" if ($debug);
} else {
last if ($finished);
## Encapsulate the xml with a "root"
$_ = "<root>$_</root>";
$last_cnt--;
$finished = 1;
}
}
next if ($attr_error);
if (/<|>/) {
print "($i) XML problem: malformed, syntax or tag closure:\n$_";
} else {
print "** Itterations = $i\n".
"** Debug = $debug\n".
"** Rmv white space = $rmv_white_space\n".
"** ForceArray = $ForceArray\n".
"** KeepRoot = $KeepRoot\n".
"** KeepComments = $KeepComments\n".
"** KeepContentOrder = $KeepContentOrder\n";
#print Dumper($ROOT);
print "The remaining string is:\n$_\n\n" if ($debug);

## Strip off the outer element (our root) to
## examine the contents for errors.
## ---------------------------------------
my $outer_element = $cnt-1;
if (exists $ROOT->{$outer_element})
{
my $hroot = $ROOT->{$outer_element};
my ($key,$val) = each (%{$hroot});
my $htodump = $val;

# check for errors in root
if (ref($htodump) ne "HASH" || (!$KeepContentOrder && exists $htodump->{'content'})) {
my $msg = 'Error';
$msg = 'Warning' if ($KeepContentOrder);
print "$msg, bare content at root level ..\n";
} else {
my $dmp_keys = keys (%{$htodump});

if ($dmp_keys > 1) {
print "Warning, multiple elements at root level ..\n";
} else {
($key,$val) = each (%{$htodump});
my $val_type = ref($val);

if ($dmp_keys == 0 || (exists $htodump->{'comment'})) {
print "Warning, no elements at root level ..\n";
}
if ($dmp_keys == 1) {
if ($val_type eq "HASH") {
$htodump = $val if (!$KeepRoot);
}
elsif ($val_type eq "ARRAY") {
$htodump = $val if (!$KeepRoot && $KeepContentOrder);
if (!$ForceArray || scalar(@{$val}) > 1) {
print "Warning, multiple elements at root level ..\n";
}
}
}
}
}
print "\n";
my $tmp = undef;
if (ref($htodump) eq "HASH") {
$tmp = {};
%{$tmp} = %{$htodump};
} elsif (ref($htodump) eq "ARRAY") {
$tmp = [];
@{$tmp} = @{$htodump};
} else {
print "Not a hash or array!\n";
}
print Dumper($tmp) if (defined $tmp);
} else {
print "nothing to output!\n";
}
}
}
##
sub adjustForSingleItemArrays
{
my $href = shift;
## if $val is an array ref and has one element
## set $href->{$key} equal to the element
while (my ($key,$val) = each (%{$href})) {
if (ref($val) eq "ARRAY") {
if (scalar(@{$val}) == 1) {
$href->{$key} = $val->[0];
}
}
}
}
##
sub getAttrHash
{
my $attrstr = shift;
my $ahref = {};
return $ahref unless (defined $attrstr);
while ($attrstr =~ s/[\s]*([0-9a-zA-Z]+)[\s]*=[\s]*("|')([^=]*)\2[\s]*//i) {
$ahref->{$1} = $3;
}
if ($attrstr=~/=/) {
$attrstr =~ s/^\s+//s;
$attrstr =~ s/\s+$//s;
return $attrstr
}
return $ahref;
}
##
sub getContentHash
{
my ($contstr,$hStore,$hcdata_elements) = @_;
my $ahref = {};
return $ahref unless (defined $contstr && defined $hStore && defined $hcdata_elements);
my @ary = ();
my $append_flag = 0;

while ($contstr =~ s/^([^<$S_dlim[0]$E_dlim[0]]+)|$S_dlim[0]([\d]+)$E_dlim[0]//s)
{
## -- $1 is text contents --
if (defined $1) {
my $tmp1 = $1;
# if flagged, append it to $ary[last]
if ($append_flag && scalar(@ary) > 0) {
my $size = scalar(@ary);
$ary[$size-1] .= $tmp1;
} else {
push (@ary, $1);
}
$append_flag = 0;
}
## -- $2 is substitution index --
elsif (defined $2) {
## Exist check (Comments stripped?),
# turn on append flag.
# -----------------------------------
if (!exists $hStore->{$2}) {
$append_flag = 1;
next;
}
## CDATA check, append it to $ary[last]
# and turn on append flag.
# ---------------------------------------
if (exists $hcdata_elements->{$2}) {
my $size = scalar(@ary);
if ($size > 0) {
$ary[$size-1] .= $hStore->{$2};
} else {push (@ary, $hStore->{$2});}
$append_flag = 1;
next;
}
$append_flag = 0;

## Substitution of in-line content,
# push it to @ary
# ----------------------------------
if ($KeepContentOrder) {
push (@ary, $hStore->{$2});
next;
}
## Substitution of same level here (normal),
# just store it to $ahref
# -----------------------------------------
my ($key,$val) = each (%{$hStore->{$2}});
if (exists $ahref->{$key}) {
push (@{$ahref->{$key}}, $val);
} else {
$ahref->{$key} = [$val];
}
}
else {} # shouldn't get here
}
# Store contents, strip out
# pure whitespace text elements
my $hary = [];
for (@ary) {
next if (/^\s+$/s);
push (@{$hary}, $_);
}
if (scalar(@{$hary}) > 0) {
$ahref->{'content'} = $hary;
}
## if $val is an array ref and has one element and it
## is a hash ref, set {$key} equal to hash ref
if (!$ForceArray) {
while (my ($key,$val) = each (%{$ahref})) {
if (ref($val) eq "ARRAY") {
if (scalar(@{$val}) == 1 && ref($val->[0]) eq "HASH") {
$ahref->{$key} = $val->[0];
}
}
}
}
return $ahref;
}

sub ProcessAltDebugInfo
{
}

__END__





# -----------------------
# XML Regex Parser
# Version .903 - 12/31/05
# Copyright 2005,
# by robic0-At-yahoo.com
# -----------------------

******************************
XML string:
---------------
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>ActivePerl 5.6 Change Log</title>
<link rel="stylesheet" href="Active.css" type="text/css" />
<link rev="made" href="mailto:" />
</head>

<body>

<p><a name="__index__"></a></p>
<!-- INDEX BEGIN -->
<!--

<ul>

<li><a href="#activeperl_5_6_change_log">ActivePerl 5.6 Change Log</a></li>
<ul>

<li><a href="#build_638_thursday__apr_15__2004">Build 638 Thursday, Apr 15, 2004</a></li>
<li><a href="#build_635_thursday__feb_6__2003">Build 635 Thursday, Feb 6, 2003</a></li>
<li><a href="#build_633_monday__june_17__2002">Build 633 Monday, June 17, 2002</a></li>
<li><a href="#build_632_monday__june_3__2002">Build 632 Monday, June 3, 2002</a></li>
<li><a href="#build_631_monday__december_31__2001">Build 631 Monday, December 31, 2001</a></li>
<li><a href="#build_630_wednesday__october_30__2001">Build 630 Wednesday, October 30, 2001</a></li>
<li><a href="#build_629_thursday__august_23__2001">Build 629 Thursday, August 23, 2001</a></li>
<li><a href="#build_628_thursday__july_5__2001">Build 628 Thursday, July 5, 2001</a></li>
<li><a href="#build_626_thursday__may_1__2001">Build 626 Thursday, May 1, 2001</a></li>
<li><a href="#build_623_sunday__december_12__2000">Build 623 Sunday, December 12, 2000</a></li>
<li><a href="#build_622_sunday__november_5__2000">Build 622 Sunday, November 5, 2000</a></li>
<li><a href="#build_620_sunday__october_29__2000">Build 620 Sunday, October 29, 2000</a></li>
<li><a href="#build_618_tuesday__september_12__2000">Build 618 Tuesday, September 12, 2000</a></li>
<li><a href="#build_617_thursday__august_31__2000">Build 617 Thursday, August 31, 2000</a></li>
<li><a href="#build_616_friday__july_14__2000">Build 616 Friday, July 14, 2000</a></li>
<li><a href="#build_615_thursday__june_29__2000">Build 615 Thursday, June 29, 2000</a></li>
<li><a href="#build_613_thursday__march_23__2000">Build 613 Thursday, March 23, 2000</a></li>
<li><a href="#build_612_wednesday__march_22__2000">Build 612 Wednesday, March 22, 2000</a></li>
<li><a href="#build_611_wednesday__march_15__2000">Build 611 Wednesday, March 15, 2000</a></li>
<li><a href="#build_609_wednesday__march_1__2000">Build 609 Wednesday, March 1, 2000</a></li>
<li><a href="#build_607_friday__february_11__2000">Build 607 Friday, February 11, 2000</a></li>
<li><a href="#build_606_friday__february_4__2000">Build 606 Friday, February 4, 2000</a></li>
<li><a href="#build_604_friday__november_26__1999">Build 604 Friday, November 26, 1999</a></li>
<li><a href="#build_603_tuesday__november_23__1999">Build 603 Tuesday, November 23, 1999</a></li>
<li><a href="#build_602_thursday__august_5__1999">Build 602 Thursday, August 5, 1999</a></li>
<li><a href="#build_601_tuesday__july_13__1999">Build 601 Tuesday, July 13, 1999</a></li>
<li><a href="#what_s_new_in_the_600_series">What's new in the 600 Series</a></li>
</ul>

</ul>
-->
<!-- INDEX END -->

<p>
</p>
<h1><a name="activeperl_5_6_change_log">ActivePerl 5.6 Change Log</a></h1>
<p>For the latest information on ActivePerl, please see:</p>
<pre>
<a href="http://www.ActiveState.com/ActivePerl/">http://www.ActiveState.com/ActivePerl/</a></pre>
<p>
</p>
<h2><a name="build_638_thursday__apr_15__2004">Build 638 Thursday, Apr 15, 2004</a></h2>
<p><em>PPM2 and PPM3</em></p>
<p>PPM3 has <strong>not</strong> been updated to the latest version PPM 3.1 as shipped
with the ActivePerl 5.8 series. PPM 3.1 assumes that PPM 2.x is no
longer installed and doesn't synchronize package information with it.
Since PPM2 is the default PPM version in ActivePerl 5.6, PPM3 has been
kept at version 3.0.</p>
<p><em>Bug Fixes and Changes</em></p>
<ul>
<li></li>
On Windows, a potential buffer overrun in the <code>stat()</code> function has been
fixed.
<p></p>
<li></li>
On Windows, a handle leak in <code>kill()</code> has been fixed.
<p></p>
<li></li>
On Windows, a memory leak in <code>fork()</code> has been fixed.
<p></p>
<li></li>
On Windows NT and later, subprocesses are now started via ``cmd /x/d/c''
instead of ``cmd /x/c''. This disables execution of AutoRun command
specified in the registry.
<p></p>
<li></li>
On Windows, the four-argument form of <code>select()</code> did not report the
$! (errno) value properly after errors. This has been corrected.
<p></p>
<li></li>
Win32::GetOSVersion() returns additional information about the system
(when available, Windows NT SP6 and later).
<p></p>
<li></li>
Perl for ISAPI would sometimes close a filehandle twice. This leads
to a race condition where another thread could have reused the
filehandle before the second close would be executed. This usually
happens in high load scenarios. Typical symptoms include error
messages that Perl could not load standard modules, even though they
are installed on the server.
<p>Perl for ISAPI no longer closes filehandles implicitly and relies now
on the application to properly clean up file and socket handle
resources.</p>
<p></p>
<li></li>
Perl for ISAPI now avoids closing the special handles STDIN, STDOUT
and STDERR, even if the script asked for that explicitly.
<p></p>
<li></li>
The following bundled modules have been updated to their latest
versions:
<pre>
Archive-Tar
Compress-Zlib
Digest
Digest-MD2
Digest-MD5
Digest-SHA1
File-CounterFile
HTML-Parser
HTML-Tree
libnet
libwin32
libwww-perl
MD5
MIME-Base64
Storable
Test-Harness
URI</pre>
<p>The following modules have been added to ActivePerl:</p>
<pre>
Data-Dump
IO-Zlib
Test-Simple</pre>
<p></p>
<li></li>
Other minor bug fixes and documentation updates.
<p></p></ul>
<p>
</p>
<h2><a name="build_635_thursday__feb_6__2003">Build 635 Thursday, Feb 6, 2003</a></h2>
<p><em>Fixes for Security Issues</em></p>
<ul>
<li></li>
On Linux, the <code>crypt()</code> builtin did not return consistent results.
This has been corrected.
<p></p>

***** cut off here ******


Output:
---------------

** Itterations = 3
** Debug = 0
** Rmv white space = 0
** ForceArray = 0
** KeepRoot = 1
** KeepComments = 1
** KeepContentOrder = 1

$VAR1 = {
'html' => {
'xmlns' => 'http://www.w3.org/1999/xhtml',
'content' => [
{
'head' => [
{
'title' => 'ActivePerl 5.6 Change Log'
},
{
'link' => {
'rel' => 'stylesheet',
'href' => 'Active.css',
'type' => 'text/css'
}
},
{
'link' => {
'href' => 'mailto:',
'rev' => 'made'
}
}
]
},
{
'body' => [
{
'p' => {
'a' => {
'name' => '__index__'
}
}
},
{
'comment' => ' INDEX BEGIN '
},
{
'comment' => '

<ul>

<li><a href="#activeperl_5_6_change_log">ActivePerl 5.6 Change Log</a></li>
<ul>

<li><a href="#build_638_thursday__apr_15__2004">Build 638 Thursday, Apr 15, 2004</a></li>
<li><a href="#build_635_thursday__feb_6__2003">Build 635 Thursday, Feb 6, 2003</a></li>
<li><a href="#build_633_monday__june_17__2002">Build 633 Monday, June 17, 2002</a></li>
<li><a href="#build_632_monday__june_3__2002">Build 632 Monday, June 3, 2002</a></li>
<li><a href="#build_631_monday__december_31__2001">Build 631 Monday, December 31, 2001</a></li>
<li><a href="#build_630_wednesday__october_30__2001">Build 630 Wednesday, October 30, 2001</a></li>
<li><a href="#build_629_thursday__august_23__2001">Build 629 Thursday, August 23, 2001</a></li>
<li><a href="#build_628_thursday__july_5__2001">Build 628 Thursday, July 5, 2001</a></li>
<li><a href="#build_626_thursday__may_1__2001">Build 626 Thursday, May 1, 2001</a></li>
<li><a href="#build_623_sunday__december_12__2000">Build 623 Sunday, December 12, 2000</a></li>
<li><a href="#build_622_sunday__november_5__2000">Build 622 Sunday, November 5, 2000</a></li>
<li><a href="#build_620_sunday__october_29__2000">Build 620 Sunday, October 29, 2000</a></li>
<li><a href="#build_618_tuesday__september_12__2000">Build 618 Tuesday, September 12, 2000</a></li>
<li><a href="#build_617_thursday__august_31__2000">Build 617 Thursday, August 31, 2000</a></li>
<li><a href="#build_616_friday__july_14__2000">Build 616 Friday, July 14, 2000</a></li>
<li><a href="#build_615_thursday__june_29__2000">Build 615 Thursday, June 29, 2000</a></li>
<li><a href="#build_613_thursday__march_23__2000">Build 613 Thursday, March 23, 2000</a></li>
<li><a href="#build_612_wednesday__march_22__2000">Build 612 Wednesday, March 22, 2000</a></li>
<li><a href="#build_611_wednesday__march_15__2000">Build 611 Wednesday, March 15, 2000</a></li>
<li><a href="#build_609_wednesday__march_1__2000">Build 609 Wednesday, March 1, 2000</a></li>
<li><a href="#build_607_friday__february_11__2000">Build 607 Friday, February 11, 2000</a></li>
<li><a href="#build_606_friday__february_4__2000">Build 606 Friday, February 4, 2000</a></li>
<li><a href="#build_604_friday__november_26__1999">Build 604 Friday, November 26, 1999</a></li>
<li><a href="#build_603_tuesday__november_23__1999">Build 603 Tuesday, November 23, 1999</a></li>
<li><a href="#build_602_thursday__august_5__1999">Build 602 Thursday, August 5, 1999</a></li>
<li><a href="#build_601_tuesday__july_13__1999">Build 601 Tuesday, July 13, 1999</a></li>
<li><a href="#what_s_new_in_the_600_series">What\'s new in the 600 Series</a></li>
</ul>

</ul>
'
},
{
'comment' => ' INDEX END '
},
{
'p' => {}
},
{
'h1' => {
'a' => {
'content' => 'ActivePerl 5.6 Change Log',
'name' => 'activeperl_5_6_change_log'
}
}
},
{
'p' => 'For the latest information on ActivePerl, please see:'
},
{
'pre' => {
'a' => {
'href' => 'http://www.ActiveState.com/ActivePerl/',
'content' => 'http://www.ActiveState.com/ActivePerl/'
}
}
},
{
'p' => {}
},
{
'h2' => {
'a' => {
'content' => 'Build 638 Thursday, Apr 15, 2004',
'name' => 'build_638_thursday__apr_15__2004'
}
}
},
{
'p' => {
'em' => 'PPM2 and PPM3'
}
},
{
'p' => [
'PPM3 has ',
{
'strong' => 'not'
},
' been updated to the latest version PPM 3.1 as shipped
with the ActivePerl 5.8 series. PPM 3.1 assumes that PPM 2.x is no
longer installed and doesn\'t synchronize package information with it.
Since PPM2 is the default PPM version in ActivePerl 5.6, PPM3 has been
kept at version 3.0.'
]
},
{
'p' => {
'em' => 'Bug Fixes and Changes'
}
},
{
'ul' => [
{
'li' => ''
},
'
On Windows, a potential buffer overrun in the ',
{
'code' => 'stat()'
},
' function has been
fixed.
',
{
'p' => ''
},
{
'li' => ''
},
'
On Windows, a handle leak in ',
{
'code' => 'kill()'
},
' has been fixed.
',
{
'p' => ''
},
{
'li' => ''
},
'
On Windows, a memory leak in ',
{
'code' => 'fork()'
},
' has been fixed.
',
{
'p' => ''
},
{
'li' => ''
},
'
On Windows NT and later, subprocesses are now started via ``cmd /x/d/c\'\'
instead of ``cmd /x/c\'\'. This disables execution of AutoRun command
specified in the registry.
',
{
'p' => ''
},
{
'li' => ''
},
'
On Windows, the four-argument form of ',
{
'code' => 'select()'
},
' did not report the
$! (errno) value properly after errors. This has been corrected.
',
{
'p' => ''
},
{
'li' => ''
},
'
Win32::GetOSVersion() returns additional information about the system
(when available, Windows NT SP6 and later).
',
{
'p' => ''
},
{
'li' => ''
},
'
Perl for ISAPI would sometimes close a filehandle twice. This leads
to a race condition where another thread could have reused the
filehandle before the second close would be executed. This usually
happens in high load scenarios. Typical symptoms include error
messages that Perl could not load standard modules, even though they
are installed on the server.
',
{
'p' => 'Perl for ISAPI no longer closes filehandles implicitly and relies now
on the application to properly clean up file and socket handle
resources.'
},
{
'p' => ''
},
{
'li' => ''
},
'
Perl for ISAPI now avoids closing the special handles STDIN, STDOUT
and STDERR, even if the script asked for that explicitly.
',
{
'p' => ''
},
{
'li' => ''
},
'
The following bundled modules have been updated to their latest
versions:
',
{
'pre' => '
Archive-Tar
Compress-Zlib
Digest
Digest-MD2
Digest-MD5
Digest-SHA1
File-CounterFile
HTML-Parser
HTML-Tree
libnet
libwin32
libwww-perl
MD5
MIME-Base64
Storable
Test-Harness
URI'
},
{
'p' => 'The following modules have been added to ActivePerl:'
},
{
'pre' => '
Data-Dump
IO-Zlib
Test-Simple'
},
{
'p' => ''
},
{
'li' => ''
},
'
Other minor bug fixes and documentation updates.
',
{
'p' => ''
}
]
},

********** cut off here ************
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,177
Messages
2,570,953
Members
47,507
Latest member
codeguru31

Latest Threads

Top