Confused about Schwartz idiom utilizing map & split

W

weston

In an article on Stonehenge.com on using libxml2 to strip html from a
document, I came across a part of the listing that I'm having trouble
understanding. Randall apparently creates a hash of approved tags and
their attributes with these lines:

=9= my %PERMITTED =
=10= map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }
=11= split /\n/, <<'END';
=12= a href name target class title
=13= b
=14= big
=15= blockquote class
....
=49= END

(See http://www.stonehenge.com/merlyn/PerlJournal/col02.html )

I keep trying to parse line 10 in my head and am not getting a lot of
mental traction in really understanding how this works. Anybody want to
help?
 
D

Dr.Ruud

weston schreef:
In an article on Stonehenge.com on using libxml2 to strip html from a
document, I came across a part of the listing that I'm having trouble
understanding. Randall apparently creates a hash of approved tags and
their attributes with these lines:

=9= my %PERMITTED =
=10= map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }
=11= split /\n/, <<'END';
=12= a href name target class title
=13= b
=14= big
=15= blockquote class
....
=49= END

(See http://www.stonehenge.com/merlyn/PerlJournal/col02.html )

I keep trying to parse line 10 in my head and am not getting a lot of
mental traction in really understanding how this works. Anybody want
to help?

Maybe this helps:

#!/usr/bin/perl
use strict; use warnings;
use Data::Dumper;

my %PERMITTED =
map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }
split /\n/, <<'END';
a href name target class title
b
big
blockquote class
....
END

print Data::Dumper->Dump( [\%PERMITTED]
, [qw(%PERMITTED)]
), "\n";
 
R

Randal L. Schwartz

weston> In an article on Stonehenge.com on using libxml2 to strip html from a
weston> document, I came across a part of the listing that I'm having trouble
weston> understanding. Randall apparently creates a hash of approved tags and
weston> their attributes with these lines:

weston> =9= my %PERMITTED =
weston> =10= map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }
weston> =11= split /\n/, <<'END';
weston> =12= a href name target class title
weston> =13= b
weston> =14= big
weston> =15= blockquote class
weston> ....
weston> =49= END

weston> (See http://www.stonehenge.com/merlyn/PerlJournal/col02.html )

weston> I keep trying to parse line 10 in my head and am not getting a lot of
weston> mental traction in really understanding how this works. Anybody want to
weston> help?

Heh.

The split on line 11 creates elements like:

"a href name target class title",
"b",
"big",
"blockquote class",

etc. The map on the beginning of line 10 sets $_ equal to each of those,
and looks for a list-valued return from the block.

The split in the middle of line 10 breaks each of those elements listed above
into a list, and assigns the first to $k, and any remaining ones to @v.

The second map on line 10 converts @v to a list of elements of @v alternating
with the value "1", and then turns that into a hashref, so that @v becomes
keys, with values 1. That hashref is then added along with $k to be
two values that eventually contribute to %PERMITTED.

But didn't I say all this in the article? :)

print "Just another Perl hacker,"; # the original
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[email protected]> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
 
T

Tad McClellan

weston said:
In an article on Stonehenge.com on using libxml2 to strip html from a
document, I came across a part of the listing that I'm having trouble
understanding. Randall apparently creates a hash of approved tags and
their attributes with these lines:
=10= map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }
I keep trying to parse line 10 in my head and am not getting a lot of
mental traction in really understanding how this works. Anybody want to
help?


Does this help?

------------------------------
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;

my %PERMITTED =
map { my($k, @v) = split; # 1st space-sep'd field is tag, rest are its attrs
($k, {map {$_, 1} @v}) # a 2-element list. 1st is tag,
# 2nd is a hash-ref with keys as attr names,
# and values set to one
}
split /\n/, <<'END';
a href name target class title
b
big
blockquote class
END

print Dumper \%PERMITTED;
------------------------------


Or maybe it would help to "unroll" the maps into foreachs:

------------------------------
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;

my %PERMITTED;

foreach (split /\n/, <<'END')
a href name target class title
b
big
blockquote class
END
{
my($k, @v) = split;
my %h;
foreach ( @v ) { # "unroll" {map {$_, 1} @v
$h{$_} = 1;
}
$PERMITTED{$k} = \%h;
}

print Dumper \%PERMITTED;
 
A

Anno Siegel

weston said:
In an article on Stonehenge.com on using libxml2 to strip html from a
document, I came across a part of the listing that I'm having trouble
understanding. Randall apparently creates a hash of approved tags and

Who is this Randall you speak of?
their attributes with these lines:

Randal's code constructs a hash of hashes. The first word in each data
line is a primary key. The rest of the words in each line (if any)
become the keys of an inner hash, all with the value 1. Presumably
the inner hash represents a set of whatever, associated with the primary
key.
=9= my %PERMITTED =
=10= map { my($k, @v) = split; ($k, {map {$_, 1} @v}) }
=11= split /\n/, <<'END';
=12= a href name target class title
=13= b
=14= big
=15= blockquote class
....
=49= END

How does it do that? Rewriting the code with fewer map's and more
variable names may help. (untested)

my @lines = split /\n/, <<'END';
a href name target class title
b
big
blockquote class
END

my %PERMITTED;

for my $line ( @lines ) {
my ($primary_key, @words) = split; # ($k, @v) in the original code
# build wordlist
my @wordlist; # alternating one word and one 1 (for hash initialization)
for my $word ( @v ) {
push @wordlist, ( $word => 1);
}
# build a hash out of @wordlist and assign it to its place
$PERMITTED{ $k} = { @wordlist};
}
I keep trying to parse line 10 in my head and am not getting a lot of
mental traction in really understanding how this works. Anybody want to
help?

Line 10 does basically what the (outer) for-loop does in my code. The
inner for-loop does the job of the nested map.

Randal's code is that of a fluent speaker of Perl. Its parts (the two map's)
are two well-known idioms for hash-building. Applied together, they may
look like a mess, but once you recognize the pattern of each their
interaction becomes clear too.

Anno
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,744
Latest member
CortneyMcK

Latest Threads

Top