question about sorting syntax

A

Adam Sandler

Hello,

On a few pages on the net I've seen this snip of code:

my @sorted=map { substr($_,4) }
sort
map { pack("LA*",tr/eE/eE/,$_) } @words;

In the few pages I've seen, which all have the same snip of code(word
for word in fact), none of the pages explain the parameters passed to
pack(). My question is, and perldoc -pack and Googling the "LA" don't
help, what does "LA*" mean?

Thanks!
 
J

J. Gleixner

Adam Sandler wrote:

Actually it's a question about pack, not sorting.

[...]
for word in fact), none of the pages explain the parameters passed to
pack(). My question is, and perldoc -pack and Googling the "LA" don't
help, what does "LA*" mean?

Didn't perldoc -pack give you an error?

perldoc -f pack
 
P

Paul Lalli

Subject: question about sorting syntax

You do not have a question about sort. You have a question about
pack(). Paring your problem down to get at the heart of your question
will greatly help get you good responses in technical newsgroups like
this one.
On a few pages on the net I've seen this snip of code:

my @sorted=map { substr($_,4) }
sort
map { pack("LA*",tr/eE/eE/,$_) } @words;

In the few pages I've seen, which all have the same snip of code(word
for word in fact), none of the pages explain the parameters passed to
pack().

None? Not even the documentation for pack()?
My question is, and perldoc -pack and Googling the "LA" don't
help, what does "LA*" mean?

Presumably, you meant `perldoc -f pack`, right? I beg to differ with
the assessment that the parameters are not explained in that document:

perldoc -f pack
pack TEMPLATE,LIST
Takes a LIST of values and converts it into a string
using the rules given by the TEMPLATE.
...
The TEMPLATE is a sequence of characters that give
the order and type of values, as follows:
...
A A text (ASCII) string, will be space padded.
...
L An unsigned long value.
(This 'long' is _exactly_ 32 bits)
...
* Each letter may optionally be followed by a
number giving a repeat count. ...
the pack function
will gobble up that many values from the
LIST. A "*" for the repeat count means to
use however many items are left

Okay. So now we know what the format is doing. It's going to look at
the list of values passed to pack(), and create from it a string where
the first four bytes (32 bits) will be the first value converted to a
Long, and the reminder of the bytes will be an ASCII representation of
the remainder of the values.

So now, what are the remaining values passed to pack? Well first we
have the expression 'tr/eE/eE'. You can read the docs yourself in
perldoc perlop, but basically this counts the number of "e" and "E"
characters in $_. So if $_ contained "End of the line", then this
expression would return 3. The final argument is $_ itself.

This tells us that in this example, the first value for pack will be
3, and the remaining value will be "End of the line". The L converts
the 3 to a four-byte representation of the number 3, and the A* simply
treats "End of the line" as a string, which it already is.

So let's see it in action:
$ perl -MData::Dumper -le'
$Data::Dumper::Useqq = 1;
$str = pack("LA*", 3, "End of the line");
print Dumper($str);
'
$VAR1 = "\0\0\0\3End of the line";

The L converted the number 3 to the four-byte string "\0\0\0\3", and
the A* kept "End of the line" as "End of the line"

This string is then sorted amongst all the other likewise conversions
of the strings in @words, and then the first four bytes - the
"\0\0\0\3" part is returned into @sorted.

I would be very curious to see where this code originated. I'm almost
of the belief that it would have to be an obfuscated Perl contest...

Hope this helps,
Paul Lalli
 
J

John W. Krahn

Paul said:
This tells us that in this example, the first value for pack will be
3, and the remaining value will be "End of the line". The L converts
the 3 to a four-byte representation of the number 3, and the A* simply
treats "End of the line" as a string, which it already is.

So let's see it in action:
$ perl -MData::Dumper -le'
$Data::Dumper::Useqq = 1;
$str = pack("LA*", 3, "End of the line");
print Dumper($str);
'
$VAR1 = "\0\0\0\3End of the line";

The L converted the number 3 to the four-byte string "\0\0\0\3", and
the A* kept "End of the line" as "End of the line"

This string is then sorted amongst all the other likewise conversions
of the strings in @words, and then the first four bytes - the
"\0\0\0\3" part is returned into @sorted.

I would be very curious to see where this code originated. I'm almost
of the belief that it would have to be an obfuscated Perl contest...

You must have a big-endian CPU because on my little-endian CPU (Intel) using
the "L" format converts the number 3 to "\3\0\0\0" which means that any counts
above 255 will not be sorted correctly:

$ perl -le'print unpack "L", $_ for sort map pack( "L", $_ ), qw/ 65537 65536
65535 65534 256 256 255 5 4 3 2 1 /'
65536
256
256
1
65537
2
3
4
5
65534
255
65535


However, if you use the "N" format it should work on any platform:

$ perl -le'print unpack "N", $_ for sort map pack( "N", $_ ), qw/ 65537 65536
65535 65534 256 256 255 5 4 3 2 1 /'
1
2
3
4
5
255
256
256
65534
65535
65536
65537



John
 
M

Minimiscience

Paul said:
This string is then sorted amongst all the other likewise conversions
of the strings in @words, and then the first four bytes - the
"\0\0\0\3" part is returned into @sorted.

Actually, everything after the first four bytes ("End of the line") is returned
by the substr().

-- MiSc
 
P

Paul Lalli

Actually, everything after the first four bytes ("End of the
line") is returned by the substr().

Ah, yes, you're correct. That makes much more sense, but it's still a
very odd method of (presumably) sorting strings by the number of E's
they have...

Thanks for the correction,
Paul Lalli
 
U

Uri Guttman

AS> I found the same exact code snippet on a handful of sites... including
AS> sites by authors who also post here. Here is the URL of the page
AS> which I bookmarked:

AS> http://www.perlmonks.org/?node=145659

AS> Thanks to everyone for their responses

and check out Sort::Maker on cpan which can do this sort for you very
easily and it can be among the fastest ways to sort in perl. you might
also learn a lot from the docs on the various sorting styles including
the ST and GRT.

uri
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,202
Messages
2,571,057
Members
47,667
Latest member
DaniloB294

Latest Threads

Top