Map or Regex and Sorting

Herr Hardy · May 24, 2004

Hi,

I have some working code, but I think, there's more than one better
way to solve the task:

I have some given Data that should be sorted first on appearance of
string OPEN (on top) and second after the values of the first
occurence of digits, here e.g. 31 to 40.
How can I 'regex' or 'map' this to sort them in one loop already?

Thanks
Hardy

#!/usr/bin/perl

use warnings;
use strict;

my @digitlist;
my @openlist;
my @clsdlist;

while(<DATA>){
/(\d{1,3})/;
push (@digitlist,$1.$_);
}

@digitlist = sort @digitlist;

for(@digitlist){
if (/OPEN/){
push(@openlist,$_);
}else{
push(@clsdlist,$_);
}
}

for(@openlist){
print;
}
for(@clsdlist){
print;
}
1;
# EOF

__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN

Herr Hardy · May 24, 2004

I have some given Data that should be sorted first on appearance of
string OPEN (on top) and second after the values of the first
occurence of digits, here e.g. 31 to 40.
How can I 'regex' or 'map' this to sort them in one loop already?

Shorter sorting of the @digitlist, but howto combine it with the
String search, hm?

#!/usr/bin/perl

use warnings;
use strict;

my($x,$y);
my @digitlist = sort {($x)=($a=~/(\d{1,3})/);($y)=($b=~/(\d{1,3})/);$x
<=> $y} (<DATA>);

for(@digitlist){
print;
}

1;
# EOF

__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN

Anno Siegel · May 24, 2004

Herr Hardy said:
Shorter sorting of the @digitlist, but howto combine it with the
String search, hm?

#!/usr/bin/perl

use warnings;
use strict;

my($x,$y);
my @digitlist = sort {($x)=($a=~/(\d{1,3})/);($y)=($b=~/(\d{1,3})/);$x
<=> $y} (<DATA>);

for(@digitlist){
print;
}

1;
# EOF

__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN

That can be done using a Schwartz transform. Assign numeric sort keys
to the lines. Make the key -1 if the line contains "OPEN", otherwise
use the first group of digits. Sort according to keys, throw the
keys away.

print for # print result
map $_->[ 0], # throw keys away
sort { $a->[ 1] <=> $b->[ 1] } # sort by key
map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys
<DATA>;

Anno

Herr Hardy · May 24, 2004

Well, I'd build a HoH (Hash of Hashes) to store
the OPEN-offset and the first digit group, then
use a normal sort (you can combine sort patterns
with ||).

Aha, pretty good idea (not mine, sniff) and looks very clear.

In my example I've assumed that neither the offset
reaches 1000 nor does the digit value. Of course you
should adapt this value to your needs.

OK, YYYYMMDDhhmmss

$val{$_}->{'DIGT'} = ( /(\d{14})/ and $1);

HTH & happy rail route calculating
-Christian

Thanks a lot... but it's project data from regional water supply, at
least 200 sets of DATA but never more than 321 at once, so I had no
reason to care for performance and worked well with my loopy-loop
skript...

Yours tastes better!

Ciao Hardy

Herr Hardy · May 24, 2004

That can be done using a Schwartz transform. Assign numeric sort keys
to the lines. Make the key -1 if the line contains "OPEN", otherwise
use the first group of digits. Sort according to keys, throw the
keys away.

ooh great, Anno, I'd already seen the Schwartzian transform while
googling, but I'm pretty unfamiliar with that mighty map function and
wouldn't get that assignment

map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys

to work.

So, thank you

Matija Papec · May 24, 2004

X-Ftn-To: Herr Hardy

Herr Hardy said:
to the lines. Make the key -1 if the line contains "OPEN", otherwise
use the first group of digits. Sort according to keys, throw the
keys away.

Click to expand...

ooh great, Anno, I'd already seen the Schwartzian transform while
googling, but I'm pretty unfamiliar with that mighty map function and
wouldn't get that assignment

map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys

Click to expand...

to work.

If you're familiar with references, learning map is very usefull

use Data:

umper;
print Dumper
# map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys
map {
my ($value_for_sorting) = /(\d+)/;
$value_for_sorting = -1 if /OPEN/;
[ $_, $value_for_sorting ];
}
<DATA>;

Herr Hardy · May 24, 2004

__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN

Click to expand...

That can be done using a Schwartz transform. Assign numeric sort keys
to the lines. Make the key -1 if the line contains "OPEN", otherwise
use the first group of digits. Sort according to keys, throw the
keys away.

print for # print result
map $_->[ 0], # throw keys away
sort { $a->[ 1] <=> $b->[ 1] } # sort by key
map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys
<DATA>;

.... buuuuut the result ist

RBruhrII33PER11OPEN
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN
SBsorpe31PER19IN18last
SBruhrtal32PER09CLSD
RBruhrI34PER14CLSD

3 OPEN on top, rest follows, but the OPEN-values should then be
ordered after their (\d{1,3})'s and here's 33 40 34,
should be 33 34 40

So, is there a way to alter the keys by 2nd mapping?

map [ $_ , /OPEN/?$_->[0]+=1000

0) [ 0] ], # alter keys on 1st level

but that's not referencing, I see.

print for # print result
map $_->[ 0], # throw keys away
sort { $a->[ 1] <=> $b->[ 1] } # sort by altered-key
#do something, maybe add 1000 if(/OPEN/) # alter keys on 1st level
map [ $_ , (/(\d{1,3})/) [ 0] ], # assign 2nd-level ordinate
<DATA>;

John W. Krahn · May 25, 2004

Herr said:
I have some working code, but I think, there's more than one better
way to solve the task:

I have some given Data that should be sorted first on appearance of
string OPEN (on top) and second after the values of the first
occurence of digits, here e.g. 31 to 40.
How can I 'regex' or 'map' this to sort them in one loop already?

Thanks
Hardy

#!/usr/bin/perl

use warnings;
use strict;

my @digitlist;
my @openlist;
my @clsdlist;

while(<DATA>){
/(\d{1,3})/;
push (@digitlist,$1.$_);
}

@digitlist = sort @digitlist;

for(@digitlist){
if (/OPEN/){
push(@openlist,$_);
}else{
push(@clsdlist,$_);
}
}

for(@openlist){
print;
}
for(@clsdlist){
print;
}
1;
# EOF

__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN

#!/usr/bin/perl
use warnings;
use strict;

my ( @openlist, @clsdlist );

push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for
map substr( $_, 3 ),
sort
map sprintf( '%03d%s', /(\d{1,3})/, $_ ),
<DATA>;

print @openlist, @clsdlist;

John

Uri Guttman · May 25, 2004

JWK> #!/usr/bin/perl
JWK> use warnings;
JWK> use strict;

JWK> my ( @openlist, @clsdlist );

JWK> push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for
JWK> map substr( $_, 3 ),
JWK> sort
JWK> map sprintf( '%03d%s', /(\d{1,3})/, $_ ),
JWK> <DATA>;

JWK> print @openlist, @clsdlist;

i like it, an GRT written on the fly. but you now owe me royalties! for
payment, i expect you to test Sort::Maker before its public release.

uri

Herr Hardy · May 25, 2004

Let me comment to understand, from bottom to top

push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for

# ^ 4. OPEN 2 openlist, !OPEN 2 clsdlist

map substr( $_, 3 ),

# ^ 3. delete leading 'mark'-digits

sort

# ^ 2. simple Perl sort on the 'marks'

map sprintf( '%03d%s', /(\d{1,3})/, $_ ),

# ^ 1. 'marking' each line with \d{1,3} as leading digits
# or 0nn or 00n if less than 3 digits in \d{1,3}

<DATA>;

print @openlist, @clsdlist;

That's pretty, John.

Herr Hardy · May 25, 2004

Taking your funny linewithout mapping

push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for
sort {
my($x) = ($a=~/(\d{1,3})/);
my($y) = ($b=~/(\d{1,3})/);
$x <=> $y;
}
<DATA>;

Uri Guttman · May 25, 2004

HH> On Tue, 25 May 2004 08:31:02 +0200, Herr Hardy <[email protected]>
HH> wrote:

HH> Taking your funny lineHH> without mapping

HH> push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for
HH> sort {
HH> my($x) = ($a=~/(\d{1,3})/);
HH> my($y) = ($b=~/(\d{1,3})/);
HH> $x <=> $y;
HH> }
HH> <DATA>;

read http://sysarch.com/perl/sort_paper.html to find out why the
map/sort/map is usually much faster.

uri

Anno Siegel · May 25, 2004

Herr Hardy said:
__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN

Click to expand...

That can be done using a Schwartz transform. Assign numeric sort keys
to the lines. Make the key -1 if the line contains "OPEN", otherwise
use the first group of digits. Sort according to keys, throw the
keys away.

print for # print result
map $_->[ 0], # throw keys away
sort { $a->[ 1] <=> $b->[ 1] } # sort by key
map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys
<DATA>;

Click to expand...

... buuuuut the result ist

RBruhrII33PER11OPEN
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN
SBsorpe31PER19IN18last
SBruhrtal32PER09CLSD
RBruhrI34PER14CLSD

3 OPEN on top, rest follows, but the OPEN-values should then be
ordered after their (\d{1,3})'s and here's 33 40 34,
should be 33 34 40

You weren't too clear about the sort criteria, I'm not the only one who
misunderstood.

So, is there a way to alter the keys by 2nd mapping?

map [ $_ , /OPEN/?$_->[0]+=10000) [ 0] ], # alter keys on 1st level

but that's not referencing, I see.

Hmm...

You want to subtract, not add 1000 to bring the lines up.

Otherwise, your code makes little sense. Since you give no context,
I must guess where you want this to appear in the map/sort pipeline.
If "2nd mapping" means what I think it means, at this point $_ is
the two-element array(ref) created in the first step. You don't
want to match against that, nor put it in another two-element array.
You could modify the keys in a second step, I guess, but it ain't pretty.
Do it right in the first step:

map [ $_, ( /(\d+)/)[ 0] - 1000 * /OPEN/ ], <DATA>;

....though that's a little exotic (and untested).

The whole approach is rather fragile, depending as it does on the
particular range 0 .. 999 for the numbers. (Using -1 for /OPEN/ is
in the same category.) Both try to squeeze two sort criteria
in a single key. A two-key sort is more robust:

print for
map $_->[ 0],
sort { $b->[ 1] <=> $a->[ 1] or $a->[ 2] <=> $b->[ 2] }
map [ $_ , scalar /OPEN/, /(\d+)/ ],
<DATA>;

Note the reverse sort on the /OPEN/ key. Also, if other numeric
fields enter the equation some day, just add /g to the regex:
/(\d+)/g, and they will be available for the sort.

Anno

Herr Hardy · May 25, 2004

(e-mail address removed)-berlin.de (Anno Siegel) wrote on 25 May 2004
16:31:17 GMT:

You weren't too clear about the sort criteria, I'm not the only one who
misunderstood.

Sorry, Anno, I wouldn't like to waste your time.

map [ $_ , /OPEN/?$_->[0]+=10000) [ 0] ], # alter keys on 1st level

Click to expand...

You want to subtract, not add 1000 to bring the lines up. Yes.

Otherwise, your code makes little sense. Since you give no context,
I must guess where you want this to appear in the map/sort pipeline.
If "2nd mapping" means what I think it means, at this point $_ is
the two-element array(ref) created in the first step. You don't
want to match against that, nor put it in another two-element array.
You could modify the keys in a second step, I guess, but it ain't pretty.
Do it right in the first step:

Well, I would ask, any idea?

map [ $_, ( /(\d+)/)[ 0] - 1000 * /OPEN/ ], <DATA>;

That's it, exactly. I meant: somehow alter the '\d+'-value depending
on the OPEN-String, so the key is one for both,

...though that's a little exotic (and untested).

it works.

print $_->[1]."\n" for # print key
sort { $a->[ 1] <=> $b->[ 1] } # sort
map [ $_, ( /(\d+)/)[ 0] - 1000 * /OPEN/ ], # assign key
<DATA>;

__DATA__
SBruhrtal31PER09CLSD
SBsorpe33PER19IN18last
RBruhrII32PER11OPEN
RBruhrII34PER11OPEN
RBruhrI35PER14CLSD
RBharkortI36PER09OPEN

PRINTS
-968
-966
-964
31
33
35

The whole approach is rather fragile, depending as it does on the
particular range 0 .. 999 for the numbers. (Using -1 for /OPEN/ is
in the same category.) Both try to squeeze two sort criteria
in a single key. A two-key sort is more robust:

print for
map $_->[ 0],
sort { $b->[ 1] <=> $a->[ 1] or $a->[ 2] <=> $b->[ 2] }
map [ $_ , scalar /OPEN/, /(\d+)/ ],
<DATA>;

Note the reverse sort on the /OPEN/ key. Also, if other numeric
fields enter the equation some day, just add /g to the regex:
/(\d+)/g, and they will be available for the sort.

print $_->[1]." - ".$_->[2]."\n" for
sort { $b->[ 1] <=> $a->[ 1] or $a->[ 2] <=> $b->[ 2] }
map [ $_ , scalar /OPEN/, /(\d+)/ ],
<DATA>;

PRINTS
1 - 32
1 - 34
1 - 36
- 31
- 33
- 35
.... and I understand what's happening.

Thank you,
Hardy

John W. Krahn · May 26, 2004

Uri said:
JWK> my ( @openlist, @clsdlist );

JWK> push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for
JWK> map substr( $_, 3 ),
JWK> sort
JWK> map sprintf( '%03d%s', /(\d{1,3})/, $_ ),
JWK> <DATA>;

JWK> print @openlist, @clsdlist;

i like it, an GRT written on the fly. but you now owe me royalties!

Then you will have to hire me so that I can afford to pay you.

for payment, i expect you to test Sort::Maker before its public release.

I couldn't find it on CPAN or your web site.

John

Uri Guttman · May 26, 2004

JWK> I couldn't find it on CPAN or your web site.

i posted the location in another recent thread on sorting. i would
expect someone who knows how to write a GRT to follow all sort threads!

anyhow, it is at stemsystems.com/sort/Sort-Maker-0.01.tar.gz

i am coding up some late changes and improvements now and i will
announce the next beta as soon as that is done. my goal is to release it
before yapc::na since i am giving a talk about it there.

uri

John W. Krahn · May 27, 2004

Uri said:
JWK> I couldn't find it on CPAN or your web site.

i posted the location in another recent thread on sorting. i would
expect someone who knows how to write a GRT to follow all sort threads!

I follow, I just don't remember.

anyhow, it is at stemsystems.com/sort/Sort-Maker-0.01.tar.gz

Got it.

i am coding up some late changes and improvements now and i will
announce the next beta as soon as that is done. my goal is to release it
before yapc::na since i am giving a talk about it there.

If you want it to run on version 5.6 or earlier you will have to remove
the =head3 tags from the POD which were introduced in version 5.8.

John

Uri Guttman · May 27, 2004

JWK> Got it.

JWK> If you want it to run on version 5.6 or earlier you will have to remove
JWK> the =head3 tags from the POD which were introduced in version 5.8.

i liked using =head3 as the pod needed it with all the nested attribute
stuff. if you can figure out a good way to eliminate head3 i will do
it. i find pod easy to use but frustrating sometimes when you want
better nesting.

sorting map value based	6	Nov 2, 2010
Regex, spaces in pattern stored in variable.	3	Jan 6, 2010
Sorting an STL map	1	Oct 19, 2013
Sorting hash of hashes	3	Nov 21, 2011
READFILE sorting coding problem	3	Oct 25, 2013
Paralle sorting algorithms...	0	Sep 7, 2012
Regex insertion	1	Jul 18, 2007
Merge files	1	Aug 7, 2013

Map or Regex and Sorting

Herr Hardy

Herr Hardy

Anno Siegel

Herr Hardy

Herr Hardy

Matija Papec

Herr Hardy

John W. Krahn

Uri Guttman

Herr Hardy

Herr Hardy

Uri Guttman

Anno Siegel

Herr Hardy

John W. Krahn

Uri Guttman

John W. Krahn

Uri Guttman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads