Map or Regex and Sorting

H

Herr Hardy

Hi,

I have some working code, but I think, there's more than one better
way to solve the task:

I have some given Data that should be sorted first on appearance of
string OPEN (on top) and second after the values of the first
occurence of digits, here e.g. 31 to 40.
How can I 'regex' or 'map' this to sort them in one loop already?

Thanks
Hardy

#!/usr/bin/perl

use warnings;
use strict;

my @digitlist;
my @openlist;
my @clsdlist;

while(<DATA>){
/(\d{1,3})/;
push (@digitlist,$1.$_);
}

@digitlist = sort @digitlist;

for(@digitlist){
if (/OPEN/){
push(@openlist,$_);
}else{
push(@clsdlist,$_);
}
}

for(@openlist){
print;
}
for(@clsdlist){
print;
}
1;
# EOF

__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN
 
H

Herr Hardy

I have some given Data that should be sorted first on appearance of
string OPEN (on top) and second after the values of the first
occurence of digits, here e.g. 31 to 40.
How can I 'regex' or 'map' this to sort them in one loop already?

Shorter sorting of the @digitlist, but howto combine it with the
String search, hm?

#!/usr/bin/perl

use warnings;
use strict;

my($x,$y);
my @digitlist = sort {($x)=($a=~/(\d{1,3})/);($y)=($b=~/(\d{1,3})/);$x
<=> $y} (<DATA>);

for(@digitlist){
print;
}


1;
# EOF

__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN
 
A

Anno Siegel

Herr Hardy said:
Shorter sorting of the @digitlist, but howto combine it with the
String search, hm?

#!/usr/bin/perl

use warnings;
use strict;

my($x,$y);
my @digitlist = sort {($x)=($a=~/(\d{1,3})/);($y)=($b=~/(\d{1,3})/);$x
<=> $y} (<DATA>);

for(@digitlist){
print;
}


1;
# EOF

__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN

That can be done using a Schwartz transform. Assign numeric sort keys
to the lines. Make the key -1 if the line contains "OPEN", otherwise
use the first group of digits. Sort according to keys, throw the
keys away.

print for # print result
map $_->[ 0], # throw keys away
sort { $a->[ 1] <=> $b->[ 1] } # sort by key
map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys
<DATA>;

Anno
 
H

Herr Hardy

Well, I'd build a HoH (Hash of Hashes) to store
the OPEN-offset and the first digit group, then
use a normal sort (you can combine sort patterns
with ||).
Aha, pretty good idea (not mine, sniff) and looks very clear.
In my example I've assumed that neither the offset
reaches 1000 nor does the digit value. Of course you
should adapt this value to your needs.
OK, YYYYMMDDhhmmss

$val{$_}->{'DIGT'} = ( /(\d{14})/ and $1);
HTH & happy rail route calculating
-Christian

Thanks a lot... but it's project data from regional water supply, at
least 200 sets of DATA but never more than 321 at once, so I had no
reason to care for performance and worked well with my loopy-loop
skript...

Yours tastes better!

Ciao Hardy
 
H

Herr Hardy

That can be done using a Schwartz transform. Assign numeric sort keys
to the lines. Make the key -1 if the line contains "OPEN", otherwise
use the first group of digits. Sort according to keys, throw the
keys away.

ooh great, Anno, I'd already seen the Schwartzian transform while
googling, but I'm pretty unfamiliar with that mighty map function and
wouldn't get that assignment
map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys
to work.

So, thank you
 
M

Matija Papec

X-Ftn-To: Herr Hardy

Herr Hardy said:
to the lines. Make the key -1 if the line contains "OPEN", otherwise
use the first group of digits. Sort according to keys, throw the
keys away.

ooh great, Anno, I'd already seen the Schwartzian transform while
googling, but I'm pretty unfamiliar with that mighty map function and
wouldn't get that assignment
map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys
to work.

If you're familiar with references, learning map is very usefull

use Data::Dumper;
print Dumper
# map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys
map {
my ($value_for_sorting) = /(\d+)/;
$value_for_sorting = -1 if /OPEN/;
[ $_, $value_for_sorting ];
}
<DATA>;
 
H

Herr Hardy

__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN

That can be done using a Schwartz transform. Assign numeric sort keys
to the lines. Make the key -1 if the line contains "OPEN", otherwise
use the first group of digits. Sort according to keys, throw the
keys away.

print for # print result
map $_->[ 0], # throw keys away
sort { $a->[ 1] <=> $b->[ 1] } # sort by key
map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys
<DATA>;

.... buuuuut the result ist

RBruhrII33PER11OPEN
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN
SBsorpe31PER19IN18last
SBruhrtal32PER09CLSD
RBruhrI34PER14CLSD

3 OPEN on top, rest follows, but the OPEN-values should then be
ordered after their (\d{1,3})'s and here's 33 40 34,
should be 33 34 40

So, is there a way to alter the keys by 2nd mapping?

map [ $_ , /OPEN/?$_->[0]+=1000:(0) [ 0] ], # alter keys on 1st level

but that's not referencing, I see.


print for # print result
map $_->[ 0], # throw keys away
sort { $a->[ 1] <=> $b->[ 1] } # sort by altered-key
#do something, maybe add 1000 if(/OPEN/) # alter keys on 1st level
map [ $_ , (/(\d{1,3})/) [ 0] ], # assign 2nd-level ordinate
<DATA>;
 
J

John W. Krahn

Herr said:
I have some working code, but I think, there's more than one better
way to solve the task:

I have some given Data that should be sorted first on appearance of
string OPEN (on top) and second after the values of the first
occurence of digits, here e.g. 31 to 40.
How can I 'regex' or 'map' this to sort them in one loop already?

Thanks
Hardy

#!/usr/bin/perl

use warnings;
use strict;

my @digitlist;
my @openlist;
my @clsdlist;

while(<DATA>){
/(\d{1,3})/;
push (@digitlist,$1.$_);
}

@digitlist = sort @digitlist;

for(@digitlist){
if (/OPEN/){
push(@openlist,$_);
}else{
push(@clsdlist,$_);
}
}

for(@openlist){
print;
}
for(@clsdlist){
print;
}
1;
# EOF

__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN


#!/usr/bin/perl
use warnings;
use strict;

my ( @openlist, @clsdlist );

push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for
map substr( $_, 3 ),
sort
map sprintf( '%03d%s', /(\d{1,3})/, $_ ),
<DATA>;

print @openlist, @clsdlist;




John
 
U

Uri Guttman

JWK> #!/usr/bin/perl
JWK> use warnings;
JWK> use strict;

JWK> my ( @openlist, @clsdlist );

JWK> push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for
JWK> map substr( $_, 3 ),
JWK> sort
JWK> map sprintf( '%03d%s', /(\d{1,3})/, $_ ),
JWK> <DATA>;

JWK> print @openlist, @clsdlist;

i like it, an GRT written on the fly. but you now owe me royalties! for
payment, i expect you to test Sort::Maker before its public release.

uri
 
H

Herr Hardy

Let me comment to understand, from bottom to top
push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for
# ^ 4. OPEN 2 openlist, !OPEN 2 clsdlist
map substr( $_, 3 ),
# ^ 3. delete leading 'mark'-digits
# ^ 2. simple Perl sort on the 'marks'
map sprintf( '%03d%s', /(\d{1,3})/, $_ ),
# ^ 1. 'marking' each line with \d{1,3} as leading digits
# or 0nn or 00n if less than 3 digits in \d{1,3}
<DATA>;

print @openlist, @clsdlist;
That's pretty, John.
 
H

Herr Hardy

Taking your funny linewithout mapping

push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for
sort {
my($x) = ($a=~/(\d{1,3})/);
my($y) = ($b=~/(\d{1,3})/);
$x <=> $y;
}
<DATA>;
 
U

Uri Guttman

HH> On Tue, 25 May 2004 08:31:02 +0200, Herr Hardy <[email protected]>
HH> wrote:

HH> Taking your funny lineHH> without mapping

HH> push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for
HH> sort {
HH> my($x) = ($a=~/(\d{1,3})/);
HH> my($y) = ($b=~/(\d{1,3})/);
HH> $x <=> $y;
HH> }
HH> <DATA>;

read http://sysarch.com/perl/sort_paper.html to find out why the
map/sort/map is usually much faster.

uri
 
A

Anno Siegel

Herr Hardy said:
__DATA__
SBruhrtal32PER09CLSD
SBsorpe31PER19IN18last
RBruhrII33PER11OPEN
RBruhrI34PER14CLSD
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN

That can be done using a Schwartz transform. Assign numeric sort keys
to the lines. Make the key -1 if the line contains "OPEN", otherwise
use the first group of digits. Sort according to keys, throw the
keys away.

print for # print result
map $_->[ 0], # throw keys away
sort { $a->[ 1] <=> $b->[ 1] } # sort by key
map [ $_ , /OPEN/ ? -1 : ( /(\d+)/ )[ 0] ], # assign keys
<DATA>;

... buuuuut the result ist

RBruhrII33PER11OPEN
RBharkortI40PER09OPEN
RBruhrI34PER14OPEN
SBsorpe31PER19IN18last
SBruhrtal32PER09CLSD
RBruhrI34PER14CLSD

3 OPEN on top, rest follows, but the OPEN-values should then be
ordered after their (\d{1,3})'s and here's 33 40 34,
should be 33 34 40

You weren't too clear about the sort criteria, I'm not the only one who
misunderstood.
So, is there a way to alter the keys by 2nd mapping?

map [ $_ , /OPEN/?$_->[0]+=1000:(0) [ 0] ], # alter keys on 1st level

but that's not referencing, I see.

Hmm...

You want to subtract, not add 1000 to bring the lines up.

Otherwise, your code makes little sense. Since you give no context,
I must guess where you want this to appear in the map/sort pipeline.
If "2nd mapping" means what I think it means, at this point $_ is
the two-element array(ref) created in the first step. You don't
want to match against that, nor put it in another two-element array.
You could modify the keys in a second step, I guess, but it ain't pretty.
Do it right in the first step:

map [ $_, ( /(\d+)/)[ 0] - 1000 * /OPEN/ ], <DATA>;

....though that's a little exotic (and untested).

The whole approach is rather fragile, depending as it does on the
particular range 0 .. 999 for the numbers. (Using -1 for /OPEN/ is
in the same category.) Both try to squeeze two sort criteria
in a single key. A two-key sort is more robust:

print for
map $_->[ 0],
sort { $b->[ 1] <=> $a->[ 1] or $a->[ 2] <=> $b->[ 2] }
map [ $_ , scalar /OPEN/, /(\d+)/ ],
<DATA>;

Note the reverse sort on the /OPEN/ key. Also, if other numeric
fields enter the equation some day, just add /g to the regex:
/(\d+)/g, and they will be available for the sort.

Anno
 
H

Herr Hardy

(e-mail address removed)-berlin.de (Anno Siegel) wrote on 25 May 2004
16:31:17 GMT:
You weren't too clear about the sort criteria, I'm not the only one who
misunderstood.
Sorry, Anno, I wouldn't like to waste your time.

map [ $_ , /OPEN/?$_->[0]+=1000:(0) [ 0] ], # alter keys on 1st level
You want to subtract, not add 1000 to bring the lines up. Yes.

Otherwise, your code makes little sense. Since you give no context,
I must guess where you want this to appear in the map/sort pipeline.
If "2nd mapping" means what I think it means, at this point $_ is
the two-element array(ref) created in the first step. You don't
want to match against that, nor put it in another two-element array.
You could modify the keys in a second step, I guess, but it ain't pretty.
Do it right in the first step:
Well, I would ask, any idea?
map [ $_, ( /(\d+)/)[ 0] - 1000 * /OPEN/ ], <DATA>;
That's it, exactly. I meant: somehow alter the '\d+'-value depending
on the OPEN-String, so the key is one for both,
...though that's a little exotic (and untested).
it works.

print $_->[1]."\n" for # print key
sort { $a->[ 1] <=> $b->[ 1] } # sort
map [ $_, ( /(\d+)/)[ 0] - 1000 * /OPEN/ ], # assign key
<DATA>;

__DATA__
SBruhrtal31PER09CLSD
SBsorpe33PER19IN18last
RBruhrII32PER11OPEN
RBruhrII34PER11OPEN
RBruhrI35PER14CLSD
RBharkortI36PER09OPEN

PRINTS
-968
-966
-964
31
33
35
The whole approach is rather fragile, depending as it does on the
particular range 0 .. 999 for the numbers. (Using -1 for /OPEN/ is
in the same category.) Both try to squeeze two sort criteria
in a single key. A two-key sort is more robust:

print for
map $_->[ 0],
sort { $b->[ 1] <=> $a->[ 1] or $a->[ 2] <=> $b->[ 2] }
map [ $_ , scalar /OPEN/, /(\d+)/ ],
<DATA>;

Note the reverse sort on the /OPEN/ key. Also, if other numeric
fields enter the equation some day, just add /g to the regex:
/(\d+)/g, and they will be available for the sort.

print $_->[1]." - ".$_->[2]."\n" for
sort { $b->[ 1] <=> $a->[ 1] or $a->[ 2] <=> $b->[ 2] }
map [ $_ , scalar /OPEN/, /(\d+)/ ],
<DATA>;

PRINTS
1 - 32
1 - 34
1 - 36
- 31
- 33
- 35
.... and I understand what's happening.

Thank you,
Hardy
 
J

John W. Krahn

Uri said:
JWK> my ( @openlist, @clsdlist );

JWK> push @{ /OPEN/ ? \@openlist : \@clsdlist }, $_ for
JWK> map substr( $_, 3 ),
JWK> sort
JWK> map sprintf( '%03d%s', /(\d{1,3})/, $_ ),
JWK> <DATA>;

JWK> print @openlist, @clsdlist;

i like it, an GRT written on the fly. but you now owe me royalties!

Then you will have to hire me so that I can afford to pay you. :)
for payment, i expect you to test Sort::Maker before its public release.

I couldn't find it on CPAN or your web site.


John
 
U

Uri Guttman

JWK> I couldn't find it on CPAN or your web site.

i posted the location in another recent thread on sorting. i would
expect someone who knows how to write a GRT to follow all sort threads!

anyhow, it is at stemsystems.com/sort/Sort-Maker-0.01.tar.gz

i am coding up some late changes and improvements now and i will
announce the next beta as soon as that is done. my goal is to release it
before yapc::na since i am giving a talk about it there.

uri
 
J

John W. Krahn

Uri said:
JWK> I couldn't find it on CPAN or your web site.

i posted the location in another recent thread on sorting. i would
expect someone who knows how to write a GRT to follow all sort threads!

I follow, I just don't remember. :)
anyhow, it is at stemsystems.com/sort/Sort-Maker-0.01.tar.gz

Got it.
i am coding up some late changes and improvements now and i will
announce the next beta as soon as that is done. my goal is to release it
before yapc::na since i am giving a talk about it there.

If you want it to run on version 5.6 or earlier you will have to remove
the =head3 tags from the POD which were introduced in version 5.8.


John
 
U

Uri Guttman

JWK> Got it.

JWK> If you want it to run on version 5.6 or earlier you will have to remove
JWK> the =head3 tags from the POD which were introduced in version 5.8.

i liked using =head3 as the pod needed it with all the nested attribute
stuff. if you can figure out a good way to eliminate head3 i will do
it. i find pod easy to use but frustrating sometimes when you want
better nesting.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
474,156
Messages
2,570,878
Members
47,404
Latest member
PerryRutt

Latest Threads

Top