sort question

N

Nick Wedd

Here is my program:


use strict;

sub by_incsub1 { $$a[1] <=> $$b[1]; }
sub by_incsub2 { $$a[2] <=> $$b[2]; }

my $i;
my @a = ( ['a',3,1],['b',2,2],['c',1,1],['d',1,2],['e',2,1],['f',3,2] );

my @result = sort by_incsub1 @a;
foreach $i ( 0..5 )
{ print "$result[$i][0]$result[$i][1]$result[$i][2] "; }
print "\n";
@result = sort by_incsub2 @result;
foreach $i ( 0..5 )
{ print "$result[$i][0]$result[$i][1]$result[$i][2] "; }
print "\n";
@result = sort by_incsub1 @result;
foreach $i ( 0..5 )
{ print "$result[$i][0]$result[$i][1]$result[$i][2] "; }


and here is its output:


c11 d12 b22 e21 a31 f32
c11 e21 a31 d12 b22 f32
c11 d12 e21 b22 a31 f32


This is exactly what I hoped for. In fact it is better (more useful to
me) than I feel I have any right to expect. When it sorts on one
criterion, it leaves the items that tie under that criterion in the
order they were in before.

Now this is exactly what I want it to do. But no documentation that I
can recall promises that it will do that. Output like
c11 d12 e21 b22 a31 f32
a31 c11 e21 b22 d12 f32
d12 c11 e21 b22 f32 a31
would still meet the specification of "sort".

Can I rely on Perl's sort to continue to do what I want, or is it
implementation-dependent?

Nick
 
A

A. Sinan Unur

This is exactly what I hoped for. In fact it is better (more useful
to me) than I feel I have any right to expect. When it sorts on one
criterion, it leaves the items that tie under that criterion in the
order they were in before.

Now this is exactly what I want it to do. But no documentation that I
can recall promises that it will do that.

Which version of Perl are you using?

http://perldoc.perl.org/functions/sort.html

Perl 5.6 and earlier used a quicksort algorithm to implement sort. That
algorithm was not stable, and could go quadratic. (A stable sort
preserves the input order of elements that compare equal. Although
quicksort's run time is O(NlogN) when averaged over all arrays of length
N, the time can be O(N**2), quadratic behavior, for some inputs.) In
5.7, the quicksort implementation was replaced with a stable mergesort
algorithm whose worst-case behavior is O(NlogN). But benchmarks
indicated that for some inputs, on some platforms, the original
quicksort was faster. 5.8 has a sort pragma for limited control of the
sort. Its rather blunt control of the underlying algorithm may not
persist into future Perls, but the ability to characterize the input or
output in implementation independent ways quite probably will. See sort.

http://perldoc.perl.org/sort.html

use sort 'stable'; # guarantee stability


-- Sinan


--
A. Sinan Unur <[email protected]>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/
 
N

Nick Wedd

A. Sinan Unur said:
Which version of Perl are you using?

Version 5.6.1

So what I am looking for is a "stable" sort. Knowing what it is called
will make further investigations easier for me.

That page tells me that the sort in 5.6 is not stable; the one in 5.7
is stable; and in 5.8 I can use a pragma to ensure that it uses a
stable sort. So I have been lucky so far, maybe because my arrays have
never had more than seven elements.
Perl 5.6 and earlier used a quicksort algorithm to implement sort. That
algorithm was not stable, and could go quadratic. (A stable sort
preserves the input order of elements that compare equal. Although
quicksort's run time is O(NlogN) when averaged over all arrays of length
N, the time can be O(N**2), quadratic behavior, for some inputs.) In
5.7, the quicksort implementation was replaced with a stable mergesort
algorithm whose worst-case behavior is O(NlogN). But benchmarks
indicated that for some inputs, on some platforms, the original
quicksort was faster. 5.8 has a sort pragma for limited control of the
sort. Its rather blunt control of the underlying algorithm may not
persist into future Perls, but the ability to characterize the input or
output in implementation independent ways quite probably will. See sort.

http://perldoc.perl.org/sort.html

use sort 'stable'; # guarantee stability


-- Sinan

Thank you.

Nick
 
S

sln

Here is my program:


use strict;

sub by_incsub1 { $$a[1] <=> $$b[1]; }
sub by_incsub2 { $$a[2] <=> $$b[2]; }

my $i;
my @a = ( ['a',3,1],['b',2,2],['c',1,1],['d',1,2],['e',2,1],['f',3,2] );

my @result = sort by_incsub1 @a;
foreach $i ( 0..5 )
{ print "$result[$i][0]$result[$i][1]$result[$i][2] "; }
print "\n";
@result = sort by_incsub2 @result;
foreach $i ( 0..5 )
{ print "$result[$i][0]$result[$i][1]$result[$i][2] "; }
print "\n";
@result = sort by_incsub1 @result;
foreach $i ( 0..5 )
{ print "$result[$i][0]$result[$i][1]$result[$i][2] "; }


and here is its output:


c11 d12 b22 e21 a31 f32
c11 e21 a31 d12 b22 f32
c11 d12 e21 b22 a31 f32


This is exactly what I hoped for. In fact it is better (more useful to
me) than I feel I have any right to expect. When it sorts on one
criterion, it leaves the items that tie under that criterion in the
order they were in before.

Now this is exactly what I want it to do. But no documentation that I
can recall promises that it will do that. Output like
c11 d12 e21 b22 a31 f32
a31 c11 e21 b22 d12 f32
d12 c11 e21 b22 f32 a31
would still meet the specification of "sort".

Can I rely on Perl's sort to continue to do what I want, or is it
implementation-dependent?

Nick

There is no difference in style when sorting primary, secondary, tertiary
fields as it crosses languages, the same result will be arived at no matter
what. So regardless of what sort method is used, the layout of how you interpret
relationships is the same. Its either less than, greater than or equal in the
comparison.

That said, there is no need to re-sort on countless fields when it could be,
at least done in one pass. This is no guarantee of speed benefit, but in general,
the below framework is how it is done in one pass. Its up to you to customize
as your requirements dictate.

Obviously doing a sort in one pass is quicker. However, this requires custom, user
supplied comparison functions.

Below is a sample of whats possible. Minimal error checking, it is asumed that you
would know what to do.

There is a prototype "key" function at the top that is just there to explain the
logic in sorting. Once you understand that, you can write any custom bomb you want.
And you should. Don't lay all the responsibility on Perl, its up to you to craft
a sort protocol.

Good luck!
-sln

-------------------------------------------------------------------------------
## iii.pl
## More sort junk
## -sln

use warnings;
use strict;

sub Sort_Template_Protype_aka_By_Both
{
if ( $$a[1] < $$b[1] ) {return -1}
if ( $$a[1] > $$b[1] ) {return 1}
if ( $$a[1] == $$b[1]) {
if (($$a[2] < $$b[2])) {return -1}
if (($$a[2] > $$b[2])) {return 1}
# if element 2's are equal {
# .. check element 3, etc..
return 0
}
}

sub By_Both
{
my $element_compare = $$a[1] <=> $$b[1];
$element_compare == 0 ? ($$a[2] <=> $$b[2]) : $element_compare;
}

sub By_Field_Range
{
my ($start,$end) = @_;
return $$a[$start] <=> $$b[$start] if (!defined $end || $end <= $start);
for ($start..$end)
{
my $element_compare = $$a[$_] <=> $$b[$_];
next if ($element_compare == 0);
return $element_compare;
}
$$a[$_] <=> $$b[$_];
}

sub By_Field_Array
{
return 0 if (!@_);
return $$a[$_[0]] <=> $$b[$_[0]] if (scalar(@_) == 1);
for (@_)
{
my $element_compare = $$a[$_] <=> $$b[$_];
next if ($element_compare == 0);
return $element_compare;
}
$$a[$_] <=> $$b[$_];
}

## --------------------------------------------------------
my @result;
my @a = ( ['a',3,1],['b',2,2],['c',1,1],['d',1,2],['e',2,1],['f',3,2] );

sub Print_Results
{
foreach my $i ( 0..5 ) {
print "$result[$i][0]$result[$i][1]$result[$i][2] ";
}
print "\n";
}

## Stuff to play with..
## ---------------------

@result = sort { By_Both } @a; Print_Results();
@result = sort { By_Field_Range ( 1 ) } @a; Print_Results(); #need params
@result = sort { By_Field_Range (1,1) } @a; Print_Results(); #need params
@result = sort { By_Field_Range (1,3) } @a; Print_Results(); #need params
@result = sort { By_Field_Array ( 1 ) } @a; Print_Results(); #need params
@result = sort { By_Field_Array (1,2) } @a; Print_Results(); #need params
@result = sort { By_Field_Array ( 2 ) } @a; Print_Results(); #need params


__END__

Output:

c11 d12 e21 b22 a31 f32
c11 d12 b22 e21 a31 f32
c11 d12 b22 e21 a31 f32
c11 d12 e21 b22 a31 f32
c11 d12 b22 e21 a31 f32
c11 d12 e21 b22 a31 f32
a31 c11 e21 b22 d12 f32
 
S

sln

[snip for code correction]

-------------------------------------------------------------------------------
## iii.pl
## More sort junk
## -sln

use warnings;
use strict;

sub Sort_Template_Protype_aka_By_Both
{
if ( $$a[1] < $$b[1] ) {return -1}
if ( $$a[1] > $$b[1] ) {return 1}
if ( $$a[1] == $$b[1]) {
if (($$a[2] < $$b[2])) {return -1}
if (($$a[2] > $$b[2])) {return 1}
# if element 2's are equal {
# .. check element 3, etc..
return 0
}
}

sub By_Both
{
my $element_compare = $$a[1] <=> $$b[1];
$element_compare == 0 ? ($$a[2] <=> $$b[2]) : $element_compare;
}

sub By_Field_Range
{
my ($start,$end) = @_;
return $$a[$start] <=> $$b[$start] if (!defined $end || $end <= $start);
for ($start..$end)
{
my $element_compare = $$a[$_] <=> $$b[$_];
next if ($element_compare == 0);
return $element_compare;
}
fix> $$a[$_] <=> $$b[$_];
^
0
for sure this should be zero not only because $element_compare equals 0 here
but because $_ could be undefined (my own template says that).
}

sub By_Field_Array
{
return 0 if (!@_);
return $$a[$_[0]] <=> $$b[$_[0]] if (scalar(@_) == 1);
for (@_)
{
my $element_compare = $$a[$_] <=> $$b[$_];
next if ($element_compare == 0);
return $element_compare;
}
fix> $$a[$_] <=> $$b[$_];
^
0
for sure this should be zero not only because $element_compare equals 0 here
but because $_ could be undefined (my own template says that).
}

## --------------------------------------------------------
my @result;
my @a = ( ['a',3,1],['b',2,2],['c',1,1],['d',1,2],['e',2,1],['f',3,2] );

sub Print_Results
{
foreach my $i ( 0..5 ) {
print "$result[$i][0]$result[$i][1]$result[$i][2] ";
}
print "\n";
}

## Stuff to play with..
## ---------------------

@result = sort { By_Both } @a; Print_Results();
@result = sort { By_Field_Range ( 1 ) } @a; Print_Results(); #need params
@result = sort { By_Field_Range (1,1) } @a; Print_Results(); #need params
fix>@result = sort { By_Field_Range (1,3) } @a; Print_Results(); #need params
^
is out of range given @a, and not checked in sort function, set it to 2.
its up to the user to check parameters.
@result = sort { By_Field_Array ( 1 ) } @a; Print_Results(); #need params
@result = sort { By_Field_Array (1,2) } @a; Print_Results(); #need params
@result = sort { By_Field_Array ( 2 ) } @a; Print_Results(); #need params


__END__

-sln
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top