sort filename array the same way as windows explorer

P

panofish

I've searched and searched... I've seen references to sort:naturally
and other, but I do not have enough experience to write a piece of
code that will sort an array the same way that windows explorer sorts
its file list.

Here is my list I want sorted:

AsM-00 (EPC) 20070125 173425.wmf
AsM-01 (EPC) 20070125 173425.wmf
AsM-01-01 (EPC) 20070125 173425.wmf
AsM-01-02 (EPC) 20070125 173425.wmf
AsM-01-03 (EPC) 20070125 173425.wmf
AsM-01-04 (EPC) 20070125 173425.wmf
AsM-01-05 (EPC) 20070125 173425.wmf
AsM-01-06 (EPC) 20070125 173425.wmf
AsM-02 (EPC) 20070125 173425.wmf
CpM-00 (EPC) 20070125 173425.wmf
CpM-01 (EPC) 20070125 173425.wmf
CpM-01-01 (EPC) 20070125 173425.wmf
CpM-01-02 (EPC) 20070125 173425.wmf

The desired sorted result should look the same as it would in windows
explorer like this:

AsM-00 (EPC) 20070125 173425.wmf
AsM-01-01 (EPC) 20070125 173425.wmf
AsM-01-02 (EPC) 20070125 173425.wmf
AsM-01-03 (EPC) 20070125 173425.wmf
AsM-01-04 (EPC) 20070125 173425.wmf
AsM-01-05 (EPC) 20070125 173425.wmf
AsM-01-06 (EPC) 20070125 173425.wmf
AsM-01 (EPC) 20070125 173425.wmf
AsM-02 (EPC) 20070125 173425.wmf
CpM-00 (EPC) 20070125 173425.wmf
CpM-01-01 (EPC) 20070125 173425.wmf
CpM-01-02 (EPC) 20070125 173425.wmf
CpM-01 (EPC) 20070125 173425.wmf

The difference is subtle.
I hope some one can provide a robust, and elegant piece of source code
that can do this sort.
That would be very appreciated.
 
M

Mirco Wahab

I've searched and searched... I've seen references to sort:naturally
and other, but I do not have enough experience to write a piece of
code that will sort an array the same way that windows explorer sorts
its file list.

AsM-00 (EPC) 20070125 173425.wmf
AsM-01 (EPC) 20070125 173425.wmf
AsM-01-01 (EPC) 20070125 173425.wmf
...
The desired sorted result should look the same as it would in windows
explorer like this:

AsM-00 (EPC) 20070125 173425.wmf
AsM-01-01 (EPC) 20070125 173425.wmf
AsM-01-02 (EPC) 20070125 173425.wmf
..

Looks like the explorer converts spaces to asc(255)
before sorting strings (guessed).

That would imply sth. like:

...
sub newkey { (my $s=shift) =~ s/\x20+/\xff/; $s }
...
sub explorersort { sort {newkey($a) cmp newkey($b)} @_ }

print explorersort <DATA>

__DATA__
AsM-00 (EPC) 20070125 173425.wmf
AsM-01 (EPC) 20070125 173425.wmf
AsM-01-01 (EPC) 20070125 173425.wmf
AsM-01-02 (EPC) 20070125 173425.wmf
...

if thats too slow (large lists) use sth. like:

...
sub newkey { (my $s=shift) =~ s/\x20+/\xff/; $s }
...
print map $_->[0],
sort {$a->[1] cmp $b->[1]}
map [$_, newkey($_)],
<DATA>

__DATA__
AsM-00 (EPC) 20070125 173425.wmf
AsM-01 (EPC) 20070125 173425.wmf
...

Regards

M.
 
P

Paul Lalli

I've searched and searched... I've seen references to sort:naturally
and other, but I do not have enough experience to write a piece of
code that will sort an array the same way that windows explorer sorts
its file list.

Here is my list I want sorted:

AsM-00 (EPC) 20070125 173425.wmf
AsM-01 (EPC) 20070125 173425.wmf
AsM-01-01 (EPC) 20070125 173425.wmf
AsM-01-02 (EPC) 20070125 173425.wmf
AsM-01-03 (EPC) 20070125 173425.wmf
AsM-01-04 (EPC) 20070125 173425.wmf
AsM-01-05 (EPC) 20070125 173425.wmf
AsM-01-06 (EPC) 20070125 173425.wmf
AsM-02 (EPC) 20070125 173425.wmf
CpM-00 (EPC) 20070125 173425.wmf
CpM-01 (EPC) 20070125 173425.wmf
CpM-01-01 (EPC) 20070125 173425.wmf
CpM-01-02 (EPC) 20070125 173425.wmf

The desired sorted result should look the same as it would in windows
explorer like this:

AsM-00 (EPC) 20070125 173425.wmf
AsM-01-01 (EPC) 20070125 173425.wmf
AsM-01-02 (EPC) 20070125 173425.wmf
AsM-01-03 (EPC) 20070125 173425.wmf
AsM-01-04 (EPC) 20070125 173425.wmf
AsM-01-05 (EPC) 20070125 173425.wmf
AsM-01-06 (EPC) 20070125 173425.wmf
AsM-01 (EPC) 20070125 173425.wmf
AsM-02 (EPC) 20070125 173425.wmf
CpM-00 (EPC) 20070125 173425.wmf
CpM-01-01 (EPC) 20070125 173425.wmf
CpM-01-02 (EPC) 20070125 173425.wmf
CpM-01 (EPC) 20070125 173425.wmf

The difference is subtle.
I hope some one can provide a robust, and elegant piece of source code
that can do this sort.
That would be very appreciated.


I don't know for a fact that this is true, but it *appears* that
Windows is sorting the filenames as they would be sorted if the non-
alphanumeric characters were not present. If that's true, the
following does the job:

@sorted = map { $_->[0] }
sort { $a->[1] cmp $b->[1] }
map { [ $_, do { $t = $_; $t =~ s/[^a-zA-Z0-9]//g;
$t } ] } @files

We're using a Schwartzian Transform to generate a list of all the
files that have the non-alphanumerics removed, and have them
correspond to their originals. Then we sort on the modified version,
and return back the originals.

For more information:
perldoc -f map
perldoc -f sort

Hope this helps,
Paul Lalli
 
P

panofish

I've searched and searched... I've seen references to sort:naturally
and other, but I do not have enough experience to write a piece of
code that will sort an array the same way that windows explorer sorts
its file list.
Here is my list I want sorted:
AsM-00 (EPC) 20070125 173425.wmf
AsM-01 (EPC) 20070125 173425.wmf
AsM-01-01 (EPC) 20070125 173425.wmf
AsM-01-02 (EPC) 20070125 173425.wmf
AsM-01-03 (EPC) 20070125 173425.wmf
AsM-01-04 (EPC) 20070125 173425.wmf
AsM-01-05 (EPC) 20070125 173425.wmf
AsM-01-06 (EPC) 20070125 173425.wmf
AsM-02 (EPC) 20070125 173425.wmf
CpM-00 (EPC) 20070125 173425.wmf
CpM-01 (EPC) 20070125 173425.wmf
CpM-01-01 (EPC) 20070125 173425.wmf
CpM-01-02 (EPC) 20070125 173425.wmf
The desired sorted result should look the same as it would in windows
explorer like this:
AsM-00 (EPC) 20070125 173425.wmf
AsM-01-01 (EPC) 20070125 173425.wmf
AsM-01-02 (EPC) 20070125 173425.wmf
AsM-01-03 (EPC) 20070125 173425.wmf
AsM-01-04 (EPC) 20070125 173425.wmf
AsM-01-05 (EPC) 20070125 173425.wmf
AsM-01-06 (EPC) 20070125 173425.wmf
AsM-01 (EPC) 20070125 173425.wmf
AsM-02 (EPC) 20070125 173425.wmf
CpM-00 (EPC) 20070125 173425.wmf
CpM-01-01 (EPC) 20070125 173425.wmf
CpM-01-02 (EPC) 20070125 173425.wmf
CpM-01 (EPC) 20070125 173425.wmf
The difference is subtle.
I hope some one can provide a robust, and elegant piece of source code
that can do this sort.
That would be very appreciated.

I don't know for a fact that this is true, but it *appears* that
Windows is sorting the filenames as they would be sorted if the non-
alphanumeric characters were not present. If that's true, the
following does the job:

@sorted = map { $_->[0] }
sort { $a->[1] cmp $b->[1] }
map { [ $_, do { $t = $_; $t =~ s/[^a-zA-Z0-9]//g;
$t } ] } @files

We're using a Schwartzian Transform to generate a list of all the
files that have the non-alphanumerics removed, and have them
correspond to their originals. Then we sort on the modified version,
and return back the originals.

For more information:
perldoc -f map
perldoc -f sort

Hope this helps,
Paul Lalli


Well Mr. Lalli ... Mr. Lilly says THANK YOU! That worked! It was just
missing the ending semicolon.
VERY MUCH APPRECIATED.
 
A

anno4000

Mirco Wahab said:
sub newkey { (my $s=shift) =~ s/\x20+/\xff/; $s }

Only the first occurrence, or all of them? In the latter case, you'd
need /g on the replacement, but the standard tool for that is tr///:

tr/ /\xff/s

Anno
 
M

Mirco Wahab

Only the first occurrence, or all of them? In the latter case, you'd
need /g on the replacement, but the standard tool for that is tr///:

tr/ /\xff/s

The idea I had was to coalesce subsequent spaces
into one 0xff and sort the resulting string,
therefore your proposed /s would be correct.

But! I re-checked windows sort order
by some generated files:

AsM-01-00\ \ \ 0(EPC)\ 20070125\ 173425.wmf
AsM-01-00\ \ 0(EPC)\ 20070125\ 173425.wmf
AsM-01-00\ 0(EPC)\ 20070125\ 173425.wmf

AsM-01-00\ \ \ 1(EPC)\ 20070125\ 173425.wmf
AsM-01-00\ \ 1(EPC)\ 20070125\ 173425.wmf
AsM-01-00\ 1(EPC)\ 20070125\ 173425.wmf

and windows sort order turn out to be:

AsM-01-00 0(EPC) 20070125 173425.wmf
AsM-01-00 1(EPC) 20070125 173425.wmf
AsM-01-00 0(EPC) 20070125 173425.wmf
AsM-01-00 1(EPC) 20070125 173425.wmf
AsM-01-00 0(EPC) 20070125 173425.wmf
AsM-01-00 1(EPC) 20070125 173425.wmf

so its clear from the example that the
magic windows sorter has to be like:

...
sub hardenspaces { (my $s=shift) =~ tr/\x20/\xff/; $s }
sub explorersort { sort { hardenspaces($a) cmp hardenspaces($b) } @_ }
...
...
my @sorted_fnames = explorersort @fnames_from_dirread:
...

as you said above.

Thanks & Grüße

Mirco
 
M

Mirco Wahab

Mirco said:
The idea I had was to coalesce subsequent spaces
into one 0xff and sort the resulting string,
therefore your proposed /s would be correct.

(WTF)

must read: "your proposed /g would be correct"

Sorry,

M.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,999
Messages
2,570,243
Members
46,838
Latest member
KandiceChi

Latest Threads

Top