A script to separate out file names from the path?

Rich Grise · Dec 11, 2006

I have a collection of about 6000 files that need to be reorganized.
These have been strewn all over the place, from CDs to various partitions
and subdirectories on different workstations, to a pile of various
subdirectories from our Samba server, and what-not.

They're all on different depths of subdir, and I'm almost certain that
there's a lot of redundancy - I've got a list that looks something like
this example:

/Collection/a/b/c/d/file1
/Collection/a/b/c/d/file2
/Collection/a/b/c/d/file3
/Collection/a/b/c/d/file4
/Collection/a/b/c/e/file4
/Collection/a/b/c/e/file5
/Collection/e/f/g/file4
/Collection/e/f/g/file5
/Collection/e/f/g/file6
/Collection/e/f/g/file7

and so on; as you can see, they're at different subdir depths;
what I want to do, if possible, is to take this array, split out
only the last component (after some unknown number of '/', but
the last one in the string), put it in the front of a new
string, then concatenate the original line;

The ultimate goal is to sort these by filename - I could kill
a lot of reduncancy pretty easy that way.

But it turns out, what I've been trying to do is use
for (<>) {
my @line = split(/\//,$_);
my $count = @line;
print (@line[$count-1], " : ", $_);
}

doesn't seem to accomplish what I think it should. Here's the
script I've got so far:

#!/usr/bin/perl

while (<>) {
$input = chop($_);
@line = split(/\//,$input);
$count = @line;
print ("count = ", $count, "\n");

# foreach $item(@line) {
# print (" item = ", $item);
# }
# print ("count = ", $count, " ");

# for ($i = 0; $i < $count; $i++) {
# print (" item ", $i, " = ", @line[$i], " ");
# }

# $myitem = @line[$count-1];

# print (@line[$count-1]);

# print ": ";
# print $input;
# print "\n";
}

As you can seem I've tried variations on this, and nothing I've
tried yet has done what I want.

Here's the input (example):

/Collection/a/b/c/d/file1
/Collection/a/b/c/d/file2
/Collection/a/b/c/d/file3
/Collection/a/b/c/d/file4
/Collection/a/b/c/e/file4
/Collection/a/b/c/e/file5
/Collection/e/f/g/file4
/Collection/e/f/g/file5
/Collection/e/f/g/file6
/Collection/e/f/g/file7

And here's what I want the output to look like:

file1 : /Collection/a/b/c/d/file1
file2 : /Collection/a/b/c/d/file2
file3 : /Collection/a/b/c/d/file3
file4 : /Collection/a/b/c/d/file4
file4 : /Collection/a/b/c/e/file4
file5 : /Collection/a/b/c/e/file5
file4 : /Collection/e/f/g/file4
file5 : /Collection/e/f/g/file5
file6 : /Collection/e/f/g/file6
file7 : /Collection/e/f/g/file7

Which I could sort, and track down the duplicates.

But I'm stuck on rearranging the strings. )-;

Would anyone wish to be so kind as to volunteer to do my homework for me?

Thanks,
Rich

usenet · Dec 11, 2006

Rich said:
A script to separate out file names from the path?

The module File::Basename is part of your standard Perl distribution.

J. Gleixner · Dec 11, 2006

Rich Grise wrote:
[...]

The ultimate goal is to sort these by filename - I could kill
a lot of reduncancy pretty easy that way.

But it turns out, what I've been trying to do is use
for (<>) {
my @line = split(/\//,$_);
my $count = @line;
print (@line[$count-1], " : ", $_);
}

You can use a negative index.

my @arr = qw(a b c d e);
print $arr[-1];

Will print: e

Note: It's $line[] not @line[].

And since split returns a list, you could get the last item:

my $last_item = ( split /\// ) [-1];

Would anyone wish to be so kind as to volunteer to do my homework for me?

No, however most people will help you learn the language so you can do
it yourself.

Lew Pitcher · Dec 11, 2006

Rich said:
I have a collection of about 6000 files that need to be reorganized.
These have been strewn all over the place, from CDs to various partitions
and subdirectories on different workstations, to a pile of various
subdirectories from our Samba server, and what-not.

They're all on different depths of subdir, and I'm almost certain that
there's a lot of redundancy - I've got a list that looks something like
this example:

/Collection/a/b/c/d/file1
/Collection/a/b/c/d/file2
/Collection/a/b/c/d/file3
/Collection/a/b/c/d/file4
/Collection/a/b/c/e/file4
/Collection/a/b/c/e/file5
/Collection/e/f/g/file4
/Collection/e/f/g/file5
/Collection/e/f/g/file6
/Collection/e/f/g/file7

and so on; as you can see, they're at different subdir depths;
what I want to do, if possible, is to take this array, split out
only the last component (after some unknown number of '/', but
the last one in the string), put it in the front of a new
string, then concatenate the original line;

The ultimate goal is to sort these by filename - I could kill
a lot of reduncancy pretty easy that way.

But it turns out, what I've been trying to do is use
for (<>) {
my @line = split(/\//,$_);
my $count = @line;
print (@line[$count-1], " : ", $_);
}

doesn't seem to accomplish what I think it should. Here's the
script I've got so far:

[snip]

I say why use complex tools when simple tools will suffice

Have you looked at the basename(1) and dirname(1) utilities?

lpitcher@merlin:~$ basename /Collection/a/b/c/d/file1.a
file1.a
lpitcher@merlin:~$ basename /Collection/a/b/c/d/file1
file1

lpitcher@merlin:~$ dirname /Collection/a/b/c/d/file1.a
/Collection/a/b/c/d
lpitcher@merlin:~$ dirname /Collection/a/b/c/d/file1
/Collection/a/b/c/d

Something as simple as

#!/bin/bash
echo `basename $1`: $1

might do the trick

HTH

John W. Krahn · Dec 11, 2006

Rich said:
I have a collection of about 6000 files that need to be reorganized.
These have been strewn all over the place, from CDs to various partitions
and subdirectories on different workstations, to a pile of various
subdirectories from our Samba server, and what-not.

They're all on different depths of subdir, and I'm almost certain that
there's a lot of redundancy - I've got a list that looks something like
this example:

/Collection/a/b/c/d/file1
/Collection/a/b/c/d/file2
/Collection/a/b/c/d/file3
/Collection/a/b/c/d/file4
/Collection/a/b/c/e/file4
/Collection/a/b/c/e/file5
/Collection/e/f/g/file4
/Collection/e/f/g/file5
/Collection/e/f/g/file6
/Collection/e/f/g/file7

and so on; as you can see, they're at different subdir depths;
what I want to do, if possible, is to take this array, split out
only the last component (after some unknown number of '/', but
the last one in the string), put it in the front of a new
string, then concatenate the original line;

The ultimate goal is to sort these by filename - I could kill
a lot of reduncancy pretty easy that way.

But it turns out, what I've been trying to do is use
for (<>) {
my @line = split(/\//,$_);
my $count = @line;
print (@line[$count-1], " : ", $_);

You are using an array slice when you should be using a scalar:

Found in /usr/lib/perl5/5.8.6/pod/perlfaq4.pod
What is the difference between $array[1] and @array[1]?

And you can use negative numbers to index from the end of the array:

print "$line[-1] : $_";

}

doesn't seem to accomplish what I think it should. Here's the
script I've got so far:

#!/usr/bin/perl

use warnings;
use strict;

while (<>) {
$input = chop($_);

You should use chomp instead of chop.

@line = split(/\//,$input);
$count = @line;
print ("count = ", $count, "\n");

# foreach $item(@line) {
# print (" item = ", $item);
# }
# print ("count = ", $count, " ");

# for ($i = 0; $i < $count; $i++) {
# print (" item ", $i, " = ", @line[$i], " ");
# }

# $myitem = @line[$count-1];

# print (@line[$count-1]);

# print ": ";
# print $input;
# print "\n";
}

#!/usr/bin/perl
use warnings;
use strict;

use File::Basename;

print map /\0(.+)/s,
sort
map basename( $_ ) . "\0$_",
<>;

__END__

John

Paul Lalli · Dec 11, 2006

Rich said:
for (<>) {
my @line = split(/\//,$_);
my $count = @line;
print (@line[$count-1], " : ", $_);
}

doesn't seem to accomplish what I think it should.

No, that would have worked perfectly well. It's just not at all what
you did.

Here's the
script I've got so far:

#!/usr/bin/perl

while (<>) {
$input = chop($_);

perldoc -f chop
chop VARIABLE
chop( LIST )
chop Chops off the last character of a string and returns
the character chopped.

Did you bother printing $index to see what it was? It's not the line
minus the trailing newline. It's the trailing newline.

You should be using chomp anyway.

while (my $input = <>) {
chomp $input;
#etc
}

Regardless, use File::Basename, as another responder suggested. This
wheel has already been written.

Paul Lalli

Uri Guttman · Dec 11, 2006

LP> I say why use complex tools when simple tools will suffice

LP> Have you looked at the basename(1) and dirname(1) utilities?

i say why use external shell commands when File::Basename is a core
module?

uri

Rich Grise · Dec 11, 2006

The module File::Basename is part of your standard Perl distribution.

Sorry for the bother - I just did it the old way in C, which I know is
heresy for the perl group. =:-O

/* relist.c */
/* reformats strings. */

#include <stdio.h>

char buffer[512];
char * bufp;

int main() {
while (bufp = gets(buffer)) {
bufp = strrchr(buffer, '/');
printf ("item ID = %s, data = %s\n", bufp + 1, buffer);
}
}

Thanks!
Rich

Dr.Ruud · Dec 11, 2006

Rich Grise schreef:

#include <stdio.h>

char buffer[512];
char * bufp;

int main() {
while (bufp = gets(buffer)) {
bufp = strrchr(buffer, '/');
printf ("item ID = %s, data = %s\n", bufp + 1, buffer);
}
}

Perl version:

while ( <> =~ m~(.+/(.+))~ ) {
printf "item ID = %s, data = %s\n", $2, $1 ;
}

Tad McClellan · Dec 12, 2006

["Followup-To:" header set to comp.lang.perl.misc.]

Here's the input (example):

/Collection/a/b/c/d/file1
/Collection/a/b/c/d/file2
/Collection/a/b/c/d/file3
/Collection/a/b/c/d/file4
/Collection/a/b/c/e/file4
/Collection/a/b/c/e/file5
/Collection/e/f/g/file4
/Collection/e/f/g/file5
/Collection/e/f/g/file6
/Collection/e/f/g/file7

And here's what I want the output to look like:

file1 : /Collection/a/b/c/d/file1
file2 : /Collection/a/b/c/d/file2
file3 : /Collection/a/b/c/d/file3
file4 : /Collection/a/b/c/d/file4
file4 : /Collection/a/b/c/e/file4
file5 : /Collection/a/b/c/e/file5
file4 : /Collection/e/f/g/file4
file5 : /Collection/e/f/g/file5
file6 : /Collection/e/f/g/file6
file7 : /Collection/e/f/g/file7

perl -pe 's/(.*\/(.*))/$2 : $1/' input.file

Ted Zlatanov · Dec 12, 2006

The module File::Basename is part of your standard Perl distribution.

Click to expand...

Sorry for the bother - I just did it the old way in C, which I know is
heresy for the perl group. =:-O

/* relist.c */
/* reformats strings. */

#include <stdio.h>

char buffer[512];
char * bufp;

int main() {
while (bufp = gets(buffer)) {
bufp = strrchr(buffer, '/');
printf ("item ID = %s, data = %s\n", bufp + 1, buffer);
}
}

It's not heresy, just not interesting--most of us have written C and
much prefer Perl. Also, you shouldn't use gets(). Ever. Henry
Spencer explains it better than I could:

http://isthe.com/chongo/tech/comp/c/10com.html

Ted

Ted Zlatanov · Dec 12, 2006

I say why use complex tools when simple tools will suffice

Excellent point. But also, you have to know the complex ways in which
simple tools can fail.

Something as simple as

#!/bin/bash
echo `basename $1`: $1

# touch 'a b'

# cat b.sh
#!/bin/bash
echo `basename $1`: $1

# ./b.sh 'a b'
a: a b

You need the second line to be

echo `basename "$1"`: $1

and even that may have trouble on systems like Windows that don't have
a `basename' program available by default.

Ted

Rich Grise · Dec 20, 2006

I have a collection of about 6000 files that need to be reorganized.

After reading the wealth of responses, I decided to go ahead and break
form and reply to myself, because I want to say thank you to each and
every one of you, but I'm too lazy to type this six times. ;-)

Thanks!
Rich

Help with importing from multiple files and printing lines in designated spot to spit out one file.	1	Jan 16, 2023
Padding strings for a clean visual print out...	5	Dec 23, 2023
os.path.walk() to get full path of all files	5	Mar 16, 2011
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
My Status, Ciphertext	2	Nov 28, 2023
How to ignore 1st line in a file when reading	19	Jun 28, 2006
Help for my project in the last minute	0	Apr 23, 2022
How to play corresponding sound?	2	Jun 10, 2023

A script to separate out file names from the path?

Rich Grise

usenet

J. Gleixner

Lew Pitcher

John W. Krahn

Paul Lalli

Uri Guttman

Rich Grise

Dr.Ruud

Tad McClellan

Ted Zlatanov

Ted Zlatanov

Rich Grise

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads