To extract file name only from a file

Rider · Jul 9, 2009

Hi experts,

I have this file, inut.txt (listed below). each line in the file has
more than 10 fields, but I am just listing a sample format here.

I need to print out only the filenames that are ending with .txt in
the output..

The output should be:
===============
unixFile1.txt
unixFile2.txt
unixFile3.txt
....
...
===============

I am looking out for a shorter form of a reg exp to extract only the
file names in to the output here. I do the basic perl coding on an
occasional basis, but don't know the right reg exp to do it.

Thanks in advance,
J

Here is the input file.
===================
1, bob, usr/tst/unixFile1.txt, boston, text1, text2, text3
2, bob, usr/tst/unixFile2.txt, boston, text1, text2, text3
3, bob, usr/tst/unixFile3.txt, boston, text1, text2, text3
......
.....
===================

Josef Moellers · Jul 9, 2009

Rider said:
Hi experts,

I have this file, inut.txt (listed below). each line in the file has
more than 10 fields, but I am just listing a sample format here.

I need to print out only the filenames that are ending with .txt in
the output..

The output should be:
===============
unixFile1.txt
unixFile2.txt
unixFile3.txt
...
..
===============

I am looking out for a shorter form of a reg exp to extract only the
file names in to the output here. I do the basic perl coding on an
occasional basis, but don't know the right reg exp to do it.

Thanks in advance,
J

Here is the input file.
===================
1, bob, usr/tst/unixFile1.txt, boston, text1, text2, text3
2, bob, usr/tst/unixFile2.txt, boston, text1, text2, text3
3, bob, usr/tst/unixFile3.txt, boston, text1, text2, text3

$f1 = (split(/,\s+/, $line))[2];
print "$f1\n" if $f1 =~ /\.txt$/;

Josef Moellers · Jul 9, 2009

Rider said:
Hi experts,

I have this file, inut.txt (listed below). each line in the file has
more than 10 fields, but I am just listing a sample format here.

I need to print out only the filenames that are ending with .txt in
the output..

The output should be:
===============
unixFile1.txt
unixFile2.txt
unixFile3.txt
...
..
===============

I am looking out for a shorter form of a reg exp to extract only the
file names in to the output here. I do the basic perl coding on an
occasional basis, but don't know the right reg exp to do it.

Thanks in advance,
J

Here is the input file.
===================
1, bob, usr/tst/unixFile1.txt, boston, text1, text2, text3
2, bob, usr/tst/unixFile2.txt, boston, text1, text2, text3
3, bob, usr/tst/unixFile3.txt, boston, text1, text2, text3
.....
....
===================

$f1 = (split(/\//, (split(/,\s+/, $line))[2]))[-1];
print "$f1\n" if $f1 =~ /\.txt$/;

Jürgen Exner · Jul 9, 2009

Rider said:
I have this file, inut.txt (listed below). each line in the file has
more than 10 fields, but I am just listing a sample format here.

I need to print out only the filenames that are ending with .txt in
the output..

The output should be:
===============
unixFile1.txt
unixFile2.txt

This looks like a standard CSV format, and you want the third column. So
I would use Text::CSV and grab the third element from each row.

If you insist on reinventing the wheel then at least for the sample data
you have shown you can grab the third element after split()ing each line
at the comma.

jue

Rider · Jul 9, 2009

Rider said:
Rider said:

Hi experts,

Click to expand...

I have this file, inut.txt (listed below). each line in the file has
more than 10 fields, but I am just listing a sample format here.

Click to expand...

I need to print out only the filenames that are ending with .txt in
the output..

Click to expand...

The output should be:
===============
unixFile1.txt
unixFile2.txt
unixFile3.txt
...
..
===============

Click to expand...

I am looking out for a shorter form of a reg exp to extract only the
file names in to the output here. I do the basic perl coding on an
occasional basis, but don't know the right reg exp to do it.

Click to expand...

Thanks in advance,
J

Click to expand...

Here is the input file.
===================
1, bob, usr/tst/unixFile1.txt, boston, text1, text2, text3
2, bob, usr/tst/unixFile2.txt, boston, text1, text2, text3
3, bob, usr/tst/unixFile3.txt, boston, text1, text2, text3
.....
....
===================

Click to expand...

$f1 = (split(/\//, (split(/,\s+/, $line))[2]))[-1];
print "$f1\n" if $f1 =~ /\.txt$/;

--
These are my personal views and not those of Fujitsu Technology Solutions!
Josef Möllers (Pinguinpfleger bei FTS)
If failure had no penalty success would not be a prize (T.. Pratchett)
Company Details:http://de.ts.fujitsu.com/imprint.html

Thanks Josef,

But I am looking out for a one-liner of just grabbing the only file
name that ends with .txt from each line with no need of using split
function. I am sure that that I saw that kind of reg expression
before, but I can not recall now.

Jürgen Exner · Jul 9, 2009

Rider said:
But I am looking out for a one-liner of just grabbing the only file
name that ends with .txt from each line with no need of using split
function. I am sure that that I saw that kind of reg expression
before, but I can not recall now.

Unless this is some academic excercise why do you want to do it the hard
way?
It is the easy way and the most robust way to use Text::CSV, grab the
third item, and then use File::Basename to extract the file name.

Or actually in your case you could also use
substr($line, 16, 13) #might be off by one somewhere
because the filename starts at character 16 and is 13 characters long.

Oh, you mean that's just your sample data and the actual data might vary
in lenght? Well, to bad, because your actual data may also vary in such
a way to make a regexp fail. That is exactly why using Text::CSV and
File::Basename are more robust and spare you from patching your
hand-rolled code over and over again whenever you encounter some
unforeseen data.

jue

Rider · Jul 9, 2009

Unless this is some academic excercise why do you want to do it the hard
way?
It is the easy way and the most robust way to use Text::CSV, grab the
third item, and then use File::Basename to extract the file name.

Or actually in your case you could also use
substr($line, 16, 13) #might be off by one somewhere
because the filename starts at character 16 and is 13 characters long.

Oh, you mean that's just your sample data and the actual data might vary
in lenght? Well, to bad, because your actual data may also vary in such
a way to make a regexp fail. That is exactly why using Text::CSV and
File::Basename are more robust and spare you from patching your
hand-rolled code over and over again whenever you encounter some
unforeseen data.

jue

It is not a CSV file.. it is a PHP file with a lot of comments in the
middle of the file as well.
So I am looking out for a reg exp for just gets me only the file name
that is ending with .txt (this file might have a space in the middle..
example: user input.txt, instead of userinput.txt).

Rider · Jul 9, 2009

Rider said:
Rider said:

Rider wrote:
Hi experts,
I have this file, inut.txt (listed below). each line in the file has
more than 10 fields, but I am just listing a sample format here.
I need to print out only the filenames that are ending with .txt in
the output..
The output should be:
===============
unixFile1.txt
unixFile2.txt
unixFile3.txt
...
..
===============
I am looking out for a shorter form of a reg exp to extract only the
file names in to the output here. I do the basic perl coding on an
occasional basis, but don't know the right reg exp to do it.
Thanks in advance,
J
Here is the input file.
===================
1, bob, usr/tst/unixFile1.txt, boston, text1, text2, text3
2, bob, usr/tst/unixFile2.txt, boston, text1, text2, text3
3, bob, usr/tst/unixFile3.txt, boston, text1, text2, text3
.....
....
===================
$f1 = (split(/\//, (split(/,\s+/, $line))[2]))[-1];
print "$f1\n" if $f1 =~ /\.txt$/;
--
These are my personal views and not those of Fujitsu Technology Solutions!
Josef Möllers (Pinguinpfleger bei FTS)
If failure had no penalty success would not be a prize(T. Pratchett)
Company Details:http://de.ts.fujitsu.com/imprint.html

Click to expand...

Click to expand...

Thanks Josef,

Click to expand...

But I am looking out for a one-liner of just grabbing the only file
name that ends with .txt from each line with no need of using split
function. I am sure that that I saw that kind of reg expression
before, but I can not recall now.

Click to expand...

perl -nle 'print $1 if (/.+\/(.+\.txt)/)' rider.txt

where rider.txt is your input file.

D:\Perl\source\1>perl -nle "print $1 if (/.+\/(.+\.txt)/)" rider.txt
unixFile1.txt
unixFile2.txt
unixFile3.txt

Awesome Len..

Thanks a bunch.. this serves my purpose. Though I did not run, I can
see that it would work.

sln · Jul 11, 2009

Hi experts,

I have this file, inut.txt (listed below). each line in the file has
more than 10 fields, but I am just listing a sample format here.

I need to print out only the filenames that are ending with .txt in
the output..

The output should be:
===============
unixFile1.txt
unixFile2.txt
unixFile3.txt
...
..
===============

I am looking out for a shorter form of a reg exp to extract only the
file names in to the output here. I do the basic perl coding on an
occasional basis, but don't know the right reg exp to do it.

Thanks in advance,
J

Here is the input file.
===================
1, bob, usr/tst/unixFile1.txt, boston, text1, text2, text3
2, bob, usr/tst/unixFile2.txt, boston, text1, text2, text3
3, bob, usr/tst/unixFile3.txt, boston, text1, text2, text3
.....
....
===================

This might help. Its a construct your own recipe. How you use it
is up to you. Certainly not a 1-liner (or short) but neither is real
file name parsing. There might be a module you could invoke.
Or you could use something like:

/(?

\/\s*[.-]+.*?)|([a-z0-9_][a-z0-9_ .-]*\.txt))[\s,]+/i and defined $2

-sln

----------------------------
## parse_fname_unix.pl
## (some rudimentary regex construction)
##
use strict;
use warnings;

use constant debug => 1;

my $start_char = "a-z0-9_";
my $body_chars = "$start_char .-";
my $field_seps = "\\s,";
my $fname = "[$start_char][$body_chars]*";
my $ext = "txt";
my $bad_fname = "\/\\s*[.-]+.*?";

my $qualified_name = qr/(?

$bad_fname)|($fname\.$ext))[$field_seps]+/i;

print "\n$qualified_name\n";

while (<DATA>)
{
next if (/^\s*$/);

if (debug) {
print "\n$_";
while (/$qualified_name/g)
{
print "\tBAD: $1\n" if defined $1;
print "\tOK: $2\n" if defined $2;
}
} else {
while (/$qualified_name/g and defined $2) {
print "$2\n";
}
}
}

__DATA__

-4, bob, unix/ .txt/File_-4.txt, boston, text1, unix/tst.txt/File_-4a.txt
-3, bob, unix .txt/File_-3.txt, boston, text1, text2, text3
-2, bob, unix .txt/.-File_-2.txt, boston, text1, text2, text3
-1, bob, unix .txt.-File_-1.txt, boston, text1, text2, text3
0, bob, unixFile0.txt, boston, text1, text2, text3
1, bob, usr/tst/unixFile1.txt, boston, text1, text2, text3
2, bob, usr/tst/unixFile2.txt, boston, text1, text2, text3
3, bob, usr/tst/unix.some.txt.File3.txt, boston, text1, text2, text3
4, bob, usr/tst.txt/unixFile4.Txt, boston, text1, text2, text3

--------------------
output:

(?i-xsm

?

/\s*[.-]+.*?)|([a-z0-9_][a-z0-9_ .-]*\.txt))[\s,]+)

-4, bob, unix/ .txt/File_-4.txt, boston, text1, unix/tst.txt/File_-4a.txt
BAD: / .txt/File_-4.txt
OK: File_-4a.txt

-3, bob, unix .txt/File_-3.txt, boston, text1, text2, text3
OK: File_-3.txt

-2, bob, unix .txt/.-File_-2.txt, boston, text1, text2, text3
BAD: /.-File_-2.txt

-1, bob, unix .txt.-File_-1.txt, boston, text1, text2, text3
OK: unix .txt.-File_-1.txt

0, bob, unixFile0.txt, boston, text1, text2, text3
OK: unixFile0.txt

1, bob, usr/tst/unixFile1.txt, boston, text1, text2, text3
OK: unixFile1.txt

2, bob, usr/tst/unixFile2.txt, boston, text1, text2, text3
OK: unixFile2.txt

3, bob, usr/tst/unix.some.txt.File3.txt, boston, text1, text2, text3
OK: unix.some.txt.File3.txt

4, bob, usr/tst.txt/unixFile4.Txt, boston, text1, text2, text3
OK: unixFile4.Txt

How to sort a CSV file with merge sort JAVA	7	May 6, 2021
Get await function in loop to finish before script ends	0	Oct 14, 2021
Help with importing from multiple files and printing lines in designated spot to spit out one file.	1	Jan 16, 2023
Struggling to read from a file using a for loop.	0	Oct 8, 2019
need help on a regular expression of text OR text OR etc...	1	Oct 3, 2006
Download a zip file and extract to a directory	1	Apr 9, 2010
Best way to extract numeric values from a report?	4	May 7, 2009
extract substring by regex from a text file	5	Apr 15, 2010

To extract file name only from a file

Rider

Josef Moellers

Josef Moellers

Jürgen Exner

Rider

Jürgen Exner

Rider

Rider

sln

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads