parsing file name assigning extension to a variable

A

Alexander Heimann

Hi guys. I am new to Perl(four days). I am having a blast playing with
it. There seems to be a hundred ways to solve each problem I am faced
with.
Currently I am working on migrating data to a new database. I need to
read the contents of a few thousand files and then insert the contents
into a database. The trick is the files are named desc.121655 with
121655 being the record number or primary key in the database. So I
need to parse the filename and save the extension to a variable to
later use in my SQL statement. The steps I think i need to take are
below any comments would be great

1. open directory..
2. go file by file
3 assign extension of file to a variable @recordNum
4 assign contents of file to a variable @content
5 then insert content with SQL statement where PK = @recordNum
6 then do next file until end of directory

sorry if that sounds confusing i am a bit new at this

Alex
 
G

gnari

[snip problem without actual question]
1. open directory..
2. go file by file
3 assign extension of file to a variable @recordNum

$recordNum ?
4 assign contents of file to a variable @content

$content ?
5 then insert content with SQL statement where PK = @recordNum
ditto

6 then do next file until end of directory

sounds good. go for it and let us know how it goes.

gnari
 
S

Sam Holden

Hi guys. I am new to Perl(four days). I am having a blast playing with
it. There seems to be a hundred ways to solve each problem I am faced
with.
Currently I am working on migrating data to a new database. I need to
read the contents of a few thousand files and then insert the contents
into a database. The trick is the files are named desc.121655 with
121655 being the record number or primary key in the database. So I
need to parse the filename and save the extension to a variable to
later use in my SQL statement. The steps I think i need to take are
below any comments would be great

1. open directory..

perldoc -f opendir
2. go file by file

perldoc -f readdir
perldoc perlsyn [look for for, foreach, while]
3 assign extension of file to a variable @recordNum

perldoc File::Basename

You certainly don't want to use an array for a single extension
(and you don't seem to need to keep all the data at once)
4 assign contents of file to a variable @content

perldoc -f open
perldoc -f readline
perldoc -f read
perldoc -f close
5 then insert content with SQL statement where PK = @recordNum

perldoc DBI
6 then do next file until end of directory

'}'


perldoc is a command on most perl installs to read the documentation you
may have it available as HTML or in some other format, in which case
"perldoc -f foo" means the foo function documented in the perlfunc
documentation. "perldoc File::Basename" means the documentation for
the File::Basename module. perldoc perlsyn means the perlsyn
documentation.
 
J

John Bokma

Sam said:
Hi guys. I am new to Perl(four days). I am having a blast playing with
it. There seems to be a hundred ways to solve each problem I am faced
with.
Currently I am working on migrating data to a new database. I need to
read the contents of a few thousand files and then insert the contents
into a database. The trick is the files are named desc.121655 with
121655 being the record number or primary key in the database. So I
need to parse the filename and save the extension to a variable to
later use in my SQL statement. The steps I think i need to take are
below any comments would be great

1. open directory..

perldoc -f opendir
2. go file by file


perldoc -f readdir
perldoc perlsyn [look for for, foreach, while]

Or File::Find
perldoc File::Basename

You certainly don't want to use an array for a single extension
(and you don't seem to need to keep all the data at once)


perldoc -f open
perldoc -f readline
perldoc -f read
perldoc -f close

File::Slurp (you probably have to install that one, see ppm)
 
A

Alexander Heimann

gnari said:
[snip problem without actual question]
1. open directory..
2. go file by file
3 assign extension of file to a variable @recordNum

$recordNum ?
4 assign contents of file to a variable @content

$content ?
5 then insert content with SQL statement where PK = @recordNum
ditto

6 then do next file until end of directory

sounds good. go for it and let us know how it goes.

gnari
thanks everyone for all your help. i will let you guys know how it goes.

have an awesome weekend...
 
A

Alexander Heimann

maybe someone can tell me why I am unable to read the file when i do
each step individually it was working but i am having trouble putting
it all together..


use File::Basename;
fileparse_set_fstype("MSDOS");


opendir (DIR, "D:/D2") or die "couldn't open directory\n";
while (defined($file = readdir(DIR))) {



($name, $dir, $ext) = fileparse($file, '\..*');
$ext =~s/^\.//;
print " dir is $dir, name is $name, extension is $ext\n";

my $input;
open($input, "<", "$file")
#or die "Couldn't open file :!\n";
while(<$input>){
undef $/;
$content = <INPUT>;
print if /of/;

print $content;


}
close($input);







}


closedir DIR;
 
P

Paul Lalli

maybe someone can tell me why I am unable to read the file when i do
each step individually it was working but i am having trouble putting
it all together..

You've left off at least two vital pieces of information, necessary for
anyone to effectively help you:

1) What is your desired goal and/or output?
2) What is the result / output of the code you tried? (this includes all
errors and warnings that may be printed).

Paul Lalli
 
A

Alexander Heimann

Paul Lalli said:
You've left off at least two vital pieces of information, necessary for
anyone to effectively help you:

1) What is your desired goal and/or output?
2) What is the result / output of the code you tried? (this includes all
errors and warnings that may be printed).

Paul Lalli
Paul, My apologies.
1) My desired goal and output
1. open directory..
2. go file by file
3 assign extension of file to a variable @ext
4 assign contents of file to a variable $content
5 then insert content with SQL statement where PK
6 then do next file until end of directory
2) I am getting the die error output when trying to read the file. The
code parses the filename fine when i comment out the open file portion
 
G

gnari

Alexander Heimann said:
1) My desired goal and output
1. open directory..
2. go file by file
3 assign extension of file to a variable @ext
4 assign contents of file to a variable $content
5 then insert content with SQL statement where PK
6 then do next file until end of directory
2) I am getting the die error output when trying to read the file. The
code parses the filename fine when i comment out the open file portion

you obviously are forgetting the directory part of the
filename when opening it

gnari
 
A

Alexander Heimann

gnari said:
you obviously are forgetting the directory part of the
filename when opening it

gnari
gnari in the code below
I am using $file as the filename to open, if i comment out the open file
and read content portion i am able to parse the the file
is there a reason i can't use $file again



use File::Basename;
fileparse_set_fstype("MSDOS");


opendir (DIR, "D:/D2") or die "couldn't open directory\n";
while (defined($file = readdir(DIR))) {
($name, $dir, $ext) = fileparse($file, '\..*');
$ext =~s/^\.//;
print " dir is $dir, name is $name, extension is $ext\n";

my $input;
open($input, "<", "$file")
or die "Couldn't open file :!\n";
while(<$input>){
undef $/;
$content = <INPUT>;
print if /of/;

print $content;


}
close($input);
 
G

gnari

Alexander Heimann said:
gnari in the code below
I am using $file as the filename to open, if i comment out the open file
and read content portion i am able to parse the the file
is there a reason i can't use $file again

[snip code where OP is forgetting the directory part]
opendir (DIR, "D:/D2") or die "couldn't open directory\n";

here 'D:/D2' is the directory, you are reading. call this
the 'directory part'
print " dir is $dir, name is $name, extension is $ext\n";

here is your problem. your stupid debugging. why print
a bunch of variables that have nothong to do with the problem?
they are not used in the open
open($input, "<", "$file")
or die "Couldn't open file :!\n";

always include the filename in the die()
or die "Couldn't open file '$file' :!\n";

if you had done this you would have seen no
directory part ('D:/D2')

gnari
 
A

Alexander Heimann

gnari said:
Alexander Heimann said:
gnari in the code below
I am using $file as the filename to open, if i comment out the open file
and read content portion i am able to parse the the file
is there a reason i can't use $file again

[snip code where OP is forgetting the directory part]
opendir (DIR, "D:/D2") or die "couldn't open directory\n";

here 'D:/D2' is the directory, you are reading. call this
the 'directory part'
print " dir is $dir, name is $name, extension is $ext\n";

here is your problem. your stupid debugging. why print
a bunch of variables that have nothong to do with the problem?
they are not used in the open
open($input, "<", "$file")
or die "Couldn't open file :!\n";

always include the filename in the die()
or die "Couldn't open file '$file' :!\n";

if you had done this you would have seen no
directory part ('D:/D2')

gnari

Gnar,
I added a variable for the directory part. When i took the die() out
of the open file it worked ok, but when the die was in there it
didn't. For some reason the open file was reading two files with no
filenames in the directory. I don't see the files. i am not really
sure why that is happening. So the open file wouldn't work because
there was no name

Anyways it is working without the die and I think i will be able to
use this now and use the extension and the content of the file
variable in my SQL insert statement. The reason i am printing out
(stupid error checking) is to make sure the variables are holding the
correct values to later use in a SQL statement



use File::Basename;


fileparse_set_fstype("MSDOS");

$mydir = "D:/D2";
opendir (DIR, $mydir) || die "couldn't opendir $mydir: $!\n";
while ($file = readdir(DIR)) {

($name, $dir, $ext) = fileparse($file, '\..*');
$ext =~s/^\.//;
print "extension is $ext\n";

open($input, "<$mydir/$file "); #|| die "couldn't open $mydir/$file
for reading :!\n";
while(<$input>){
undef $/;
$content = <INPUT>;
print if /of/;
print $content;
}
close($input);
}

closedir (DIR);
 
G

gnari

[using readdir and open]

some of the entries returned by readdir may not be readable files,
for examples directories.
for example '.' and '..'

gnari
 
A

Alexander Heimann

If anyone is interested. I am pasting the code that worked for the
above problem. The main problem i was having was that when I was
reading the directory i forgot to add
next if $file =~/^\.\.?$/; after the while (defined($file =
readdir(DIR))) to skip over the .

after i stopped getting crazy errors

thanks for all your help guys. i will try to contribute as much as I
can i have only been playing with perl for a week now

# modules used
use DBI;
use File::Basename;
use File::Slurp;







#create connection to database
$dbh = DBI->connect('dbi:mysql:$dbname:localhost:3306',
'$username','$password8',
{ RaiseError => 1, AutoCommit => 1});
fileparse_set_fstype("MSDOS");


#open directory loop while there is still a file
$mydir = "D:/desc";
opendir (DIR, $mydir) || die "couldn't opendir $mydir: $!\n";
while (defined($file = readdir(DIR))) {
next if $file =~/^\.\.?$/;


#open file and assign content of file to variable $content
my $content = read_file("$mydir/$file");


#parse file and assing extension to $ext variable
($name, $dir, $ext) = fileparse($file, '\..*');
$ext =~s/^\.//;




#SQL insert statement to insert $ext and $content into DB
#prepare then excecute

$sth= $dbh->prepare("INSERT INTO `desc`VALUES (?,?)");
$sth->execute ( $ext, $content );


}
#close directory
closedir(DIR);
#disconnect from database
$dbh->disconnect();
 
J

Joe Smith

Alexander said:
If anyone is interested. I am pasting the code that worked for the
above problem. The main problem i was having was that when I was
reading the directory i forgot to add
next if $file =~/^\.\.?$/; after the while (defined($file =
readdir(DIR))) to skip over the .

But what if someone creates a subdirectory in D:/D2 ?
The check on /^\.\.$/ is for the cases where files and subdirectories
will both be processed.

In your case, it is more robust to use
next unless -f "$mydir/$file";
to skip anything that is not a plain file (which will skip '.' and '..').

-Joe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,818
Latest member
SapanaCarpetStudio

Latest Threads

Top