Rename File Using Strring Found in File?

  • Thread starter He Who Greets With Fire
  • Start date
H

He Who Greets With Fire

I am trying to write a little script to access many files in folder,
parse each file and then if a certain string is found, rename the file
using a substring of that found string.

OK, I have posted here before many many years ago (around 2001), back
when I did some perl programming. I even wrote a program as my senior
project to parse financial news stories and assign values to the
stories based on whether there were negative or positive words in the
news stories.

Some people here helped me with that program, and when I finished that
project I posted the code to the web.

Now I need some more help. :)

I have not programmed in a long time. I know perl has changed
somewhat. I have downloaded the latest activestate win32 perl and
installed it.

I have a file directory named E:/personalinjury. In the file directory
are 821 files named from 1.htm to 821.htm

I want to access each file in turn, and use a regex to parse the file
contents to see if a string similar to this one is found in it:
Citation: 20-333 Dorsaneo, Texas Litigation Guide § 333.103
Some files will not have a string similar to the above string. I am
not interesting in renaming those files.

If the string above is found, the numbers 20-333 and 333.103 will be
the ones that vary from file to file. All the words in the string
above and the section symbol will remain the same from file to file.
So another string I might find might be:
20-332 Dorsaneo, Texas Litigation Guide § 332.107

I am interested in that string of numbers at the end; in the examples
above, it is 333.103 and 332.107, but there are many other variations.

So, I want to rename that file to 333.103 from whatever it was before
(e.g., so I would rename the file from 1.htm or 5.htm or 200.htm etc
to 333.103.htm or 333.105.htm or 332.203.htm or whatever).

So, my script should strip off that string of digits and the end,
including the decimal point, and rename the file using that string of
digits.
Anyone got any ideas?

thx
 
T

Tad J McClellan

He Who Greets With Fire said:
Oops! it should be a BACKslash: E:\personalinjury


Slashes that lean either way will work fine if there is no
"shell" involved.
 
T

Tad J McClellan

He Who Greets With Fire said:
I have a file directory named E:/personalinjury. In the file directory
are 821 files named from 1.htm to 821.htm

I want to access each file in turn,


foreach my $file ( glob 'E:/personalinjury/*.htm' ) { # untested

and use a regex to parse the file
contents to see if a string similar to this one is found in it:
Citation: 20-333 Dorsaneo, Texas Litigation Guide § 333.103

open my $PI, '<', $file or die "could not open '$file' $!";
while ( <$PI> ) {
next unless /Citation: [\d-]+.*([\d.]+)/;
my $newfile = $1;

So, I want to rename that file to 333.103 from whatever it was before


rename $file, "$newfile.htm" or die "could not mv '$file' $!";
last;
}
close $PI;
 
H

He Who Greets With Fire

OK, thanks, but the script does not seem to rename the files.
I added some troubleshooting code, most of which I commented out. I
also moved a copy of the personalinjury folder and all its files
inside the C:\Perl directory so it can access it directly.


See below for my additional comments.


#!/bin/perl


#sleep 2;
print "here I am! \n";
#sleep 2;
my $counter =1;

foreach my $file ( glob 'personalinjury/*.htm' ) {

# print "here I am A \n";
# sleep 1;

open my $PI, '<', $file or die "could not open '$file' $!";

# print "here I am! B \n";
# sleep 1;

print $counter;
print "\n";
while ( <$PI> ) {
# print "\n inside whileloop";

I AM getting to this point here.

next unless /Citation: [\d-]+.*([\d.]+)/;

but I never get to this point here--apparently the regex never sees a
match for the "Citation:" etc string.

Here is a screen shot of the typical file, with a red arrow pointing
to the string in this particular file that I want to match.
I do not know why the regex does not see a match, because it looks
like it matches it???

See here:
http://img225.imageshack.us/img225/91/citationue2.jpg

my $newfile = $1;
rename $file, "$newfile.htm" or die "could not mv '$file' $!";
print "\n renamed a file";
sleep 1;
last;
}#end while

$counter++;
print "\n count is ";
print $counter;
print "\n";
#sleep 1;

close $PI;
} #end foreach



I think the script would work ok except that it never sees a match for
the regex pattern inside the file. I am seeing the script go through
each substring of all 821 files, but it never sees a match.
 
H

He Who Greets With Fire

next unless /Citation: [\d-]+.*([\d.]+)/;

I think it has to be something to do with the colon or the white
spaces between the colon and the first of the digits. Is the colon a
special character in perl? One white space is in the regex, but there
appears to be two white spaces in the screen shot of the file I linked
to above....
 
J

Josef Moellers

He said:
next unless /Citation: [\d-]+.*([\d.]+)/;

I think it has to be something to do with the colon or the white
spaces between the colon and the first of the digits. Is the colon a
special character in perl? One white space is in the regex, but there
appears to be two white spaces in the screen shot of the file I linked
to above....

I usually replace any white space to be matched by "\s+". That catches
TABs *and* blanks, so maybe
next unless /Citation:\s+[\d-]+.*([\d.]+)/;
will do?
 
B

Ben Morrow

Quoth He Who Greets With Fire said:
OK, thanks, but the script does not seem to rename the files.
I added some troubleshooting code, most of which I commented out. I
also moved a copy of the personalinjury folder and all its files
inside the C:\Perl directory so it can access it directly.

Don't do that. You can set the working directory from within your Perl
script using the chdir function. In any case, the working directory may
not be what you expect under Win32.
See below for my additional comments.

#!/bin/perl

Perl is *never* installed as /bin/perl.
#sleep 2;
print "here I am! \n";

Diagnostics like this are better given with warn, which will .a. print
them to STDERR, where they ought to be and .b. tell you where you are in
the script.
#sleep 2;
my $counter =1;

foreach my $file ( glob 'personalinjury/*.htm' ) {

# print "here I am A \n";
# sleep 1;

open my $PI, '<', $file or die "could not open '$file' $!";

# print "here I am! B \n";
# sleep 1;

print $counter;
print "\n";
while ( <$PI> ) {
# print "\n inside whileloop";

I AM getting to this point here.

next unless /Citation: [\d-]+.*([\d.]+)/;

but I never get to this point here--apparently the regex never sees a
match for the "Citation:" etc string.

Here is a screen shot of the typical file, with a red arrow pointing
to the string in this particular file that I want to match.
I do not know why the regex does not see a match, because it looks
like it matches it???

See here:
http://img225.imageshack.us/img225/91/citationue2.jpg

*DON'T* do that. Had you done the right thing, and copy-pasted a small
section of the relevant file into your message, you would have found
that the file doesn't in fact contain the string 'Citation: whatever' at
all. It's an HTML file, so there is markup in there as well, and the
string may well be spread across several lines. Get into the habit of
looking at files in a text editor before you try parsing them with Perl.
my $newfile = $1;
rename $file, "$newfile.htm" or die "could not mv '$file' $!";
print "\n renamed a file";
sleep 1;
last;
}#end while

If you had used proper indentation, you would be able to see that
comments like this are completely useless.

Ben
 
H

He Who Greets With Fire

Don't do that. You can set the working directory from within your Perl
script using the chdir function. In any case, the working directory may
not be what you expect under Win32.

the directory/folder location is not a problem. Like I said, the
script is indeed able to access the files in the folder and open them
and increment through them. So, everything seems to be OK on that
front.


Perl is *never* installed as /bin/perl.

but it already works in that regard--the script executes


Diagnostics like this are better given with warn, which will .a. print
them to STDERR, where they ought to be and .b. tell you where you are in
the script.

Well, I'm not actually a programmer, just someone trying to do some
organization of my files. So, that issue is not a concern right now.


#sleep 2;
my $counter =1;

foreach my $file ( glob 'personalinjury/*.htm' ) {

# print "here I am A \n";
# sleep 1;

open my $PI, '<', $file or die "could not open '$file' $!";

# print "here I am! B \n";
# sleep 1;

print $counter;
print "\n";
while ( <$PI> ) {
# print "\n inside whileloop";

I AM getting to this point here.

next unless /Citation: [\d-]+.*([\d.]+)/;

but I never get to this point here--apparently the regex never sees a
match for the "Citation:" etc string.

Here is a screen shot of the typical file, with a red arrow pointing
to the string in this particular file that I want to match.
I do not know why the regex does not see a match, because it looks
like it matches it???

See here:
http://img225.imageshack.us/img225/91/citationue2.jpg

*DON'T* do that.

Don't do what?
Had you done the right thing, and copy-pasted a small
section of the relevant file into your message, you would have found
that the file doesn't in fact contain the string 'Citation: whatever' at
all. It's an HTML file, so there is markup in there as well, and the
string may well be spread across several lines. Get into the habit of
looking at files in a text editor before you try parsing them with Perl.

That is a good point. When I wrote my financial news project that
parsed news stories for negative and positive words, I passed over all
words that were surrounded by html brackets.
Here are two excerpts from the source html for a typical file in that
folder:


here is the html source snippet that the script is looking for:
<td class="toolbar" align=right valign=top width="1%"
nowrap>Citation:&nbsp;&nbsp;</td>
<td class="toolbar" valign=top width="99%"><b>21-340 Dorsaneo, Texas
Litigation Guide § 340.02</b></td>


Yes, you are correct: the HTML code is throwing off the script.

Here is another snippet that looks much more promising. The TITLE of
the html page. This is not the instance of "citation....etc" that I
was looking for, but now that I see it, it looks like a good candidate
for use as a filename:

<title>Get a Document - by Citation - 21-340 Dorsaneo, Texas
Litigation Guide § 340.02</title>

Are the angle brackets special characters in perl so that they have to
be backslashed inside the regex?

I wonder if this regex would work?
next unless /\<title\>Get a Document - by Citation -
[\d-]+.*([\d.]+)\<\/title\>/;






If you had used proper indentation, you would be able to see that
comments like this are completely useless.

Not sure what you mean?
 
H

He Who Greets With Fire

Here is another snippet that looks much more promising. The TITLE of
the html page. This is not the instance of "citation....etc" that I
was looking for, but now that I see it, it looks like a good candidate
for use as a filename:

<title>Get a Document - by Citation - 21-340 Dorsaneo, Texas
Litigation Guide § 340.02</title>

Are the angle brackets special characters in perl so that they have to
be backslashed inside the regex?

I wonder if this regex would work?
next unless /\<title\>Get a Document - by Citation -
[\d-]+.*([\d.]+)\<\/title\>/;



well, I modified it by adding backslashes in front of the dashes like
so:
next unless /\<title\>Get a Document \- by Citation \-
[\d-]+.*([\d.]+)\<\/title\>/;

But it still does not work. Again, it does seem to cycle through all
the files, but nothing matches.
 
H

He Who Greets With Fire

Don't do that. You can set the working directory from within your Perl
script using the chdir function. In any case, the working directory may
not be what you expect under Win32.

the directory/folder location is not a problem. Like I said, the
script is indeed able to access the files in the folder and open them
and increment through them. So, everything seems to be OK on that
front.


Perl is *never* installed as /bin/perl.

but it already works in that regard--the script executes


Diagnostics like this are better given with warn, which will .a. print
them to STDERR, where they ought to be and .b. tell you where you are in
the script.

Well, I'm not actually a programmer, just someone trying to do some
organization of my files. So, that issue is not a concern right now.


#sleep 2;
my $counter =1;

foreach my $file ( glob 'personalinjury/*.htm' ) {

# print "here I am A \n";
# sleep 1;

open my $PI, '<', $file or die "could not open '$file' $!";

# print "here I am! B \n";
# sleep 1;

print $counter;
print "\n";
while ( <$PI> ) {
# print "\n inside whileloop";

I AM getting to this point here.

next unless /Citation: [\d-]+.*([\d.]+)/;

but I never get to this point here--apparently the regex never sees a
match for the "Citation:" etc string.

Here is a screen shot of the typical file, with a red arrow pointing
to the string in this particular file that I want to match.
I do not know why the regex does not see a match, because it looks
like it matches it???

See here:
http://img225.imageshack.us/img225/91/citationue2.jpg

*DON'T* do that.

Don't do what?
Had you done the right thing, and copy-pasted a small
section of the relevant file into your message, you would have found
that the file doesn't in fact contain the string 'Citation: whatever' at
all. It's an HTML file, so there is markup in there as well, and the
string may well be spread across several lines. Get into the habit of
looking at files in a text editor before you try parsing them with Perl.

That is a good point. When I wrote my financial news project that
parsed news stories for negative and positive words, I passed over all
words that were surrounded by html brackets.
Here are two excerpts from the source html for a typical file in that
folder:


here is the html source snippet that the script is looking for:
<td class="toolbar" align=right valign=top width="1%"
nowrap>Citation:&nbsp;&nbsp;</td>
<td class="toolbar" valign=top width="99%"><b>21-340 Dorsaneo, Texas
Litigation Guide § 340.02</b></td>


Yes, you are correct: the HTML code is throwing off the script.

Here is another snippet that looks much more promising. The TITLE of
the html page. This is not the instance of "citation....etc" that I
was looking for, but now that I see it, it looks like a good candidate
for use as a filename:

<title>Get a Document - by Citation - 21-340 Dorsaneo, Texas
Litigation Guide § 340.02</title>

Are the angle brackets special characters in perl so that they have to
be backslashed inside the regex?

I wonder if this regex would work?
next unless /\<title\>Get a Document - by Citation -
[\d-]+.*([\d.]+)\<\/title\>/;






If you had used proper indentation, you would be able to see that
comments like this are completely useless.

Not sure what you mean?



well, I changed the program quite a bit so as to be able to target a
match with the title string shown above. And I am able to find the
title string and extract the needed numbers, and I have been able to
place those numbers in a string variable.

BUT the problem is that the program crashes whenever I try to rename
the file using the string that I extracted.

Here is the program.



#!/bin/perl


#sleep 2;
print "here I am! \n";
sleep 2;
my $counter =1;
foreach my $file ( glob 'personalinjury/*.htm' ) {
# print "here I am A \n";
sleep 1;

open my $PI, '<', $file or die "could not open '$file' $!";
# print "here I am! B \n";
# sleep 1;

print $counter;
print "\n";
while ( <$PI> ) {
print "\n inside whileloop";
sleep 1;
#<title>Get a Document - by Citation - 21-340 Dorsaneo, Texas
#Litigation Guide § 340.02</title>

warn;
next unless /\<title\>.+Guide\s+§\s+(\d+\.\d+).?\<\/title\>/;
my $newfile = $+;






this rename line below is what causes it to crash, so i commented it
out:
#rename $file, "$newfile.htm" or die "could not mv '$file' $!";

But I cannot read what the error message says because the dos window
just closes. Where can I read what was in the window before it
crashed? And how can I rename the file? What went wrong with the
renaming?

The $newfile variable DOES contain the accurate and desired
information at this point, as shown by the print statement below.
print "\n renamed file to ";
print $newfile, "\n";
sleep 1;
last;
}#end while

$counter++;
print "\n count is ";
print $counter;
print "\n";
#sleep 1;

close $PI;
} #end foreach
sleep 5;
 
T

Tad J McClellan

He Who Greets With Fire said:
^^^^^^^
^^^^^^^
Eh?


the directory/folder location is not a problem.


Exactly so. That is why you should not do that.

Your "current working directory" and the directory that your perl
executable are in are not the same thing.

The directory where your perl binary lives has no connection
whatsoever to accessing files so moving files under there
will not solve file accessing problems.

Your cwd is what matters with regard to filesystem access.

but it already works in that regard--the script executes


The fact that your script executes does not prove that perl
is installed as /bin/perl.

(Windows programs use some other mechanizm for associating files).

You should either use a place where perl is usually installed, eg:

#!/usr/bin/perl

or simply

#!perl

if you want to use command line switches, or even

(nothing)

leave that line out completely.



Why do you think that calling sleep() will help with debugging?

Well, I'm not actually a programmer,


You will need to become a bit of a programmer if you hope to
write a bit of a program.

just someone trying to do some
organization of my files.


If you need some programming done, and you want to do it yourself,
then you are going to have to learn some programming.

So, that issue is not a concern right now.


so the issue of how to debug programs should be of ultimate concern
right now, since now is when you have a program that you need to debug!

Error and warning messages should go on STDERR, not STDOUT.



The text of the debugging message should tell you where it is in
the program rather than requiring you to search in the program
to find where it is. It also give you a chance to examine the
data that you are operating on:

warn "processing '$file' inside the foreach loop\n";



warn "succeeded in opening $file\n";



warn "processing '$_' inside the while loop\n";

I AM getting to this point here.

next unless /Citation: [\d-]+.*([\d.]+)/;

but I never get to this point here--apparently the regex never sees a
match for the "Citation:" etc string.


Then you should modify the regex so that is sees a match for
the "Citation:" etc string.

To do that, you need to know *exactly* what the data looks like,
and you probably need to read some of the standard documentation
that covers regexes.

Don't do what?


Don't post a screenshot. Your program is processing text, not graphics.

Don't post a URL and expect people to go follow it to find out what
you are talking about.



Do post a copy-pasted section of the data into your message.

That is a good point.


Well duh.

When crafting a regular expression, it is *essential* to know *exactly*
what the data you are trying to match looks like.

Here are two excerpts from the source html for a typical file in that
folder:


<td class="toolbar" align=right valign=top width="1%"
nowrap>Citation:&nbsp;&nbsp;</td>
<td class="toolbar" valign=top width="99%"><b>21-340 Dorsaneo, Texas
Litigation Guide § 340.02</b></td>


If the data you want is in an HTML table, then you should use
a module that will process an HTML table for you, such
as HTML::TableExtract.

Are the angle brackets special characters in perl so that they have to
be backslashed inside the regex?


Yes, angle brackets are special characters in Perl, they mean
"less than" and "greater than" and whatnot.

No, angle brackets are not special characters in a regular
expression, so they do not need to be backslashed.

The Perl Language and the Regular Expression Language are different
languages, so the funny characters mean different things depending
on which language you are in.

I wonder if this regex would work?


The way to answer that is to write a teeny tiny program
and *see* for yourself it it works or not.

Not sure what you mean?


Me either.
 
B

Ben Morrow

[please trim your quotations]


The fact you don't consider yourself a programmer is irrelevant. You are
writing a program, and it will make your life easier if you do it
properly.

Take screenshots of HTML files, rather than posting a sample.


Is this spread across two lines in the HTML file? If so, then reading
the file line-by-line with while (<>) will never give you a string that
matches your regex. You would be better off reading the whole file with
File::Slurp.

No. See perldoc perlreref: it lists all the special characters.


If you write your code like this

while (<>) {
#lots of code
#lots of code
#lots of code
#lots of code
}

then there is no need for any '#end while' comments: you can see it's
the end of the while from the indentation. Any half-decent editor will
find matching braces for you, as well. The comment just becomes noise
that obscures the important bits of the code.
well, I changed the program quite a bit so as to be able to target a
match with the title string shown above. And I am able to find the
title string and extract the needed numbers, and I have been able to
place those numbers in a string variable.

BUT the problem is that the program crashes whenever I try to rename
the file using the string that I extracted.

I seriously doubt it 'crashes'. That would be a serious bug in perl.
More likely, the rename fails for some reason and the program exits with
an error.
this rename line below is what causes it to crash, so i commented it
out:
#rename $file, "$newfile.htm" or die "could not mv '$file' $!";

But I cannot read what the error message says because the dos window
just closes. Where can I read what was in the window before it
crashed?

Open a cmd window yourself (Start / Run / cmd), cd into the appropriate
directory and run the script yourself with 'perl script.pl'. Then the
window won't go away.
And how can I rename the file? What went wrong with the
renaming?

Noone can tell that from here until you can see what the error message
said.

Ben
 
M

Martijn Lievaart

If you write your code like this

while (<>) {
#lots of code
#lots of code
#lots of code
#lots of code
}

then there is no need for any '#end while' comments: you can see it's
the end of the while from the indentation. Any half-decent editor will
find matching braces for you, as well. The comment just becomes noise
that obscures the important bits of the code.

Also, if it really is "lots of code" you should put that code in subs.
The while loop instantly becomes much more readable:

while (<>) {
do_this();
do_that(param, param);
if (check_something(param)) {
log_error();
last;
}
remainder_of_processing();
}

HTH,
M4
 
C

ccc31807

I am trying to write a little script to access many files in folder,
parse each file and then if a certain string is found, rename the file
using a substring of that found string.

It's important that you follow a methodology that self-corrects itself
each step of the way. If you could post a sample of a file that you
want to look at, it would be easier to see what you want to do. Also,
the format of the file is important. I assume that you want to reat
ASCII files.

The first step would be as follows. I would recommend a very small
subset of files in the beginning, one having the string you want and
one not.

1. begin your file examination loop that iterates through all files
2. open each file (in turn)
3. print each line
4. close each file (in turn)
5. end the loop.

When you run this, you can redirect the output to a text file for your
convenience. This will show you EXACTLY what Perl sees and will match
to your regular expression. It will also form the logic for your
program. Once you get this working to your satisfaction, you can start
to match your regular expression.
I have a file directory named E:/personalinjury. In the file directory
are 821 files named from 1.htm to 821.htm

You want to run your script from this directory.
I want to access each file in turn, and use a regex to parse the file
contents to see if a string similar to this one is found in it:
Citation: 20-333 Dorsaneo, Texas Litigation Guide § 333.103
Some files will not have a string similar to the above string. I am
not interesting in renaming those files.

What I would do (for starters, anyway) is this. Create a $counter.
Search each line for the string '333.nnn '. That is, a literal of two
3s followed by a digit followed by a literal period followed by three
digits and a space. If that string is found, rename the file like
this: 'TLG__33n_nnn_${counter}.txt' Obviously, if this string can be
found over multiple lines, you will have to fine tune your regex, but
I doubt that you will ever have a line break dividing a section
number, and I also doubt that you will have many false positives.
Anyone got any ideas?

Yeah, do it a piece at a time and make sure the prior part works
perfectly before you take the next step. You don't want to do this
project is one fell swoop unless you have plenty of practice.

Also, post a piece of your source file so we can see what it looks
like.

CC
 
H

He Who Greets With Fire

He Who Greets With Fire said:
I have a file directory named E:/personalinjury. In the file directory
are 821 files named from 1.htm to 821.htm

I want to access each file in turn,


foreach my $file ( glob 'E:/personalinjury/*.htm' ) { # untested

and use a regex to parse the file
contents to see if a string similar to this one is found in it:
Citation: 20-333 Dorsaneo, Texas Litigation Guide § 333.103

open my $PI, '<', $file or die "could not open '$file' $!";
while ( <$PI> ) {
next unless /Citation: [\d-]+.*([\d.]+)/;
my $newfile = $1;

So, I want to rename that file to 333.103 from whatever it was before


rename $file, "$newfile.htm" or die "could not mv '$file' $!";
last;
}
close $PI;


I have solved all the problems and have created a working script to
accomplish the task I needed to do . THere are however some problems
with the code you posted above here. For one, the rename() function
takes string values as arguments, not file handles. A file handle is a
pointer, and as such, its value is a numerical value representing an
address in RAM memory, not a string value. The $file variable you used
above as the first argument to rename() is a file handle, not a string
value. Second, you cannot rename a file that is presently open for
reading. Above, you closed the file later after you tried to rename
it. You should have closed it before you tried to rename it.




--He Who Greets With Fire
 
J

John W. Krahn

He said:
He Who Greets With Fire said:
I have a file directory named E:/personalinjury. In the file directory
are 821 files named from 1.htm to 821.htm

I want to access each file in turn,

foreach my $file ( glob 'E:/personalinjury/*.htm' ) { # untested

and use a regex to parse the file
contents to see if a string similar to this one is found in it:
Citation: 20-333 Dorsaneo, Texas Litigation Guide § 333.103
open my $PI, '<', $file or die "could not open '$file' $!";
while ( <$PI> ) {
next unless /Citation: [\d-]+.*([\d.]+)/;
my $newfile = $1;

So, I want to rename that file to 333.103 from whatever it was before

rename $file, "$newfile.htm" or die "could not mv '$file' $!";
last;
}
close $PI;

I have solved all the problems and have created a working script to
accomplish the task I needed to do . THere are however some problems
with the code you posted above here. For one, the rename() function
takes string values as arguments, not file handles.

Yes, it renames files using (surprise) the names of files.
A file handle is a pointer,

Not in Perl, Perl doesn't have pointers.
and as such, its value is a numerical value representing an
address in RAM memory, not a string value. The $file variable you used
above as the first argument to rename() is a file handle, not a string
value.

The only filehandle in the code above is $PI. $file is a file name
obtained from "glob 'E:/personalinjury/*.htm'".
Second, you cannot rename a file that is presently open for
reading. Above, you closed the file later after you tried to rename
it. You should have closed it before you tried to rename it.

Only on Windows. Other operating systems allow a file to be renamed
whether or not it is opened.


John
 
T

Tad J McClellan

He Who Greets With Fire said:
He Who Greets With Fire said:
I have a file directory named E:/personalinjury. In the file directory
are 821 files named from 1.htm to 821.htm

I want to access each file in turn,


foreach my $file ( glob 'E:/personalinjury/*.htm' ) { # untested

and use a regex to parse the file
contents to see if a string similar to this one is found in it:
Citation: 20-333 Dorsaneo, Texas Litigation Guide § 333.103

open my $PI, '<', $file or die "could not open '$file' $!";
while ( <$PI> ) {
next unless /Citation: [\d-]+.*([\d.]+)/;
my $newfile = $1;

So, I want to rename that file to 333.103 from whatever it was before


rename $file, "$newfile.htm" or die "could not mv '$file' $!";
last;
}
close $PI;


I have solved all the problems and have created a working script to
accomplish the task I needed to do . THere are however some problems
with the code you posted above here. For one, the rename() function
takes string values as arguments, not file handles.

Right.


A file handle is a
pointer,

Wrong.


and as such, its value is a numerical value representing an
address in RAM memory, not a string value. The $file variable you used
above as the first argument to rename() is a file handle, not a string


No, $file is a string, not a filehandle.

value. Second, you cannot rename a file that is presently open for
reading.


Yes I can.

Above, you closed the file later after you tried to rename
it.


Works fine on most sensible filesytems.

You should have closed it before you tried to rename it.


Not necessary on most sensible filesytems.
 
T

Tad J McClellan

Tad J McClellan said:


Oh. I think I see what happened there.

A filename glob (perldoc -f glob) is not the same as a "typeglob"
("Typeglobs and Filehandles" section in perldoc perldata).
 
M

Martijn Lievaart

_
Ben Morrow ([email protected]) wrote on VCCXCIX September MCMXCIII in
<URL:..
..
.. Perl is *never* installed as /bin/perl.


Bullocks.

Even beside the fact Perl will install itself pretty much everywhere
where the person running Configure tells it to (barring existance of the
directory and permission), there's a major operating system where /bin
and /usr/bin are identical.

Which reminds of the time I wanted to move /usr. Should be a static
filesystem, so create new slice, copy contents, mv /usr /usr.old and then
just ...... boot from CD to fix the mess.

M4
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,812
Latest member
GracielaWa

Latest Threads

Top