search and replace in Perl

D

Dominic Philsby

Hi, I'm using Perl to do simple text search & replace within a text
file. The Perl version, sample file, and commandline syntax I am using
is shown below.


C:\test>
C:\test>
C:\test>type file.txt
the quick brown cow jumps over the lazy horse
C:\test>
C:\test>
C:\test>perl -p -e "s/cow/fox/g;s/horse/dog/g" file.txt
the quick brown fox jumps over the lazy dog
C:\test>
C:\test>
C:\test>perl -v

This is perl, v5.6.1 built for MSWin32-x86

Copyright 1987-2001, Larry Wall

Perl may be copied only under the terms of either the Artistic License
or the
GNU General Public License, which may be found in the Perl 5 source
kit.

Complete documentation for Perl, including FAQ lists, should be found
on
this system using `man perl' or `perldoc perl'. If you have access to
the
Internet, point your browser at http://www.perl.com/, the Perl Home
Page.


C:\test>
C:\test>
C:\test>


My file.txt document contains only one line but the real files are
several hundred thousand lines.
The words I am changing are not just "cow" and "horse" but hundreds of
words.

I am using Windows.

In my commandline program, my question is rather than specifying "s/
cow/fox/g;s/horse/dog/g" on the commandline, I want to reference a
file containing this. In otherwords, I want my commandline program to
reference a text file, lets call it regexReplace.txt, containing the
following

s/cow/fox/g;
s/horse/dog/g;

Can someone help me out with the syntax or how to do this?
Thank you

Dominic
 
J

Jochen Lehmeier

Can someone help me out with the syntax or how to do this?

Pseudo code, untested and possibly incomplete, just to get you started:


open(REGS,"<regs.txt") or die "regs.txt: $!";
my @regs=<REGS>;

open(LINES,"<lines.txt") or die "lines.txt: $!";
while (<LINES>)
{
my $line=$_;
foreach my $reg (@regs)
{
eval "\$line =~ $reg"; # HUGE SECURITY RISK IF YOU ARE NOT
ABSOLUTE SURE ABOUT THE CONTENTS OF regs.txt
}
print $line;
}


You can implement a better (more elegant, more efficient) solution if you
don't store "s/.../.../g" in your file and use those strings directly, but
parse it and pre-calculate regular expression objects outside of the loop
(hint: perldoc perlop, look for "qr").

Have fun!
 
J

John Bokma

Jochen Lehmeier said:
open(REGS,"<regs.txt") or die "regs.txt: $!";

You might want to write:

my $filename = 'regs.txt';
open my $regs, '<', $filename
or die "Can't open '$filename' for reading: $!";
while (<LINES>)
{
my $line=$_;

Why not:

while ( my $line = <$fh> ) {

(assuming you did open my $fh ... )
 
D

Dr.Ruud

Tad said:
I once lost about 3 hours because I did it that way, so let me help
others avoid such a fate...

If you have a pattern that is a prefix of some other pattern, say "cow"
and "cows", then you better do something to ensure that the longer
one is leftmost in your regex's alternation.

I usually just do a sort in descending order:

my $regex = join('|', sort {$b cmp $a} keys %regmap);

Probably should have a "map +('\b\Q$_\E\b')," inserted as well.
 
W

Wanna-Be Sys Admin

Dominic said:
Hi, I'm using Perl to do simple text search & replace within a text
file. The Perl version, sample file, and commandline syntax I am using
is shown below.


C:\test>
C:\test>
C:\test>type file.txt
the quick brown cow jumps over the lazy horse
C:\test>
C:\test>
C:\test>perl -p -e "s/cow/fox/g;s/horse/dog/g" file.txt
the quick brown fox jumps over the lazy dog
C:\test>
C:\test>
C:\test>perl -v

This is perl, v5.6.1 built for MSWin32-x86

Copyright 1987-2001, Larry Wall

Perl may be copied only under the terms of either the Artistic License
or the
GNU General Public License, which may be found in the Perl 5 source
kit.

Complete documentation for Perl, including FAQ lists, should be found
on
this system using `man perl' or `perldoc perl'. If you have access to
the
Internet, point your browser at http://www.perl.com/, the Perl Home
Page.


C:\test>
C:\test>
C:\test>


My file.txt document contains only one line but the real files are
several hundred thousand lines.
The words I am changing are not just "cow" and "horse" but hundreds of
words.

I am using Windows.

In my commandline program, my question is rather than specifying "s/
cow/fox/g;s/horse/dog/g" on the commandline, I want to reference a
file containing this. In otherwords, I want my commandline program to
reference a text file, lets call it regexReplace.txt, containing the
following

s/cow/fox/g;
s/horse/dog/g;

Can someone help me out with the syntax or how to do this?
Thank you

Dominic

So, step through the file line by line (or read it into an array, but
that's usually not preferred if the file is very large) and for each
line, do what you want.

open the file.
while
s///;

open(my $textfile, '<', 'regexReplace.txt') or die $!;
while (<$textfile>) {
s/cow/fox/g;
s/horse/dog/g;
print;
}
close($textfile) or warn $!;


Are you looking to modify he file, or just replace the values on output?
The above just replaces the output. Also, assuming the file is more
important than simple, unimportant text, you will likely want to use
file locking if the file could be changed by another process while
you're viewing or changing it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,969
Messages
2,570,161
Members
46,705
Latest member
Stefkari24

Latest Threads

Top