How to write a program to ...

  • Thread starter Bart Van der Donck
  • Start date
B

Bart Van der Donck

gjena1 said:
I'm new to perl. Would someone please give me an example
of a perl program to read from a fileA which has several
datas, and search fileA for string1 and change it to string2
in fileA. string1 is located in various places in each
line. Find the the string1 that is located 10 spaces from
the beginning of each line in fileA.
Any suggestions would be greatly greatly appreciated.

This could be an example. Your input file must be readable ($fileA)
and your output file must be writable ($fileB). Script changes only
the first $string1 per line that is exact after 10 spaces from the
beginning. If you're on Windows then you will probably need to change
#!/usr/bin/perl to Perl's path on your machine.

-------------------------------------
PROGRAM START
-------------------------------------

#!/usr/bin/perl
use strict;
use warnings;

# init
my $fileA = "initialfile.txt";
my $fileB = "alteredfile.txt";
my $string1 = "XXXX";
my $string2 = "YYYY";
my $tenspaces;
my @initialfile;
my @alteredfile;
my $alteredline;
for (1..10) {$tenspaces.=' '}

# read $fileA and do regexp in each line
open (R, "$fileA")||die"$!";
# following code line doesn't always work on Win OS
# I 'ld suggest to uncomment when on unix:
# flock(W,1) || die "Cant get LOCK_SH on file: $!";
while (<R>)
{
push @initialfile, $_;
$alteredline=$_;
$alteredline=~s/^ {10}$string1/$tenspaces$string2/;
push @alteredfile, $alteredline;
}
close R;

# write new content to $fileB
open (W, ">$fileB")||die "$!";
# following code line doesn't always work on Win OS
# I 'ld suggest to uncomment when on unix:
# flock(W,1) || die "Cant get LOCK_SH on file: $!";
print W @alteredfile;
close W;

# small report utility to screen
print "\n------------------\n";
print "Initial file:";
print "\n------------------\n";
print @initialfile;
print "\n------------------\n";
print "Altered file:";
print "\n------------------\n";
print @alteredfile;
 
U

Uri Guttman

BVdD> # init
BVdD> my $fileA = "initialfile.txt";
BVdD> my $fileB = "alteredfile.txt";
BVdD> my $string1 = "XXXX";
BVdD> my $string2 = "YYYY";
BVdD> my $tenspaces;
BVdD> my @initialfile;
BVdD> my @alteredfile;
BVdD> my $alteredline;

don't declare these before they are needed.

BVdD> for (1..10) {$tenspaces.=' '}

belch.

my $spaces = ' ' x 10 ;


BVdD> # read $fileA and do regexp in each line
BVdD> open (R, "$fileA")||die"$!";

useless use of quotes around $fileA. and put the filename in the error
message (which is one reason to put file names in vars and not hardwire
them in open calls).

BVdD> # following code line doesn't always work on Win OS
BVdD> # I 'ld suggest to uncomment when on unix:
BVdD> # flock(W,1) || die "Cant get LOCK_SH on file: $!";

again, put the file name in the error message

BVdD> while (<R>)
BVdD> {
BVdD> push @initialfile, $_;

why the push of all lines? if you need them, then just slurp them in
first and loop over them later.

BVdD> $alteredline=$_;

why do you copy the line? just operate in $_ if you want.
BVdD> $alteredline=~s/^ {10}$string1/$tenspaces$string2/;

why the $tenspaces? just grab the whitespace and replace it with itself

$alteredline =~ s/^( {10})$string1/$1$string2/;

and put some white space around your operators.


BVdD> push @alteredfile, $alteredline;
BVdD> }
BVdD> close R;

BVdD> # write new content to $fileB
BVdD> open (W, ">$fileB")||die "$!";

this needs quotes but you can use the newer 4 arg open to not need
them. and again, file name should be in the error message

uri
 
M

Michele Dondi

This could be an example. Your input file must be readable ($fileA) [snip]

Incidentally the shebang line is much like an "hello" and the __END__
token is much like a "bye" (unless you "have DATA"). So there's hardly
any need for a dedicated marker. And if you indent the whole code a
few spaces it will be even more visually distinguished...
#!/usr/bin/perl
use strict;
use warnings;

Good! (And good practice to show this to the OP!!)

IMHO hardly any need for unnecessary cmts. Good code should be mostly
self-commenting.
my $fileA = "initialfile.txt";
my $fileB = "alteredfile.txt";

Why hardcoding them, then? Even if this is only an example
length('shift')==5. But then of course one should take care of error
checking @ARGV, etc. so I'll take your point...
my $string1 = "XXXX";
my $string2 = "YYYY";

Also

my ($string1, $string2) = qw/XXXX YYYY/;

But then both strings may contain spaces, so I'll take your point
again. But it's worth noting that one can declare and initialize more
variables at a time...
my $tenspaces;
my @initialfile;
my @alteredfile;
my $alteredline;

No need to declare all of the required variables at one point (in
Perl, that is).
for (1..10) {$tenspaces.=' '}

Yuk!

$tenspaces = ' ' x 10;
# read $fileA and do regexp in each line

Ditto as above wrt cmts.
open (R, "$fileA")||die"$!";

Better to use lexical fhs nowadays. Better to use the three-args form
of open() anyway. No need to quote all variables (see

perldoc -q quoting

for more info).

Better to use low-level precedence logical operators for flow control
(as a bonus you can avoid parenthesis).
# following code line doesn't always work on Win OS
# I 'ld suggest to uncomment when on unix:
# flock(W,1) || die "Cant get LOCK_SH on file: $!";

Would this be really required, especially in a minimal example like
this?
while (<R>)
{

Quite a weird indenting style, isn't it? Well it's a matter of
personal tastes anyway...
push @initialfile, $_;

Why on heart should one do so? If you really need to read all of a
file's contents into an array then

@initialfile = <R>;

would be the commonest (and better!) way to do it. But in this kind of
situations one hardly would really want to do so. Perl's typical (and
efficient!) idiom for iterating over all lines of a file is

while (<$file>) {
# ...
}

It's not good to spread awkward, clumsy snippets of code like that
above, instead.
$alteredline=$_;
$alteredline=~s/^ {10}$string1/$tenspaces$string2/;
push @alteredfile, $alteredline;

Not such a great strategy to work on $alteredline, since s/// acts by
default on $_.

More importantly, it's not required in this case, but I'd use

s/^ {10}\Q$string1/$tenspaces$string2/;

in any case.

Also, I'd really use

s/(?=^ {10})\Q$string1/$string2/;

instead.
# write new content to $fileB

Please note that the OP never wrote that he wants to write to $fileB
# small report utility to screen
print "\n------------------\n";
print "Initial file:";
print "\n------------------\n";
print @initialfile;
print "\n------------------\n";
print "Altered file:";
print "\n------------------\n";
print @alteredfile;

If I really were to adopt such a "report utility" (which I wouldn't!),
I'd use an here doc:

$"='';
print <<"EOF";
------------------
Initial file:
------------------
@initialfile
------------------
Altered file:
------------------
@alteredfile
EOF

An IMHO better example script along the lines of yours may be:


#!/usr/bin/perl

use strict;
use warnings;

my ($str1, $str2)=qw/XXXX YYYY/;

die "Usage: $0 <infile> <outfile>\n"
unless @ARGV==2;

my $out=pop;
open my $fh, '>', $out or
die "Can't open `$out': $!\n";
select $fh;

s/(?<=^ {10})\Q$str1/$str2/, print while <>;

__END__


But then I guess that the best answer for the OP may be in terms of
the simple one-liner:

perl -pe 's/(?<=^ {10})XXXX/YYYY/' fileA >fileB


Michele
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,996
Messages
2,570,238
Members
46,826
Latest member
robinsontor

Latest Threads

Top