it is easy to get confused (from perldoc perlre):
\l lowercase next char (think vi)
That is, \l is not linefeed.
)) nice demonstration of the problem. But this
s/(?:\r\n?|\n)/
should work correctly? (except for the fact that one should use the codes)
In any case, these escapes could potentially mean different things on
different systems. Why not be very specific in what you really are
looking for?
Hmmmm, I assumed that I should rather use what Perl thinks is a linefeed
than the ASCII code I think it is. But this is really a minor issue.
OK. However, I was not looking for a solution with string substitution, as
you have seen (demonstrated on my faulty code snippet) I came up with that
one myself. I was rather thinking along the following lines: isn't there a
general way to tell Perl "Hey, treat all the text files alike, wherever
they come from: DOS, Mac or Unix".
The point is: (i) I have written a handfull of various scripts, some of them
quite large. All of them work on text files. Recently I have discovered
problems due to the fact that some of the files that I work on recently
come from the DOS world. Now, I'd rather insert _one_ command or variable
assignment somewhere at the beginning of the script that would change the
behaviour of chomp than to go through all that code and substitute each
chomp by a substitution. (ii) A substitution takes more time by orders of
magnitude:
:~ $ head -100000 /db/prodom/prodom.mul | (time perl -p -e 'chomp ;' > /dev/null ; )
real 0m0.157s
user 0m0.123s
sys 0m0.034s
:~ $ head -100000 /db/prodom/prodom.mul | (time perl -p -e 's{ \012 | (?: \015\012? ) }{\n}x ;' > /dev/null ; )
real 0m2.012s
user 0m1.990s
sys 0m0.024s
And, surprise, the files can be quite large:
:~ $ wc -l /db/prodom/prodom.mul
7900570 /db/prodom/prodom.mul
I simply thought there might be a better solution than to use
substitutions, like assigning $/ in a special way or using a module that
adds a layer to the file open() or redefines chomp. What do I know. I
thought that the problem was common enough to be addressed in a better way.
I think that I will find some way to determine the file type (possibly by
looking at the ending of the first line), redefine $/ and continue reading.
Some untested code follows:
#!/usr/bin/perl -w
use strict;
use warnings;
my $DFNTNTF = myopen("<test.mul") ;
die "Cannot open file: $!\n" unless($DFNTNTF) ;
while( <$DFNTNTF> ) {
chomp ;
print "line $.:$_\n" ;
}
close $DFNTNTF ;
exit 0 ;
# open a file and set the input record separator
sub myopen {
my $file_mode = shift ;
my $definitelynotif ;
open ( $definitelynotif, $file_mode ) or return ;
my $line = <$definitelynotif> ;
if($line =~ m/(\015\012|\012|\015)/) {
$/ = $1 ;
}
seek $definitelynotif, 0, 0 ;
return $definitelynotif ;
}
In any case, I should probably have put a smiley there, because I had not
intended it to come across that harshly.
No offence taken.
Cheers,
January
--