perl out of memory

X

xlue897

Hi,

I have a query to search the max number in a large size file. When I
use the perl code below, it generates error : Out of memory! Bus
error.

Perl Code:
perl -e '
for(<>)
{if ($_>$max){$max=$_;}}
print $max;'
<large_size_file

Also, can command line perl with -n run like awk -
'BEGIN{code}
{code}
END{code}
'

Thanks

Steven
 
J

Jürgen Exner

Hi,

I have a query to search the max number in a large size file. When I
use the perl code below, it generates error : Out of memory! Bus
error.

Perl Code:
perl -e '
for(<>)

Replace 'for' with 'while'.
The magic of reading a line at a time works for 'while(<>)' only.

jue
 
J

John W. Krahn

I have a query to search the max number in a large size file. When I
use the perl code below, it generates error : Out of memory! Bus
error.

Perl Code:
perl -e '
for(<>)

You are using a for loop so perl has to read the entire file first into a list
in memory. Use a while loop instead.

{if ($_>$max){$max=$_;}}
print $max;'
<large_size_file

perl -lne'$max = $_ if $_ > $max; END { print $max }' large_size_file

Also, can command line perl with -n run like awk -
'BEGIN{code}
{code}
END{code}
'

Yes.



John
 
X

xlue897

The magic of reading a line at a time works for 'while(<>)' only.

ATM! :)

Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
.'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,

Thanks everyone for your help. The while loop works. However, the
perl code seems much slower than awk code. For the same file size
around 5M records, the awk takes only 1 min to loop to find the max
value, the perl takes around 20 mins. Does perl slower than awk?


Thanks.

Steven
 
M

Michele Dondi

What's with quoting the .sig? (If not discussing it, that is. But this
is generally the case with Abigail's!)
Thanks everyone for your help. The while loop works. However, the
perl code seems much slower than awk code. For the same file size
around 5M records, the awk takes only 1 min to loop to find the max
value, the perl takes around 20 mins. Does perl slower than awk?

Hard to say, without seeing any code. Find it hard to believe, though:

cognac:~ [21:23:58]$ perl -le 'print rand for 1..5_000_000' > test
cognac:~ [21:24:19]$ time perl -ne '$m=$_>$m?$_:$m;END{print $m}'
test
0.999999995290754

real 0m8.604s
user 0m7.160s
sys 0m1.368s


Michele
 
X

xlue897

What's with quoting the .sig? (If not discussing it, that is. But this
is generally the case with Abigail's!)
Thanks everyone for your help. The while loop works. However, the
perl code seems much slower than awk code. For the same file size
around 5M records, the awk takes only 1 min to loop to find the max
value, the perl takes around 20 mins. Does perl slower than awk?

Hard to say, without seeing any code. Find it hard to believe, though:

cognac:~ [21:23:58]$ perl -le 'print rand for 1..5_000_000' > test
cognac:~ [21:24:19]$ time perl -ne '$m=$_>$m?$_:$m;END{print $m}'
test
0.999999995290754

real 0m8.604s
user 0m7.160s
sys 0m1.368s

Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
.'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,

Here is the test code with result. test file generated by (perl -le
'print rand for 1..5_000_000' > test.txt)
$time awk -F'.' '{ if($2 > max) {max = $2;} } END{print max;}'
<test.txt
999969482421875

real 0m18.16s
user 0m17.38s
sys 0m0.18s

$time perl -a -F'\.' -n -e '{ if($F[1] >$max) {$max=$F[1];} }
END{print $max;}' test.txt
999969482421875

real 0m41.57s
user 0m41.14s
sys 0m0.16s


BTW, why the code below doesn't work?
perl -a -F/\./ -n -e '{print $F[1], "\n";} ' test.txt


Thanks,
Steven
 
M

Michele Dondi

[snip]
Here is the test code with result. test file generated by (perl -le
'print rand for 1..5_000_000' > test.txt)
$time awk -F'.' '{ if($2 > max) {max = $2;} } END{print max;}'
<test.txt
999969482421875

real 0m18.16s
user 0m17.38s
sys 0m0.18s

$time perl -a -F'\.' -n -e '{ if($F[1] >$max) {$max=$F[1];} }
END{print $max;}' test.txt
999969482421875

real 0m41.57s
user 0m41.14s
sys 0m0.16s

Well, indeed awk appears to be faster, but not in the same measure as
you hinted above. Anyway, this *does* surprise me, but not too much:
afaik awk is a specialized tool and Perl a full fledged language
(although one supposed to excel in the same areas).
BTW, why the code below doesn't work?
perl -a -F/\./ -n -e '{print $F[1], "\n";} ' test.txt

That should be -F'/\./' otherwise the dot gets quoted by the shell,
but perl will see the /./ pattern which is *not* what you want.


Michele
 
X

xhoster

....

Here is the test code with result. test file generated by (perl -le
'print rand for 1..5_000_000' > test.txt)
$time awk -F'.' '{ if($2 > max) {max = $2;} } END{print max;}'
<test.txt
999969482421875

real 0m18.16s
user 0m17.38s
sys 0m0.18s

$time perl -a -F'\.' -n -e '{ if($F[1] >$max) {$max=$F[1];} }
END{print $max;}' test.txt
999969482421875

real 0m41.57s
user 0m41.14s
sys 0m0.16s

So the difference here is less than a factor of 3, rather than the factor
of 20 you originally said. A factor of 3 is easy to believe. Different
languages have different strengths.
BTW, why the code below doesn't work?
perl -a -F/\./ -n -e '{print $F[1], "\n";} ' test.txt

The shell eats the backslash, so Perl never sees it and treats . as the
special character rather than as a literal. It often helps to use echo
to tell you exactly what Perl is seeing once the shell is done:


$ echo F/\./
F/./

$ echo 'F/\./'
F/\./

Xho
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,818
Latest member
SapanaCarpetStudio

Latest Threads

Top