C
Cosmic Cruizer
I've been able to reduce my dataset by 75%, but it still leaves me with a
file of 47 gigs. I'm trying to find the frequency of each line using:
open(TEMP, "< $tempfile") || die "cannot open file $tempfile:
$!";
foreach (<TEMP>) {
$seen{$_}++;
}
close(TEMP) || die "cannot close file
$tempfile: $!";
My program keeps aborting after a few minutes because the computer runs out
of memory. I have four gigs of ram and the total paging files is 10 megs,
but Perl does not appear to be using it.
How can I find the frequency of each line using such a large dataset? I
tried to have two output files where I kept moving the databack and forth
each time I grabbed the next line from TEMP instead of using $seen{$_}++,
but I did not have much success.
file of 47 gigs. I'm trying to find the frequency of each line using:
open(TEMP, "< $tempfile") || die "cannot open file $tempfile:
$!";
foreach (<TEMP>) {
$seen{$_}++;
}
close(TEMP) || die "cannot close file
$tempfile: $!";
My program keeps aborting after a few minutes because the computer runs out
of memory. I have four gigs of ram and the total paging files is 10 megs,
but Perl does not appear to be using it.
How can I find the frequency of each line using such a large dataset? I
tried to have two output files where I kept moving the databack and forth
each time I grabbed the next line from TEMP instead of using $seen{$_}++,
but I did not have much success.