F
FMAS
I am comparing 2 lists of words and want to output the words in list 1
which are not available in list 2. Both lists have a different number
of entries.
Basically the script below works, but if the lists are large it takes
ages. I tested it on 2 lists of approx 15,000 entries each and after
25 min I had processed only 316 entries!
Here the script:
open(WORDLIST1,'C:\temp\a.txt') || die("cannot open file1!\n");
@list1 = <WORDLIST1>;
open(WORDLIST2,'C:\temp\b.txt') || die("cannot open file2!\n");
@list2 = <WORDLIST2>;
# loop in loop
foreach $list1 (@list1) {
foreach $list2 (@list2) {
chomp $list1;
chomp $list2;
last if ($list1 =~ m/$list2/i) ; # if match found look for next $list1
$lastentry = $list2[$#list2]; # in order to print entry only once when
no match found
if ($list2 =~ m/$lastentry/i) {
print "$list1\n";
}}}
Any suggestions?
I have heard that it would be much faster to build a search tree, but
I don't know how to do it (probably too complicated for a newbee like
me).
Thanks for your help!
Francois
which are not available in list 2. Both lists have a different number
of entries.
Basically the script below works, but if the lists are large it takes
ages. I tested it on 2 lists of approx 15,000 entries each and after
25 min I had processed only 316 entries!
Here the script:
open(WORDLIST1,'C:\temp\a.txt') || die("cannot open file1!\n");
@list1 = <WORDLIST1>;
open(WORDLIST2,'C:\temp\b.txt') || die("cannot open file2!\n");
@list2 = <WORDLIST2>;
# loop in loop
foreach $list1 (@list1) {
foreach $list2 (@list2) {
chomp $list1;
chomp $list2;
last if ($list1 =~ m/$list2/i) ; # if match found look for next $list1
$lastentry = $list2[$#list2]; # in order to print entry only once when
no match found
if ($list2 =~ m/$lastentry/i) {
print "$list1\n";
}}}
Any suggestions?
I have heard that it would be much faster to build a search tree, but
I don't know how to do it (probably too complicated for a newbee like
me).
Thanks for your help!
Francois