finding common words

T

Tore Aursand

Thanks to all.

No problem, but please consider stop top-posting. It's a bad thing.
Can I ask another newbie question please? The solutions provided are
working fine but sometimes depending on the lengths of words, I don't
get aligned results, e.g.

apple 2.2
boy 9.0
definite 2.5
eel 5.0

Is there a way of aligning these comments?

Yes, and it's very simple. I won't tell you how to, however, 'cause I
think it's best if you'd try it yourself first and - eventually - come
back here with some code you're having problem(s) with.

Please don't call me rude; I think it's a nice excercise for you to try
to solve this on your own.

If don't want to roll this one on your own, please consider using the
Text::Table module (it's excellent);
 
T

Tad McClellan

viv2k said:
apple 2.2
boy 9.0
definite 2.5
eel 5.0

Is there a way of aligning these comments?


Those are not comments. They appear to be output.

perldoc -f printf
perldoc -f sprintf
 
H

Hunter Johnson

Uri Guttman said:
HJ> If the docs say there is no list generated, how is map going to tell
HJ> the reader that the docs are wrong?

HJ> If a reader reads something extra into what's written (like "here
HJ> comes a list" when none is coming), how is that more the writer's
HJ> problem than the reader's? Either the writer can write it
HJ> differently (using 'for' in this case) or the reader can read it
HJ> differently. I don't see how the former is the only right answer.

it is a semantic communication to the reader of the code. it is your
(the coder's) responsibility to convey as much accurate information to
the reader as possible. map in a void context is not as accurate as a
for modifier even with the optimization.

Yes, it is because, as you say:
sure the docs will say it won't generate a list

Of course, you then qualify that with:
but its history has always been that way.

Historically, map wasn't even in perl, but people use it with its
current implementation anyway. That's all I'm suggesting will happen
now -- no list in void context means that programmers can (and should,
as they like) use it in void context, and readers will have to learn
that the language has changed, which it has.

Hunter
 
G

Gunnar Hjalmarsson

[ Please learn how to reply properly. You should respond *below* the
quoted text. ]
sorry for those 'unsophisticated' questions Gunnar. But it's first
time I'm playing with Perl and geez..

Note that it was my solution I called unsophisticated, seeing all the
other solutions that came up in this thread. :)
I'll definitely learn Perl in more detail in future

Why not now, so that you can accomplish this task the way you want it
to be?
The solutions provided are working fine but sometimes depending on
the lengths of words, I don't get aligned results, e.g.

apple 2.2
boy 9.0
definite 2.5
eel 5.0

Is there a way of aligning these comments?

Yes, and I agree with Tore that it's high time that you start learning
how to figure it out by help of the documentation and the available
modules. Tore mentioned a module, and Tad mentioned a couple of
functions.

If you encounter problems, please feel welcome to ask for help here,
but now we want to see code that you wrote.

Good luck!
 
T

Tore Aursand

Those are not comments. They appear to be output.

perldoc -f printf
perldoc -f sprintf

Hmm. Isn't it also appropriate to mention 'perldoc -f length'? I can't
think of a way to solve this with just (s)printf...?
 
J

Jean-Pierre Vidal

Thanks to all. I know many of u here are experts in Perl.. sorry for
those 'unsophisticated' questions Gunnar. But it's first time I'm
playing with Perl and geez.. it seems to be very powerful in text
manipulation...I'll definitely learn Perl in more detail in future

Can I ask another newbie question please? The solutions provided are
working fine but sometimes depending on the lengths of words, I don't
get aligned results, e.g.

apple 2.2
boy 9.0
definite 2.5
eel 5.0

Is there a way of aligning these comments?


perldoc perlform ?

Jean-Pierre
 
V

viv2k

Gunnar Hjalmarsson said:
If you encounter problems, please feel welcome to ask for help here,
but now we want to see code that you wrote.

Ok guys, I've got a slightly different requirement now. The lists I
have to compare have increased to 3 and I need to find out words that
appear in at least two of the lists. However, if a word appears in
only 2 list, I need to include it in my third list by giving it a
default value, like '0'.

Example:
ListA
Apple 4, Boy 3, Cat 5
ListB
Apple 1.0, Baby 2.1, Cat 3.3
ListC
Apple 99, Beef 100, Cow 101

Should give me as result:
ListA ListB ListC
Apple 4 Apple 1.0 Apple 99
Cat 5 Cat 3.3 Cat 0

Right now, my code looks like below but I don't know how to give this
default value. Any help is much appreciated.

my %ListA;
open my $fh, '< ListA.txt' or die "Couldn't open ListA.txt $!";
while (<$fh>) {
for (split /,\s*/) {
my ($key, $value) = split;
$ListA{$key} = $value;
}
}
close $fh;

my %ListB;
open my $fh, '< ListB.txt' or die "Couldn't open ListB.txt $!";
while (<$fh>) {
for (split /,\s*/) {
my ($key, $value) = split;
$ListB{$key} = $value;
}
}
close $fh;

my %ListC;
open my $fh, '< ListC.txt' or die "Couldn't open ListC.txt $!";
while (<$fh>) {
for (split /,\s*/) {
my ($key, $value) = split;
$ListC{$key} = $value;
}
}
close $fh;


for (keys %ListA) {
delete $ListA{$_} unless exists $ListB{$_} || $ListC{$_} ;
}
for (keys %ListB) {
delete $ListB{$_} unless exists $ListA{$_} || $ListC{$_};
}
for (keys %ListC) {
delete $ListC{$_} unless exists $ListA{$_} || $ListB{$_};
}

print "ListA \n";
print "$_\t$ListA{$_}\n" for sort keys %ListA;
print "\n";
print "ListB \n";
print "$_\t$ListB{$_}\n" for sort keys %ListB;
print "\n";
print "ListC \n";
print "$_\t$ListC{$_}\n" for sort keys %ListC;
 
G

Gunnar Hjalmarsson

viv2k said:
Ok guys, I've got a slightly different requirement now. The lists I
have to compare have increased to 3 and I need to find out words
that appear in at least two of the lists. However, if a word
appears in only 2 list, I need to include it in my third list by
giving it a default value, like '0'.

Example:
ListA
Apple 4, Boy 3, Cat 5
ListB
Apple 1.0, Baby 2.1, Cat 3.3
ListC
Apple 99, Beef 100, Cow 101

Should give me as result:
ListA ListB ListC
Apple 4 Apple 1.0 Apple 99
Cat 5 Cat 3.3 Cat 0

Right now, my code looks like below but I don't know how to give
this default value. Any help is much appreciated.

Well, you need to somehow count the number of occurrences of each
word, right? Below find a suggestion that does just that.
my %ListA;
open my $fh, '< ListA.txt' or die "Couldn't open ListA.txt $!";
while (<$fh>) {
for (split /,\s*/) {
my ($key, $value) = split;
$ListA{$key} = $value;
}
}
close $fh;

my %ListB;
open my $fh, '< ListB.txt' or die "Couldn't open ListB.txt $!";
while (<$fh>) {
for (split /,\s*/) {
my ($key, $value) = split;
$ListB{$key} = $value;
}
}
close $fh;

my %ListC;
open my $fh, '< ListC.txt' or die "Couldn't open ListC.txt $!";
while (<$fh>) {
for (split /,\s*/) {
my ($key, $value) = split;
$ListC{$key} = $value;
}
}
close $fh;


for (keys %ListA) {
delete $ListA{$_} unless exists $ListB{$_} || $ListC{$_} ;
}
for (keys %ListB) {
delete $ListB{$_} unless exists $ListA{$_} || $ListC{$_};
}
for (keys %ListC) {
delete $ListC{$_} unless exists $ListA{$_} || $ListB{$_};
}

my %count;
$count{$_}++ for keys %ListA, keys %ListB, keys %ListC;

for (keys %count) {
if ($count{$_} > 1) {
$ListA{$_} ||= 0;
$ListB{$_} ||= 0;
$ListC{$_} ||= 0;
}
}
print "ListA \n";
print "$_\t$ListA{$_}\n" for sort keys %ListA;
print "\n";
print "ListB \n";
print "$_\t$ListB{$_}\n" for sort keys %ListB;
print "\n";
print "ListC \n";
print "$_\t$ListC{$_}\n" for sort keys %ListC;

Even if this solution gets the job done, there are likely more
efficient ways to do it.
 
T

Tore Aursand

Ok guys, I've got a slightly different requirement now. The lists I
have to compare have increased to 3 and I need to find out words that
appear in at least two of the lists. However, if a word appears in
only 2 list, I need to include it in my third list by giving it a
default value, like '0'.

The easiest solution is to keep track of how many lists each data entry is
represented in, ie. by doing something like while iterating through each
list:
open my $fh, '< ListA.txt' or die "Couldn't open ListA.txt $!";
while (<$fh>) {
for (split /,\s*/) {
my ($key, $value) = split;
$ListA{$key} = $value;
$all{$key}++;

}
}
close $fh;

This way, you'll always know how many lists each '$key' is in (by looking
at its value).
 
A

Anno Siegel

Gunnar Hjalmarsson said:
viv2k wrote:
[...]
Example:
ListA
Apple 4, Boy 3, Cat 5
ListB
Apple 1.0, Baby 2.1, Cat 3.3
ListC
Apple 99, Beef 100, Cow 101

Should give me as result:
ListA ListB ListC
Apple 4 Apple 1.0 Apple 99
Cat 5 Cat 3.3 Cat 0

Right now, my code looks like below but I don't know how to give
this default value. Any help is much appreciated.

Well, you need to somehow count the number of occurrences of each
word, right? Below find a suggestion that does just that.

[setup of %ListA, %ListB and %listC]
my %count;
$count{$_}++ for keys %ListA, keys %ListB, keys %ListC;

for (keys %count) {
if ($count{$_} > 1) {
$ListA{$_} ||= 0;
$ListB{$_} ||= 0;
$ListC{$_} ||= 0;
}
}


Even if this solution gets the job done, there are likely more
efficient ways to do it.

I don't know about efficiency, it doesn't seem to be an issue here.

I'd make the individual %ListA .. %ListC an array of hashes, and write
the code so that it works for an arbitrary number of them. At the same
time, the code gets more compact (if that is an advantage). Also,
avoid tabs like the plague. The meaning of tabs is too ill-defined for
them to be of practical use. Use (s)printf for formating, or use
Text::Table. Thirdly, Perl's default value for numbers *is* 0. The
idiom " ... || 0" (soon " ... // 0", yay) has its place, but isn't
needed here.

Assume "@lists = \ ( %ListA, %ListB, %ListC)" to connect to previous
code. One would read the data directly into @lists.

my %count;
$count{ $_} ++ for map keys %$_, @lists;

for my $item ( sort grep $count{ $_} >= 2, keys %count ) {
# interleave copies of $item with prices
my @vals = map { $item, $_->{ $item} } @lists;
no warnings 'uninitialized'; # take care of default
printf '%6s %5.2f ' x @lists . "\n", @vals;
}

Anno
 
G

Gunnar Hjalmarsson

Anno said:
I'd make the individual %ListA .. %ListC an array of hashes, and
write the code so that it works for an arbitrary number of them.

I thought of that, but skipped it since OP isn't likely ready for
complex data structures.
Also, avoid tabs like the plague. The meaning of tabs is too
ill-defined for them to be of practical use. Use (s)printf for
formating, or use Text::Table.

That aspect was discussed previously in the thread.
Thirdly, Perl's default value for numbers *is* 0. The idiom " ...
|| 0" (soon " ... // 0", yay) has its place, but isn't needed here.

.... provided (s)printf() and disabling the uninitialized warning ...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,375
Latest member
FelishaCma

Latest Threads

Top