comparing a 2D array

Rose · Mar 13, 2008

For the following 2D array comparison codes modified from perllol google
search, I wonder why the output generated is not what I expected. Could
anybody tell me what i should modify in order to have an exact match of all
the attributes of a row from an individual file? Thanks a lot~

#!/usr/bin/perl

use warnings;

$usage='prog t1 t2 cmpname';
die "Usage: $usage\n" if $#ARGV < 1;

$outname = $ARGV[2];
$file1 = $ARGV[0];
$file2 = $ARGV[1];
$cmpcout = $outname . ".cmpofcmp.xls";

open(FP1, $file1);
open(FP2, $file2);
open(CMP, ">$cmpcout");

$line = <FP1>; #get header
while ($line ne "") {
$line = <FP1>;
@attr1 = split(/[\t ]+/, $line);
push @row1, [ @attr1 ];
}

$line = <FP2>; #get header
while ($line ne "") {
$line = <FP2>;
@attr2 = split(/[\t ]+/, $line);
push @row2, [ @attr2 ];
}

for $aref1 (@row1) {
for $aref2 (@row2) {
if (@$aref1 == @$aref2) {
print "@$aref1\n";
}
}
}

========================
File t1:
a1 a2 a3
1 AAAA 99
0 TT 88
99 what -888
File t2:
a1 a2 a3
1 AAAAA 99
0 TCCT 88
-10 TT 88
99 what 888

==========================
Output:
1 AAAA 99

1 AAAA 99

1 AAAA 99

1 AAAA 99

0 TT 88

0 TT 88

0 TT 88

0 TT 88

99 what -888

99 what -888

99 what -888

99 what -888

J. Gleixner · Mar 13, 2008

Rose said:
For the following 2D array comparison codes modified from perllol google
search, I wonder why the output generated is not what I expected. Could
anybody tell me what i should modify in order to have an exact match of all
the attributes of a row from an individual file? Thanks a lot~

#!/usr/bin/perl

use warnings;

Forgot:

use strict;

$usage='prog t1 t2 cmpname';
die "Usage: $usage\n" if $#ARGV < 1;

$outname = $ARGV[2];
$file1 = $ARGV[0];
$file2 = $ARGV[1];
$cmpcout = $outname . ".cmpofcmp.xls";

open(FP1, $file1);
open(FP2, $file2);
open(CMP, ">$cmpcout");

Check that they were successful.

$line = <FP1>; #get header
while ($line ne "") {
$line = <FP1>;
@attr1 = split(/[\t ]+/, $line);

No need for @attr1, just put the split in the [ ], below.

push @row1, [ @attr1 ];
}

$line = <FP2>; #get header
while ($line ne "") {
$line = <FP2>;
@attr2 = split(/[\t ]+/, $line);
push @row2, [ @attr2 ];
}

for $aref1 (@row1) {
for $aref2 (@row2) {
if (@$aref1 == @$aref2) {

perldoc -q "How do I test whether two arrays or hashes are equal"

print "@$aref1\n";
}
}
}

[...]

Rose · Mar 13, 2008

$line = <FP1>; #get header
while ($line ne "") {
$line = <FP1>;
@attr1 = split(/[\t ]+/, $line);

Click to expand...

No need for @attr1, just put the split in the [ ], below.

push @row1, [ @attr1 ];
}

Click to expand...

put the split in the []? I don't quite get what you mean....

After modifying the codes in perlfaq as:

for ($i = 0; $i < @$aref1; $i++) {
if ($aref1->[$i] ne $aref2->[$i]){
$match = 0;
last;
}
}

if ($match == 1) {
print "@$aref1\n";
}
}

this time it successfully finds the matches, but it prints out extra new
lines, why does that happen?

Rose · Mar 13, 2008

Untested code:

LOOP:
my $equal = 1;
foreach my $row1 (@row1) {
foreach my $row2 (@row2) {
unless (@$row1 == @$row2) {
$equal = 0;
last LOOP;
}
for (my $i = 0; $i < @$row1; $i ++) {
my $elem1 = $$row1 [$i];
my $elem2 = $$row2 [$i];
unless ($elem1 eq $elem2) {
$equal = 0;
last LOOP;
}
}
}
}

say "Arrays are equal" if $equal;

Abigail

Thanks, Abigail. But it seems that your untested code is testing the whole
array of array, but not individual rows, maybe the last code should enter
after the last but second curly bracket.

John W. Krahn · Mar 13, 2008

Rose said:
$line = <FP1>; #get header
while ($line ne "") {
$line = <FP1>;
@attr1 = split(/[\t ]+/, $line);

Click to expand...

No need for @attr1, just put the split in the [ ], below.

push @row1, [ @attr1 ];
}

Click to expand...

Click to expand...

put the split in the []? I don't quite get what you mean....

push @row1, [ split /[\t ]+/, $line ];

John

J. Gleixner · Mar 13, 2008

Rose said:
[...]
this time it successfully finds the matches, but it prints out extra new
lines, why does that happen?

Because you're not removing it from your input.

perldoc -f chomp

Eithe chomp() or remove the "\n" from your print.

Rose · Mar 14, 2008

After taking your advice, the codes to compare 2 files have been modified
(the followings), the problem is that I am unable to attach a tag to a
particular row. I once try using

for $aref2 (@row2) {
$f1m[$aref2] = 0;
}

instead of $f1m[$i++] = 0; but this does not help because I find have to
label rows from both files in order to print out two lists: one for the same
and the other for the difference. Is there any good way to achieve that?

#!/usr/bin/perl

#use warnings;

$usage='prog outname1.xls outname2.xls';
die "Usage: $usage\n" if $#ARGV < 1;

$outname = $ARGV[2];
$file1 = $ARGV[0];
$file2 = $ARGV[1];
$cmpsout = $outname . ".cmpofcmpsame.xls";
$cmpdout = $outname . ".cmpofcmpdiff.xls";

open(FP1, $file1);
open(FP2, $file2);
open(CMPS, ">$cmpsout");
open(CMPD, ">$cmpdout");

$line = <FP1>; #get header
while ($line ne "") {
$line = <FP1>;
push @row1, [ split(/[\t ]+/, $line) ];
}

$line = <FP2>; #get header
while ($line ne "") {
$line = <FP2>;
push @row2, [ split(/[\t ]+/, $line) ];
}

$i=0;
for $aref2 (@row2) {
$f1m[$i++] = 0;
}

print CMPD "$file1 statistics\n";
for $aref1 (@row1) {
for $aref2 (@row2) {

$match = 1;

for ($i = 0; $i < @$aref1; $i++) {
if ($aref1->[$i] ne $aref2->[$i]) {
$match = 0;
last;
}
}

if ($match == 1) {
print CMPS "@$aref1";
$flm[$aref2]=1;
} else {
print CMPD "@$aref1";
}
}
}

print CMPD "\n$file2 statistics\n";
$i=0;
for $aref2 (@row2) {
if ($f1m[$i] == 0) {
print CMPD "@$aref2";
}
$i++;
}

#########
File 1:

a1 a2 a3
1 AAAAA 99
0 TCCT 88
99 what -888

File 2:
a1 a2 a3
1 AAAAA 99
0 TCCT 88
-10 TT 88
99 what 888

Tad J McClellan · Mar 14, 2008

Rose said:
After taking your advice, the codes to compare 2 files have been modified
(the followings), the problem is that I am unable to attach a tag to a
particular row. I once try using

for $aref2 (@row2) {
$f1m[$aref2] = 0;
}

instead of $f1m[$i++] = 0; but this does not help because I find have to
label rows from both files in order to print out two lists: one for the same
and the other for the difference. Is there any good way to achieve that?

#!/usr/bin/perl

#use warnings;

You lose the benfits of warnings when you comment out the
use warnings pragma.

You should also be using the

use strict;

pragma as well.

$usage='prog outname1.xls outname2.xls';
die "Usage: $usage\n" if $#ARGV < 1;

If you want to require exactly two command line arguments, then

die "Usage: $usage\n" unless @ARGV == 2;

says it much more clearly.

$outname = $ARGV[2];
$file1 = $ARGV[0];
$file2 = $ARGV[1];

Err, your Usage message says that it should be called with 2 arguments,
so why are you expecting 3 arguments?

If your usage message is wrong, and you really do want those 3
variables loaded from the command line arguments, then

my($file1, $file2, $outname) = @ARGV;

will do that.

$cmpsout = $outname . ".cmpofcmpsame.xls";
$cmpdout = $outname . ".cmpofcmpdiff.xls";

Using variable names that differ by a single interior character is
a very bad idea.

Interpolation is nearly always easier to read and understand
than explicit concatenations.

my $same_out = "$outname.cmpofcmpsame.xls";
my $diff_out = "$outname.cmpofcmpdiff.xls";

open(FP1, $file1);

You should always, yes *always*, check the return value from open():

open(FP1, $file1) or die "could not open '$file1' $!";

Even better, use a lexical filehandle and the 3-arg form of open():

open( my $FP1, '<', $file1) or die "could not open '$file1' $!";

open(CMPS, ">$cmpsout");

open my $SAME, '>', $same_out or die "could not open '$same_out' $!";

$line = <FP1>; #get header

If you plan to discard the line anyway, then there is not much
point in saving it in a variable.

while ($line ne "") {
$line = <FP1>;

That is a truly horrid way of reading lines from a file.

while ( $line = <FP1> ) }

even better, with a lexical filehandle as above:

for ($i = 0; $i < @$aref1; $i++) {

foreach my $i ( 0 .. $#$aref1 ) { # less error-prone, does the same thing

Rose · Mar 14, 2008

Tad J McClellan said:
Rose said:

After taking your advice, the codes to compare 2 files have been modified
(the followings), the problem is that I am unable to attach a tag to a
particular row. I once try using

for $aref2 (@row2) {
$f1m[$aref2] = 0;
}

instead of $f1m[$i++] = 0; but this does not help because I find have to
label rows from both files in order to print out two lists: one for the
same
and the other for the difference. Is there any good way to achieve that?

#!/usr/bin/perl

#use warnings;

Click to expand...

You lose the benfits of warnings when you comment out the
use warnings pragma.

You should also be using the

use strict;

pragma as well.

$usage='prog outname1.xls outname2.xls';
die "Usage: $usage\n" if $#ARGV < 1;

Click to expand...

If you want to require exactly two command line arguments, then

die "Usage: $usage\n" unless @ARGV == 2;

says it much more clearly.

$outname = $ARGV[2];
$file1 = $ARGV[0];
$file2 = $ARGV[1];

Click to expand...

Err, your Usage message says that it should be called with 2 arguments,
so why are you expecting 3 arguments?

If your usage message is wrong, and you really do want those 3
variables loaded from the command line arguments, then

my($file1, $file2, $outname) = @ARGV;

will do that.

$cmpsout = $outname . ".cmpofcmpsame.xls";
$cmpdout = $outname . ".cmpofcmpdiff.xls";

Click to expand...

Using variable names that differ by a single interior character is
a very bad idea.

Interpolation is nearly always easier to read and understand
than explicit concatenations.

my $same_out = "$outname.cmpofcmpsame.xls";
my $diff_out = "$outname.cmpofcmpdiff.xls";

open(FP1, $file1);

Click to expand...

You should always, yes *always*, check the return value from open():

open(FP1, $file1) or die "could not open '$file1' $!";

Even better, use a lexical filehandle and the 3-arg form of open():

open( my $FP1, '<', $file1) or die "could not open '$file1' $!";

open(CMPS, ">$cmpsout");

Click to expand...

open my $SAME, '>', $same_out or die "could not open '$same_out' $!";

$line = <FP1>; #get header

Click to expand...

If you plan to discard the line anyway, then there is not much
point in saving it in a variable.

while ($line ne "") {
$line = <FP1>;

Click to expand...

That is a truly horrid way of reading lines from a file.

while ( $line = <FP1> ) }

even better, with a lexical filehandle as above:

for ($i = 0; $i < @$aref1; $i++) {

Click to expand...

foreach my $i ( 0 .. $#$aref1 ) { # less error-prone, does the same
thing

Dear Mr. Tad McClellan,

Thanks a lot for pointing out all the problems I have!

szr · Mar 15, 2008

Abigail said:
_
Rose ([email protected]) wrote on VCCCVIII September MCMXCIII in
<URL:?? For the following 2D array comparison codes modified from perllol
google ?? search, I wonder why the output generated is not what I
expected. Could ?? anybody tell me what i should modify in order to
have an exact match of all ?? the attributes of a row from an
individual file? Thanks a lot~ ??
??
??
?? #!/usr/bin/perl
??
?? use warnings;
??
?? $usage='prog t1 t2 cmpname';
?? die "Usage: $usage\n" if $#ARGV < 1;
??
?? $outname = $ARGV[2];
?? $file1 = $ARGV[0];
?? $file2 = $ARGV[1];
?? $cmpcout = $outname . ".cmpofcmp.xls";
??
?? open(FP1, $file1);
?? open(FP2, $file2);
?? open(CMP, ">$cmpcout");

You are blindly assuming that the open will succeed. What if an input
file doesn't exist, or if you don't have permission to open the output
file?

??
?? $line = <FP1>; #get header
?? while ($line ne "") {

Eh, did you by any chance follow a Perl course organized by HP
education?
They have the only books I've encountered that don't use the
canonical way
of iterating over a file:

while ($line = <FP1>) {
...
}

But I wouldn't use bare file handles any more. With any Perl from the
current century, you could write:

open my $fh, '<', $file or die "open: $!";
while (my $line = <$fh>) {
# Do something with $line
}
close $fh or die "close: $!";

?? $line = <FP1>;
?? @attr1 = split(/[\t ]+/, $line);
?? push @row1, [ @attr1 ];
?? }

??
?? $line = <FP2>; #get header
?? while ($line ne "") {
?? $line = <FP2>;
?? @attr2 = split(/[\t ]+/, $line);
?? push @row2, [ @attr2 ];
?? }
??
?? for $aref1 (@row1) {
?? for $aref2 (@row2) {
?? if (@$aref1 == @$aref2) {
?? print "@$aref1\n";

This just compares whether $aref1 and $aref2 point to arrays of the
same length.

You should compare each element individually.

?? }
?? }
?? }

Untested code:

LOOP:
my $equal = 1;
foreach my $row1 (@row1) {
foreach my $row2 (@row2) {
unless (@$row1 == @$row2) {
$equal = 0;
last LOOP;
}
for (my $i = 0; $i < @$row1; $i ++) {
my $elem1 = $$row1 [$i];
my $elem2 = $$row2 [$i];
unless ($elem1 eq $elem2) {
$equal = 0;
last LOOP;
}
}
}
}

say "Arrays are equal" if $equal;

Abigail

How about this for comparign a "2D" array?

use strict;

my @a1 = ( [qw/A B C/], [qw/D E F/], [qw/G H I/] );
my @a2 = ( [qw/A B C/], [qw/D E F/], [qw/G H I/] );

my $matches =
@a1 == grep { my $i = $_;
@{$a1[$i]} == grep {
$a1[$i][$_] eq $a2[$i][$_];
} (0..$#{$a1[$i]})
} (0..$#a1);

print $matches ? 'Yep' : 'Nope';

$match is true if ALL the elements match, false if not.

This can easily be expanded to handle more "dimentions" as well:

use strict;

my @a1 = ( [ [qw/A B C/], [qw/D E F/] ], [ [qw/G H I/], [qw/J K
L/] ] );
my @a2 = ( [ [qw/A B C/], [qw/D E F/] ], [ [qw/G H I/], [qw/J K
L/] ] );

my $matches =
@a1 == grep { my $i = $_;
@{$a1[$i]} == grep { my $j = $_;
@{$a1[$i][$j]} == grep {
$a1[$i][$j][$_] eq $a2[$i][$j][$_];
} (0..$#{$a1[$i][$j]})
} (0..$#{$a1[$i]})
} (0..$#a1);

print $matches ? 'Yep' : 'Nope';

And so on, for as many levels as you need. Best of all, it's technically
a one liner, though it is obviously more readable written as above

Dr.Ruud · Mar 16, 2008

Rose schreef:

@attr1 = split(/[\t ]+/, $line);

You could write /[\t ]+/ as /[[:blank:]]+/

but most often this is what you want:

@attr1 = split " ", $line;

See perldoc -f split.

Comparing Two Files line by line and word by word	8	Jan 11, 2006
Reading lines from a file into an array	2	May 1, 2010
Wrap rev 2.	8	Aug 25, 2004
script help - stripping trailing spaces in exisitng script	1	Jun 6, 2005
Line/word wrap program.	11	Aug 19, 2004
This is a mess...	3	Jul 16, 2009
Text processing	29	Sep 26, 2011
newbie question	6	Aug 14, 2004

comparing a 2D array

Rose

J. Gleixner

Rose

Rose

John W. Krahn

J. Gleixner

Rose

Tad J McClellan

Rose

szr

Dr.Ruud

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads