Preventing lines from printing

D

Diamond, Mark

Hello (I posted this in alt.comp.lang.perl but I think it should be here)
....

I am processing multiple text files. There is a lot of pre-processing before
the file-reading commences, then each file gets read (but I don't want to
print any blank line) and then there is a lot of post-processing of each
file. In AWK I had a BEGIN block, no print lines in the middle, and a very
long END block. In Perl, I have

pre-code
while (<>) {
lots of processing of file lines with only one wanted "print"
}
post-processing

but I get a blank line output on any line that I don't actually do an
explicit print for. I thought "perl -n" would give me what I need but it
doesn't (of course) because of the wrapped "while(<>) { }" .

Please ... what should I be doing?

Mark
 
A

Alexander Bartolich

In Perl, I have

pre-code
while (<>) {
lots of processing of file lines with only one wanted "print"
}
post-processing

but I get a blank line output on any line that I don't actually do an
explicit print for. I thought "perl -n" would give me what I need but it
doesn't (of course) because of the wrapped "while(<>) { }" .

Unless you start perl with option -p the output can only come from
statements in your script. Since you neither posted the exact command
line nor the exact script nobody can help you.
 
D

Diamond, Mark

Thank you Alexander. I thought I was saving unnecessary space by NOT posting
the code. As it turns out, your comment has led me to the source of the
problem.

Instead of asking myself "how can I suppress printing that I don't want?",
your comment made me look at the problems differently, and ask "how is it
that I am accidentally printing something I didn't intend?" ... and now I
have found my mistake!

Thank you again.

Mark
 
J

Jürgen Exner

Diamond said:
pre-code
while (<>) {
lots of processing of file lines with only one wanted "print"
}
post-processing

but I get a blank line output on any line that I don't actually do an
explicit print for.[...]

So you are saying Perl is executing a print() without a print() being in
your code? I find that hard to believe.
Could you please provide a minimal(!) sample program that exhibits this
problem and that we can run?

jue
 
T

Tim McDaniel

In AWK I had a BEGIN block, no print lines in the middle, and a very
long END block.

Just to pick a nit: you *could* do the same thing in Perl.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

075.awk:

#! /usr/bin/awk -f
BEGIN { sum = 0; count = 0; }
{ sum += $2 * 100; ++count; }
END { print "The average of column 2 is this percent: ", sum / count;
}

075.in:

chase .0225
citi .0110
uhcu .03

075.awk < 075.in:

The average of column 2 is this percent: 2.11667

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

075.pl:

#! /usr/bin/perl -wan
BEGIN { $sum = 0; $count = 0; }
{ $sum += $F[1] * 100; ++$count; }
END { print "The average of column 2 is this percent: ", ($sum / $count), "\n"; }

075.pl < 075.in:

The average of column 2 is this percent: 2.11666666666667

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Mind you, I don't recommend this. I always do "use strict;", which
requires me (in general) to declare variables with "my".

But the scope of "my $sum" and "my $count" in any of those blocks ends
with its block, and I don't know how to "my" a variable across a BEGIN
block boundary.

So I think you're right to rewrite a large complicated piece of code
to be more Perly and not rely on "-n -a", and for short pieces of
code, you can just use awk as before.
 
T

Tim McDaniel

You can have a file-scoped lexical declared before the BEGIN block:

my($sum, $count);
BEGIN { $sum = 0; $count = 0; }

Note this excerpt from "Private Variables via my()" in perlsub.pod:

A C<my> has both a compile-time and a run-time effect. At compile
time, the compiler takes notice of it. The principal usefulness
of this is to quiet C<use strict 'vars'>

But it interacts non-intuitively with "perl -n". My test program,
transmogrified:

#! /usr/bin/perl -wan
use strict;
my ($sum, $count);
BEGIN { $sum = 0; my $count = 0; }
$sum += $F[1] * 100; ++$count; print "After $count lines, sum is $sum\n";
END { print "The average of column 2 is this percent: $sum/$count=", ($sum / $count), "\n"; }

Data:

chase .0225
citi .0110
uhcu .03

Output:

After 1 lines, sum is 2.25
After 1 lines, sum is 1.1
After 1 lines, sum is 3
The average of column 2 is this percent: 2.25/1=2.25

$count staying 1 on each iteration is easy to see. As "man perlrun"
says, the program is effectively

LINE:
while (<>) {
use strict;
my ($sum, $count);
BEGIN { $sum = 0; my $count = 0; }
$sum += $F[1] * 100; ++$count; print "After $count lines, sum is $sum\n";
END { print "The average of column 2 is this percent: $sum/$count=", ($sum / $count), "\n"; }

}

that is, $sum and $count are redeclared and zorched on every loop
iteration.

But I don't know why $sum keeps its value from the FIRST iteration
.... maybe $sum being used in BEGIN somehow squirrels away a reference
to that iteration's value of $sum, and that's somehow what END sees,
instead of the last iteration's value of $sum, or $::sum, or sum such?
 
U

Uri Guttman

TM> #! /usr/bin/perl -wan
TM> BEGIN { $sum = 0; $count = 0; }

no need to initialize those to 0 as += won't warn when adding to
undef. same is true for ++ and .= .

now you can declare those vars without the BEGIN

TM> { $sum += $F[1] * 100; ++$count; }
TM> END { print "The average of column 2 is this percent: ", ($sum / $count), "\n"; }

uri
 
S

sln

You can have a file-scoped lexical declared before the BEGIN block:

my($sum, $count);
BEGIN { $sum = 0; $count = 0; }

Note this excerpt from "Private Variables via my()" in perlsub.pod:

A C<my> has both a compile-time and a run-time effect. At compile
time, the compiler takes notice of it. The principal usefulness
of this is to quiet C<use strict 'vars'>

But it interacts non-intuitively with "perl -n". My test program,
transmogrified:

#! /usr/bin/perl -wan
use strict;
my ($sum, $count);
BEGIN { $sum = 0; my $count = 0; }
$sum += $F[1] * 100; ++$count; print "After $count lines, sum is $sum\n";
END { print "The average of column 2 is this percent: $sum/$count=", ($sum / $count), "\n"; }

Data:

chase .0225
citi .0110
uhcu .03

Output:

After 1 lines, sum is 2.25
After 1 lines, sum is 1.1
After 1 lines, sum is 3
The average of column 2 is this percent: 2.25/1=2.25

$count staying 1 on each iteration is easy to see. As "man perlrun"
says, the program is effectively

LINE:
while (<>) {
use strict;
my ($sum, $count);
BEGIN { $sum = 0; my $count = 0; }
$sum += $F[1] * 100; ++$count; print "After $count lines, sum is $sum\n";
END { print "The average of column 2 is this percent: $sum/$count=", ($sum / $count), "\n"; }

}

that is, $sum and $count are redeclared and zorched on every loop
iteration.

But I don't know why $sum keeps its value from the FIRST iteration
... maybe $sum being used in BEGIN somehow squirrels away a reference
to that iteration's value of $sum, and that's somehow what END sees,
instead of the last iteration's value of $sum, or $::sum, or sum such?

BEGIN, END, labels and goto's are crutches that should not be part of
scoped languages with if,then,else blocks.
And now somebody is implementing a := operator ala Pascal.
Perl 6 is going to do // operator...
Its just rediculous and totally out of control.
If Perl could have done ANSI, they would have. Its too much.
-sln
 
T

Tim McDaniel

TM> #! /usr/bin/perl -wan
TM> BEGIN { $sum = 0; $count = 0; }

no need to initialize those to 0 as += won't warn when adding to
undef. same is true for ++ and .= .

I like explicit initialization to the correct value, even if the
default works the same way. I don't like touching undef (except with
defined, which doesn't really touch it), even in contexts where it
works.
 
U

Uri Guttman

TM> #! /usr/bin/perl -wan
TM> BEGIN { $sum = 0; $count = 0; }
TM> I like explicit initialization to the correct value, even if the
TM> default works the same way. I don't like touching undef (except with
TM> defined, which doesn't really touch it), even in contexts where it
TM> works.

do you also assign hash/array refs before you add things to them? the
above is the same thing, autovivification. this saves lots of dumb
boring newbie code that does it like other langs, check if initialized
to something and if not initialize, then mung it.

uri
 
D

Diamond, Mark

Many thanks to all who responded. I learned a lot of Perl in the process. In
responding to Alexander Bartolich I mentioned that his early post had
pointed me in the direction of my error. I had incorrectly assumed that
default must be to print something for each input line and that I needed to
supress printing. Silly assumption. I realised after reading AB's answer
that I was printing something accidentally when I did not intend to. ...
Leading me to a different post today ...
Mark
 
T

Tim McDaniel

TM> #! /usr/bin/perl -wan
TM> BEGIN { $sum = 0; $count = 0; }

TM> I like explicit initialization to the correct value, even if
TM> the default works the same way. I don't like touching undef
TM> (except with defined, which doesn't really touch it), even in
TM> contexts where it works.

do you also assign hash/array refs before you add things to them? the
above is the same thing, autovivification. this saves lots of dumb
boring newbie code that does it like other langs, check if
initialized to something and if not initialize, then mung it.

If I omit initialization, it might be because I'm happy with the
default, but alternately, it may be that I've forgotten a necessary
assignment, and it's all too easy to just forget something entirely.
Doing an initialization answers the question (leaving open the
question of whether it's *correct*, of course).

It adds hardly any extra semantic or visual clutter to have
initializations like
my $running_total = 0;
my @bug_numbers = ();
especially because it's already standard to have other assignments
doing things like
my $use_bash = 1;
my @final_states = ('RESOLVED', 'VERIFIED', 'CLOSED');


I had seen the term "autovivification", but never really thought about
it. Now that you mention it, something like
if (! exists $dependencies{$_}) {
$dependencies{$_} = [];
}
push @{$dependencies{$_}}, $bug_number;
*does* add semantic and visual clutter, and a non-trivial chance of
buggy initialization. I think I'll use autovivification in the
future -- though maybe with a comment that I'm using it?

Thank you for the prompting. I write Perl code for me and don't
generally see other people's Perl code, so I don't have the chance to
pick up idioms by osmosis.
 
U

Uri Guttman

TM> #! /usr/bin/perl -wan
TM> BEGIN { $sum = 0; $count = 0; }TM> I like explicit initialization to the correct value, even if
TM> the default works the same way. I don't like touching undef
TM> (except with defined, which doesn't really touch it), even in
TM> contexts where it works.

too much explicit stuff leads to noisy code. knowing that my vars will
be undef by default and will work nicely (under warnings) with some key
ops is intermediate level perl and should be assumed by any decent perl
hacker. it is no different when you use a counter in a hash element
and ++ it without first setting it to 0.

TM> It adds hardly any extra semantic or visual clutter to have
TM> initializations like
TM> my $running_total = 0;
TM> my @bug_numbers = ();

but it does if you do this:

$count{foo} = 0 unless exists $count{foo} ;
$count{foo}++

TM> especially because it's already standard to have other assignments
TM> doing things like
TM> my $use_bash = 1;
TM> my @final_states = ('RESOLVED', 'VERIFIED', 'CLOSED');

i am only talking about coercion of undef to 0 or '' in cases where it
works. as i said this is true for scalars OR hash/array
elements. consistant style is important too.


TM> I had seen the term "autovivification", but never really thought about
TM> it. Now that you mention it, something like
TM> if (! exists $dependencies{$_}) {
TM> $dependencies{$_} = [];

no need for the exists test as the array ref is always true. this means
you could simplify this to:

$dependencies{$_} ||= [];

but as i said, even that is not needed with autovivification.

TM> }
TM> push @{$dependencies{$_}}, $bug_number;
TM> *does* add semantic and visual clutter, and a non-trivial chance of
TM> buggy initialization. I think I'll use autovivification in the
TM> future -- though maybe with a comment that I'm using it?

see my article on autoviv:

http://sysarch.com/Perl/autoviv.txt

no comment should be used. if you know about autoviv your code readers
should know about it too.

TM> Thank you for the prompting. I write Perl code for me and don't
TM> generally see other people's Perl code, so I don't have the chance to
TM> pick up idioms by osmosis.

you IS other people. one thing i always teach is that code is for
people, not for computers.

uri
 
T

Tim McDaniel

if you know about autoviv your code readers should know about it too.

There's no way you can make that generalization. Example: you and me!
You knew about it perfectly well, and I didn't.

For another example, the cow-orker who is most likely to have to pick
up my work doesn't like Perl, with the usual complaints that it's line
noise, it's opaque, et cetera. It is dangerous for me to depend on
subtle features.
 
U

Uri Guttman

TM> There's no way you can make that generalization. Example: you and me!
TM> You knew about it perfectly well, and I didn't.

TM> For another example, the cow-orker who is most likely to have to pick
TM> up my work doesn't like Perl, with the usual complaints that it's line
TM> noise, it's opaque, et cetera. It is dangerous for me to depend on
TM> subtle features.

then you will never improve your perl from kiddie levels. yours is a
common and falacious argument to never use 'advanced' features in any
language. those features are there for a reason. yes, you can enforce a
level of features to code but setting that at the right level is also a
skill to be learned. map/grep confound newbies all the time but are
considered standard ops for mid-level hackers. same for
autivivification. i don't use some bleading edge wierd things myself and
even some simple ones like format (which is never used by experts it
seems and only newbies! there are much better cpan modules for that
anyhow). so you do have to decide who your audience is and code to that
level. i just say mid to high-level is a good target as long as the code
is clean and clear. eschewing common ops from your target is limiting
your coding ability and looking down at your readers.

uri
 
T

Tim McDaniel

TM> There's no way you can make that generalization. Example: you
TM> and me! You knew about it perfectly well, and I didn't.

TM> For another example, the cow-orker who is most likely to have
TM> to pick up my work doesn't like Perl, with the usual complaints
TM> that it's line noise, it's opaque, et cetera. It is dangerous
TM> for me to depend on subtle features.

then you will never improve your perl from kiddie levels. yours is a
common and falacious argument to never use 'advanced' features in any
language.

If you want to come by my office and try to train my technical lead in
Perl advanced features that are not obvious, please try. I suspect
he'll stop wanting to listen fairly quickly regardless of what you say
about Perl. It's not a fallacious argument in my environment -- it is
a constraint that I should worry about.
 
U

Uri Guttman

TM> If you want to come by my office and try to train my technical lead in
TM> Perl advanced features that are not obvious, please try. I suspect
TM> he'll stop wanting to listen fairly quickly regardless of what you say
TM> about Perl. It's not a fallacious argument in my environment -- it is
TM> a constraint that I should worry about.

i am willing to do so if he is willing to pay for it. :)

if you really have such a low level audience then you can code for it. i
did say that but doing so is bad for you and them. you two won't be the
only people ever reading this code. and remember, code is for people,
not computers!

uri
 
B

Bart Lateur

Tim said:
There's no way you can make that generalization. Example: you and me!
You knew about it perfectly well, and I didn't.

For another example, the cow-orker who is most likely to have to pick
up my work doesn't like Perl, with the usual complaints that it's line
noise, it's opaque, et cetera. It is dangerous for me to depend on
subtle features.

Yet it's those subtle features that make Perl to what it is. It's those
features that make Perl popular, with the people who want to come to
YAPC for a hobby.

If you insist on only using features that are also in other languages,
like Java, then maybe you're better of using Java. If you strip down
Perl to the level of Java, then Perl will make a poor Java.

Autovivification is one of those things that make Perl so practical.
 
S

sln

TM> #! /usr/bin/perl -wan
TM> BEGIN { $sum = 0; $count = 0; }

TM> I like explicit initialization to the correct value, even if
TM> the default works the same way. I don't like touching undef
TM> (except with defined, which doesn't really touch it), even in
TM> contexts where it works.

do you also assign hash/array refs before you add things to them? the
above is the same thing, autovivification. this saves lots of dumb
boring newbie code that does it like other langs, check if
initialized to something and if not initialize, then mung it.

If I omit initialization, it might be because I'm happy with the
default, but alternately, it may be that I've forgotten a necessary
assignment, and it's all too easy to just forget something entirely.
Doing an initialization answers the question (leaving open the
question of whether it's *correct*, of course).

It adds hardly any extra semantic or visual clutter to have
initializations like
my $running_total = 0;
my @bug_numbers = ();
especially because it's already standard to have other assignments
doing things like
my $use_bash = 1;
my @final_states = ('RESOLVED', 'VERIFIED', 'CLOSED');


I had seen the term "autovivification", but never really thought about
it. Now that you mention it, something like
if (! exists $dependencies{$_}) {
$dependencies{$_} = [];
}
push @{$dependencies{$_}}, $bug_number;
*does* add semantic and visual clutter, and a non-trivial chance of
buggy initialization. I think I'll use autovivification in the
future -- though maybe with a comment that I'm using it?

Thank you for the prompting. I write Perl code for me and don't
generally see other people's Perl code, so I don't have the chance to
pick up idioms by osmosis.

I wouldn't worry about autovivification. When you need to initialize
a array or hash reference, just do it. Don't worry about if they're
defined or exist. Most people re-use them so they are cleared (re-initialized)
at different places throughout, depending on the scope.

Autovivification on arrays takes place when you read/write from/to an element.
For hashes, its the keys.
The existance of hash keys is a check, but not usually used for initialization.
Clearing/initalization is just done at points where you want it to happen.
Garbage collection takes care of allocation based reference count.

Example:
# Clear, then add elements
@{$dependencies{$_}} = ();
push @{$dependencies{$_}}, $bug_number;

The only thing you have to worry about is if you are dereferencing an uninitialzed value
without context or an initialzed value with the wrong context.
It does seem kind of weird though that a bunch of scalars are initialized and a few aren't
in a block far away from where it is used. But since scalars can be reused, they can
contain different context, so its ambigous to initialize them at that location.
Variable names suck so thats no help.

Don't take all this too seriously. Larger programs have many different needs, and there
are many ways to accomodate them. In reality, you are writing the program for others
to read. Mostly is you but you want to make it easy on yourself later.

-sln

{
no strict;
use warnings;

my $aref; # autov on element Read
print !$aref->[0] ? "aref\) '@{$aref}'\n" : "won't see this\n";

my $bref;
print "bref) '@{$bref}'\n"; # no autov, error undef value in array deref (read)

my $cref;
$cref->[0] = 'hello'; # autov on element Write
print "cref) '@{$cref}'\n";

print "\n";

my %hash; # autov hash keys on Read or Write
print "hash> '@{[%hash]}'\n" if !$hash{'key'};

# autov on element Read
print !$hash{'Akey'}->[0] ? "Akey\) '@{$hash{'Akey'}}'\n" : "won't see this\n";

print "Bkey) '@{$hash{'Bkey'}}'\n"; # no autov, undef value in array deref (read)

$hash{'Ckey'}->[0] = 'hello'; # autov on element Write
print "Ckey) '@{$hash{'Ckey'}}'\n";

my @array = @{$hash{'Dkey'}}; # no autov, error undef value in array deref (read)


@{$hash{'Ekey'}} = (); # same as $hash{'Ekey'} = []
print "Ekey) '@{$hash{'Ekey'}}'\n";
print "$hash{'Ekey'} = ".\@{$hash{'Ekey'}}."\n";
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top