Constructing a value beforehand

Xze · Feb 23, 2012

Hi everyone,

I'm trying to implement the 'verbose' mode and got stuck.
Here is the requirement:
if verbose mode is on then: $just_hash -> {$key} = $val1.$val2.$val3
if verbose mode is off then: $just_hash -> {$key} = $val3

Now, how do I avoid useless checks 'is verbose on or off' in the loop,
having that this is known before entering the loop (please see the
code)?

This is what i currently have

my %hOpt = qw{-v 1 --verbose 1};
my $bVerbose = 0;
for (@ARGV) {
$hOpt{$_} ? $bVerbose = 1 : usage ("Unknown option: $_");
}

while(<FH>){
chomp;
my @aflds = split/,/;
#Verobse on: $just_hash -> {$key} = $val1.$val2.$val3;
#Verobse off: $just_hash -> {$key} = $val3;
$bVerbose ? $just_hash -> {$key} = ($val1.$val2.$val3) :
$just_hash -> {$key} = ($val3);
}

but this code checks the $bVervose at every iteration which is ugly. I
feel that there should be a perl-ish way to accomplish this
Is it possible to construct a pattern of the value beforehand, based
on $bVerbose?

Please advice,
Thanks

Rainer Weikusat · Feb 23, 2012

Xze said:
I'm trying to implement the 'verbose' mode and got stuck.
Here is the requirement:
if verbose mode is on then: $just_hash -> {$key} = $val1.$val2.$val3
if verbose mode is off then: $just_hash -> {$key} = $val3

Now, how do I avoid useless checks 'is verbose on or off' in the loop,
[...]

my %hOpt = qw{-v 1 --verbose 1};
my $bVerbose = 0;

Do you have other kinds of 'verbosity' beyond b-verbosity? If not,
consider dropping the b.

for (@ARGV) {
$hOpt{$_} ? $bVerbose = 1 : usage ("Unknown option: $_");
}

while(<FH>){
chomp;
my @aflds = split/,/;
#Verobse on: $just_hash -> {$key} = $val1.$val2.$val3;
#Verobse off: $just_hash -> {$key} = $val3;
$bVerbose ? $just_hash -> {$key} = ($val1.$val2.$val3) :
$just_hash -> {$key} = ($val3);
}

but this code checks the $bVervose at every iteration which is ugly.

There are a variety of options but the most straight-forward one (and
the only without a performance penalty) is to move the verbose check
out of the loop and use two loops, one which does the 'verbose' stuff
and one which doesn't. An aesthetically nicer one would be to assign a
reference to a subroutine, roughly like this:

$value = $verbose ? sub { return $val1.$val.2.$val3; } : sub { return $val3; };
while (<FH>) {

Tim McDaniel · Feb 23, 2012

$bVerbose ? $just_hash -> {$key} = ($val1.$val2.$val3) : $just_hash -> {$key} = ($val3);

Just a tactical thing.

The ?: operator works on values just as well. The left-hand sides are
identical, and I believe that, where it's at all convenient, identical
code should be factored out and kept in one place.

So I'd write it as

$just_hash->{$key} = ($bVerbose ? $val1.$val2.$val3 : $val3);

According to "man perlop", I believe that the parentheses are not
needed. However, for operators that I use rarely and where I'm not
sure of the precedence off the top of my head, I tend to
(over)parenthesize just to be certain.

Tim McDaniel · Feb 23, 2012

We disagree, Xze: I think you're being overly fastidious. I don't see
a clean alternative and the inefficiency is trivial.

There are a variety of options but the most straight-forward one (and
the only without a performance penalty) is to move the verbose check
out of the loop and use two loops, one which does the 'verbose' stuff
and one which doesn't.

I think it's very much more important to factor it so that common code
usually occurs in only one place. When maintaining it, you'd have to
read both loops to see that they're close to identical. If you
need to change it you'd have to remember to change it in both places,
and if you don't, Perl won't catch it and you'll likely have a
hard-to-find bug.

An aesthetically nicer one would be to assign a reference to a
subroutine, roughly like this:

$value = $verbose ? sub { return $val1.$val2.$val3; } : sub { return $val3; };
while (<FH>) {
.
.
.
$just_hash->{$key} = $value->();
}

It is clever; I hadn't thought of that possibility.

Then you have to have $val1, $val2, and $val3 defined in the outer
scope, or at least create an extra {...} pair to enclose their
declarations and this code.

More seriously,

$just_hash->{$key} = $value->();

By looking at this line, you can't tell what variables it's using --
it's splitting a lot of important information between two widely
separated parts. You could make it

$just_hash->{$key} = $value->($val1, $val2, $val3);

(which solves the scope problem I mentioned). But it's still not
evident from looking at the line what it does with them -- it's still
splitting information. It's also not that efficient.

I think all these alternatives are much uglier than just checking one
variable in a loop.

Rainer Weikusat · Feb 23, 2012

Just a tactical thing.

The ?: operator works on values just as well. The left-hand sides are
identical, and I believe that, where it's at all convenient, identical
code should be factored out and kept in one place.

So I'd write it as

$just_hash->{$key} = ($bVerbose ? $val1.$val2.$val3 : $val3);

This lends itself to another variant:

my $verbose # Perl

alternatively,

my $verbose = ''; # Chicken-Little-Perl

and then use

$just_hash->{$key} = ($verbose && $val1.$val2).$val3;

Rainer Weikusat · Feb 23, 2012

We disagree, Xze: I think you're being overly fastidious. I don't see
a clean alternative and the inefficiency is trivial.

The amount of additional work this does is proportional to the number
of lines in the input file. And this means it can become arbitrarly
large.

[...]

It is clever; I hadn't thought of that possibility.

Then you have to have $val1, $val2, and $val3 defined in the outer
scope, or at least create an extra {...} pair to enclose their
declarations and this code.

Since it is completely unknown what these $valn are etc because the
original example code didn't contain them anywhere, it is somewhat
pointless to make speculative arguments like the one above: I was
using the same dysfunctional example code that was originally posted
to mean 'whatever is supposed to be done in this case', as it was in
the original.

More seriously,
By looking at this line, you can't tell what variables it's using --
it's splitting a lot of important information between two widely
separated parts.

Indeed: By introducing an abstraction which encapsulates some
irrelevant detail of the original code, this irrelevant detail is
moved out of sight and what remains is the more general loop

while (<FH>) {

Xze · Feb 23, 2012

Is this Hungarian notation? Perl variables already have sigils, they
don't need more prefixes.

The 'b' stands for boolean, a convention used in this particular code
to denote the variable type/value

Also, there's no need for that '= 0'. Newly-declared variables are
initialised to undef, which is a perfecly good false value.

You might want to consider using one of the Getopt modules instead of
rolling your own.

None of the modules will allow using the combination of short and long
forms (e.g. -v and --verbose), AFAIK

while(<FH>){

Click to expand...

Don't use global bareword filehandles, use filehandles in lexical
variables. If you'd shown us how you opened the file I could show you
what I mean; something like

open my $FH, "<", "file" or die ...;

chomp;
my @aflds = split/,/;
#Verobse on: $just_hash -> {$key} = $val1.$val2.$val3;
#Verobse off: $just_hash -> {$key} = $val3;
$bVerbose ? $just_hash -> {$key} = ($val1.$val2.$val3) :
$just_hash -> {$key} = ($val3);

Click to expand...

Please post complete code. Where do $val1 &c. come from? Do you just
mean @aflds[0..3], or are they supposed to come from outside the loop?

}

Click to expand...

but this code checks the $bVervose at every iteration which is ugly.

Click to expand...

It's not bad, actually. If your code really is this simple it may well
be the best option: all the other choices have overhead, and a simple ?:
is neither expensive nor unclear.

I
feel that there should be a perl-ish way to accomplish this
Is it possible to construct a pattern of the value beforehand, based
on $bVerbose?

Click to expand...

There are basically two options: duplicate the loop, or use a subref.
The first is more verbose that what you have:

if ($bVerbose) {
while (<FH>) {
...;
$just_hash->{$key} = $val1.$val2.$val3;
}
}
else {
while (<FH>) {
...;
$just_hash->{$key} = $val3;
}
}

I'm trying to avoid code duplication as it becomes a headache to
maintain it later, so i'll opt for having only one loop with the
checks inside or without the checks, if a solution can be found

and probably isn't even faster. The second looks something like this:

my $get_val = $bVerbose
? sub { $val1.$val2.$val3 }
: sub { $val3 };

while (<FH>) {
...;
$just_hash->{$key} = $get_val->();
}

unless $valN actually live inside the loop. In that case you have to
pass them in to the subref, so you end up with

my $get_val = $bVerbose
? sub { join "", @_ }
: sub { $_[2] };

while (<FH>) {
...;
$just_hash->{$key} = $get_val->($val1, $val2, $val3);
}

It's nice, but the logic didn't change. You're still performing the
check of $bVerbose at every iteration.

Ben

Below is the full code:

sub populate_hash {
my $just_file = shift;
my $just_hash = shift;
open(FH,$just_file) or die "Error reading file $just_file: $!";
while(<FH>){
chomp;
my @aflds = split/,/;
($val1, $val2, $val3) = @aflds[5,6,9];
#Verobse on: $just_hash -> {$key} = $val1.$val2.$val3;
#Verobse off: $just_hash -> {$key} = $val3;
#$bVerbose ? $just_hash -> {$key} = ($val1.$val2.$val3) :
$just_hash -> {$key} = ($val3);
$just_hash -> {$key} = ($val1.$val2) x!! $bVerbose . $val3;
}
close (FH);
}

The ternary operator could be replaced by the following, which is what
i did:
$just_hash -> {$key} = ($val1.$val2) x!! $bVerbose . $val3;

In fact, i don't have any big issue with the code, it works fine.
It's just my personal curiosity..

May be it is possible to eval in a tricky way like this:

# define the pattern before loop
my $val = ($val1.$val2) x!! $bVerbose . $val3;
# and later in the loop:
....
$just_hash -> {$key} = eval $val; #?? <make the assignment happen
again, but with the desired pattern>

Thanks a lot to everyone for spending your time!!
Xze

Tim McDaniel · Feb 23, 2012

use of undef in a boolean context doesn't provoke any warnings.
However, personally I dislike implicit initialization and avoid undef
(except when I need a value distinguishable from all valid values).
So I tend to initialize to an appropriate value, like 0.

None of the modules will allow using the combination of short and long
forms (e.g. -v and --verbose), AFAIK

http://perldoc.perl.org/Getopt/Long.html

It allows alternatives, like
GetOptions ('length|height=f' => \$length);
and by default it allows automatic abbreviation to a unique prefix,
which may be one letter. I haven't tried it, but I'd try
noauto_abbrev to get rid of that automatic shortening, and list
explicit one-character synonyms as needed.

The ternary operator could be replaced by the following, which is what
i did:
$just_hash -> {$key} = ($val1.$val2) x!! $bVerbose . $val3;

*blink* *blink*
Oooookay, that works, but I'd suggest ($verbose && ($val1.$val2)) as
someone else suggested.

May be it is possible to eval in a tricky way like this:

# define the pattern before loop
my $val = ($val1.$val2) x!! $bVerbose . $val3;
# and later in the loop:
...
$just_hash -> {$key} = eval $val; #?? <make the assignment happen
again, but with the desired pattern>

You'd need to have
my $val = 'some code here';
to set it up for that eval. It is ridiculously inefficient to call
the Perl parser on each iteration.

If you want to generate code on the fly, I think it's standard to have
the code be a sub, like
my $code = 'sub { the code you want }';
my $code_ref = eval $code;
loop {
...
... $code_ref->(args)
That way the overhead for the eval is paid only once.
But there's no need for an eval in the situation you posit.

Rainer Weikusat · Feb 23, 2012

[...]

my $get_val = $bVerbose
? sub { $val1.$val2.$val3 }
: sub { $val3 };

while (<FH>) {
...;
$just_hash->{$key} = $get_val->();
}

unless $valN actually live inside the loop. In that case you have to
pass them in to the subref, so you end up with

my $get_val = $bVerbose
? sub { join "", @_ }
: sub { $_[2] };

while (<FH>) {
...;
$just_hash->{$key} = $get_val->($val1, $val2, $val3);
}

Click to expand...

It's nice, but the logic didn't change. You're still performing the
check of $bVerbose at every iteration.

This is wrong: The value of $get_val will be a code reference to a
subroutine which performs either the verbose or the non-verbose
operation. Which one it will be is determined at the time of the
assignment. Later on, the code in the loop-body just invokes whatever
subroutine reference was assigned to get_val.

This is actually a generally useful technique for eliminating
comparisons whose outcome is already known at some 'point of call'
(I refuse to use the term 'pattern' for that):
Instead of using switching statement a la

if (...) do_w();
elsif (...) do_x();
elsif (...) do_y();
else do_z();

a variable $operation is introduced and the code in the loop just does
$operation->(). The code dealing with the state changes sets the
value of $operation to something representing the code which is to be
executed in the current state, as opposed to setting some state
variable and selecting one of several code alternatives based on the
current value of that whenever 'the operation associated with the
current state' needs to be performed.

Martijn Lievaart · Feb 23, 2012

(I have always quite wanted an 'else' which would turn 'and' and 'or'
into a ternary construction...)

If you like'm ugly and unreadable:

(this() and (that(),1) or so();

M4

Xze · Feb 24, 2012

The 'b' stands for boolean, a convention used in this particular code
to denote the variable type/value

Click to expand...

Yes. That's called 'Hungarian notation'. Particularly in the case of 'h'
for hash and 'a' for array, it is completely redundant with the @ or %
already on the variable. I realise Perl doesn't have a dedicated sigil
for 'boolean', but it's still not useful

None of the modules will allow using the combination of short and long
forms (e.g. -v and --verbose), AFAIK

Click to expand...

Getopt::Long will, among others.

I'm trying to avoid code duplication as it becomes a headache to
maintain it later, so i'll opt for having only one loop with the
checks inside or without the checks, if a solution can be found

Click to expand...

Yup, that's sensible.

my $get_val = $bVerbose
? sub { join "", @_ }
: sub { $_[2] };
while (<FH>) {
...;
$just_hash->{$key} = $get_val->($val1, $val2, $val3);
}

Click to expand...

Click to expand...

It's nice, but the logic didn't change. You're still performing the
check of $bVerbose at every iteration.

Click to expand...

No I'm not. $bVerbose is checked once, outside the loop, and then
whichever subref we picked is called inside. Maybe it would be clearer
if I wrote it like this?

my $get_val;
if ($bVerbose) {
$get_val = sub { join "", @_ };
}
else {
$get_val = sub { $_[2] };
}

while (...) {
... = $get_val->(...);
}

The ternary operator could be replaced by the following, which is what
i did:
$just_hash -> {$key} = ($val1.$val2) x!! $bVerbose . $val3;

Click to expand...

Oh, yuck. That's just *nasty*. In any case, it's more work than the
ternary.

In fact, i don't have any big issue with the code, it works fine.
It's just my personal curiosity..

Click to expand...

May be it is possible to eval in a tricky way like this:

Click to expand...

# define the pattern before loop
my $val = ($val1.$val2) x!! $bVerbose . $val3;
# and later in the loop:
...
$just_hash -> {$key} = eval $val; #?? <make the assignment happen
again, but with the desired pattern>

Click to expand...

This is pretty-much exactly what I was doing with the anon subs above,
but without needing to resort to 'eval'. I could have written it like
this:

my $get_val = $bVerbose
? '$val1.$val2.$val3' # note single quotes!
: '$val3';

while (...) {
... = eval $get_val;
}

which works exactly the same, except that the anon subs are a whole lot
cleaner and safer.

Ben

Thanks for the detailed explanations!

Indeed the following snippet:
my $get_val = $bVerbose
? sub { join "", @_ }
: sub { $_[2] };

does what i was looking for, somehow i missed that in your first post.
But it is not as fast as ternary operator inside the loop

Thanks everyone for spending your time!
Cheers,
Xze

Xze · Feb 24, 2012

The 'b' stands for boolean, a convention used in this particular code
to denote the variable type/value

Click to expand...

Yes. That's called 'Hungarian notation'. Particularly in the case of 'h'
for hash and 'a' for array, it is completely redundant with the @ or %
already on the variable. I realise Perl doesn't have a dedicated sigil
for 'boolean', but it's still not useful

None of the modules will allow using the combination of short and long
forms (e.g. -v and --verbose), AFAIK

Click to expand...

Getopt::Long will, among others.

I'm trying to avoid code duplication as it becomes a headache to
maintain it later, so i'll opt for having only one loop with the
checks inside or without the checks, if a solution can be found

Click to expand...

Yup, that's sensible.

my $get_val = $bVerbose
? sub { join "", @_ }
: sub { $_[2] };
while (<FH>) {
...;
$just_hash->{$key} = $get_val->($val1, $val2, $val3);
}

Click to expand...

Click to expand...

It's nice, but the logic didn't change. You're still performing the
check of $bVerbose at every iteration.

Click to expand...

No I'm not. $bVerbose is checked once, outside the loop, and then
whichever subref we picked is called inside. Maybe it would be clearer
if I wrote it like this?

my $get_val;
if ($bVerbose) {
$get_val = sub { join "", @_ };
}
else {
$get_val = sub { $_[2] };
}

while (...) {
... = $get_val->(...);
}

The ternary operator could be replaced by the following, which is what
i did:
$just_hash -> {$key} = ($val1.$val2) x!! $bVerbose . $val3;

Click to expand...

Oh, yuck. That's just *nasty*. In any case, it's more work than the
ternary.

In fact, i don't have any big issue with the code, it works fine.
It's just my personal curiosity..

Click to expand...

May be it is possible to eval in a tricky way like this:

Click to expand...

# define the pattern before loop
my $val = ($val1.$val2) x!! $bVerbose . $val3;
# and later in the loop:
...
$just_hash -> {$key} = eval $val; #?? <make the assignment happen
again, but with the desired pattern>

Click to expand...

This is pretty-much exactly what I was doing with the anon subs above,
but without needing to resort to 'eval'. I could have written it like
this:

my $get_val = $bVerbose
? '$val1.$val2.$val3' # note single quotes!
: '$val3';

while (...) {
... = eval $get_val;
}

which works exactly the same, except that the anon subs are a whole lot
cleaner and safer.

Ben

Thanks for detailed explanations!
Indeed, the following snippet:

my $get_val = $bVerbose
? sub { join "", @_ }
: sub { $_[2] };

does what i was looking for, but ternary operator turned to be faster
anyway

Thank you for spending your time
Cheers,
Xze

Peter J. Holzer · Feb 25, 2012

Is this Hungarian notation? Perl variables already have sigils, they
don't need more prefixes.

While I don't use Hungarian notation myself (I think it's ugly, the
wrong way around, and not much use in practice), I disagree:

Perl sigils are no adequate substitute for the type prefixes in
Hungarian notations. Ignoring the fact that sigils don't even really
denote the type of a variable, there are only three of them, and even in
Systems Hungarian notation there are many more types which can be
distinguished:

$bVerbose (Verbose is a boolean)
$nObjects (Objects is a count, i.e., a non-negative integer)
$fLength (Length is floating point number)
$bsParam (Param is a byte string, you still need to decode() it)
$csParam (Param is a character string)
$dogSpot (Spot is object of class Animals:

og)

If you go into Application Hungarian Notation (the original kind
invented by Charles Simonyi, which I find a lot more useful than systems
HN), you distinguish logical types, for example:

my $lenWidth = 5; # a length
my $lenHeight = 3; # another length
my $arRect; # an area

$arRect = $lenWidth * $lenHeight; # ok. an Area is the product of
# two lengths

if ($arRect > $lenHeight) # not ok. You can't directly compare an
# area and a length

Or if you have to use lengths in inches and centimeters, you could
encode the type to avoid crashing your space probe ...

hp

Dr.Ruud · Feb 25, 2012

Perl sigils are no adequate substitute for the type prefixes in
Hungarian notations. Ignoring the fact that sigils don't even really
denote the type of a variable, there are only three of them, and even in
Systems Hungarian notation there are many more types which can be
distinguished:

$bVerbose (Verbose is a boolean)
$nObjects (Objects is a count, i.e., a non-negative integer)
$fLength (Length is floating point number)
$bsParam (Param is a byte string, you still need to decode() it)
$csParam (Param is a character string)
$dogSpot (Spot is object of class Animals:og)

In Perl, the sigil tells you about the structure (or count) of the
variable (or expression).

Perl is a strongly typed language. The datatype is in the operators.

my $elements = @array; # normally equal to $#array + 1

my $array = \@array; # take a reference

In p5p there is an auto-dereference thread, that wants to make

push $array, @elements;

be syntactic sugar for

push @$array, @elements;

I opposed that, by pointing at the consequence that

my $array = \@array;

should then mean that $array contains 2 values:
1. the number of elements (to be returned in numeric context), and
2. the array-reference (to be returned in scalar context, specifically
in auto-dereference context)

Be aware that in Perl it is normal for a variable to have multiple
values at the same time.

Peter J. Holzer · Feb 26, 2012

In Perl, the sigil tells you about the structure (or count) of the
variable (or expression).

I'm not sure whether you are agreeing or disagreeing with me, and
what this has to do with Hungarian notation.

Perl is a strongly typed language. The datatype is in the operators.

I am aware that no two people agree on what "strongly typed" means
exactly, but I'm sure any definition of the term which includes Perl is
far from the mainstream.

In p5p there is an auto-dereference thread, that wants to make

push $array, @elements;

be syntactic sugar for

push @$array, @elements;

AFAIK that has already happened (5.14, I think).

I opposed that, by pointing at the consequence that

my $array = \@array;

should then mean that $array contains 2 values:
1. the number of elements (to be returned in numeric context), and
2. the array-reference (to be returned in scalar context, specifically
in auto-dereference context)

I don't see how that is a necessary consequence of allowing push to
autodereference its first parameter if is a scalar. Many builtins (and
user-defined subs, too) allow different parameters.

Even if autodereference is used in a lot more contexts, I think it's a
bit weird to say that $array contains the number of elements in case 1.
It makes more sense to say that $array is autodereferenced in a numeric
context, so that ($array + 0) is equivalent to (@$array + 0). That would
also be mostly backwards compatible I think, since a reference in
numeric context returns no usable value. I would want an exception for
== and !=, though: Comparing references for equality is useful.

Autodereferencing array references in string context is probably not a
good idea, however: Printing it as "ARRAY(0x8182818)" is too useful for
debugging.

Be aware that in Perl it is normal for a variable to have multiple
values at the same time.

Yes, although you can't currently have a reference and something else in
a scalar, and the different values you can put into a scalar (PV, NV,
IV) are normally just different representations of the same value
(yes, I know about Scalar::Util::dualvar).

hp

Peter J. Holzer · Feb 26, 2012

Is this Hungarian notation? Perl variables already have sigils, they
don't need more prefixes.

Click to expand...

While I don't use Hungarian notation myself (I think it's ugly, the
wrong way around, and not much use in practice), I disagree:

Perl sigils are no adequate substitute for the type prefixes in
Hungarian notations. Ignoring the fact that sigils don't even really
denote the type of a variable, there are only three of them, and even in
Systems Hungarian notation there are many more types which can be
distinguished: [...]
If you go into Application Hungarian Notation (the original kind
invented by Charles Simonyi, which I find a lot more useful than systems
HN), you distinguish logical types, for example:

I just stumbled across this article by Joel Spolsky about Hungarian
notation (especially about the advantages of Apps Hungarian):

http://www.joelonsoftware.com/articles/Wrong.html

It's well worth reading and thinking about.

hp

Trying to define a generic template operator for all container classes	3	Jun 23, 2013
How do I get address of scalars?	1	Jan 13, 2013
Trouble accessing a value within a JSON string.	1	Jun 16, 2023
Anyone got a good one-liner?	14	Mar 3, 2004
Undeclared Identifier... but why?	6	Dec 18, 2009
multiple assignments with a HoHoA	6	Sep 17, 2004
Shortest way of constructing array of members of a class?	1	Jun 30, 2012
htaccess rewriterule in combination with POST form	2	Jan 27, 2005

Constructing a value beforehand

Xze

Rainer Weikusat

Tim McDaniel

Tim McDaniel

Rainer Weikusat

Rainer Weikusat

Xze

Tim McDaniel

Rainer Weikusat

Martijn Lievaart

Xze

Xze

Peter J. Holzer

Dr.Ruud

Peter J. Holzer

Peter J. Holzer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads