Constructing a value beforehand

X

Xze

Hi everyone,

I'm trying to implement the 'verbose' mode and got stuck.
Here is the requirement:
if verbose mode is on then: $just_hash -> {$key} = $val1.$val2.$val3
if verbose mode is off then: $just_hash -> {$key} = $val3

Now, how do I avoid useless checks 'is verbose on or off' in the loop,
having that this is known before entering the loop (please see the
code)?

This is what i currently have

my %hOpt = qw{-v 1 --verbose 1};
my $bVerbose = 0;
for (@ARGV) {
$hOpt{$_} ? $bVerbose = 1 : usage ("Unknown option: $_");
}

while(<FH>){
chomp;
my @aflds = split/,/;
#Verobse on: $just_hash -> {$key} = $val1.$val2.$val3;
#Verobse off: $just_hash -> {$key} = $val3;
$bVerbose ? $just_hash -> {$key} = ($val1.$val2.$val3) :
$just_hash -> {$key} = ($val3);
}

but this code checks the $bVervose at every iteration which is ugly. I
feel that there should be a perl-ish way to accomplish this
Is it possible to construct a pattern of the value beforehand, based
on $bVerbose?

Please advice,
Thanks
 
R

Rainer Weikusat

Xze said:
I'm trying to implement the 'verbose' mode and got stuck.
Here is the requirement:
if verbose mode is on then: $just_hash -> {$key} = $val1.$val2.$val3
if verbose mode is off then: $just_hash -> {$key} = $val3

Now, how do I avoid useless checks 'is verbose on or off' in the loop,
[...]

my %hOpt = qw{-v 1 --verbose 1};
my $bVerbose = 0;

Do you have other kinds of 'verbosity' beyond b-verbosity? If not,
consider dropping the b.
for (@ARGV) {
$hOpt{$_} ? $bVerbose = 1 : usage ("Unknown option: $_");
}

while(<FH>){
chomp;
my @aflds = split/,/;
#Verobse on: $just_hash -> {$key} = $val1.$val2.$val3;
#Verobse off: $just_hash -> {$key} = $val3;
$bVerbose ? $just_hash -> {$key} = ($val1.$val2.$val3) :
$just_hash -> {$key} = ($val3);
}

but this code checks the $bVervose at every iteration which is ugly.

There are a variety of options but the most straight-forward one (and
the only without a performance penalty) is to move the verbose check
out of the loop and use two loops, one which does the 'verbose' stuff
and one which doesn't. An aesthetically nicer one would be to assign a
reference to a subroutine, roughly like this:

$value = $verbose ? sub { return $val1.$val.2.$val3; } : sub { return $val3; };
while (<FH>) {
 
T

Tim McDaniel

$bVerbose ? $just_hash -> {$key} = ($val1.$val2.$val3) : $just_hash -> {$key} = ($val3);

Just a tactical thing.

The ?: operator works on values just as well. The left-hand sides are
identical, and I believe that, where it's at all convenient, identical
code should be factored out and kept in one place.

So I'd write it as

$just_hash->{$key} = ($bVerbose ? $val1.$val2.$val3 : $val3);

According to "man perlop", I believe that the parentheses are not
needed. However, for operators that I use rarely and where I'm not
sure of the precedence off the top of my head, I tend to
(over)parenthesize just to be certain.
 
T

Tim McDaniel

We disagree, Xze: I think you're being overly fastidious. I don't see
a clean alternative and the inefficiency is trivial.
There are a variety of options but the most straight-forward one (and
the only without a performance penalty) is to move the verbose check
out of the loop and use two loops, one which does the 'verbose' stuff
and one which doesn't.

I think it's very much more important to factor it so that common code
usually occurs in only one place. When maintaining it, you'd have to
read both loops to see that they're close to identical. If you
need to change it you'd have to remember to change it in both places,
and if you don't, Perl won't catch it and you'll likely have a
hard-to-find bug.
An aesthetically nicer one would be to assign a reference to a
subroutine, roughly like this:

$value = $verbose ? sub { return $val1.$val2.$val3; } : sub { return $val3; };
while (<FH>) {
.
.
.
$just_hash->{$key} = $value->();
}

It is clever; I hadn't thought of that possibility.

Then you have to have $val1, $val2, and $val3 defined in the outer
scope, or at least create an extra {...} pair to enclose their
declarations and this code.

More seriously,
$just_hash->{$key} = $value->();
By looking at this line, you can't tell what variables it's using --
it's splitting a lot of important information between two widely
separated parts. You could make it
$just_hash->{$key} = $value->($val1, $val2, $val3);
(which solves the scope problem I mentioned). But it's still not
evident from looking at the line what it does with them -- it's still
splitting information. It's also not that efficient.

I think all these alternatives are much uglier than just checking one
variable in a loop.
 
R

Rainer Weikusat

Just a tactical thing.

The ?: operator works on values just as well. The left-hand sides are
identical, and I believe that, where it's at all convenient, identical
code should be factored out and kept in one place.

So I'd write it as

$just_hash->{$key} = ($bVerbose ? $val1.$val2.$val3 : $val3);

This lends itself to another variant:

my $verbose # Perl

alternatively,

my $verbose = ''; # Chicken-Little-Perl

and then use

$just_hash->{$key} = ($verbose && $val1.$val2).$val3;
 
R

Rainer Weikusat

We disagree, Xze: I think you're being overly fastidious. I don't see
a clean alternative and the inefficiency is trivial.

The amount of additional work this does is proportional to the number
of lines in the input file. And this means it can become arbitrarly
large.

[...]
It is clever; I hadn't thought of that possibility.

Then you have to have $val1, $val2, and $val3 defined in the outer
scope, or at least create an extra {...} pair to enclose their
declarations and this code.

Since it is completely unknown what these $valn are etc because the
original example code didn't contain them anywhere, it is somewhat
pointless to make speculative arguments like the one above: I was
using the same dysfunctional example code that was originally posted
to mean 'whatever is supposed to be done in this case', as it was in
the original.

More seriously,
By looking at this line, you can't tell what variables it's using --
it's splitting a lot of important information between two widely
separated parts.

Indeed: By introducing an abstraction which encapsulates some
irrelevant detail of the original code, this irrelevant detail is
moved out of sight and what remains is the more general loop

while (<FH>) {
 
X

Xze

Is this Hungarian notation? Perl variables already have sigils, they
don't need more prefixes.

The 'b' stands for boolean, a convention used in this particular code
to denote the variable type/value
Also, there's no need for that '= 0'. Newly-declared variables are
initialised to undef, which is a perfecly good false value.


You might want to consider using one of the Getopt modules instead of
rolling your own.

None of the modules will allow using the combination of short and long
forms (e.g. -v and --verbose), AFAIK
while(<FH>){

Don't use global bareword filehandles, use filehandles in lexical
variables. If you'd shown us how you opened the file I could show you
what I mean; something like

    open my $FH, "<", "file" or die ...;
   chomp;
      my @aflds = split/,/;
      #Verobse on: $just_hash -> {$key} = $val1.$val2.$val3;
      #Verobse off: $just_hash -> {$key} = $val3;
      $bVerbose ? $just_hash -> {$key} = ($val1.$val2.$val3) :
$just_hash -> {$key} = ($val3);

Please post complete code. Where do $val1 &c. come from? Do you just
mean @aflds[0..3], or are they supposed to come from outside the loop?
but this code checks the $bVervose at every iteration which is ugly.

It's not bad, actually. If your code really is this simple it may well
be the best option: all the other choices have overhead, and a simple ?:
is neither expensive nor unclear.
I
feel that there should be a perl-ish way to accomplish this
Is it possible to construct a pattern of the value beforehand, based
on $bVerbose?

There are basically two options: duplicate the loop, or use a subref.
The first is more verbose that what you have:

    if ($bVerbose) {
        while (<FH>) {
            ...;
            $just_hash->{$key} = $val1.$val2.$val3;
        }
    }
    else {
        while (<FH>) {
            ...;
            $just_hash->{$key} = $val3;
        }
    }

I'm trying to avoid code duplication as it becomes a headache to
maintain it later, so i'll opt for having only one loop with the
checks inside or without the checks, if a solution can be found
and probably isn't even faster. The second looks something like this:

    my $get_val = $bVerbose
        ? sub { $val1.$val2.$val3 }
        : sub { $val3 };

    while (<FH>) {
        ...;
        $just_hash->{$key} = $get_val->();
    }

unless $valN actually live inside the loop. In that case you have to
pass them in to the subref, so you end up with

    my $get_val = $bVerbose
        ? sub { join "", @_ }
        : sub { $_[2] };

    while (<FH>) {
        ...;
        $just_hash->{$key} = $get_val->($val1, $val2, $val3);
    }

It's nice, but the logic didn't change. You're still performing the
check of $bVerbose at every iteration.

Below is the full code:

sub populate_hash {
my $just_file = shift;
my $just_hash = shift;
open(FH,$just_file) or die "Error reading file $just_file: $!";
while(<FH>){
chomp;
my @aflds = split/,/;
($val1, $val2, $val3) = @aflds[5,6,9];
#Verobse on: $just_hash -> {$key} = $val1.$val2.$val3;
#Verobse off: $just_hash -> {$key} = $val3;
#$bVerbose ? $just_hash -> {$key} = ($val1.$val2.$val3) :
$just_hash -> {$key} = ($val3);
$just_hash -> {$key} = ($val1.$val2) x!! $bVerbose . $val3;
}
close (FH);
}

The ternary operator could be replaced by the following, which is what
i did:
$just_hash -> {$key} = ($val1.$val2) x!! $bVerbose . $val3;

In fact, i don't have any big issue with the code, it works fine.
It's just my personal curiosity..

May be it is possible to eval in a tricky way like this:

# define the pattern before loop
my $val = ($val1.$val2) x!! $bVerbose . $val3;
# and later in the loop:
....
$just_hash -> {$key} = eval $val; #?? <make the assignment happen
again, but with the desired pattern>


Thanks a lot to everyone for spending your time!!
Xze
 
T

Tim McDaniel

use of undef in a boolean context doesn't provoke any warnings.
However, personally I dislike implicit initialization and avoid undef
(except when I need a value distinguishable from all valid values).
So I tend to initialize to an appropriate value, like 0.
None of the modules will allow using the combination of short and long
forms (e.g. -v and --verbose), AFAIK

http://perldoc.perl.org/Getopt/Long.html

It allows alternatives, like
GetOptions ('length|height=f' => \$length);
and by default it allows automatic abbreviation to a unique prefix,
which may be one letter. I haven't tried it, but I'd try
noauto_abbrev to get rid of that automatic shortening, and list
explicit one-character synonyms as needed.
The ternary operator could be replaced by the following, which is what
i did:
$just_hash -> {$key} = ($val1.$val2) x!! $bVerbose . $val3;

*blink* *blink*
Oooookay, that works, but I'd suggest ($verbose && ($val1.$val2)) as
someone else suggested.
May be it is possible to eval in a tricky way like this:

# define the pattern before loop
my $val = ($val1.$val2) x!! $bVerbose . $val3;
# and later in the loop:
...
$just_hash -> {$key} = eval $val; #?? <make the assignment happen
again, but with the desired pattern>

You'd need to have
my $val = 'some code here';
to set it up for that eval. It is ridiculously inefficient to call
the Perl parser on each iteration.

If you want to generate code on the fly, I think it's standard to have
the code be a sub, like
my $code = 'sub { the code you want }';
my $code_ref = eval $code;
loop {
...
... $code_ref->(args)
That way the overhead for the eval is paid only once.
But there's no need for an eval in the situation you posit.
 
R

Rainer Weikusat

[...]
    my $get_val = $bVerbose
        ? sub { $val1.$val2.$val3 }
        : sub { $val3 };

    while (<FH>) {
        ...;
        $just_hash->{$key} = $get_val->();
    }

unless $valN actually live inside the loop. In that case you have to
pass them in to the subref, so you end up with

    my $get_val = $bVerbose
        ? sub { join "", @_ }
        : sub { $_[2] };

    while (<FH>) {
        ...;
        $just_hash->{$key} = $get_val->($val1, $val2, $val3);
    }

It's nice, but the logic didn't change. You're still performing the
check of $bVerbose at every iteration.

This is wrong: The value of $get_val will be a code reference to a
subroutine which performs either the verbose or the non-verbose
operation. Which one it will be is determined at the time of the
assignment. Later on, the code in the loop-body just invokes whatever
subroutine reference was assigned to get_val.

This is actually a generally useful technique for eliminating
comparisons whose outcome is already known at some 'point of call'
(I refuse to use the term 'pattern' for that):
Instead of using switching statement a la

if (...) do_w();
elsif (...) do_x();
elsif (...) do_y();
else do_z();

a variable $operation is introduced and the code in the loop just does
$operation->(). The code dealing with the state changes sets the
value of $operation to something representing the code which is to be
executed in the current state, as opposed to setting some state
variable and selecting one of several code alternatives based on the
current value of that whenever 'the operation associated with the
current state' needs to be performed.
 
M

Martijn Lievaart

(I have always quite wanted an 'else' which would turn 'and' and 'or'
into a ternary construction...)

If you like'm ugly and unreadable:

(this() and (that(),1) or so();

:)
M4
 
X

Xze

The 'b' stands for boolean, a convention used in this particular code
to denote the variable type/value

Yes. That's called 'Hungarian notation'. Particularly in the case of 'h'
for hash and 'a' for array, it is completely redundant with the @ or %
already on the variable. I realise Perl doesn't have a dedicated sigil
for 'boolean', but it's still not useful
None of the modules will allow using the combination of short and long
forms (e.g. -v and --verbose), AFAIK

Getopt::Long will, among others.
I'm trying to avoid code duplication as it becomes a headache to
maintain it later, so i'll opt for having only one loop with the
checks inside or without the checks, if a solution can be found

Yup, that's sensible.
    my $get_val = $bVerbose
        ? sub { join "", @_ }
        : sub { $_[2] };
    while (<FH>) {
        ...;
        $just_hash->{$key} = $get_val->($val1, $val2, $val3);
    }
It's nice, but the logic didn't change. You're still performing the
check of $bVerbose at every iteration.

No I'm not. $bVerbose is checked once, outside the loop, and then
whichever subref we picked is called inside. Maybe it would be clearer
if I wrote it like this?

    my $get_val;
    if ($bVerbose) {
        $get_val = sub { join "", @_ };
    }
    else {
        $get_val = sub { $_[2] };
    }

    while (...) {
        ... = $get_val->(...);
    }
The ternary operator could be replaced by the following, which is what
i did:
$just_hash -> {$key} = ($val1.$val2) x!! $bVerbose . $val3;

Oh, yuck. That's just *nasty*. In any case, it's more work than the
ternary.
In fact, i don't have any big issue with the code, it works fine.
It's just my personal curiosity..
May be it is possible to eval in a tricky way like this:
# define the pattern before loop
my $val = ($val1.$val2) x!! $bVerbose . $val3;
# and later in the loop:
...
$just_hash -> {$key} = eval $val; #?? <make the assignment happen
again, but with the desired pattern>

This is pretty-much exactly what I was doing with the anon subs above,
but without needing to resort to 'eval'. I could have written it like
this:

    my $get_val = $bVerbose
        ? '$val1.$val2.$val3'   # note single quotes!
        : '$val3';

    while (...) {
        ... = eval $get_val;
    }

which works exactly the same, except that the anon subs are a whole lot
cleaner and safer.

Ben

Thanks for the detailed explanations!

Indeed the following snippet:
my $get_val = $bVerbose
? sub { join "", @_ }
: sub { $_[2] };

does what i was looking for, somehow i missed that in your first post.
But it is not as fast as ternary operator inside the loop :)

Thanks everyone for spending your time!
Cheers,
Xze
 
X

Xze

The 'b' stands for boolean, a convention used in this particular code
to denote the variable type/value

Yes. That's called 'Hungarian notation'. Particularly in the case of 'h'
for hash and 'a' for array, it is completely redundant with the @ or %
already on the variable. I realise Perl doesn't have a dedicated sigil
for 'boolean', but it's still not useful
None of the modules will allow using the combination of short and long
forms (e.g. -v and --verbose), AFAIK

Getopt::Long will, among others.
I'm trying to avoid code duplication as it becomes a headache to
maintain it later, so i'll opt for having only one loop with the
checks inside or without the checks, if a solution can be found

Yup, that's sensible.
    my $get_val = $bVerbose
        ? sub { join "", @_ }
        : sub { $_[2] };
    while (<FH>) {
        ...;
        $just_hash->{$key} = $get_val->($val1, $val2, $val3);
    }
It's nice, but the logic didn't change. You're still performing the
check of $bVerbose at every iteration.

No I'm not. $bVerbose is checked once, outside the loop, and then
whichever subref we picked is called inside. Maybe it would be clearer
if I wrote it like this?

    my $get_val;
    if ($bVerbose) {
        $get_val = sub { join "", @_ };
    }
    else {
        $get_val = sub { $_[2] };
    }

    while (...) {
        ... = $get_val->(...);
    }
The ternary operator could be replaced by the following, which is what
i did:
$just_hash -> {$key} = ($val1.$val2) x!! $bVerbose . $val3;

Oh, yuck. That's just *nasty*. In any case, it's more work than the
ternary.
In fact, i don't have any big issue with the code, it works fine.
It's just my personal curiosity..
May be it is possible to eval in a tricky way like this:
# define the pattern before loop
my $val = ($val1.$val2) x!! $bVerbose . $val3;
# and later in the loop:
...
$just_hash -> {$key} = eval $val; #?? <make the assignment happen
again, but with the desired pattern>

This is pretty-much exactly what I was doing with the anon subs above,
but without needing to resort to 'eval'. I could have written it like
this:

    my $get_val = $bVerbose
        ? '$val1.$val2.$val3'   # note single quotes!
        : '$val3';

    while (...) {
        ... = eval $get_val;
    }

which works exactly the same, except that the anon subs are a whole lot
cleaner and safer.

Ben

Thanks for detailed explanations!
Indeed, the following snippet:

my $get_val = $bVerbose
? sub { join "", @_ }
: sub { $_[2] };

does what i was looking for, but ternary operator turned to be faster
anyway :)

Thank you for spending your time
Cheers,
Xze
 
P

Peter J. Holzer

Is this Hungarian notation? Perl variables already have sigils, they
don't need more prefixes.

While I don't use Hungarian notation myself (I think it's ugly, the
wrong way around, and not much use in practice), I disagree:

Perl sigils are no adequate substitute for the type prefixes in
Hungarian notations. Ignoring the fact that sigils don't even really
denote the type of a variable, there are only three of them, and even in
Systems Hungarian notation there are many more types which can be
distinguished:

$bVerbose (Verbose is a boolean)
$nObjects (Objects is a count, i.e., a non-negative integer)
$fLength (Length is floating point number)
$bsParam (Param is a byte string, you still need to decode() it)
$csParam (Param is a character string)
$dogSpot (Spot is object of class Animals::Dog)

If you go into Application Hungarian Notation (the original kind
invented by Charles Simonyi, which I find a lot more useful than systems
HN), you distinguish logical types, for example:

my $lenWidth = 5; # a length
my $lenHeight = 3; # another length
my $arRect; # an area

$arRect = $lenWidth * $lenHeight; # ok. an Area is the product of
# two lengths

if ($arRect > $lenHeight) # not ok. You can't directly compare an
# area and a length

Or if you have to use lengths in inches and centimeters, you could
encode the type to avoid crashing your space probe ...

hp
 
D

Dr.Ruud

Perl sigils are no adequate substitute for the type prefixes in
Hungarian notations. Ignoring the fact that sigils don't even really
denote the type of a variable, there are only three of them, and even in
Systems Hungarian notation there are many more types which can be
distinguished:

$bVerbose (Verbose is a boolean)
$nObjects (Objects is a count, i.e., a non-negative integer)
$fLength (Length is floating point number)
$bsParam (Param is a byte string, you still need to decode() it)
$csParam (Param is a character string)
$dogSpot (Spot is object of class Animals::Dog)

In Perl, the sigil tells you about the structure (or count) of the
variable (or expression).

Perl is a strongly typed language. The datatype is in the operators.


my $elements = @array; # normally equal to $#array + 1

my $array = \@array; # take a reference


In p5p there is an auto-dereference thread, that wants to make

push $array, @elements;

be syntactic sugar for

push @$array, @elements;


I opposed that, by pointing at the consequence that

my $array = \@array;

should then mean that $array contains 2 values:
1. the number of elements (to be returned in numeric context), and
2. the array-reference (to be returned in scalar context, specifically
in auto-dereference context)

Be aware that in Perl it is normal for a variable to have multiple
values at the same time.
 
P

Peter J. Holzer

In Perl, the sigil tells you about the structure (or count) of the
variable (or expression).

I'm not sure whether you are agreeing or disagreeing with me, and
what this has to do with Hungarian notation.

Perl is a strongly typed language. The datatype is in the operators.

I am aware that no two people agree on what "strongly typed" means
exactly, but I'm sure any definition of the term which includes Perl is
far from the mainstream.

In p5p there is an auto-dereference thread, that wants to make

push $array, @elements;

be syntactic sugar for

push @$array, @elements;

AFAIK that has already happened (5.14, I think).

I opposed that, by pointing at the consequence that

my $array = \@array;

should then mean that $array contains 2 values:
1. the number of elements (to be returned in numeric context), and
2. the array-reference (to be returned in scalar context, specifically
in auto-dereference context)

I don't see how that is a necessary consequence of allowing push to
autodereference its first parameter if is a scalar. Many builtins (and
user-defined subs, too) allow different parameters.

Even if autodereference is used in a lot more contexts, I think it's a
bit weird to say that $array contains the number of elements in case 1.
It makes more sense to say that $array is autodereferenced in a numeric
context, so that ($array + 0) is equivalent to (@$array + 0). That would
also be mostly backwards compatible I think, since a reference in
numeric context returns no usable value. I would want an exception for
== and !=, though: Comparing references for equality is useful.

Autodereferencing array references in string context is probably not a
good idea, however: Printing it as "ARRAY(0x8182818)" is too useful for
debugging.

Be aware that in Perl it is normal for a variable to have multiple
values at the same time.

Yes, although you can't currently have a reference and something else in
a scalar, and the different values you can put into a scalar (PV, NV,
IV) are normally just different representations of the same value
(yes, I know about Scalar::Util::dualvar).

hp
 
P

Peter J. Holzer

Is this Hungarian notation? Perl variables already have sigils, they
don't need more prefixes.

While I don't use Hungarian notation myself (I think it's ugly, the
wrong way around, and not much use in practice), I disagree:

Perl sigils are no adequate substitute for the type prefixes in
Hungarian notations. Ignoring the fact that sigils don't even really
denote the type of a variable, there are only three of them, and even in
Systems Hungarian notation there are many more types which can be
distinguished: [...]
If you go into Application Hungarian Notation (the original kind
invented by Charles Simonyi, which I find a lot more useful than systems
HN), you distinguish logical types, for example:

I just stumbled across this article by Joel Spolsky about Hungarian
notation (especially about the advantages of Apps Hungarian):

http://www.joelonsoftware.com/articles/Wrong.html

It's well worth reading and thinking about.

hp
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top