random array elements and speed

J

John Bokma

Peter J. Holzer said:
I'm sure that linker was quite up to date when Simonyi wrote his paper
:).

Do you have a problem with the past tense? I wrote "the original
proposal made at lot of sense".

Because of the limitations of a linker? In that case, the original
proposal (which I haven't read, so I might be wrong) should have
included that the system is a work around for a linker issue.
But it's easier to read when it does.

If it does, it's short, and has a few variables which probably don't
need a lot of explanation in their names.

The advantage of splitting a non-trivial formula over several lines is
that it can be documented per line

A formula in a text book is read in the context of the page it's on. It
often has some explanation of all variables. It assumes you read the
context.

With programming, I prefer to have the context as close to the real
thing as possible, and if I can make things clear by formatting my code,
and picking good names (etc), I prefer that over a comment block.
In the non-trivial case I need documentation anyway. Code only shows
how something is done but not why it is done that way.

Picking good names, and a good structure however, makes this
documentation shorter, and easier to maintain. Add to this that a lot of
code is not documented or not in sync, and picking good names becomes
even more important.


--
John Bokma Freelance software developer

&
Experienced Perl programmer:
http://castleamber.com/
 
P

Peter J. Holzer

Dr.Ruud said:
Peter J. Holzer schreef:


Often is doesn't, like:

if ( /^BEGIN/ ) {...}

if ( index( $_, 'BEGIN' ) == 0 ) {...}

if ( substr( $_, 0, 5 ) eq 'BEGIN' ) {...}

I don't get your argument here.
but in a more complex algorithm, you can make a lot of the why-s show
through by choosing good variable names etc., and of course by
including appropriate comments. (But maybe you meant code without
comments?)

Yes. I mentioned that I preferred to use short variable names and
comments near their declaration, and John objected to that on the
grounds that comments and code could get out of sync.

(I might mention that variable names and their usage can get out of
sync, too).

hp
 
P

Peter J. Holzer

John said:
Because of the limitations of a linker?

There are two aspects to the proposal.

The main aspect (in my opinion) was that the "type" (as determined by
usage of the variable, NOT the type system of the language) of the
variable should be indicated in the variable name. So if a variable is a
length it should indicate this in the name (and if you have lengths in
cm and inches in your program, you should indicate that too in your
variable names, lest your space probes miss their targets). That part is
still valid, I think. Especially in a typeless language like perl.
(BTW, BCPL was a typeless language, too)

The second aspect is the specific form: Single- or double-character
lower case prefixes before an upper case variable name. This aspect is I
think an artifact of the development environment he was using at the
time: Long variable names simply were impractical then, and if you've
ever used a VT100 terminal, you know why most unix commands are so short
:). This aspect is IMHO obsolete.
In that case, the original proposal (which I haven't read, so I might
be wrong) should have included that the system is a work around for a
linker issue.

I read that paper about 15 years ago and I don't remember whether it
included that or not. It may have. But even if it hasn't that's hardly a
fair critique. It was written at a time when severe limits on identifier
names were *normal*. It wasn't a workaround for the limitations of a
specific linker. Most linkers and programming languages had limitations
on the length of their identifiers, and a proposal which would have
required 30 character variable names would have been laughed out of the
door because nobody would have been able to use it. You have to consider
the time when a paper was written, especially in a fast-moving field
like IT.
If it does, it's short, and has a few variables which probably don't
need a lot of explanation in their names.

No, I'm specifically talking about long formulas. A short formula is
easy to understand whether it takes half a line or three. But a long
formula is a lot easier to understand if it fits on a line or is broken
into a few lines according to the structure of the formula than if has
to be split onto 20 lines in completely arbitrary places just fit
horizontally into the editor window.

The advantage of splitting a non-trivial formula over several lines is
that it can be documented per line

Another reason to use short variable names: Leaves more space for your
inline comments.

A formula in a text book is read in the context of the page it's on.
It often has some explanation of all variables. It assumes you read
the context.

A formula in a program must be read in the context of the function it's
in. Explanations of the variables can be included at the declaration of
the variables. Scope should be used to minimize the context that the
programmer has to consider.
With programming, I prefer to have the context as close to the real
thing as possible, and if I can make things clear by formatting my
code, and picking good names (etc), I prefer that over a comment
block.

I agree with that completely. I am just arguing that with variable
names, "long" does not equal "good". A variable name should be long
enough to make the purpose of the variable clear, but no longer.

As an example, consider a loop like

for my $i (1 .. $#src - 1) {
$result[$i] = (sort { $a <=> $b } @src[$i - 1 .. $i + 1])[1];
}

vs.

for my $source_array_counter (1 .. $#source_array - 1) {
$result_array[$source_array_counter] = (sort { $a <=> $b }
@source_array[$source_array_counter - 1 .. $source_array_counter + 1]
[1];
}

Using $source_array_counter instead of $i for the loop variable does not
make the code more readable, quite the contrary.

hp
 
D

Dr.Ruud

Peter J. Holzer schreef:
Dr.Ruud:

I don't get your argument here.

It is just an example of when the "why" is not shown. Those tree lines
are alternatives.

Why someone chooses one way over another, is often not layed down in a
comment. I would prefer the first, and would use the second maybe if no
regexes are used anywhere else in the whole code-file, or maybe if it is
very frequently called.
 
J

John Bokma

Peter J. Holzer said:
John Bokma wrote:

[ ..]
The main aspect (in my opinion) was that the "type" (as determined by
usage of the variable, NOT the type system of the language) of the
variable should be indicated in the variable name. So if a variable is
a length it should indicate this in the name (and if you have lengths
in cm and inches in your program, you should indicate that too in your
variable names, lest your space probes miss their targets). That part
is still valid, I think.

Yes, but that is just sound variable naming. As long as it's not a rule
one *must* follow, no matter what, it's fine with me. If I have a
variable containing a length of something, I call it length. If from the
context it's not clear what length, I use what_length, with "what"
making it clear.
The second aspect is the specific form: Single- or double-character
lower case prefixes before an upper case variable name. This aspect is
I think an artifact of the development environment he was using at the
time: Long variable names simply were impractical then, and if you've
ever used a VT100 terminal, you know why most unix commands are so
short
:). This aspect is IMHO obsolete.

Yes (terminal), and yes, agreed.

No, I'm specifically talking about long formulas. A short formula is
easy to understand whether it takes half a line or three. But a long
formula is a lot easier to understand if it fits on a line or is
broken into a few lines according to the structure of the formula than
if has to be split onto 20 lines in completely arbitrary places just
fit horizontally into the editor window.

Then don't split arbitrary :) You need quite some long names to make
that happen nowadays anyway.
Another reason to use short variable names: Leaves more space for your
inline comments.

Uhm, it's a bit silly to do things like:

* a # multiply with the area of the floor

Moreover, this sounds like a contradiction of: you can fit more on one
line if you use short names.
A formula in a program must be read in the context of the function
it's in. Explanations of the variables can be included at the
declaration of the variables.

With clear variable names, this is less needed, of course.

my $a = 40; # default area of the floor in m2

v.s.

my $floor_area_m2 = 40;
Scope should be used to minimize the context that the
programmer has to consider.

And hence you don't want (well I don't) to have several very short named
variables, with comment to the left, and then a formula which requires
scrolling up and down to understand each variable.
I agree with that completely. I am just arguing that with variable
names, "long" does not equal "good". A variable name should be long
enough to make the purpose of the variable clear, but no longer.

Yup, agree. That's why I have problems with Hungarian, no matter what
system.

As an example, consider a loop like

for my $i (1 .. $#src - 1) {
$result[$i] = (sort { $a <=> $b } @src[$i - 1 .. $i + 1])[1];
}

vs.

for my $source_array_counter (1 .. $#source_array - 1) {
$result_array[$source_array_counter] = (sort { $a <=> $b }
@source_array[$source_array_counter - 1 .. $source_array_counter + 1]
[1];
}

Using $source_array_counter instead of $i for the loop variable does
not make the code more readable, quite the contrary.

I probably would use:

$result[ $_ ] = ( sort { $a <=> $b } @src[ $_ - 1 .. $_ + 1 ] )[ 1 ]
for 1 .. $#src - 1;

( and maybe write it differently anyway )

Anyway, I think we agree on many things :)

--
John Bokma Freelance software developer

&
Experienced Perl programmer:
http://castleamber.com/
 
D

David Combs

It is not really my method, though. Now, if I could just find that book
;-)

Maybe you mean his 2nd book, "MORE programming perls", in which
the 13th chapter ("column"), "A sample of brilliance", talks
about Floyd's Algorithm, and then abour Random Permutations
(pg 142)? (Not that I've read it!)

Also, maybe 10 years ago or so, ACM's "Computing Surveys"
had a maybe 50-page survey-article on generating combinations,
permutations, etc.

David
 
D

David Combs

JJ> Actually, what I've found is that in this context swap/pop is faster
JJ> than splice for large arrays but slower for small arrays. I have no
JJ> explanantion.

that isn't hard to explain at all. splice is a single internal perl op
and swap/pop is several perl ops. a very rough cost estimate of perl
speed is how many perl ops are executed since the op dispatch loop is
the big overhead compared to many of the actual operations. so for short
arrays, splice will be 1 perl op and move a small number of elements
(which is done in c). when the size gets larger the element move takes
longer. the swap/pop is a fixed amount of work for any size array but it

What other changes to make when worrying about too many GCs, ie to
minimize the eating up of the heap?

David
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,186
Messages
2,570,998
Members
47,587
Latest member
JohnetteTa

Latest Threads

Top