Using reference for performance gain?

H

howa

hi,

consider the following codes...

it is interested that there is no performance gain when we use return
by reference...or is that perl already optimized them for us?

#--------------------------------------------------------------------
sub test {
my %h = (
"a" => "asdsaefsayd7asdsdsdsdsdsdsatd7as6fdsa76smndfkjtpyty",
"b" => "vdfdgdgregrehgregdsa6td76satd7as6fdsa76smndfdfdgfhf",
"c" => "zxczxrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76hthth",
"d" => "jhthfgdsadsadhsayd7asysgdsa6td76satd7as6fdsahrhrhrh",
"e" => "sfeghhfhsadsadhsayd7asysgdsa6td76satd7as6fdsahththt",
"f" => "jgjgjgsadsadhsayd7asysgdsa6td76satd7as6fdseretttttg",
"g" => "ekreorrsadsadhsayd7asysgdsa6td76satd7as6fdsadgdgggg",
"h" => "sadsadhsayd7asysgdsa6td76satd7as6fdsa76smndfkjdfdfd",
"i" => "mhjghhtsadsadhsayd7asysgdsa6td76satd7ahghghrrtrtrrt",
"j" => "ewtyhrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76smndf",
"k" => "ykiyuiyutdhsayd7asysgdsa6td76satd7as6fdsa76smndfkjd"
);

# return %h;
return \%h;
}

my $startTime = time();

for ($count=1; $count<500000; $count++) {
my $tmp = test();
}

my $endTime = time();
print $endTime - $startTime;
 
A

anno4000

howa said:
hi,

consider the following codes...

it is interested that there is no performance gain when we use return
by reference...or is that perl already optimized them for us?

For timing Perl code there is the standard module Benchmark.
#--------------------------------------------------------------------
sub test {
my %h = (
"a" => "asdsaefsayd7asdsdsdsdsdsdsatd7as6fdsa76smndfkjtpyty",
"b" => "vdfdgdgregrehgregdsa6td76satd7as6fdsa76smndfdfdgfhf",
"c" => "zxczxrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76hthth",
"d" => "jhthfgdsadsadhsayd7asysgdsa6td76satd7as6fdsahrhrhrh",
"e" => "sfeghhfhsadsadhsayd7asysgdsa6td76satd7as6fdsahththt",
"f" => "jgjgjgsadsadhsayd7asysgdsa6td76satd7as6fdseretttttg",
"g" => "ekreorrsadsadhsayd7asysgdsa6td76satd7as6fdsadgdgggg",
"h" => "sadsadhsayd7asysgdsa6td76satd7as6fdsa76smndfkjdfdfd",
"i" => "mhjghhtsadsadhsayd7asysgdsa6td76satd7ahghghrrtrtrrt",
"j" => "ewtyhrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76smndf",
"k" => "ykiyuiyutdhsayd7asysgdsa6td76satd7as6fdsa76smndfkjd"
);

# return %h;
return \%h;
}

my $startTime = time();

for ($count=1; $count<500000; $count++) {
my $tmp = test();
}

my $endTime = time();
print $endTime - $startTime;

Your benchmarks don't tell much about the actual performance of
returning a list vs. returning a reference. For one, you are building
the hash on each call to the routine. That is going to swamp out
other performance differences. Secondly, you are assigning the list
that is returned in one case to a scalar. That means that only one
scalar assignment is actually done, the others will be more or less
efficiently optimized away.

Here is a better benchmark:

use Benchmark qw( cmpthese);

my %h = (
"a" => "asdsaefsayd7asdsdsdsdsdsdsatd7as6fdsa76smndfkjtpyty",
"b" => "vdfdgdgregrehgregdsa6td76satd7as6fdsa76smndfdfdgfhf",
"c" => "zxczxrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76hthth",
"d" => "jhthfgdsadsadhsayd7asysgdsa6td76satd7as6fdsahrhrhrh",
"e" => "sfeghhfhsadsadhsayd7asysgdsa6td76satd7as6fdsahththt",
"f" => "jgjgjgsadsadhsayd7asysgdsa6td76satd7as6fdseretttttg",
"g" => "ekreorrsadsadhsayd7asysgdsa6td76satd7as6fdsadgdgggg",
"h" => "sadsadhsayd7asysgdsa6td76satd7as6fdsa76smndfkjdfdfd",
"i" => "mhjghhtsadsadhsayd7asysgdsa6td76satd7ahghghrrtrtrrt",
"j" => "ewtyhrhsadsadhsayd7asysgdsa6td76satd7as6fdsa76smndf",
"k" => "ykiyuiyutdhsayd7asysgdsa6td76satd7as6fdsa76smndfkjd"
);


sub list { %h }
sub ref { \ %h }

cmpthese -1, {
list_ret => 'my %x = list()',
ref_ret => 'my $x = ref()',
};

....which prints

Rate list_ret ref_ret
list_ret 61837/s -- -98%
ref_ret 3633678/s 5776% --

In other words, in this case returning a reference is about 60 times
faster than returning the list.

Anno
 
H

howa

(e-mail address removed)-berlin.de 寫é“:
Your benchmarks don't tell much about the actual performance of
returning a list vs. returning a reference. For one, you are building
the hash on each call to the routine. That is going to swamp out
other performance differences.

i have considered this, but this just reflect the real world suitation.

Secondly, you are assigning the list
that is returned in one case to a scalar. That means that only one
scalar assignment is actually done, the others will be more or less
efficiently optimized away.

yes, you are right.

thanks.
 
U

Uri Guttman

h> (e-mail address removed)-berlin.de

h> i have considered this, but this just reflect the real world suitation.

what real world? if you are building constant hashes in each call, then
your real world is very slow. and as anno said, it will swamp out any
return overhead.

you need to show real world code if you want to get real world
optimizations.

uri
 
H

howa

Uri Guttman 寫é“:
h> (e-mail address removed)-berlin.de


h> i have considered this, but this just reflect the real world suitation.

what real world? if you are building constant hashes in each call, then
your real world is very slow. and as anno said, it will swamp out any
return overhead.

in real world, complex data are always generated and returning from a
function, and this is the reason why we need to consider using
reference instead.

of coz purely comparing the speed of reference & value might be a
factor of 100, but when used in real world suitation, this might be
just a factor of 2 as you must have some min. overhead of other things
else.

thanks anyway.
 
U

Uri Guttman

h> Uri Guttman 寫é“:

h> in real world, complex data are always generated and returning from a
h> function, and this is the reason why we need to consider using
h> reference instead.

h> of coz purely comparing the speed of reference & value might be a
h> factor of 100, but when used in real world suitation, this might be
h> just a factor of 2 as you must have some min. overhead of other things
h> else.

you arent' getting it but i can't think of any way to explain it to
you. optimization is a skill in itself and it requires understanding of
when things happen. loading large CONSTANT hashes inside a sub is a
killer of cpu power. it also will distort any benchmarks you are
doing. as for the real world, the whole point of a benchmark is to try
to simulate real world conditions. your benchmark was useless for that
and for isolating whether returning a hash or a reference was faster
(besides it being broken in how it returned stuff).

uri
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,989
Messages
2,570,207
Members
46,782
Latest member
ThomasGex

Latest Threads

Top