Why is this so much faster?

K

Keir Rice

Hi All,

The following function was showing up in my profiles as a large bottle neck:

# Slow version
def RMSBand(self, histogram):
"""Calculates the root-mean-squared value for the given colour stream histogram."""
intermediateResult = map(lambda (i, h): h*(i**2), zip(histogram, range(255)))
totalSum = sum(intermediateResult)

# calculate rms
return math.sqrt(totalSum / self.Area())

So after a bit of trial and error I came up with the following function which is a lot faster:

# Fast version
def RMSBand(self, histogram):
"""Calculates the root-mean-squared value for the given colour stream histogram."""
totalSum = 0
for i, h in enumerate(histogram):
totalSum += h*(i**2)

# calculate rms
return math.sqrt(totalSum / self.Area())

My question is why is the second method so much faster?
Is it the removal of the extra function calls?
Is it skipping the creation of a list?
Do the basic arithmetic operators get pushed up into C code?

Any insights into the whats really happening behind the scenes would be appreciated.

Keir
 
T

Terry Reedy

Hi All,

The following function was showing up in my profiles as a large bottle neck:

# Slow version
def RMSBand(self, histogram):
"""Calculates the root-mean-squared value for the given colour stream histogram."""
intermediateResult = map(lambda (i, h): h*(i**2), zip(histogram, range(255)))
totalSum = sum(intermediateResult)

# calculate rms
return math.sqrt(totalSum / self.Area())

So after a bit of trial and error I came up with the following function which is a lot faster:

# Fast version
def RMSBand(self, histogram):
"""Calculates the root-mean-squared value for the given colour stream histogram."""
totalSum = 0
for i, h in enumerate(histogram):
totalSum += h*(i**2)
# calculate rms
return math.sqrt(totalSum / self.Area())

My question is why is the second method so much faster?
Is it the removal of the extra function calls?

Yes. Map is only 'fast' when one already has a function and is going to
call it repeatedly regardless of the other code. When one has an
expression, wrapping it as a function to use map is surely slower. Have
you tried

return math.sqrt(sum([h*i*i for i,h in enumerate(histogram)])
/ self.Area())

or same without [] brackets?

i*i should be faster than i**2 in any version.
Is it skipping the creation of a list?

A bit.

See Tim's response.
 
I

Ian Kelly

First of all, do you have the parameters to the lambda the wrong way around
in the map() version? zip(histogram, range(255)) will return (histogram
value, index), but enumerate(histogram) will return (index, histogram
value). But the parameters haven't been swapped, resulting in the histogram
value being squared in the map version, but the index being squared in the
manual summing version. Depending on the values, this could result in a
large performance increase in the second version (if the total value exceeds
the maximum size of your platform's "long" type).

In addition to what Tim said, I question whether you should even be
multiplying by the index at all. The usual definition of "root mean
square" is this:

math.sqrt(sum(x * x for x in values) / len(values))

I don't know the problem that you're trying to solve, though, so maybe
I am just being confused by your terminology.

Cheers,
Ian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,701
Latest member
XavierQ83

Latest Threads

Top