efficient running median

J

Janto Dreijer

Some numbers:

10.197 seconds for running_median_scipy_medfilt
25.043 seconds for running_median_python
13.040 seconds for running_median_python_msort
14.280 seconds for running_median_python_scipy_median
4.024 seconds for running_median_numpy
0.221 seconds for running_median_insort

What would be an acceptable performance, by the way?

That's great!
Well, the faster it works, the better. It means I can process more
data before getting frustrated. So if you have a faster version I'd
like to see it :)

Thankyou!
Janto
 
E

Ethan Furman

Janto said:
Very nice! I assume you mean I can use it to quickly insert items into
the sliding window?

Thanks
Janto

I'm afraid I can't help any further. Going from your post, I thought a
quicker list implementation might be useful, but beyond that I have no
knowledge to share.

Who said ignorance is bliss? *hangs head*

~Ethan~
 
R

Raymond Hettinger

[Janto Dreijer]
I found a PDF by Soumya D. Mohanty entitled "Efficient Algorithm for
computing a Running Median" (2003) by Googling. It has code snippets
at the end, but it's not going to be a simple cut-and-paste job. It
will take some work figuring out the missing parts.

See http://code.activestate.com/recipes/576930/ for working Python
code
adapted from that paper.


Raymond
 
J

Janto Dreijer

[Janto Dreijer]
I found a PDF by Soumya D. Mohanty entitled "Efficient Algorithm for
computing a Running Median" (2003) by Googling. It has code snippets
at the end, but it's not going to be a simple cut-and-paste job. It
will take some work figuring out the missing parts.

Seehttp://code.activestate.com/recipes/576930/for working Python
code
adapted from that paper.

Raymond

Wow! Thanks. I'll have a look.

Janto
 
D

denis

Folks,
approximate medians -- would you settle for 49 - 51 % ? --
open up new possibilities, and there's quite a lot of work on that,
for huuuge datasets.

A class of problems:
from a data stream X1 X2 ... you want, every so often,
a histogram / quantile summary / distribution estimator such that
H approximates the distribution of X, so
median of H(t) ~ median of some of X.
Some parameters of this class:
NH, size of H: 100 => H(p) ~ pth percentile of X
A, how often you want H
window, drop or age old data
runtime, memory of course
accuracy -- in theory, in practice

A histogram viewer in which you can change buckets, colors, zoom,
in REALTIME sounds tantalizing -- fast loop >> fast algorithm + slow
loop.
Anyone know of one off-the -shelf, python or anything else ?

Refs:
Manku et al., Approximate medians in one pass with limited memory,
1998, 10p
under http://infolab.stanford.edu/~manku/papers.html
nice tree pictures
they optimize mem (NH * Nbuf) not runtime, and don't window

Suri +, Quantiles on Streams, 2006, 5p, http://www.cs.ucsb.edu/~suri/psdir/ency.pdf
~ 20 refs zzz

cheers
-- denis
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,184
Messages
2,570,978
Members
47,561
Latest member
gjsign

Latest Threads

Top