Parallel processing on shared data structures

P

psaffrey

I'm filing 160 million data points into a set of bins based on their
position. At the moment, this takes just over an hour using interval
trees. I would like to parallelise this to take advantage of my quad
core machine. I have some experience of Parallel Python, but PP seems
to only really work for problems where you can do one discrete bit of
processing and recombine these results at the end.

I guess I could thread my code and use mutexes to protect the shared
lists that everybody is filing into. However, my understanding is that
Python is still only using one process so this won't give me multi-
core.

Does anybody have any suggestions for this?

Peter
 
M

MRAB

I'm filing 160 million data points into a set of bins based on their
position. At the moment, this takes just over an hour using interval
trees. I would like to parallelise this to take advantage of my quad
core machine. I have some experience of Parallel Python, but PP seems
to only really work for problems where you can do one discrete bit of
processing and recombine these results at the end.

I guess I could thread my code and use mutexes to protect the shared
lists that everybody is filing into. However, my understanding is that
Python is still only using one process so this won't give me multi-
core.

Does anybody have any suggestions for this?
Could you split your data set and run multiple instances of the script
at the same time and then merge the corresponding lists?
 
H

Hendrik van Rooyen

I'm filing 160 million data points into a set of bins based on their
position. At the moment, this takes just over an hour using interval

So why do you not make four sets of bins - one for each core of your quad,
and split the points into quarters, and run four processes, and merge the
results
later?

This assumes that it is the actual filing process that is the bottle neck,
and that the bins are just sets, where position, etc does not matter.

If it takes an hour just to read the input, then nothing you can do
will make it better.

- Hendrik
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,738
Latest member
JinaMacvit

Latest Threads

Top