Hi Chris,
I had been looking into treads and process/subprocess myself a while ago
and couldn't decide which would suit what I needed to do best. I'm still
very confused about the whole thing. Can you elaborate on the above a bit
please?
Cheers,
Jack
Threads and processes are a concept that exists in your operating
system, and Python can use either of them to advantage, depending on the
problem. Note that different OS also handle them differently, so code
that's optimal on one system might not be as optimal on another. Still,
some generalities can be made.
Each process is a separate program, with its own address space and its
own file handles, etc. You can examine them separately with task
manager, for example. If you launch multiple processes, they might not
even all have to be python, so if one problem can be handled by an
existing program, just run it as a separate process. Processes are
generally very protected from each other, and the OS is generally better
at scheduling them than it is at scheduling threads within a single
process. If you have multiple cores, the processes can really run
simultaneously, frequently with very small overhead. The downside is
that you cannot share variables between processes without extra work, so
if the two tasks are very interdependent, it's more of a pain to use
separate processes.
Within one process, you can have multiple threads. On some OS, and in
some languages, this can be extremely efficient. Some programs launch
hundreds of threads, and use them to advantage. By default, it's easy
to share data between threads, since they're in the same address space.
But the downsides are 1) it's very easy to trash another thread by
walking on its variables. 2) Python does a lousy job of letting threads
work independently. For CPU-bound tasks, using separate threads is
likely to be slower than just doing it all in one thread.