Threading problem

S

sdistefano

I have the following issue:

My program runs a thread called the MainThread, that loops trough a
number of URLs and decides when it's the right time for one to be
fetched. Once a URL has to be fetched, it's added to a Queue object,
where the FetchingThread picks up and does the actual work. Often,
URLs have to be fetched with frequencies of 100ms, so the objects will
get added to the queue repeatedly. Even though it takes more than
100ms to get the URL and process it, ideally what would happen is:
ms0: Request 1 is sent
ms100: request 2 is sent
ms150: request 1 is processed
ms200: request 3 is sent
ms250: request 2 is processed

and so on... The problem is that for some reason python runs the main
thread considerably more than the checking thread. If I ask them both
to 'print' when run, this becomes obvious ; even if I create more
check threads than main threads (I even tried 50 check threads and 1
main thread). Despite my terminology, both the mainthread and
fetchthread are created in exactly the same way.

what part of threading in python am I not properly understanding?

thanks!
 
P

Patrick Maupin

I have the following issue:

My program runs a thread called the MainThread, that loops trough a
number of URLs and decides when it's the right time for one to be
fetched.  Once a URL has to be fetched, it's added to a Queue object,
where the FetchingThread picks up and does the actual work. Often,
URLs have to be fetched with frequencies of 100ms, so the objects will
get added to the queue repeatedly. Even though it takes more than
100ms to get the URL and process it, ideally what would happen is:
ms0: Request 1 is sent
ms100: request 2 is sent
ms150: request 1 is processed
ms200: request 3 is sent
ms250: request 2 is processed

and so on... The problem is that for some reason python runs the main
thread considerably more than the checking thread. If I ask them both
to 'print' when run, this becomes obvious ; even if I create more
check threads than main threads (I even tried 50 check threads and 1
main thread). Despite my terminology, both the mainthread and
fetchthread are created in exactly the same way.

what part of threading in python am I not properly understanding?

Unless I'm missing something, your description doesn't make this sound
like either a python-specific problem, or a threading problem. Threads
run when it's their turn and they aren't blocked, and you haven't
described any code that would ever block your main thread, but your
subsidiary threads will often be blocked at a socket waiting for their
HTTP requests to complete.
 
D

Dennis Lee Bieber

what part of threading in python am I not properly understanding?
The part where you supply a minimal functioning example of the code
that demonstrates the problem.

You don't state what type of processing is done on the fetched
data... After all, the minimal fetcher I can imagine would consist of:

while True:
url = urlQueue.get()
(fname, headers) = urllib.urlretrieve(url)

Also consider that 100msec is 0.1 seconds -- it is quite possible
that the remote end can not serve the URL content in that short a time
span. (Ignoring TCP/IP overhead, on my connection, 0.1 second would
allow for no more than 18kBytes IF I obtained maximum speed in the
transfer -- I've got HTML files that are easily 9-15kB in size)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,692
Latest member
JenniferTi

Latest Threads

Top