malv said:
How would you approach the following?
In a multithreaded realtime data acquisition system (all python v2.4),
after hours of running without a snag, without warning python hangs at
once without leaving any error message or error traceback at all. After
the incident, the OS itself (linux) appears to remain fully functional.
1. Define "hangs" more precisely. It's rare that an app can hang
without any behaviour that can tell you something more about it. For
example, can you Ctrl-C out of it? Does it have sockets open still? Is
it using any CPU time? etc... While you're at it, define "realtime"
too, since many people use it incorrectly and it might mean something
different to you than it does to everyone else reading it...
2. Use logging. "import logging" and proceed from there... without
logging, you have no idea where the problem occurred, and are going to
spend lots of time guessing where the cause is.
3. Make sure you are using Queues exclusively for inter-thread
communication.
4. Consider whether your external source of data (assuming there is one)
is the cause. How are you interfacing to it? Serial port? ctypes?
Something SWIGged for Python? A true "hang" is much more likely in an
external package than in Python code.
5. Tell us more about the platform. Which Linux (including version),
does it have a GUI, what's the basic architecture of the app as it
relates to the threading and data acquisition stuff, etc. (Of course,
the problem might not come from threads or data acquisition, but until
we learn more who could say?)
-Peter