python reliability with EINTR handling in general modules

  • Thread starter oleg korenevich
  • Start date
O

oleg korenevich

I have linux board on samsung SoC s3c6410 (ARM11). I build rootfs with
buildroot: Python 2.7.1, uClibc-0.9.31. Linux kernel: Linux buildroot
2.6.28.6 #177 Mon Oct 3 12:50:57 EEST 2011 armv6l GNU/Linux

My app, written on python, in some mysterios conditons raise this
exceptions:

1) exception:

File "./dfbUtils.py", line 3209, in setItemData
ValueError: (4, 'Interrupted system call')
code:

currentPage=int(math.floor(float(rowId)/
self.pageSize))==self.selectedPage
2) exception:

File "./terminalGlobals.py", line 943, in getFirmawareName
OSError: [Errno 4] Interrupted system call: 'firmware'
code:

for fileName in os.listdir('firmware'):
Some info about app: it have 3-7 threads, listen serial ports via
'serial' module, use gui implemented via c extension that wrap
directfb, i can't reproduce this exceptions, they are not predictable.

I googled for EINTR exceptions in python, but only found that EINTR
can occur only on slow system calls and python's modules socket,
subprocess and another one is already process EINTR. So what happens
in my app? Why simple call of math function can interrupt program at
any time, it's not reliable at all. I have only suggestions: ulibc
bug, kernel/hw handling bug. But this suggestions don't show me
solution.

Now i created wrap functions (that restart opertion in case of EINTR)
around some functions from os module, but wrapping math module will
increase execution time in 2 times. There another question: if math
can be interrutped than other module also can and how to get
reliability?
 
D

Dennis Lee Bieber

I have linux board on samsung SoC s3c6410 (ARM11). I build rootfs with
buildroot: Python 2.7.1, uClibc-0.9.31. Linux kernel: Linux buildroot
2.6.28.6 #177 Mon Oct 3 12:50:57 EEST 2011 armv6l GNU/Linux

My app, written on python, in some mysterios conditons raise this
exceptions:

1) exception:

File "./dfbUtils.py", line 3209, in setItemData
ValueError: (4, 'Interrupted system call')
code:

currentPage=int(math.floor(float(rowId)/
self.pageSize))==self.selectedPage
2) exception:

File "./terminalGlobals.py", line 943, in getFirmawareName
OSError: [Errno 4] Interrupted system call: 'firmware'
code:

for fileName in os.listdir('firmware'):
Some info about app: it have 3-7 threads, listen serial ports via
'serial' module, use gui implemented via c extension that wrap
directfb, i can't reproduce this exceptions, they are not predictable.

I googled for EINTR exceptions in python, but only found that EINTR
can occur only on slow system calls and python's modules socket,
subprocess and another one is already process EINTR. So what happens
in my app? Why simple call of math function can interrupt program at
any time, it's not reliable at all. I have only suggestions: ulibc
bug, kernel/hw handling bug. But this suggestions don't show me
solution.

I see nothing in your traceback that indicates that the interrupt
occurred in the math library call -- unless you deleted that line. In
the first one, I'd be more likely to suspect your C extension/wrapper...
(are the fields .pageSize and .selectedPage coming from an object
implemented in C?)

As for the math stuff... I presume both rowID and .pageSize are
constrained to be 0 or positive integers. If that is the case, invoking
math.floor() is just redundant overhead as the documented behavior of
int() is to truncate towards 0, which for a positive value, is the same
as floor()

In the second case... Well, os.listdir() is most likely translated
into some operating system call.

http://www.gnu.org/software/libc/manual/html_node/Interrupted-Primitives.html

And, while that call is waiting for I/O to complete, some sort of signal
is being received.
 
O

oleg korenevich

I have linux board on samsung SoC s3c6410 (ARM11). I build rootfs with
buildroot: Python 2.7.1, uClibc-0.9.31. Linux kernel: Linux buildroot
2.6.28.6 #177 Mon Oct 3 12:50:57 EEST 2011 armv6l GNU/Linux
My app, written on python, in some mysterios conditons raise this
exceptions:
1) exception:
File "./dfbUtils.py", line 3209, in setItemData
ValueError: (4, 'Interrupted system call')
code:
currentPage=int(math.floor(float(rowId)/
self.pageSize))==self.selectedPage
2) exception:
File "./terminalGlobals.py", line 943, in getFirmawareName
OSError: [Errno 4] Interrupted system call: 'firmware'
code:
for fileName in os.listdir('firmware'):
Some info about app: it have 3-7 threads, listen serial ports via
'serial' module, use gui implemented via c extension that wrap
directfb, i can't reproduce this exceptions, they are not predictable.
I googled for EINTR exceptions in python, but only found that EINTR
can occur only on slow system calls and python's modules socket,
subprocess and another one is already process EINTR. So what happens
in my app? Why simple call of math function can interrupt program at
any time, it's not reliable at all. I have only suggestions: ulibc
bug, kernel/hw handling bug. But this suggestions don't show me
solution.

        I see nothing in your traceback that indicates that the interrupt
occurred in the math library call -- unless you deleted that line. In
the first one, I'd be more likely to suspect your C extension/wrapper...
(are the fields .pageSize and .selectedPage coming from an object
implemented in C?)

        As for the math stuff... I presume both rowID and .pageSize are
constrained to be 0 or positive integers. If that is the case, invoking
math.floor() is just redundant overhead as the documented behavior of
int() is to truncate towards 0, which for a positive value, is the same
as floor()


3.0

        In the second case... Well, os.listdir() is most likely translated
into some operating system call.

http://www.gnu.org/software/libc/manual/html_node/Interrupted-Primiti...

And, while that call is waiting for I/O to complete, some sort of signal
is being received.

Thanks for help. In first case all vars is python integers, maybe
math.floor is redundant, but i'm afraid that same error with math
module call will occur in other places of app, where math is needed.
Strange thing here is that math library call is not a system call, and
strange exception ValueError (all values have right values) and why in
braces i have (4, Interruted system call).

For second case: if python really does some slow system call from
module os, why it doesn't handle EINTR and not restart call. Is
SA_RESTART flag in signal can be solution? But how i can set this
flag? By placing flag for signal handler in c extension (or ctypes
manipulation)?
 
D

Dennis Lee Bieber

Thanks for help. In first case all vars is python integers, maybe
math.floor is redundant, but i'm afraid that same error with math
module call will occur in other places of app, where math is needed.
Strange thing here is that math library call is not a system call, and
strange exception ValueError (all values have right values) and why in
braces i have (4, Interruted system call).
math.floor() may still be a system call of some sort if access to
the math processor requires synchronization between processes (that is,
the math processor/registers are maintained as a separate structure
apart from the task status during process switches). {Yes -- that is a
wild hypothesis}

Or perhaps you haven't enabled the full math support in the runtime
-- from http://uclibc.org/FAQ.html
In other cases, uClibc leaves certain features (such as full C99
Math library support, wordexp, IPV6, and RPC support) disabled by
default.

FYI: uClibc 9.33 has been released -- might be worth checking the change
logs to see if something might apply to the problems.
For second case: if python really does some slow system call from
module os, why it doesn't handle EINTR and not restart call. Is
SA_RESTART flag in signal can be solution? But how i can set this
flag? By placing flag for signal handler in c extension (or ctypes
manipulation)?

Note that it is identifying "firmware" as the system call (If I
understand the nature of the traceback). Is "firmware" a real file
system directory or some virtual entry that translates into scanning the
hardware/ROM for information. (If it isn't obvious, I'm not an OS
internals expert -- though I was able to get around in DEC (Open)VMS
back in the 90s).

What has me curious is: WHAT signal is being seen that is
interrupting the system call.
 
M

Mel Wilson

Dennis said:
math.floor() may still be a system call of some sort if access to
the math processor requires synchronization between processes (that is,
the math processor/registers are maintained as a separate structure
apart from the task status during process switches). {Yes -- that is a
wild hypothesis}

One thing to remember about errno is that C library code will set it to a
non-zero value when an error is encountered, but (I believe) there's no
requirement to clear it in the absence of an error. EINTR might just be
left over from some long-gone I/O call, then reported "just in case" in
handling an exception that didn't involve the C library at all.

As a C coder there are times when it's wise to clear errno yourself to make
sure your code doesn't get fooled.

Mel.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,733
Latest member
LonaMonzon

Latest Threads

Top