D
Dave Roberts
Okay, so I'm trying to write a server application that runs on both
Windows and Linux using NIO. I originally coded up the application on
Windows and I'm now testing it on Linux.
The structure of the application is fairly simple. There is a main
server thread that basically calls Selector.select() on a set of keys.
The keys include the server socket and any other sockets that have been
accepted. When the server thread gets an event on a socket, it queues
the event for another thread to deal with. To avoid getting repeated
events until the event is dealt with, the server socket removes the
readOps from the interestOps and then goes back to Selector.select().
Now, the problem comes in my other thread. The other thread gets the
event queued by the selecting server thread. It reads/writes data
from/to the network socket and then tries to re-enable the various
interest operations. It then wakes up the selecting server thread so
that the server thread can "discover" the new selection interest
information.
This works well on Windows. It deadlocks on Linux. The worker thread is
blocked getting the current interest set from the selection key. I found
this nice ditty at the end of the SelectionKey JavaDocs:
"Selection keys are safe for use by multiple concurrent threads. The
operations of reading and writing the interest set will, in general, be
synchronized with certain operations of the selector. Exactly how this
synchronization is performed is implementation-dependent: In a naive
implementation, reading or writing the interest set may block
indefinitely if a selection operation is already in progress; in a
high-performance implementation, reading or writing the interest set may
block briefly, if at all. In any case, a selection operation will always
use the interest-set value that was current at the moment that the
operation began."
Okay, so I think Sun chose the naive implementation for Linux. If I
"tickle" the selector thread using another incoming connection, it
unblocks, which then causes the worker thread to progress and then the
interest set is updated in the key the worker is dealing with.
So, I can't, for the life of me, figure out how NIO is useful with this
behavior. Can somebody show me what I'm missing here? I just can't see
how one can use multiple threads with NIO given the current behavior. I
mean, I can't call SelectionKey.selector().wakeup() before I try to set
the interest set as that would cause a race between the selector thread
going back to select and the worker thread trying to set the
interestOps. If the selector thread is blocking on Selector.select(),
however, then I can't retrieve/set the interstOps for *any* key that is
registered with that selector.
Interestingly, Sun seems to have chosen the non-naive ("high
performance") behavior for Windows, where this works as one would expect.
For what it's worth, this is being tested with RH 9. I have tried to
disable NPTL, just in case, using "export LD_ASSUME_KERNEL=2.4.1" and it
didn't help.
Thanks for any help somebody can provide to either educate me or confirm
that this is just useless behavior.
-- Dave
Windows and Linux using NIO. I originally coded up the application on
Windows and I'm now testing it on Linux.
The structure of the application is fairly simple. There is a main
server thread that basically calls Selector.select() on a set of keys.
The keys include the server socket and any other sockets that have been
accepted. When the server thread gets an event on a socket, it queues
the event for another thread to deal with. To avoid getting repeated
events until the event is dealt with, the server socket removes the
readOps from the interestOps and then goes back to Selector.select().
Now, the problem comes in my other thread. The other thread gets the
event queued by the selecting server thread. It reads/writes data
from/to the network socket and then tries to re-enable the various
interest operations. It then wakes up the selecting server thread so
that the server thread can "discover" the new selection interest
information.
This works well on Windows. It deadlocks on Linux. The worker thread is
blocked getting the current interest set from the selection key. I found
this nice ditty at the end of the SelectionKey JavaDocs:
"Selection keys are safe for use by multiple concurrent threads. The
operations of reading and writing the interest set will, in general, be
synchronized with certain operations of the selector. Exactly how this
synchronization is performed is implementation-dependent: In a naive
implementation, reading or writing the interest set may block
indefinitely if a selection operation is already in progress; in a
high-performance implementation, reading or writing the interest set may
block briefly, if at all. In any case, a selection operation will always
use the interest-set value that was current at the moment that the
operation began."
Okay, so I think Sun chose the naive implementation for Linux. If I
"tickle" the selector thread using another incoming connection, it
unblocks, which then causes the worker thread to progress and then the
interest set is updated in the key the worker is dealing with.
So, I can't, for the life of me, figure out how NIO is useful with this
behavior. Can somebody show me what I'm missing here? I just can't see
how one can use multiple threads with NIO given the current behavior. I
mean, I can't call SelectionKey.selector().wakeup() before I try to set
the interest set as that would cause a race between the selector thread
going back to select and the worker thread trying to set the
interestOps. If the selector thread is blocking on Selector.select(),
however, then I can't retrieve/set the interstOps for *any* key that is
registered with that selector.
Interestingly, Sun seems to have chosen the non-naive ("high
performance") behavior for Windows, where this works as one would expect.
For what it's worth, this is being tested with RH 9. I have tried to
disable NPTL, just in case, using "export LD_ASSUME_KERNEL=2.4.1" and it
didn't help.
Thanks for any help somebody can provide to either educate me or confirm
that this is just useless behavior.
-- Dave