IO::Select::select() says no readable data even if there are

J

jari.eskelinen

Hi there,

I am having wirting simple server-client program with
ActiveState ActivePerl 5.8.7 in Win32, but having some
weird problems with reading from sockets. I have googled
for a day now and have found no solution.

I am using line-based communication in order to keep
reading and writing to sockets easy and to avoid buffering
problems. However, buffering problems I have:

When I write two lines from server to socket and do
IO::Select::select (or can_read()) in client, select()
returns handle that can be read. I read first line from
socket using <$socket> and then select() again. Select should
return again handle because there are unread second line.
However, it wil not return anything. If I read from socket
despite of select()'s reutrn, I get second line just like
expected.

So behaviour of select() is unexpected. Or should select()
behave like that? Returning filehandle that can be read only
once, even if there are more data? I read select()'s
documentation, but got an image that it should return handles
ALWAYS if there are data available.

What I am doing wrong? Thank you very much in advance if
somebody can give answer to me!

Ps. Code for reproducing problem:

--------- server.pl -----------
#!/usr/bin/perl -w
use strict;
use IO::Socket;
use IO::Select;

my $socket = new IO::Socket::INET(
LocalAddr=>'localhost',
LocalPort=>'1234',
Proto=>'TCP',
Listen =>1);
binmode($socket);
$socket->autoflush(1);

my $select = new IO::Select();
$select->add($socket);

while (1) {
my @readable = $select->can_read(0);
foreach my $rsock (@readable) {
if ($rsock == $socket) {
my $ns = $rsock->accept();
binmode($ns);
$ns->autoflush(1);
$select->add($ns);
print {$ns} "Hello there, I am your server.\r\n";
print {$ns} "Ready to obey?.\r\n";
} elsif (eof($rsock)) {
$select->remove($rsock);
close($rsock);
} else {
my $buffer = <$rsock>;
print "Client said: $buffer\n";
}
}

sleep(1);
}
----------------------------------

----------- client.pl ------------
#!/usr/bin/perl -w
use strict;
use IO::Socket;
use IO::Select;
use Data::Dumper;

my $socket = new IO::Socket::INET(
PeerAddr=>'localhost',
PeerPort=>'1234',
Proto=>'TCP',
);
binmode($socket);
$socket->autoflush(1);

my $select = new IO::Select();
$select->add($socket);

while (1) {
my @readable = $select->can_read(0);
print Dumper \@readable;
foreach my $rsock (@readable) {
my $buffer = <$rsock>;
print "Server said: $buffer\n";
}

sleep(1);
}
----------------------------------

When running server.pl and client.pl, client prints following:

$VAR1 = [];
$VAR1 = [
bless( \*Symbol::GEN0, 'IO::Socket::INET' )
];
Server said: Hello there, I am your server.

$VAR1 = [];
$VAR1 = [];

Etc. It is cleary seen that select() (or in this case can_read())
indicates that there are nothing more to read, even when there are.

Best regards,
Jari Eskelinen
 
J

jari.eskelinen

Etc. It is cleary seen that select() (or in this case can_read())
indicates that there are nothing more to read, even when there are.

Found solution. When using $socket->recv() instead of <$socket>,
it won't mess up with select(). So I wrote subroutine which reads
socket
until newline:

sub recvSocket {
my ($socket) = @_;

my $buffer;
my $c;
while ($socket->recv($c, 1) && $c ne "\n") {
$buffer .= $c;
}
$buffer .= "\n";

return $buffer;
}

Wham, everything works. Still, I cannot but wonder if there is bug in
IO::Socket / IO::Select under Win32, or is this supposed to work like
this... Cygwin's perl did also the same thing when I tried.

Best regards,
Jari Eskelinen
 
V

vparseval

Found solution. When using $socket->recv() instead of <$socket>,
it won't mess up with select(). So I wrote subroutine which reads
socket
until newline:

sub recvSocket {
my ($socket) = @_;

my $buffer;
my $c;
while ($socket->recv($c, 1) && $c ne "\n") {
$buffer .= $c;
}
$buffer .= "\n";

return $buffer;
}

Wham, everything works. Still, I cannot but wonder if there is bug in
IO::Socket / IO::Select under Win32, or is this supposed to work like
this... Cygwin's perl did also the same thing when I tried.

This is not surprising. You said you were originally using line-based
reading in order to avoid buffering-issues. However, line-based
communication is the very thing that will expose your program to
these issues.

When you read line-wise, select() may not know that there is more
data to read. On the first line-read, perl's IO-layer probably reads a
bigger chunk than just the first line and keeps it in an internal
buffer.

This is what happens when you are using fread(3) in C on a FILE
struct which contains such a buffer in order to avoid doing too many
read(2) system-calls.

select(2) on the other hand acts on the underlying file-descriptor and
does not take into account what is left in the buffer.

Even though the perldocs explicitely state that select() should not be
mixed with line-wise processing, there is a dim chance that you could
still make it work by slurping into an array instead of reading just
one line. Instead of:
my $buffer = <$rsock>;
print "Client said: $buffer\n";

my @buffer = <$rsock>;
print "Client said: ", join '', @buffer;

This may still fail when there were, say, only one and a half lines
sent from the client. The half line will then remain in the buffer
waiting for the other half (with the terminating newline) to arrive.

The correct solution is to use sysread() or -- as you did -- recv()
which may even end up doing the same system-calls depending
on the platform.

Cheers,
Tassilo
 
B

Ben Morrow

Quoth (e-mail address removed):
I am using line-based communication in order to keep
reading and writing to sockets easy and to avoid buffering
problems. However, buffering problems I have:

When I write two lines from server to socket and do
IO::Select::select (or can_read()) in client, select()
returns handle that can be read. I read first line from
socket using <$socket> and then select() again. Select should
return again handle because there are unread second line.
However, it wil not return anything. If I read from socket
despite of select()'s reutrn, I get second line just like
expected.

As you said, you have buffering problems. select(2) operates on the
low-level filedescriptor (or its analogue under Win32); when you use
<> Perl reads a whole bufferful of data and then returns only the first
line. There really is nothing to read from the raw socket: the next line
is already read into the buffer.

With 5.8 you ought to be able to use <> in an unbuffered fashion by
pushing the :unix PerlIO layer; I haven't tried it though. It may be
very inefficient: a better solution would be to use sysread/syswrite and
do the buffering yourself.

:) It's nice to see someone getting it right...

Some unrelated comments:
--------- server.pl -----------
#!/usr/bin/perl -w

You want
use warnings;
nowadays, instead of -w.
use strict;
use IO::Socket;
use IO::Select;

my $socket = new IO::Socket::INET(
LocalAddr=>'localhost',
LocalPort=>'1234',
Proto=>'TCP',
Listen =>1);
binmode($socket);
$socket->autoflush(1);

my $select = new IO::Select();
$select->add($socket);

while (1) {
my @readable = $select->can_read(0);
^^
This, plus the sleep below, means you are polling for data (trying again
and again at regular intervals). This is very unfriendly to other
processes on the system: much better is to specify infinite timeout
(can_read with no argument) and let the system wake you up when
something happens.
foreach my $rsock (@readable) {
if ($rsock == $socket) {
my $ns = $rsock->accept();
binmode($ns);
$ns->autoflush(1);
$select->add($ns);
print {$ns} "Hello there, I am your server.\r\n";
print {$ns} "Ready to obey?.\r\n";
} elsif (eof($rsock)) {

Personally I wouldn't use eof. I have no idea if it works reliably on
sockets, and in any case I don't think it plays with sysread/syswrite/
select at all. Detect EOF by waiting for sysread to return 0.
$select->remove($rsock);
close($rsock);
} else {
my $buffer = <$rsock>;
print "Client said: $buffer\n";
}
}

sleep(1);
}
----------------------------------

Ben
 
A

anno4000

Ben Morrow said:
Quoth (e-mail address removed):

:) It's nice to see someone getting it right...

Are you saying it is always wrong to use "data" with a singular verb?

My understanding is that in English "data" can mean a collection of
individual facts, in which case it is plural. It can also mean a body
of facts, synonymous with "information" and as such is construed singular.

In Latin "data" is of course unambiguously plural.

Anno
 
B

Ben Morrow

Quoth (e-mail address removed)-berlin.de:
Are you saying it is always wrong to use "data" with a singular verb?

My understanding is that in English "data" can mean a collection of
individual facts, in which case it is plural. It can also mean a body
of facts, synonymous with "information" and as such is construed singular.

Traditionally, 'data' was always plural, a straight steal from Latin. It
has become common practice to use it as a singular (continuous) noun,
and this in turn has become one of the things people who complain about
grammar complain about :). In particular, in a statistical/whatever
context, 'The data suggests <foo>' is definitely wrong.

However, when one is using the pipe analogy of data flowing from one
place to another, it is very hard to resist using a continuous noun, so
I would say usage has changed and in some cases the singular is correct.
After all, we're dealing with rather more than just a few sets of
numbers nowadays, so a continuous noun is really more appropriate.

I have noticed in the past that Damian at least always uses 'data' as a
plural...

Ben
 
D

David Squire

Are you saying it is always wrong to use "data" with a singular verb?

My understanding is that in English "data" can mean a collection of
individual facts, in which case it is plural. It can also mean a body
of facts, synonymous with "information" and as such is construed singular.

In Latin "data" is of course unambiguously plural.

Pedants will say that even in English "data" is always plural, and even
when referring to a collection it should be "the data are". When
referring to a single item they would have us say "the datum is".

I, however, am of the descriptive rather than prescriptive school [1],
and in modern English usage "the data is" is perfectly acceptable.


DS


[1] Still, it's worth knowing what prescriptivists want, so that you can
tailor your responses should one be interviewing you for a job :)
 
J

jari.eskelinen

line. There really is nothing to read from the raw socket: the next line
is already read into the buffer.

Thank you very much for your information and thanks for Tassilo too.
Now I understand why this work like it does and undestanding something
is more valuable than just solving the problem.
This, plus the sleep below, means you are polling for data (trying again
and again at regular intervals). This is very unfriendly to other
processes on the system: much better is to specify infinite timeout
(can_read with no argument) and let the system wake you up when
something happens.

Yes, I know that polling does keep system busy and that's why I am
sleep(1)'ing with every iteration of loop. My server program (example
code was just a fraction) does much event based job (polls also
filesystem, calculates some things and does various things according
situations) and socket communications is only a tiny bit. So I loop my
program infinitely, sleeping a while between iterations and doing
polling of sockets etc. Server just can't halt when waiting data from
socket - it must work all time. Same goes to client - it is Perl/GTK2
program which cannot halt when waiting data from server, otherwise user
interface will also halt.

My first idea was to do polling like described above. I am not very
familiar with this kind of programmin so I do not know if there would
be much better solutions. Perhaps using threads, event modules or
something like that. Infinite loop polling seemed to be most easiest to
implement so I went that way. If there are much better ways however,
please point me to the right direction (threading...?).
Personally I wouldn't use eof. I have no idea if it works reliably on
sockets, and in any case I don't think it plays with sysread/syswrite/
select at all. Detect EOF by waiting for sysread to return 0.

Just saw that in some socket examples so I adopted habit of using eof.
I'll switch to sysread, thanks for pointing it out.

Best regards,
Jari Eskelinen
 
D

David Squire

Ben said:
I have noticed in the past that Damian at least always uses 'data' as a
plural...

You are talking about the man who wrote Lingua::Romana::perligata :)


DS
 
B

Ben Morrow

[please attribute quotations]

Quoth (e-mail address removed):
Yes, I know that polling does keep system busy and that's why I am
sleep(1)'ing with every iteration of loop. My server program (example
code was just a fraction) does much event based job (polls also
filesystem, calculates some things and does various things according
situations) and socket communications is only a tiny bit. So I loop my
program infinitely, sleeping a while between iterations and doing
polling of sockets etc. Server just can't halt when waiting data from
socket - it must work all time.

You should use a proper event system to handle this: perhaps POE or
Event.pm. I confess I've never used either of these. I presume you could
also use Glib.pm, if you like, to match the client.
Same goes to client - it is Perl/GTK2
program which cannot halt when waiting data from server, otherwise user
interface will also halt.

GTK2 has a facility (Gtk2::Helper->add_watch) to watch a filehandle in
addition to waiting for GUI events. In general any toolkit that does its
own event management will do the same.
Perhaps using threads,

You could cook something up with threads, but I'd strongly recommend
against it. It would certainly be slower (unless your app ends up
CPU-bound and you have a multi-processor machine), and it would probably
be much harder to get right. At least with the client you're going to
have to structure your program 'inside-out' with callbacks anyway, so
there would be no gain whatever.

Ben
 
D

Dr.Ruud

Ben Morrow schreef:
anno4000:

Traditionally, 'data' was always plural, a straight steal from Latin.
It has become common practice to use it as a singular (continuous)
noun, and this in turn has become one of the things people who
complain about grammar complain about :). In particular, in a
statistical/whatever context, 'The data suggests <foo>' is definitely
wrong.

However, when one is using the pipe analogy of data flowing from one
place to another, it is very hard to resist using a continuous noun,
so I would say usage has changed and in some cases the singular is
correct. After all, we're dealing with rather more than just a few
sets of numbers nowadays, so a continuous noun is really more
appropriate.

In Dutch you can on top of that choose between data and datums, where
both data and datums can mean "multiple points in time" (as the English
"dates", but then <g> mostly without the place-connotation).

http://en.wikipedia.org/wiki/Datum http://en.wikipedia.org/wiki/Data
http://nl.wikipedia.org/wiki/Datum http://nl.wikipedia.org/wiki/Data
http://de.wikipedia.org/wiki/Datum http://de.wikipedia.org/wiki/Daten
 
C

Charles DeRykus

David said:
...


[1] Still, it's worth knowing what prescriptivists want, so that you can
^^^^^^^^^^^^^^

Somehow 'prescribers' seems more reminiscent of those administering
awful medicine .... :)

Forgive me -- no Perl content at all...
 
D

David Squire

Charles said:
David said:
...


[1] Still, it's worth knowing what prescriptivists want, so that you can
^^^^^^^^^^^^^^

Somehow 'prescribers' seems more reminiscent of those administering
awful medicine .... :)

True, but not a synonym in this case. In grammar-land, the wars rage
between the descriptive and prescriptive schools, and they call
themselves descriptivists and prescriptivists. See
http://en.wikipedia.org/wiki/Prescriptive_grammar

Still, as a descriptivist, I guess I should swallow your 'prescribers'
prescription :)

ObPerl: Perl's flexible syntax makes it perhaps the most
descriptivist-feeling programming language I have used.
 
J

jari.eskelinen

You should use a proper event system to handle this: perhaps POE or
Event.pm. I confess I've never used either of these. I presume you could
also use Glib.pm, if you like, to match the client.

Thanks Ben. I was too narrow-minded and thought that I have no energy
to learn using events in addition to learn how to use sockets. However
decided to give it a try and was then able to cut my
networkking/polling code to half and in general to make code much much
simpler using Glib::Event. Even more importantly my program now
responds immediadly instead of sleep():ing between polls.

Thanks to all helping me out, think I learnd a lot.

Best regards,
Jari Eskelinen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,152
Members
46,698
Latest member
LydiaHalle

Latest Threads

Top