ssize_t and size_t

K

kid joe

Hi all,

I was thinking about interfaces like this one for Unix read

ssize_t read(int fd, void *buf, size_t count);

(I know that read isnt an ISO C function, but the question is nothing to
do with read, just the function interface).

The return value is -1 in case of error, or an integer between 0 and count
giving the number of bytes read into buf.

This seems like a poor choice to me.... If I pass SIZE_MAX for count, then
there's no way of distinguishing between an error (-1) and a successful
read of ((size_t) -1) bytes.

Wouldn't it be better for the function to return an unsigned size_t and
notify the caller of errors in another way (eg an extra parameter)?

Cheers,
Joe
 
A

Antoninus Twink

This seems like a poor choice to me.... If I pass SIZE_MAX for count,
then there's no way of distinguishing between an error (-1) and a
successful read of ((size_t) -1) bytes.

This is hardly likely to be a problem in practise - if you try to
allocate a buffer of size SIZE_MAX to pass to read(), it's rather likely
that malloc() will return a null pointer...
 
N

Nate Eldredge

kid joe said:
Hi all,

I was thinking about interfaces like this one for Unix read

ssize_t read(int fd, void *buf, size_t count);

(I know that read isnt an ISO C function, but the question is nothing to
do with read, just the function interface).

The return value is -1 in case of error, or an integer between 0 and count
giving the number of bytes read into buf.

This seems like a poor choice to me.... If I pass SIZE_MAX for count, then
there's no way of distinguishing between an error (-1) and a successful
read of ((size_t) -1) bytes.

There is, actually: set `errno' to 0 beforehand and check afterwards to
see if it is nonzero. Not the most convenient thing, but your only
option if you might pass SIZE_MAX as an argument.

However, better would be not to do that at all, and that's usually the
officia stance: don't do that. For instance, my system's man page for
`read' says that any value for `count' larger than INT_MAX is
errnoneous, and read() returns -1 and sets errno to EINVAL. So if
`count' is SIZE_MAX (typically larger than INT_MAX), you know ahead of
time that the read() is going to fail.

You're right, though, it isn't an ideally designed interface. However,
it has the benefit of being simple.
Wouldn't it be better for the function to return an unsigned size_t and
notify the caller of errors in another way (eg an extra parameter)?

Probably. But the design of the interface is so historical that it
isn't likely to be changed at this point. When it was designed, I
presume all these arguments were expected to be `int', and sizes that
might not fit in a positive `int' were probably considered outlandishly
large (on 32-bit machines, at least).
 
K

Keith Thompson

kid joe said:
I was thinking about interfaces like this one for Unix read

ssize_t read(int fd, void *buf, size_t count);

(I know that read isnt an ISO C function, but the question is nothing to
do with read, just the function interface).

The return value is -1 in case of error, or an integer between 0 and count
giving the number of bytes read into buf.

This seems like a poor choice to me.... If I pass SIZE_MAX for count, then
there's no way of distinguishing between an error (-1) and a successful
read of ((size_t) -1) bytes.

Wouldn't it be better for the function to return an unsigned size_t and
notify the caller of errors in another way (eg an extra parameter)?

There's certainly a good case to be made for that.

On the other hand, both Unix and C have a long tradition of squeezing
error indications into results alongside useful data. Examples in
standard C include the time() function, which returns either the
current time or (time_t)-1, and getchar(), which returns either a
character value or EOF; both of these can cause problems.

The read function originally returned an int (source: K&R1 chapter 8,
The UNIX System Interface). It was probably just assumed that you'd
never try to read more than INT_MAX bytes as a time.

Today, on a 32-bit system, size_t and ssize_t are probably going to be
32 bits; you *might* want to read between 2 and 4 gigabytes in a
single operation, but it's not likely. On a 64-bit system, size_t and
ssize_t are probably going to be 64 bits; it's going to be a while
before single read operations between 8 and 16 exabytes become common,
or even possible. (Of course those aren't the only possibilities
consistent with the standard.)
 
K

Keith Thompson

Nate Eldredge said:
There is, actually: set `errno' to 0 beforehand and check afterwards to
see if it is nonzero. Not the most convenient thing, but your only
option if you might pass SIZE_MAX as an argument.
[...]

That doesn't work reliably. I'm not sure about read(), but generally
for functions that can set errno, you shouldn't check the value of
errno until the function actually tells you that there's been an error
(by whatever mechanism it uses).

For example, I've seen cases where an output routine checks whether
its output stream is at tty. The code that performs this check might
set errno as a side effect. The higher-level routine needn't set
errno to a meaningful value unless it fails.

So you need to set errno to 0, then call the routine, then check
whether it signals an error, and only then check the value of errno.
 
N

Nate Eldredge

pete said:
If ssize_t can represent SIZE_MAX,

then (-1 != (ssize_t)SIZE_MAX) is defined as true.

Typically it cannot. On all of the four different Unix systems I just
checked, size_t and ssize_t are unsigned and signed versions of the same
integer type (e.g. unsigned int vs signed int, unsigned long vs signed
long, uint64_t vs int64_t).
 
J

James Kuyper

kid said:
Hi all,

I was thinking about interfaces like this one for Unix read

ssize_t read(int fd, void *buf, size_t count);

(I know that read isnt an ISO C function, but the question is nothing to
do with read, just the function interface).

The return value is -1 in case of error, or an integer between 0 and count
giving the number of bytes read into buf.

This seems like a poor choice to me.... If I pass SIZE_MAX for count, then
there's no way of distinguishing between an error (-1) and a successful
read of ((size_t) -1) bytes.


The man page for read() on my desktop says "If count is greater than
SSIZE_MAX, the result is unspecified." ((size_t)-1) will generally be
greater than SSIZE_MAX, so you shouldn't even be attempting this.
 
S

Stephen Sprunk

Nate said:
There is, actually: set `errno' to 0 beforehand and check afterwards to
see if it is nonzero. Not the most convenient thing, but your only
option if you might pass SIZE_MAX as an argument.

However, better would be not to do that at all, and that's usually the
officia stance: don't do that. For instance, my system's man page for
`read' says that any value for `count' larger than INT_MAX is
errnoneous, and read() returns -1 and sets errno to EINVAL. So if
`count' is SIZE_MAX (typically larger than INT_MAX), you know ahead of
time that the read() is going to fail.

Aside: This is due to the fact that read() hearkens from C's "everything
an int" days long ago. The "count" parameter was originally an int,
therefore it was not possible to pass a count larger than INT_MAX, and
therefore a negative (int) return value unquestionably meant an error.
However, when ANSI created size_t, the POSIX folks went back and changed
many of their types to size_t, which enabled passing larger values in
many cases because size_t was unsigned; however, the return value for
many other functions needed to accommodate negative values so it was
changed to ssize_t (which they invented for the purpose and is not part
of ANSI/ISO C).

The result is what Nate explains: you can't meaningfully pass a value
larger than SIZE_MAX/2 to read(), even though the parameter has type
size_t -- and most implementations will check for that case and
immediately return an error if you try it. Implementations with a
64-bit size_t and a 32-bit int may also disallow passing a count greater
than INT_MAX, even though that would be meaningful.

Any time you see a return value of ssize_t, expect this limitation to
rear its ugly head.

S
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,810
Latest member
Kassie0918

Latest Threads

Top