size_t problems

Richard Tobin · Aug 31, 2007

why? It returns in the case of a mad string (ie bigger than int) when i
wraps to 0. Assuming i does that in the standard.

Integer overflow is allowed to be an error. But on most systems, huge
positive integers wrap around to huge negative ones and only get to
zero again when they are doubly huge.

-- Richard

Kenny McCormack · Aug 31, 2007

No, Ben's just trolling again.

Richard

It was an intentional play on words (double entendre).

Kenny McCormack · Aug 31, 2007

But that's not funny.

Indeed. Plus, it sounds very much like an ex post facto construction.

Martin Wells · Aug 31, 2007

Keith Thompson:

"const" in a parameter declaration doesn't do anything useful for the
caller, since (as I'm sure you know) a function can't modify an
argument anyway.

Agreed, it's just a waste of letters.

It does prevent the function from (directly)
modifying its own parameter (a local object), but that's of no concern
to the caller.

If I don't plan on changing a variable's value, then I make it const,
including in the parameter list of a function.

It would make more sense to be able to specify "const" in the
*definition* of a function but not in the *declaration*. And gcc
seems to allow this:

int foo(int x);

int main(void)
{
return foo(0);
}

int foo(const int x)
{
return x;
}

but I'm not sure whether it's actually legal. In any case, it's not a
style that seems to be common.

I haven't written much C in a while, but I think I used to do that and
have no problem.

Martin

Martin Wells · Aug 31, 2007

Ian Collins:

If you use casts frequently in C, you are doing something wrong.

Depends entirely on the nature of the code. I've written portable code
before that is littered with casts for very good reasons.

If you use naked casts at all in C++, you are doing something very wrong.

No, this is a phobia. If a C++ programmer had any sense, they'd
realise that the following two expressions are identical in every way:

MyType(x)

(MyType)x

Try it if you don't believe me.

I only use the more flowerly casts when I'm actually dealing with user-
defined class types and so forth.

There's nothing at all wrong with writing the following in C++:

int x;

char *p = (char*)&x;

Going to the effort of writing "static_cast" just exposes a phobia.

Anyway, back to c.

In my shops we always have a rule that all casts require a comment, a
good way to make developers think twice before using them.

In the little snippet I wrote just above, I'd only write a comment
with it if my target audience only started programming yesterday at 3
O'Clock.

I can't fan a compiler that issues a warning without the cast, just out
of interest, which one does?

IMO, any decent compiler should issue truncation warnings.

Martin

Martin Wells · Aug 31, 2007

jacob navia:

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;}

If you want a ounce of efficiency then try:

int Strlen_i(char const *const p)
{
return (int)strlen(p);
}

That is to say, the platform's bulit-in strlen function is extremely
likely to be more efficient than anything you write.

Martin

CBFalconer · Aug 31, 2007

Martin said:
Ian Collins:

Depends entirely on the nature of the code. I've written portable
code before that is littered with casts for very good reasons.

No, this is a phobia. If a C++ programmer had any sense, they'd
realise that the following two expressions are identical in
every way:

MyType(x)

(MyType)x

Try it if you don't believe me.

Please don't confuse this newsgroup with C++. There is a separate
newsgroup where that (different) language is on topic.

CBFalconer · Aug 31, 2007

jacob said:
.... snip ...

Just

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

At which point your code has undefined behaviour. Please read the
standard some day.

Ben Pfaff · Aug 31, 2007

Malcolm McLean said:
If you are indexing an arbitrary-length array, effectively now it is
an error to use int. That's a big change from what most people would
recognise as "C". It is also very undesireable that i, which holds the
index, is described as a "size_t" when it certainly doesn't hold a
size. N, the count, doesn't hold an amount of memory either, but is
also a size_t.

An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable type
for representing an array index.

Malcolm McLean · Aug 31, 2007

Ben Pfaff said:
An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable type
for representing an array index.

An arbitrary fucntion, let's call it mean(), ought to be able to take any
array.

so
double mean(double *x, size_t N)

is correct. int will work, but might be a nuisance to caller.

However if we are to have a really whizzy mean, we will sort the numbers
before adding them.

So let's call qsort

void qsort(void *x, size_t N, size_t sz, int (*comp)(const void * const void
*)).

Yes qsort() takes two size_t's as well. So we are OK. The system does work,
but only so long as we are absolutely consistent in using size_t everywhere.

My proposal is to 1) make size_t signed, 2) rename it int.

Ed Jensen · Aug 31, 2007

Ben Pfaff said:
What programming languages are you thinking of here?

One example would be Java.

Richard · Aug 31, 2007

Ed Jensen said:
One example would be Java.

Lisp. Or Lithp.

Ed Jensen · Aug 31, 2007

user923005 said:
Can those same languages create objects with a size to large to be
held in an integer?

Consider this Java code:

byte[] foo = new byte[N];

N must be type int. In Java, an int is a 32 bit signed value.
Therefore, you can't create a byte array with more than 2^31-1
elements.

Now consider this Java code:

short[] foo = new short[N];

Presumably, this could work on a 64 bit JVM, where N = 2^31-1.

The size of the resulting object, in bytes, is larger than the maximum
value a Java int can hold.

Full disclosure: I do not have access to a system capable of testing
this. These conclusions are based on my understanding of the Java
language.

If 'yes', then those languages are defective. If 'no', then integer
is the correct return.

A pointless observation. All programming languages are defective in
at least one way or another. ALL of them.

My point stands: Somehow, other programming languages get by just fine
returning an int when asked for the length of a string.

I can create a language with a single type. Somehow, I think it will
be less effective than C for programming tasks.

You may decide a programming language with only signed integer types
is less effective than C for programming tasks if you like; however,
it doesn't dimish the success or usefulness of those other languages.
Nor is that the only thing that should be considered when choosing a
programming language.

The way to minimize the pain of writing 100% portable code is to write
it correctly, according to the language standard. For instance, that
would include using size_t for object sizes. Now, pre-ANSI C did not
have size_t. So that code will require effort to repair.

Writing 100% portable C code is extremely non-trivial and when taken
to an extreme can interfere with the progress of a project.

I understand why size_t was invented, but I have some suspicions a
more pragmatic approach may have been superior, such as returning int
from strlen() instead of size_t.

Charlton Wilbur · Aug 31, 2007

BP> Implicit function declarations are part of C89. A compiler
BP> that rejects programs that use this feature is not an
BP> implementation of C89.

Yes, but --

a conforming compiler may issue any diagnostics it wishes, which means
it may certainly say "WARNING: implicitly declared function" or
something to that effect; and

most compilers need to be instructed to compile in strict ANSI/ISO
mode anyway, and so making the default behavior for implicitly
declared functions an error and only accepting them in strict mode
would be nicely consonant with that.

Charlton

Richard Tobin · Aug 31, 2007

Lisp. Or Lithp.

Most modern Lisps have bignums, which removes the problem of choosing
a size.

-- Richard

Richard Tobin · Aug 31, 2007

Ben Pfaff said:
An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable type
for representing an array index.

For a sufficiently restricted interpretation of array index. p[-3]
can be perfectly legal.

-- Richard

Flash Gordon · Aug 31, 2007

Malcolm McLean wrote, On 31/08/07 16:18:

An arbitrary fucntion, let's call it mean(), ought to be able to take
any array.

so
double mean(double *x, size_t N)

is correct. int will work, but might be a nuisance to caller.

Only if the caller does not write correct code.

However if we are to have a really whizzy mean, we will sort the numbers
before adding them.

So let's call qsort

void qsort(void *x, size_t N, size_t sz, int (*comp)(const void * const
void *)).

Yes qsort() takes two size_t's as well. So we are OK. The system does
work, but only so long as we are absolutely consistent in using size_t
everywhere.

Ah, he sees the light.

My proposal is to 1) make size_t signed, 2) rename it int.

Or perhaps not. Almost 20 years after a language is standardised is a
bit late to start trying to change it. Especially when it has proved
extremely successful.

Keith Thompson · Aug 31, 2007

jacob navia said:
NO!

There's no need to shout.

Just

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

I think redefining strlen invokes undefined behavior; you're likely to
get away with it, but it might break when your code is compiled by
some other compiler. And if 'strlen' is already defined as a
function-like macro, redefining it as an object-like macro (without
first '#undef'ing it) is a constraint violation. (I'm assuming you
have a '#include <string.h>'.)

Why reimplement strlen rather than just calling it? It's a simple
enough function, but the implementation's strlen could well be faster
than your re-write. And by calling strlen and converting the result
to int, you make it easier to add a range check later.

jacob navia · Aug 31, 2007

Martin said:
jacob navia:

If you want a ounce of efficiency then try:

int Strlen_i(char const *const p)
{
return (int)strlen(p);
}

That is to say, the platform's bulit-in strlen function is extremely
likely to be more efficient than anything you write.

Martin

I just did it so that I defined before the macro. But you are right.
Should do that.

Malcolm McLean · Aug 31, 2007

Flash Gordon said:
Malcolm McLean wrote, On 31/08/07 16:18:

Ah, he sees the light.

That's why Basic Algorithms is absolutely consistent in using int. Otherwise
I would either have to translate everything to size_t, or you would rapidly
risk a mess.

Or perhaps not. Almost 20 years after a language is standardised is a bit
late to start trying to change it. Especially when it has proved extremely
successful.

Effectively we are in a hiatus between standards. It looks like C99 will
never be widely implemented. So now is the time to get those nasty size_t's
out of our code.

size_t, ssize_t and ptrdiff_t	56	Oct 12, 2013
The problem with size_t	45	Oct 15, 2009
size_t	18	Dec 6, 2004
size_t in inttypes.h	4	May 26, 2011
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
mixed declarations and code (and size_t)?	7	Nov 15, 2010
64 bit porting - size_t vs unsigned int	7	Dec 23, 2006
size_t in C++	0	May 9, 2010

size_t problems

Richard Tobin

Kenny McCormack

Kenny McCormack

Martin Wells

Martin Wells

Martin Wells

CBFalconer

CBFalconer

Ben Pfaff

Malcolm McLean

Ed Jensen

Richard

Ed Jensen

Charlton Wilbur

Richard Tobin

Richard Tobin

Flash Gordon

Keith Thompson

jacob navia

Malcolm McLean

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads