size_t problems

R

Richard Tobin

why? It returns in the case of a mad string (ie bigger than int) when i
wraps to 0. Assuming i does that in the standard.

Integer overflow is allowed to be an error. But on most systems, huge
positive integers wrap around to huge negative ones and only get to
zero again when they are doubly huge.

-- Richard
 
M

Martin Wells

Keith Thompson:
"const" in a parameter declaration doesn't do anything useful for the
caller, since (as I'm sure you know) a function can't modify an
argument anyway.


Agreed, it's just a waste of letters.

It does prevent the function from (directly)
modifying its own parameter (a local object), but that's of no concern
to the caller.


If I don't plan on changing a variable's value, then I make it const,
including in the parameter list of a function.

It would make more sense to be able to specify "const" in the
*definition* of a function but not in the *declaration*. And gcc
seems to allow this:

int foo(int x);

int main(void)
{
return foo(0);
}

int foo(const int x)
{
return x;
}

but I'm not sure whether it's actually legal. In any case, it's not a
style that seems to be common.


I haven't written much C in a while, but I think I used to do that and
have no problem.

Martin
 
M

Martin Wells

Ian Collins:
If you use casts frequently in C, you are doing something wrong.


Depends entirely on the nature of the code. I've written portable code
before that is littered with casts for very good reasons.

If you use naked casts at all in C++, you are doing something very wrong.


No, this is a phobia. If a C++ programmer had any sense, they'd
realise that the following two expressions are identical in every way:

MyType(x)

(MyType)x

Try it if you don't believe me.

I only use the more flowerly casts when I'm actually dealing with user-
defined class types and so forth.

There's nothing at all wrong with writing the following in C++:

int x;

char *p = (char*)&x;

Going to the effort of writing "static_cast" just exposes a phobia.

Anyway, back to c.

In my shops we always have a rule that all casts require a comment, a
good way to make developers think twice before using them.


In the little snippet I wrote just above, I'd only write a comment
with it if my target audience only started programming yesterday at 3
O'Clock.

I can't fan a compiler that issues a warning without the cast, just out
of interest, which one does?


IMO, any decent compiler should issue truncation warnings.

Martin
 
M

Martin Wells

jacob navia:
int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;}


If you want a ounce of efficiency then try:

int Strlen_i(char const *const p)
{
return (int)strlen(p);
}

That is to say, the platform's bulit-in strlen function is extremely
likely to be more efficient than anything you write.

Martin
 
C

CBFalconer

Martin said:
Ian Collins:


Depends entirely on the nature of the code. I've written portable
code before that is littered with casts for very good reasons.


No, this is a phobia. If a C++ programmer had any sense, they'd
realise that the following two expressions are identical in
every way:

MyType(x)

(MyType)x

Try it if you don't believe me.

Please don't confuse this newsgroup with C++. There is a separate
newsgroup where that (different) language is on topic.
 
C

CBFalconer

jacob said:
.... snip ...

Just

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

At which point your code has undefined behaviour. Please read the
standard some day.
 
B

Ben Pfaff

Malcolm McLean said:
If you are indexing an arbitrary-length array, effectively now it is
an error to use int. That's a big change from what most people would
recognise as "C". It is also very undesireable that i, which holds the
index, is described as a "size_t" when it certainly doesn't hold a
size. N, the count, doesn't hold an amount of memory either, but is
also a size_t.

An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable type
for representing an array index.
 
M

Malcolm McLean

Ben Pfaff said:
An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable type
for representing an array index.
An arbitrary fucntion, let's call it mean(), ought to be able to take any
array.

so
double mean(double *x, size_t N)

is correct. int will work, but might be a nuisance to caller.

However if we are to have a really whizzy mean, we will sort the numbers
before adding them.

So let's call qsort

void qsort(void *x, size_t N, size_t sz, int (*comp)(const void * const void
*)).

Yes qsort() takes two size_t's as well. So we are OK. The system does work,
but only so long as we are absolutely consistent in using size_t everywhere.

My proposal is to 1) make size_t signed, 2) rename it int.
 
E

Ed Jensen

user923005 said:
Can those same languages create objects with a size to large to be
held in an integer?

Consider this Java code:

byte[] foo = new byte[N];

N must be type int. In Java, an int is a 32 bit signed value.
Therefore, you can't create a byte array with more than 2^31-1
elements.

Now consider this Java code:

short[] foo = new short[N];

Presumably, this could work on a 64 bit JVM, where N = 2^31-1.

The size of the resulting object, in bytes, is larger than the maximum
value a Java int can hold.

Full disclosure: I do not have access to a system capable of testing
this. These conclusions are based on my understanding of the Java
language.
If 'yes', then those languages are defective. If 'no', then integer
is the correct return.

A pointless observation. All programming languages are defective in
at least one way or another. ALL of them.

My point stands: Somehow, other programming languages get by just fine
returning an int when asked for the length of a string.
I can create a language with a single type. Somehow, I think it will
be less effective than C for programming tasks.

You may decide a programming language with only signed integer types
is less effective than C for programming tasks if you like; however,
it doesn't dimish the success or usefulness of those other languages.
Nor is that the only thing that should be considered when choosing a
programming language.
The way to minimize the pain of writing 100% portable code is to write
it correctly, according to the language standard. For instance, that
would include using size_t for object sizes. Now, pre-ANSI C did not
have size_t. So that code will require effort to repair.

Writing 100% portable C code is extremely non-trivial and when taken
to an extreme can interfere with the progress of a project.

I understand why size_t was invented, but I have some suspicions a
more pragmatic approach may have been superior, such as returning int
from strlen() instead of size_t.
 
C

Charlton Wilbur

BP> Implicit function declarations are part of C89. A compiler
BP> that rejects programs that use this feature is not an
BP> implementation of C89.

Yes, but --

a conforming compiler may issue any diagnostics it wishes, which means
it may certainly say "WARNING: implicitly declared function" or
something to that effect; and

most compilers need to be instructed to compile in strict ANSI/ISO
mode anyway, and so making the default behavior for implicitly
declared functions an error and only accepting them in strict mode
would be nicely consonant with that.

Charlton
 
R

Richard Tobin

Ben Pfaff said:
An array of char can potentially have an index range of
0...SIZE_MAX. An array of any larger object type has a more
limited index range. Therefore, size_t is always a suitable type
for representing an array index.

For a sufficiently restricted interpretation of array index. p[-3]
can be perfectly legal.

-- Richard
 
F

Flash Gordon

Malcolm McLean wrote, On 31/08/07 16:18:
An arbitrary fucntion, let's call it mean(), ought to be able to take
any array.

so
double mean(double *x, size_t N)

is correct. int will work, but might be a nuisance to caller.

Only if the caller does not write correct code.
However if we are to have a really whizzy mean, we will sort the numbers
before adding them.

So let's call qsort

void qsort(void *x, size_t N, size_t sz, int (*comp)(const void * const
void *)).

Yes qsort() takes two size_t's as well. So we are OK. The system does
work, but only so long as we are absolutely consistent in using size_t
everywhere.

Ah, he sees the light.
My proposal is to 1) make size_t signed, 2) rename it int.

Or perhaps not. Almost 20 years after a language is standardised is a
bit late to start trying to change it. Especially when it has proved
extremely successful.
 
K

Keith Thompson

jacob navia said:

There's no need to shout.
Just

int Strlen_i(char *s)
{
char *start=s;
while (*s)
s++;
return s-start;
}
#define strlen Strlen_i;

I think redefining strlen invokes undefined behavior; you're likely to
get away with it, but it might break when your code is compiled by
some other compiler. And if 'strlen' is already defined as a
function-like macro, redefining it as an object-like macro (without
first '#undef'ing it) is a constraint violation. (I'm assuming you
have a '#include <string.h>'.)

Why reimplement strlen rather than just calling it? It's a simple
enough function, but the implementation's strlen could well be faster
than your re-write. And by calling strlen and converting the result
to int, you make it easier to add a range check later.
 
J

jacob navia

Martin said:
jacob navia:



If you want a ounce of efficiency then try:

int Strlen_i(char const *const p)
{
return (int)strlen(p);
}

That is to say, the platform's bulit-in strlen function is extremely
likely to be more efficient than anything you write.

Martin

I just did it so that I defined before the macro. But you are right.
Should do that.
 
M

Malcolm McLean

Flash Gordon said:
Malcolm McLean wrote, On 31/08/07 16:18:

Ah, he sees the light.
That's why Basic Algorithms is absolutely consistent in using int. Otherwise
I would either have to translate everything to size_t, or you would rapidly
risk a mess.
Or perhaps not. Almost 20 years after a language is standardised is a bit
late to start trying to change it. Especially when it has proved extremely
successful.
Effectively we are in a hiatus between standards. It looks like C99 will
never be widely implemented. So now is the time to get those nasty size_t's
out of our code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top