size_t problems

T

Tor Rustad

Malcolm said:
The snail, going West, is moving towards the Andromeda galaxy at
50.000001 km/s. The jet, going East, is moving towards Andromeda at
about 49.660 km/s, assuming it's a Concorde.

That was a fast snail! :)

AFAIK, M31 is closing in on the Sun at 300 km/s, and on Milky Way at 100
km/s, so how did you arrive at ca. 50 km/s?


Rolling your own standard functions, is usually a bad idea, in
particularly when the replacement is *much* slower.
 
P

Peter J. Holzer

Not necessarily. The library in question is for the Intel architecture,
which allows unaligned access. It is possible that an unaligned access
to a 4-byte word is still faster than 4 accesses to single bytes.

The call itself may be longer than the execution of the inline, for
suitably short strings.

Which call? The C compiler could inline any call to strlen. It knows
what it does, after all. Inlining calls to my_strlen is much harder.

(In fact, gcc does not only inline calls to strlen, it replaces calls to
strlen on a string literal with a suitable integer constant).

hp
 
M

Malcolm McLean

Tor Rustad said:
That was a fast snail! :)

AFAIK, M31 is closing in on the Sun at 300 km/s, and on Milky Way at 100
km/s, so how did you arrive at ca. 50 km/s?
That's a wiki fact.
It could easily be wrong. I know Andromeda is moving towards us much faster
than any aeroplance could possibly fly, but I didn't know the value off the
top of my head.
 
K

Kelsey Bjarnason

[snips]

In fact if you use size_t safely and consistently, virtually all ints need
to be size_t's.

Complete, total and utter bollocks.

size_t is used primarily for sizes and indexes.
int is used primarily for general calculations.

I have reams of code using both and it is the _unusual_ case where the
twain meet at all.

Where do you get this nonsense?
 
J

jacob navia

Peter said:
Not necessarily. The library in question is for the Intel architecture,
which allows unaligned access. It is possible that an unaligned access
to a 4-byte word is still faster than 4 accesses to single bytes.

This will be slower if the address is unaligned.

[snip]
(In fact, gcc does not only inline calls to strlen, it replaces calls to
strlen on a string literal with a suitable integer constant).

I thought about it, but I haven't got the time to do it...
 
K

Kelsey Bjarnason

[snips]

Yes qsort() takes two size_t's as well. So we are OK. The system does work,
but only so long as we are absolutely consistent in using size_t everywhere.

My proposal is to 1) make size_t signed, 2) rename it int.

And thereby break reams of code, and produce yet another situation where
the solution simply does not work.

Current:

void *buff = malloc( 40000U );
int size = 40000;

Oops. The malloc works, but the size is wrong. So your solution requires
either:

1) Forcing 16-bit implementations to limit allocations to 32767 or fewer
bytes

2) Forcing 16-bit implementations to emulate larger ints.


Hmm. Turns out the same problem occurs with 32-bit implementations, as
one can trivially allocate regions > 2GB, but a 32-bit int won't work. So
32-bit implementations (and 16-bit ones, too) will have to use 64-bit ints
for everything, making both 32-bit and 16-bit implementations hellishly
inefficient, or simply crippled in terms of memory allocation.

So your proposal is to cripple the language or force it to result in
massively inefficient operations, all because you don't like some feature
which has worked perfectly well for at least some 17 years.

Somehow I think we'll choose to err on the side of sanity on this one; not
a single thing you've offered has given a single real justification for
eradicating size_t, other than to suit your personal pet peeves - and that
might make you happy, but crippling the language to make you happy just
isn't a terribly compelling argument.
 
K

Kelsey Bjarnason

That's why Basic Algorithms is absolutely consistent in using int.

And producing broken code in the process.

Why you refuse to deal with that side of the equation, I don't know. Yes,
fine, it makes you happy to use int, but it makes your code at best
undesirable and at worst unusable in real programs.
Effectively we are in a hiatus between standards. It looks like C99 will
never be widely implemented. So now is the time to get those nasty
size_t's out of our code.

Now is the time to fix the standard - which means leaving size_t in, as it
actually solves a problem and has a justification for existing, whereas
none of your counter-proposals solve anything or make any sense.
 
K

Kelsey Bjarnason

[snips]

It obviously performs some output, otherwise the program can be optimised to
nothing.

What defines "output"? If I write a program that, oh, takes an argument
specifying a number of seconds to delay, then pauses for that number of
seconds (busy loop or something implementation-specific) before exiting,
what exactly is the output? The return 0 from main? Either way, I would
be most unhappy if a compiler optimized this to nothing - it's not
nothing, it simply does no output in any conventional sense.
 
K

Kelsey Bjarnason

[snips]

No, limited experience, Not the same thing as incompetence at all.
If you write say, mainly code to drive GUIs under Windows, you will find
that there's little point making much portable. Everything has to be ripped
up and rewritten whenever the denizens of Redmond decide to realease a new
compiler anyway.

He said incompetence, and you just demonstrated it.

If I were writing such apps, I'd write the body of the code to be as
conforming as possible, meaning it is effectively immune to switching to
a different OS, compiler, or version of a compiler.

Some stuff - GUI code, network code, etc - will possibly have to be
rewritten at each change, but if that's a significant portion of the code,
chances are you're doing something very badly wrong.
However if you are writing mostly scientific programs, as I am doing at
present, everything has got to be portable. I've no business writing
code that can't be shifted to a mainframe or PC or whatever, as need
arises.

Oh, you mean like how you use int instead of size_t, thus crippling your
code on 16-bit (and even 32-bit) implementations? That sort of
"portable"?
Even slash slash comments, which I thought were surely as good as
standard by now, are not accepted by the parallel compiler.

Why would you think they're standard? They're in C++ and C99, but most of
the C world uses C90, not C99 - and those are not part of C90, are they?
What, you think standards magically change to suit your whim?
 
K

Kelsey Bjarnason

Malcolm McLean said:


I think it *is* true, by and large. Mistakes *are* pointed out, and
rightly so, but to point out a mistake is *not* the same as to
criticise the person who made it. I make my fair share of mistakes (or
perhaps more!), but when people in clc point this out, I don't feel
threatened or intimidated by the fact. On the contrary, I welcome
corrections for what they really are - opportunities to learn and to
improve my programming.

Yeah, you just tend to make your mistakes so esoteric only about three
people are qualified to find them, let alone figure out the right way. :)
 
M

Malcolm McLean

Kelsey Bjarnason said:
He said incompetence, and you just demonstrated it.

If I were writing such apps, I'd write the body of the code to be as
conforming as possible, meaning it is effectively immune to switching to
a different OS, compiler, or version of a compiler.
What you not uncommonly find is that the actual processing that the app
performs is trivial - maybe it adds a few columns of numbers together and
produces a report. However the GUI to allow the user to enter these numbers,
check them, specify which columns to add up, and format the report might be
very non-trivial. So in fact the portable bit of the code is a sum()
function and maybe a few histogram or pie chart generators, without the
graphical part.
 
K

Kelsey Bjarnason

Yes. But psychological factors are also important.

Indeed, they are. So if you could kindly cease your asinine rantings
against size_t and for 64-bit ints, the rest of us wouldn't have to deal
with it and be happier. Of course, this would also suggest you're going
to fix your currently hopelessly broken code in your book, but I suspect
we'll just have to live with that, reminding newbies of the dangers
inherent in it.
If an index variable
is called "size" then of course the compiler will happily chug through
and index the array by variable "size". However to anyone reading the
program it is intensely irritating.

So don't call it size. Call it index. Just make sure it's a size_t.
If you allow a meaningful, but wrong type, called "int", and a correct
but misleadingly named type, called "size_t", how many programmers are
going to be consistent with their use of size_t.

Those with experience, skill, ability, or simply enough smarts or
curiosity to ask _why_ two different types exist, for starters. Oh, and
any smart enough to ask for more experienced programmers to look over
their code now and then and actually learn from the recommendations
provided.
 
M

Malcolm McLean

Kelsey Bjarnason said:
[snips]

Yes qsort() takes two size_t's as well. So we are OK. The system does
work,
but only so long as we are absolutely consistent in using size_t
everywhere.

My proposal is to 1) make size_t signed, 2) rename it int.

And thereby break reams of code, and produce yet another situation where
the solution simply does not work.

Current:

void *buff = malloc( 40000U );
int size = 40000;

Oops. The malloc works, but the size is wrong. So your solution requires
either:

1) Forcing 16-bit implementations to limit allocations to 32767 or fewer
bytes
Or forcing someone in the unusual situation of allocating more than half of
the address space in one go into using an unusual type.
Engineering doesn't usually offer perfect solutions. If you want to
simulataneously have signed arithemetic, an efficient integer
representation, and use all bits of the integer, something has got to give.
The ability to manipulate huge arrays of bytes, without using a "special"
type, is the thing that should give.
That's not to say you won't be able to come up with some real examples of
situations where it is extremely inconvenient. Engineering is like that.
There's always someone who wants screws with non-standard threads.
 
M

Martin Wells

jacob navia:
Standard C doesn't have

1) Any serious i/o. To do anything fast you need system specific stuff.
2) Any notion of the keyboard. To handle the keyboard you need system
specific stuff.
3) Any graphics. Ditto.
4) No network.
5) Not any timers with reasonable accuracy.


Good job so that 99% of algorithms don't need any of the above.

C is very popular for systems programming but none of those programs
is written in standard C.


Where possible, I'd hope that they are.

I am porting the lcc-win IDE and debugger. Written in C but system
specific. And I do not give a damm about portability of a windows
debugger to the latest toaster with embedded linux :)


I don't see what you're smiling about since you've already had a
headache with strlen.

All this people talking about "Portable standard C" are just talking
nonsense.


I've written countless fully-portable C programs.

#include <stdio.h>
int main(void){printf("hello\n");}

is portable since the errors of printf are NOT specified, so you have no
way to know what happened if printf returns a negative result, besides
going into implementation specific stuff!


In the absence of an output error you're guaranteed of the results.
But then how often do we get an output error in such a small program?
0% of the time? Or would it be something considerably bigger like
0.0000000000000000000000000000000000000001% of the time?

Martin
 
M

Martin Wells

Malcolm:
What are those ints going to be used for? We don't know, but such a useful
function would surely find a place in calculating array indices, or
intermediate values, such as counts, to calculating array indices.

So we need another version

void AddFiveToEachElementsz(size_t *p, size_t len)

The fuse has gone off. That's what the admission of size_t does to your
code.


Sorry I haven't a clue what you're talking about. Could you please
explain more clearly?

Martin
 
M

Martin Wells

Ed:
But, instead of pointless and unfounded insults, let's try a real
world test for a change. You paste one or two thousand lines of C
code you've written from your most recent project, and we'll see if
anyone on the newsgroup can identify any code that's not 100% portable
C code.

Since you've made the claim that writing 100% portable C code isn't
just easy, but VERY easy, I'm quite sure you're up to the challenge.
It's time to put your code where your mouth is.

Ask me to write an algorithm. Or even an algorithm contained in a very
small program. I'd write it efficiently using fully-portable code.
Something along the lines of the Euro Banknote Serial Number Checker.

Martin
 
M

Martin Wells

Chris:
Furthermore, the easy-ness of portability varies with the goal of
the code. Clearly, something like "calculate mortgage payments"
or "concatenate files specified by argv[] elements" is going to be
easier than "write an operating system" or "do bitmapped graphics":
the latter two *require* at least *some* non-portable code.


Let's say for instance that on a particular platform, that an
"unsigned int" contains padding bits (or whatever they call those bits
that don't part-take in indicating what the number is). This could
possibly throw a big spanner in the works for playing around with
bitmaps.

However, it's still not impossible to achieve what you want if you
play around with arrays of "unsigned char". Indeed, the code might be
ugly, but it's definitely possible. And probably fun to write too.

Martin
 
K

Keith Thompson

Martin Wells said:
jacob navia: [...]
#include <stdio.h>
int main(void){printf("hello\n");}

is portable since the errors of printf are NOT specified, so you have no
way to know what happened if printf returns a negative result, besides
going into implementation specific stuff!

I assume jacob meant to say "non-portable".

The failure to return a value from main() makes the program less
portable than it could be, since the program is very likely to be
compiled by a compiler that doesn't fully implement C99. Adding a
"return 0;" costs practically nothing and doesn't hurt C99
conformance.
In the absence of an output error you're guaranteed of the results.
But then how often do we get an output error in such a small program?
0% of the time? Or would it be something considerably bigger like
0.0000000000000000000000000000000000000001% of the time?

Output errors happen. Suppose stdout is directed to a file on a
filesystem that has no remaining space. (I was going to give
"./prog > /dev/nosuchdevice" as an example, but that fails in the
calling environment before the program is invoked, at least on my
system.)

For a simple program like this, there's probably no good way to handle
an output error, so ignoring it is probably acceptable. But if I
wanted to be a bit more paranoid, I might check the result of printf()
and print a message to stderr (it's at least possible that stderr is
writable even if stdout isn't). Or at least the program could do
'exit(EXIT_FAILURE);' on an output error, so that whatever entity
invoked it can detect that something went wrong and perform some
cleanup. For example, if only part of the output is written, you
might want to delete the output file altogether rather than leaving an
incorrect partial file lying around.
 
M

Malcolm McLean

Kelsey Bjarnason said:
Yeah, you just tend to make your mistakes so esoteric only about three
people are qualified to find them, let alone figure out the right way. :)
Harald van Dijk found one, I found another, so can you find a third?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,240
Members
46,828
Latest member
LauraCastr

Latest Threads

Top