sizeof...

Jack Klein · May 30, 2005

As long as we're being pedantic, why do you assume that the phrase
"size of" refers to the C "sizeof" operator?

Perhaps from the C standard?

6.5.3.4 The sizeof operator (paragraph 2)

"The sizeof operator yields the size (in bytes) of its operand"

In C, there is no way to know the size of an object or type without
using the sizeof operator, with one exception.

That exception, of course, being the character types, for which sizeof
yields a value of 1, and which occupy 1 byte, by definition.

Mark McIntyre · May 31, 2005

Let's do it somewhat simpler : _you_ come up with a _single_ piece of
hardware that does divisions quicker than additions.

I have no need to. I'm making no claims at all. You're the one
asserting its slower, and using this as the basis for some requirement
to understand assembler or hardware or something.

I would be mightily impressed.

And I care because?

Paul Mesken · May 31, 2005

I have no need to. I'm making no claims at all. You're the one
asserting its slower, and using this as the basis for some requirement
to understand assembler or hardware or something.

It's completely logical that a division will always be slower than an
addition, given the algorithms that are used to implement it.

Here's an overview :

http://www.ecst.csuchico.edu/~juliano/csci380/Papers/sOberman1997division.pdf

And as for proof that divisions are slower than additions on all
conceivable hardware :

Of course, I don't have such proof since I don't know all conceivable
hardware (who knows? some quantum computer).

But that doesn't disproof my assertion. Everyone with a tiny bit of
knowledge about how divisions and additions are implemented in
hardware will realize that additions are quicker (implementing an
addition with boolean primitives is about the very first excercise one
does when learning hardware implementation).

It's simply logical.

Of course, you can choose to ignore this and program as if divisions
are just as quick as additions.

Personally, I'll never choose ignorance over knowledge. Ignorance is a
flaw, not a feature.

And I care because?

You have to ask that to me?

Jack Klein · May 31, 2005

[snip]

If speed and efficiency are of no concern then there are better
languages than C. C is not the only portable language (how about
Java?).

Java? Portable? You have just disqualified yourself from any serious
discussion of C on various architectures. Such as the 98% of
processors and controllers manufactured in the world today that are
not used as the main CPU in desktop/laptop/server applications.

Paul Mesken · May 31, 2005

Java? Portable? You have just disqualified yourself from any serious
discussion of C on various architectures.

Well, lucky me then that such discussions aren't topical here ;-)

Malcolm · May 31, 2005

Jack Klein said:
In C, there is no way to know the size of an object or type without
using the sizeof operator, with one exception.

size_t sizeof(int)
{
int *mem = malloc(1024);
unsigned char *ptr = (unsigned char *) mem;
size_t answer = 0;
memset(mem, 0, 1024);
*mem = ~0;
while(ptr[answer] != 0)
answer++;

return answer;
}

(OK maybe we've got padding bytes. So write to mem[1] and check.)

Anonymous 7843 · May 31, 2005

In C, there is no way to know the size of an object or type without
using the sizeof operator, with one exception.

look at the declaration for the object and figure it out (human error possible)

offsetof() the type's sucessor in a struct (padding may add to apparent size)

pointer differences in an array of the type (padding)

set last (subpart) to a known value and use memchr to search for it
(begs the question of the size of the last subpart, but that should
be smaller and simpler and not be turtles all the way down)

pete · May 31, 2005

Malcolm said:
Jack Klein said:

In C, there is no way to know the size of an object or type without
using the sizeof operator, with one exception.

Click to expand...

size_t sizeof(int)
{
int *mem = malloc(1024);
unsigned char *ptr = (unsigned char *) mem;
size_t answer = 0;
memset(mem, 0, 1024);
*mem = ~0;
while(ptr[answer] != 0)
answer++;

return answer;
}

(OK maybe we've got padding bytes. So write to mem[1] and check.)

sizeof(int) isn't limited to 1024.

~0 is equal to zero on one's complement machines.
Assignment is by value.

Malcolm · May 31, 2005

Paul Mesken said:
But that doesn't disproof my assertion. Everyone with a tiny bit of
knowledge about how divisions and additions are implemented in
hardware will realize that additions are quicker (implementing an
addition with boolean primitives is about the very first excercise one
does when learning hardware implementation).

But replacing a division with an addition is a micro-optimisation. Only
rarely will it make any real difference to a program's running time.
As for the single counterexample, consider this.

I've got a weighing factor I've got to apply to some variables. If I didn't
have to worry about running time at all I would implement it like this.

double weight = loadweightfactor();
....
x /= weight;

However let's say that the array x consists of integers, and the weight is
usually 5/16. (ie 3.2)

So let's optimise our code

if( weight == 5/16)
{
/* in loop */

x = ((x << 2) + x) >> 4;
y += strlen(str);
}

So we've got rid of our division and replaced it with a addition and two
shifts.

However this thing is running on a Pentium. It is not entirely impossible
that the floating point pipeline is idle, whilst the main pipeline is
overloaded. So what we have done is replaced a free instruction with several
instructions that actually cost microseconds.

So it isn't always the case that replacing a division with an addition will
speed things up.

Mark McIntyre · May 31, 2005

It's completely logical that a division will always be slower than an
addition, given the algorithms that are used to implement it.

And I say again, you can prove this to be completely true, for all
hardware, including specially optimised hardware? Imagine a system
that does addition on an 8080 and division on a Pentium. Which
operation is faster?

This may sound daft. But its perfectly possible to build hardware with
a FP unit which runs at higher clock speeds than the main cpu (in fact
it probably makes sense, FP operations often take more clock cycles).
So I can easily envisage hardware on which addition and division were
of comparable speeds, or better.

Or consider for instance floating point addition, and integer division
by two. Which is faster?

Of course, I don't have such proof since I don't know all conceivable
hardware

Indeed. But no need to invoke magic. Today's hardware could do it.

But that doesn't disproof my assertion.

The thing about assertions is, to make them into facts, you have to
prove them to be true.

Everyone with a tiny bit of knowledge

I'm sure we all remember what they say about a little knowledge.

Of course, you can choose to ignore this and program as if divisions
are just as quick as additions.

Or you can program as if thats something you should not be worrying
about yet. Do you recall the three laws of optimisation?

Personally, I'll never choose ignorance over knowledge. Ignorance is a
flaw, not a feature.

And arrogance about ones knowledge is a bigger one, IMHO. YMMV.

Paul Mesken · Jun 1, 2005

But replacing a division with an addition is a micro-optimisation. Only
rarely will it make any real difference to a program's running time.

This is true. The first optimization should, of course, be the
algorithm itself. However, there might be cases in which micro
optimizations make a big difference. In such a case it is handy to
know that not all arithmetical operations are created equally,
performance wise.

As things are now (in general) :

+ - & | ~ ^ >> << are the quick ones
* is mediocre
/ % are the slow ones

Also, I didn't say that divisions should be replaced by additions
(although, if it can be done in an obvious way, like dividing by a
half, then by all means one should). I said that divisions are slower
than additions.

As for the single counterexample, consider this.

I've got a weighing factor I've got to apply to some variables. If I didn't
have to worry about running time at all I would implement it like this.

double weight = loadweightfactor();
...
x /= weight;

However let's say that the array x consists of integers, and the weight is
usually 5/16. (ie 3.2)

So let's optimise our code

if( weight == 5/16)
{
/* in loop */

x = ((x << 2) + x) >> 4;
y += strlen(str);
}

Note that I, in another post in this thread, opposed to this kind of
optimization :

"In general, I think that a single operation should not be replaced by
two or more operations which just happen to be faster at the
_present_. It also tends to hurt readability.".

Also, using an "if" as an optimization is hardly an optimization in
pipelined architectures.

For example :

Nowadays' x86s suffer a lot from penalties by mispredicted branches
and the static branch prediction will favor a fall through in your
optimization (and in most cases it actually will _not_ fall through so
static branch prediction will mispredict most of the time).

Even if the branch is predicted correctly most of the time (in some
loop with dynamic branch prediction, for example) there's still some
overhead involved by using if's. And then there is the condition
itself as well, of course, which is an extra operation.

But this single x /= weight; operation can be replaced by another
single operation.

I would use the simple (and still readable) optimization of using a
multiplication instead of a division. So the function
loadweightfactor() should not return something like 5/16 but 16/5 and
x will not be divided by weight but multiplied by it.

So :

double weight = loadweightfactor();
x *= weight;

Of course, assuming that this course of action will not cause
performance penalties in the function loadweightfactor().

As a side note :

x = ((x << 2) + x) >> 4;

is not equivalent to :

x /= 5/16;

But the intention was clear

So we've got rid of our division and replaced it with a addition and two
shifts.

Click to expand...

And an if statement, that's the expensive one

As I said : one typically shouldn't replace a single operation by 2 or
more (in your example, it was replaced by 4, probably even more
operations in the generated machinecode). It will hurt readability and
it might not be much of an optimization in the future.

Paul Mesken · Jun 1, 2005

And I say again, you can prove this to be completely true, for all
hardware, including specially optimised hardware? Imagine a system
that does addition on an 8080 and division on a Pentium. Which
operation is faster?

Well, if the 8080 is wildly overclocked (Z80s, which is code
compatible with the 8080, can go as fast as 32 MHz, last time I
checked) and the division is one by zero on a Pentium then the 8080
might still be faster ;-)

Also, remember (I hope) that a Pentium works with a relatively long
pipeline and there will be a distinct difference between throughput
and latency.

My statement doesn't provide for such hypothetical systems. I work
with real systems and I'm confident enough (armed with the knowledge
about how divisions and additions are carried out in hardware) to
state that divisions will not be quicker than additions.

I will grant you that my statements are not universally valid (what
statement is?).

This may sound daft.

Agreed

But its perfectly possible to build hardware with
a FP unit which runs at higher clock speeds than the main cpu (in fact
it probably makes sense, FP operations often take more clock cycles).
So I can easily envisage hardware on which addition and division were
of comparable speeds, or better.

I can envision such hardware as well. But I haven't actually _seen_
such hardware.

You're trying to undermine a solid programming practice by coming up
with counter examples that rely on hypothetical machines. I dare to
say that most programmers don't program for hypothetical machines but
real ones.

Or consider for instance floating point addition, and integer division
by two. Which is faster?

If the integer division is replaced by a single shift to the right
then it might be very close. But a shift is not a division.

Indeed. But no need to invoke magic. Today's hardware could do it.

The thing about assertions is, to make them into facts, you have to
prove them to be true.

Take Newton's F = M * A

It's good enough for "daily use" even though QM and relativety theory
have shown it to be not precise enough for the very fast or the very
small.

But that doesn't make it less helpfull, for "daily use".

On most real machines my statement about divisions and additions will
hold, but there might be some exceptions (although I believe most such
exceptions to be hypothetical).

Or you can program as if thats something you should not be worrying
about yet. Do you recall the three laws of optimisation?

There are many such sets but you're probably refering to the one which
starts with "law #1 Don't do it"

pete · Jun 1, 2005

Paul said:
On Tue, 31 May 2005 23:47:40 +0100, Mark McIntyre

Well, if the 8080 is wildly overclocked

I think what Mark McIntyre meant, is that you're off topic.
Which code is faster,
isn't part of the definition of the language.

Keith Thompson · Jun 1, 2005

Malcolm said:
Jack Klein said:

In C, there is no way to know the size of an object or type without
using the sizeof operator, with one exception.

Click to expand...

size_t sizeof(int)
{
int *mem = malloc(1024);
unsigned char *ptr = (unsigned char *) mem;
size_t answer = 0;
memset(mem, 0, 1024);
*mem = ~0;
while(ptr[answer] != 0)
answer++;

return answer;
}

(OK maybe we've got padding bytes. So write to mem[1] and check.)

<quibble>
If your C compiler allows you to name a function "sizeof", send it
back.
</quibble>

Paul Mesken · Jun 1, 2005

Nowadays' x86s suffer a lot from penalties by mispredicted branches
and the static branch prediction will favor a fall through in your
optimization (and in most cases it actually will _not_ fall through so
static branch prediction will mispredict most of the time).

I must have been half asleep. Of course, your code example _will_ fall
through most of the time. But then again : the code example still
needs an "else" and that will result in possible two branches.

Mark McIntyre · Jun 1, 2005

I can envision such hardware as well. But I haven't actually _seen_
such hardware.

The point is, you're making generalisations which aren't true, and you
don't even bother to caveat them. You're misleading people. I dislike
that.

You're trying to undermine a solid programming practice

but its /not/ solid programming practice! Leave the microoptimisations
to the compiler writer, the hardware and the OS, until you actually
know you need to optimise. Crivens, this is practically a Law. In
fact, its three Laws.

If the integer division is replaced by a single shift to the right
then it might be very close. But a shift is not a division.

If you're allowed to move the goalposts by excluding some classes of
operation which inconveniently (for you) can be optimised in hardware
or software to act as counterexamples, then you can of course prove
your theorem. Where I come from, thats called "cheating"...

Take Newton's F = M * A

Why? Is it relevant to whether division is invariably slower than
addition?

On most real machines my statement about divisions and additions will
hold,

Again, you make an unfounded assertion which isn't even generally
true. I've given a real-world counterexample but you deny its
existence. Thats ok, I';m sure others will have spotted by now that
you're in a hole.

There are many such sets but you're probably refering to the one which
starts with "law #1 Don't do it"

Thats the set. You remember the 2nd and 3rd laws?

Mark McIntyre · Jun 1, 2005

I think what Mark McIntyre meant, is that you're off topic.

Thats one of the things I meant.

Which code is faster, isn't part of the definition of the language.

Absolutely. Asserting something about which code is faster is wildly
offtopic in CLC, even if correct. Which it isn't.

Paul Mesken · Jun 1, 2005

On Wed, 01 Jun 2005 00:59:15 GMT, in comp.lang.c , pete

Absolutely. Asserting something about which code is faster is wildly
offtopic in CLC, even if correct. Which it isn't.

Ooowww, you slipped up there Mark. Now _you_ are asserting things and
need to come up with universal proofs ;-)

C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
Why the sizeof is 4	16	Jun 12, 2012
sizeof implementation	2	Jul 11, 2010
sizeof implementation	8	Jul 11, 2010
[MUDFLAP] Is sizeof(ARRAY[0]) equivalent to sizeof(*ARRAY) ?	46	Jan 9, 2013
Use 3 bytes to store integer in redis?	3	Sep 5, 2012
Working on mobile css menu with plenty of frustration!	2	Dec 29, 2022
sizeof(x)	12	Mar 30, 2008

sizeof...

Jack Klein

Mark McIntyre

Paul Mesken

Jack Klein

Paul Mesken

Malcolm

Anonymous 7843

pete

Malcolm

Mark McIntyre

Paul Mesken

Paul Mesken

pete

Keith Thompson

Paul Mesken

Mark McIntyre

Mark McIntyre

Paul Mesken

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads