sizeof...

thushianthan15 · May 29, 2005

Greetings!!
Currently i am learning C from K&R.I have DOS & Linux in my
system.When i used sizeof() keyword to compute the size of integer , it
shows different results in DOS and Linux.In DOS it shows integer
occupies 2 bytes , but in Linux it shows 4 bytes.Is sizeof operator
"really" shows the machine data or register size ?
Please help
me!!

C.E.P.A. · May 29, 2005

Greetings!!
Currently i am learning C from K&R.I have DOS & Linux in my
system.When i used sizeof() keyword to compute the size of integer , it
shows different results in DOS and Linux.In DOS it shows integer
occupies 2 bytes , but in Linux it shows 4 bytes.Is sizeof operator
"really" shows the machine data or register size ?
Please help
me!!

It depends on compiler - 32 bit compilers (GCC etc) returns 4 bytes per
integer and old 16 bit compilers (PacifiC, Turbo C) returns 2 bytes per
integer.

Paul Mesken · May 29, 2005

Greetings!!
Currently i am learning C from K&R.I have DOS & Linux in my
system.When i used sizeof() keyword to compute the size of integer , it
shows different results in DOS and Linux.In DOS it shows integer
occupies 2 bytes , but in Linux it shows 4 bytes.Is sizeof operator
"really" shows the machine data or register size ?
Please help
me!!

DOS was made for the old 8086/88 processors. These were 16 bit
processors (thus : int being 2 bytes). On newer machines, the CPU runs
in a compatibility mode when running DOS (either real mode or V86
mode). It doesn't mean that the processor is suddenly a 16 bit
processor, it only acts like one.

So, in this case sizeof doesn't show the register size (which is 32
bits). Only the default operand size, which is 16 bits in this
compatibility mode.

Emmanuel Delahaye · May 29, 2005

Currently i am learning C from K&R.I have DOS & Linux in my
system.When i used sizeof() keyword to compute the size of integer , it
shows different results in DOS and Linux.In DOS it shows integer
occupies 2 bytes , but in Linux it shows 4 bytes.Is sizeof operator
"really" shows the machine data or register size ?

sizeof returns the number of byte occupied by a constant or an object.

Yes, an int can use 2 char on x86 real mode (PC/Ms-Dos) and 4 char on
x86 protected/extended mode (PC/Win32, Linux). It could also use 1 char
on a DSP TMS320C54 (Texas Instrument) where a char and and int are both
16-bit width.

--
Emmanuel
The C-FAQ: http://www.eskimo.com/~scs/C-faq/faq.html
The C-library: http://www.dinkumware.com/refxc.html

"Mal nommer les choses c'est ajouter du malheur au
monde." -- Albert Camus.

Keith Thompson · May 29, 2005

Paul Mesken said:
DOS was made for the old 8086/88 processors. These were 16 bit
processors (thus : int being 2 bytes). On newer machines, the CPU runs
in a compatibility mode when running DOS (either real mode or V86
mode). It doesn't mean that the processor is suddenly a 16 bit
processor, it only acts like one.

So, in this case sizeof doesn't show the register size (which is 32
bits). Only the default operand size, which is 16 bits in this
compatibility mode.

sizeof(int) simply yields the size, in bytes, of an int. It's up to
the compiler implementer to decide how big an int is going to be.
It's often whatever size will fit in a machine register, but that's
not required.

On the hardware level, an x86 machine (which I assume is what the OP
is using) can operate on 8-bit, 16-bit, or 32-bit quantities. The C
language specifies a range of integer types: char (at least 8 bits),
short (at least 16 bits), int (at least 16 bits), and long (at least
32 bits). (On some older machines, 32-bit operations might be more
difficult). As you can see, a compiler implementer has some
flexibility in choosing how to assign sizes to the various C types.
Typical choices are:
char 8 bits
short 16 bits
int 16 bits
long 32 bits
and
char 8 bits
short 16 bits
int 32 bits
long 32 bits

It happens that the former choice (16-bit int) is more convenient on
the older systems that DOS was designed for, and the latter (32-bit
int) is more convenient for the newer systems on which Linux typically
runs.

(I'm ignoring the type "long long", introduced in C99, which is at
least 64 bits.)

All this is only indirectly related to the "default operand size"; I'm
not even sure what that means.

Malcolm · May 29, 2005

Greetings!!
Currently i am learning C from K&R.I have DOS & Linux in my
system.When i used sizeof() keyword to compute the size of integer , it
shows different results in DOS and Linux.In DOS it shows integer
occupies 2 bytes , but in Linux it shows 4 bytes.Is sizeof operator
"really" shows the machine data or register size ?
Please help
me!!

All C objects are a multiple of sizeof(char), which is defined to be one.
Usually char is 8 bits, but not always.
int is designed to be the "natural" integer size of the machine. On some
machines it is not obvious what the natural integer size should be, and the
intermediate x86 chips are a case in point, because of the way the
instruction set allows registers to be treated as pairs.

In any case there is no absoloute guarantee that an int will fit in a
register. int must be at least 16 bits, and C compilers are available for
processors with 8 bit registers.

Paul Mesken · May 29, 2005

sizeof(int) simply yields the size, in bytes, of an int. It's up to
the compiler implementer to decide how big an int is going to be.
It's often whatever size will fit in a machine register, but that's
not required.

On the hardware level, an x86 machine (which I assume is what the OP
is using) can operate on 8-bit, 16-bit, or 32-bit quantities.

Yes, 16 bit and 32 bit operations can be mixed but one of them will
most certainly (there are some exceptions) incur a performance penalty
since it requires a size prefix. Which one (16 or 32) depends on what
the "default bit" of the code segment is. With an assembler, the
programmer sets the default operand size of the code segment by
instructions like USE16 or USE32.

The C
language specifies a range of integer types: char (at least 8 bits),
short (at least 16 bits), int (at least 16 bits), and long (at least
32 bits). (On some older machines, 32-bit operations might be more
difficult). As you can see, a compiler implementer has some
flexibility in choosing how to assign sizes to the various C types.
Typical choices are:
char 8 bits
short 16 bits
int 16 bits
long 32 bits
and
char 8 bits
short 16 bits
int 32 bits
long 32 bits

It happens that the former choice (16-bit int) is more convenient on
the older systems that DOS was designed for, and the latter (32-bit
int) is more convenient for the newer systems on which Linux typically
runs.

(I'm ignoring the type "long long", introduced in C99, which is at
least 64 bits.)

All this is only indirectly related to the "default operand size"; I'm
not even sure what that means.

Consider this :

MOV EAX, 0x01020304
MOV AX, 0x0102

These two instructions load an immediate value into a register.

The first one uses all 32 bits of the EAX register (the general
purpose registers of the x86 became 32 bits starting with the 386).

The second one uses all 16 bits of the AX register (which is really
the lower 16 bits of the EAX register).

BUT both have the same opcode : 0xb8.

This seems strange since the operations are really different : one is
16 bits, the other is 32 bits. The CPU _cannot_ distinguish between
the two different operations because they have the same opcode.

However : the CPU uses the "default bit" (or "D bit") of the code
segment to establish whether a 16 bit operand size is the default and,
thus, the lower 16 bits of EAX (aka AX) are used OR a 32 bit operand
is the default and, thus, all of the 32 bits of EAX are used.

One can, however, change this behaviour by using a size prefix (0x66)
so that the behaviour according to the D bit of the code segment is
inverted. Of course, the assembler might do this for the programmer
automatically based on whether "EAX" or "AX" is used in this example.

However, using such a prefix (and there are more such prefixes for
mixing 16 and 32 bit code) comes with a performance penalty.

Since DOS works in real or V86 mode in which 16 bit is the default
size (after all : it mimics the 8086/88), it makes perfect sense to
have int being 16 bits (for the 8086/88, there wasn't even an option
since the registers were only 16 bits wide, using 32 bits would be
dreadfully slow). Even though the Standard doesn't require it, int is
supposed to be the "quick" datatype.

Jack Klein · May 30, 2005

Yes, 16 bit and 32 bit operations can be mixed but one of them will
most certainly (there are some exceptions) incur a performance penalty
since it requires a size prefix. Which one (16 or 32) depends on what
the "default bit" of the code segment is. With an assembler, the
programmer sets the default operand size of the code segment by
instructions like USE16 or USE32.

What does assembly language or modes of the Intel processor have to do
with C? Caring about whether or not there is a performance penalty
for one thing or another thing is off-topic here, is a micro
optimization, and has nothing to do with the C language, which does
not specify anything at all about the speed or efficiency of anything.

Consider this :

MOV EAX, 0x01020304
MOV AX, 0x0102

[snip much extremely off-topic text]

If you want to expound about the bizarre quirks of the Frankenstein
patchwork Intel X86 architecture, I'd suggest you do so in
where it is appreciated and topical.

It is neither of those things here.

Paul Mesken · May 30, 2005

Caring about whether or not there is a performance penalty
for one thing or another thing is off-topic here, is a micro
optimization, and has nothing to do with the C language, which does
not specify anything at all about the speed or efficiency of anything.

Nownownow Jack, that's not the right attitude for a C programmer ;-)

Don't you think our non-Assembly programming C brothers/sisters have a
right to know that a division is typically slower than an addition,
for example? Or is such a statement off topic here because the
Standard doesn't mention this? (even though it is true in the real
world).

If speed and efficiency are of no concern then there are better
languages than C. C is not the only portable language (how about
Java?).

Isn't it so that C is a "Language of Choice" because it's "close to
the metal"? Computer architecture _is_ that "metal". A lot of
decisions made for the implementation of the compiler make more sense
(like the difference in sizeof(int) the OP experienced) when the
underlying computer architecture (which the compiler targets) is
explained.

Of course, we could deny the fact that programs written in C are
actually meant to run on a computer. But this would reduce discussion
to a completely academical one, interesting only to mathematicians and
linguists and devoid of practical experience. We would just be quoting
the Standard all the time.

If you want to expound about the bizarre quirks of the Frankenstein
patchwork Intel X86 architecture, I'd suggest you do so in
news:comp.lang.asm.x86, where it is appreciated and topical.

Well, the question wasn't asked there and even though it's obvious
that you want me to save that group, I'm a bit rusty (from too much
C/C++/SQL programming) and afraid that Terje would come up with
solutions quicker than my own ;-)

Artie Gold · May 30, 2005

Paul said:
Nownownow Jack, that's not the right attitude for a C programmer ;-)

Actually it is.

Don't you think our non-Assembly programming C brothers/sisters have a
right to know that a division is typically slower than an addition,
for example? Or is such a statement off topic here because the
Standard doesn't mention this? (even though it is true in the real
world).

All of which would be relevant if `implementation' were part of the name
of this newsgroup -- or if the standard set out any particular
requirements about how constructs behave.

If speed and efficiency are of no concern then there are better
languages than C. C is not the only portable language (how about
Java?).

All of which would be relevant if `advocacy' were part of the title of
this newsgroup.

Isn't it so that C is a "Language of Choice" because it's "close to
the metal"? Computer architecture _is_ that "metal". A lot of
decisions made for the implementation of the compiler make more sense
(like the difference in sizeof(int) the OP experienced) when the
underlying computer architecture (which the compiler targets) is
explained.

No, it's just an area where implementations are free to make certain
choices, according to the standard.

Of course, we could deny the fact that programs written in C are
actually meant to run on a computer. But this would reduce discussion
to a completely academical one, interesting only to mathematicians and
linguists and devoid of practical experience. We would just be quoting
the Standard all the time.

Programs written in C are designed to run on the virtual machine that
the language definition provides. The underlying platform could just as
easily be a computer or a flock of cooperative pigeons.Topicality is important, serving two purposes: Maintaining the
signal/noise ratio at an approriate level (lest those who *can* answer
questions bail) and limiting the subject matter in order that answers
can be properly vetted by the community.

Cheers,
--ag

Peter Nilsson · May 30, 2005

Paul said:
Nownownow Jack, that's not the right attitude for a C programmer ;-)

But it's the right attitude for clc. You can introduce specific
architectures if you want to cite examples of how different
implementations will yield different results. But readers should
not be encouraged to think of C in terms writing code for one (or
two) specific architectures.

Don't you think our non-Assembly programming C brothers/sisters
have a right to know that a division is typically slower than an
addition, for example?

Yes, but I don't want to see discussions of say opcodes and actual
cycle/latency times on an x86.

Or is such a statement off topic here because the Standard doesn't
mention this? (even though it is true in the real world).

The critical point is that what the standard(s) do say in terms of
language semantics are the priority.

...
A lot of decisions made for the implementation of the compiler make
more sense (like the difference in sizeof(int) the OP experienced)
when the underlying computer architecture (which the compiler
targets) is explained.

CLC is not here to discuss implementations and their internals.
There are plenty of other groups that discus that. CLC is here to
discus C programming, i.e. writing C code that is portable to all
implementations.

It is a natural human (or at least programmer) tendancy to want to
disect C code in terms of disassemblies and working out what the
compiler actually does with source code. However, it has been my
long experience that this only coerces students of C (and other
languages) into writing architecture specific code. The consequences
of this attitude can and often does come back to haunt students.

Unlearning acquired bad habits is difficult and time consuming.

Of course, we could deny the fact that programs written in C are
actually meant to run on a computer.

Who is denying this? C's main goal was to implement Unix and
associated tools. You don't get much more practical than that.

But this would reduce discussion to a completely academical one,
interesting only to mathematicians and linguists and devoid of
practical experience.

CLC is an esoteric group in that it does focus exclusively on the
application of the language definition, without trying to motivate
readers towards specific architecture specific trends.

How does it harm students to understand the language abstraction?

We would just be quoting the Standard all the time.

And what is wrong with that? Too few C programmers sit down to
actually analyse the language they are employing. This just
promotes a 'near enough is good enough' approach. With a language
like C, this is a dangerous attitude.

Time and time again you will see people questioning why certain
code works on one machine but not on another. The reason is invariably
because they have been using (and been encouraged to use) Machine-X C,
rather than learn the portable aspects of the core C language itself.

If you want language efficiencies on specific architectures, then
there are no shortage of groups that can help. On the other hand,
discussing strictly conforming code has pretty much only one group,
namely clc (and language/moderated) derivatives.

Prashant Mahajan · May 30, 2005

All C objects are a multiple of sizeof(char), which is defined to be one.

Usually char is 8 bits, but not always.

Can you please tell where/what are the diffrent sizes of char?I am new
to C and I haven't seen any compiler where size of char is diffrent
from 8 .Please specify.

Keith Thompson · May 30, 2005

Prashant Mahajan said:
Can you please tell where/what are the diffrent sizes of char?I am new
to C and I haven't seen any compiler where size of char is diffrent
from 8 .Please specify.

I've never heard of a hosted implementation of C with CHAR_BIT != 8.
The need to exchange octet-oriented data with other systems strongly
encourages implementers to use 8-bit chars even on systems that can't
directly access 8-bit quantities. (Search for references to "Cray
vector" in this newsgroup for one example.)

Values of CHAR_BIT other than 8 are most common on DSPs (Digital
Signal Processors). These are special-purpose systems that process
data 16 or 32 bits at a time.

Some older systems have had 9-bit bytes, or word sizes such as 36 or
60 bits, but most or all of them probably predate the ANSI C standard;
powers of two have pretty much taken over in recent decades.

CBFalconer · May 30, 2005

Prashant said:
^^^^^^^^^

Can you please tell where/what are the diffrent sizes of char?I am
new to C and I haven't seen any compiler where size of char is
diffrent from 8 .Please specify.

He did. There is no system where sizeof(char) is not one.

--
Some informative links:
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html

Keith Thompson · May 30, 2005

CBFalconer said:
He did. There is no system where sizeof(char) is not one.

As long as we're being pedantic, why do you assume that the phrase
"size of" refers to the C "sizeof" operator?

Mark McIntyre · May 30, 2005

Nownownow Jack, that's not the right attitude for a C programmer ;-)

Smiley noted, but unless you're planning becoming a troll, you need to
understand the topic of this group a tad better.

Don't you think our non-Assembly programming C brothers/sisters

Whatever. This is not comp.lang.x86.assembler or whatever.

Or is such a statement off topic here because the
Standard doesn't mention this?

Yes.

(even though it is true in the real world).

And you have proof positive of this for all concievable hardware?
Including specialist math hardware designed to do divisions
ultra-fast?

If speed and efficiency are of no concern then there are better
languages than C. C is not the only portable language (how about
Java?).

Its either an island or a coffee. The latter is semitopical here, the
latter not.

(snip rest of foolish troll)

Malcolm · May 30, 2005

Paul Mesken said:
Nownownow Jack, that's not the right attitude for a C programmer ;-)

Don't you think our non-Assembly programming C brothers/sisters have a
right to know that a division is typically slower than an addition,
for example? Or is such a statement off topic here because the
Standard doesn't mention this? (even though it is true in the real
world).

I used to spend a lot of time back cracking multiplications
( replacing x *= 9 with x = (x << 3) + x; )
Now I hardly ever do. The real world changes.

If speed and efficiency are of no concern then there are better
languages than C. C is not the only portable language (how about
Java?).

Efficiency is certainly a reason for using C. It is not the only one. When
the C++ standard template library came out I saw some figures that made a
very convincing case that the classes were more efficient than corresponding
C constructs. I did seriously consider doing everything in STL, but decided
against it, largely because the syntax made it too difficult to integrate
modules.

Isn't it so that C is a "Language of Choice" because it's "close to
the metal"? Computer architecture _is_ that "metal". A lot of
decisions made for the implementation of the compiler make more sense
(like the difference in sizeof(int) the OP experienced) when the
underlying computer architecture (which the compiler targets) is
explained.

You do need to understand how a computer works to be a good programmer,
irrespective of language. However you shouldn't have to understand what your
particular architecure is. I have a UNIX box to run heavy-duty programs on.
I don't actually know some basic things such as how many bits a pointer
takes up, what the processor is, what the size of long is. I don't need to.
I just give it C programs and it spits back the results.

Of course, we could deny the fact that programs written in C are
actually meant to run on a computer. But this would reduce discussion
to a completely academical one, interesting only to mathematicians and
linguists and devoid of practical experience. We would just be quoting
the Standard all the time.

You need both theoretical and practical approaches. It is nice to know that
a ZX81 can emulate a Cray given a large enough supply of tapes, though no
one would ever try to do this. In reality the whole world runs on Microsoft
products, but it is nice to pretend sometimes that Mr Gates is an obscure
software vendor whose operating system we have only vaguely heard of. Why?
Not because every single regular on this ng hasn't written a program to run
under MS at some stage, but because you need to separate the language from
the implementation to improve the structure of your programs.

thushianthan15 · May 30, 2005

Thank you for all your replies!!!

Malcolm said:
I used to spend a lot of time back cracking multiplications
( replacing x *= 9 with x = (x << 3) + x; )
Now I hardly ever do. The real world changes.
Efficiency is certainly a reason for using C. It is not the only one. When
the C++ standard template library came out I saw some figures that made a
very convincing case that the classes were more efficient than corresponding
C constructs. I did seriously consider doing everything in STL, but decided
against it, largely because the syntax made it too difficult to integrate
modules.
You do need to understand how a computer works to be a good programmer,
irrespective of language. However you shouldn't have to understand what your
particular architecure is. I have a UNIX box to run heavy-duty programs on.
I don't actually know some basic things such as how many bits a pointer
takes up, what the processor is, what the size of long is. I don't need to.
I just give it C programs and it spits back the results.
You need both theoretical and practical approaches. It is nice to know that
a ZX81 can emulate a Cray given a large enough supply of tapes, though no
one would ever try to do this. In reality the whole world runs on Microsoft
products, but it is nice to pretend sometimes that Mr Gates is an obscure
software vendor whose operating system we have only vaguely heard of. Why?
Not because every single regular on this ng hasn't written a program to run
under MS at some stage, but because you need to separate the language from
the implementation to improve the structure of your programs.

Paul Mesken · May 30, 2005

I used to spend a lot of time back cracking multiplications
( replacing x *= 9 with x = (x << 3) + x; )
Now I hardly ever do. The real world changes.

That's a good point. Multiplications aren't as slow as they once used
to be (although your example is still faster than the multiplication
on a P3 or P4, but that's besides the point).

In general, I think that a single operation should not be replaced by
two or more operations which just happen to be faster at the
_present_. It also tends to hurt readability.

On the other hand, replacing one operation with another one can be
quite helpfull.

For example :

a = b % 16;

a = b & 15;

(for some unsigned int b)

Of course, the compiler should optimize such simple constructs but I
have too much experience with compilers that don't do obvious
optimizations to have any trust in them

I think it's perfectly safe to assume that multiplications, divisions
and modulus operations will never be faster than the boolean
operations (and, or, not), additions and subtractions.

Having said that : I believe that readability (and, thus,
maintainability) is the most important factor for the bulk of the
code. If a program needs optimization for speed (and that optimization
is needed in the code, not in the hardware) then, typically, only a
very small part of the program needs that optimization (the "90% of
the time, 10% of the code is executed" thing

Paul Mesken · May 30, 2005

And you have proof positive of this for all concievable hardware?
Including specialist math hardware designed to do divisions
ultra-fast?

Let's do it somewhat simpler : _you_ come up with a _single_ piece of
hardware that does divisions quicker than additions.

I would be mightily impressed.

C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
Why the sizeof is 4	16	Jun 12, 2012
sizeof implementation	2	Jul 11, 2010
sizeof implementation	8	Jul 11, 2010
[MUDFLAP] Is sizeof(ARRAY[0]) equivalent to sizeof(*ARRAY) ?	46	Jan 9, 2013
Use 3 bytes to store integer in redis?	3	Sep 5, 2012
Working on mobile css menu with plenty of frustration!	2	Dec 29, 2022
sizeof(x)	12	Mar 30, 2008

sizeof...

thushianthan15

C.E.P.A.

Paul Mesken

Emmanuel Delahaye

Keith Thompson

Malcolm

Paul Mesken

Jack Klein

Paul Mesken

Artie Gold

Peter Nilsson

Prashant Mahajan

Keith Thompson

CBFalconer

Keith Thompson

Mark McIntyre

Malcolm

thushianthan15

Paul Mesken

Paul Mesken

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads