Portably accessing 1 byte

rsood · May 5, 2006

Hi

I'm developing a program, and naturally I want it to be as portable as
possible. I need to be able to access specific numbers of bytes in it,
but as far as I know, there is no keyword in the c language such as
'byte'. Is it always okay to assume that the char data type is always
1 byte, or is there some other way to be sure you are getting 1 byte
that is not processor/OS dependent that is better, or is there no way
to be both absolutly sure and absolutly portable?

Thanks for any help.

void * clvrmnky() · May 5, 2006

rsood said:
I'm developing a program, and naturally I want it to be as portable as
possible. I need to be able to access specific numbers of bytes in it,
but as far as I know, there is no keyword in the c language such as
'byte'. Is it always okay to assume that the char data type is always
1 byte, or is there some other way to be sure you are getting 1 byte
that is not processor/OS dependent that is better, or is there no way
to be both absolutly sure and absolutly portable?

sizeof(char) is always 1, isn't it? sizeof returns bytes, so I think
this is your basic byte type.

Lew Pitcher · May 5, 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

sizeof(char) is always 1, isn't it? sizeof returns bytes, so I think
this is your basic byte type.

True enough.

Is the OP interested in C 'bytes' (which are synonymous with 'char', and
are /at least/ 8 bits wide), or is he interested in 8-bit quantities
(which are sometimes called 'bytes' outside of the C arena)?

If he is interested in C 'bytes' then char will do
If he is interested in 8-bit quantities only, then (AFAIK) there is no
/portable/ way to manage that in C.

- --

Lew Pitcher, IT Specialist, Corporate Technology Solutions,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed here are my own, not my employer's)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEW6oBagVFX4UWr64RAnExAJ9MbVIA6oI3jI3d4CUHwvKTU8zm8QCdH+07
TDaIF2lv4fvBUlpBVyjT4Ww=
=JA/7
-----END PGP SIGNATURE-----

Eric Sosman · May 5, 2006

rsood wrote On 05/05/06 14:57,:

Hi

I'm developing a program, and naturally I want it to be as portable as
possible. I need to be able to access specific numbers of bytes in it,
but as far as I know, there is no keyword in the c language such as
'byte'. Is it always okay to assume that the char data type is always
1 byte, or is there some other way to be sure you are getting 1 byte
that is not processor/OS dependent that is better, or is there no way
to be both absolutly sure and absolutly portable?

One char is one byte, always and forever.

BUT "one byte" might not have the meaning you expect.
In C, "one byte" has at least eight bits, but might have
more. Machines with 9-bit bytes are now rare but were
once fairly common, and machines with 32-bit bytes can be
found even today.

If what you want is "one octet," that is, one eight-bit
quantity, you can store the value in a byte/char but you
must use a little care in manipulating it. For example,
if you start incrementing an `unsigned char' repeatedly,
it might take more than 256 increments before the original
value reappears.

rsood · May 5, 2006

Thank you very much.

One char is one byte, always and forever.

You guys have been very helpful ^...

If he is interested in 8-bit quantities only, then (AFAIK) there is no
/portable/ way to manage that in C.

....although somewhat discouraging ^

I am actually interested in 8 bit quantities, because I am attempting
to write a cross-compiler/assembler, so it is rather important whether
it outputs 8 bits, 9 bits or 32 bits. I guess I'll have to rethink my
idea, or just accept it being limited somewhat.

cbmanica · May 5, 2006

rsood said:
I'm developing a program, and naturally I want it to be as portable as
possible. I need to be able to access specific numbers of bytes in it,

As noted, sizeof( char ) is guaranteed to be 1 by the Standard.
However, if you plan to manipulate the bits of the bytes, IIRC you want
"unsigned char"; fiddling with the bits of signed chars is not
guaranteed to be safe, and it is implementation defined whether plain
"char" is signed or unsigned.

Eric Sosman · May 5, 2006

rsood wrote On 05/05/06 16:18,:

Thank you very much.

You guys have been very helpful ^...

...although somewhat discouraging ^
I am actually interested in 8 bit quantities, because I am attempting
to write a cross-compiler/assembler, so it is rather important whether
it outputs 8 bits, 9 bits or 32 bits. I guess I'll have to rethink my
idea, or just accept it being limited somewhat.

Consider that I/O always performs a translation of some
kind between the numbers stored inside the computer and the
magnetic squiggles, sonic squeals, photonic phlashes, or
whatever the recording medium uses. Even if the machine uses
eleven-bit characters, there is probably some way to get its
I/O gadgetry to traffic in octets. Obviously, a system that
can support an Internet connection must be capable of doing
so, whatever its internal architecture may look like. So
go ahead without too much fear; quite likely the worst that
can happen is that you'll need to post-process your program's
output through a filter of some kind.

Keith Thompson · May 5, 2006

rsood said:
Thank you very much.

You guys have been very helpful ^...

...although somewhat discouraging ^
I am actually interested in 8 bit quantities, because I am attempting
to write a cross-compiler/assembler, so it is rather important whether
it outputs 8 bits, 9 bits or 32 bits. I guess I'll have to rethink my
idea, or just accept it being limited somewhat.

The word for an 8-bit quantity is "octet".

As far as I know, all existing conforming hosted C implementations
have CHAR_BIT==8 (i.e., bytes are 8 bits). The only current C
implementations I've heard of with CHAR_BIT>8 are DSPs (digital signal
processors). Most embedded systems other than DSPs have 8-bit bytes,
conceivably all of them (I actually don't know).

If you want absolute 100% portability, you'll have to think about how
the octets are transferred from one system to another. (If you copy a
sequence of octets from a system with 8-bit bytes to one with 9-bit
bytes, how is it done, and what does it look like when it gets to the
other end?) The C language doesn't say anything about how this is
done, and I don't know how it's done in practice. You could, for
example, require the octets to be represented as a sequence of bytes,
each of which has a value in the range 0..255, even if a byte is 16 or
32 bits.

On the other hand, if you don't mind limiting your program to run only
on systems with 8-bit bytes, that might be portable enough. If so,
you can add something like this:

#include <limits.h>
#if CHAR_BIT != 8
# error "Not supported on systems with CHAR_BIT != 8"
#endif

Gimmmo · May 6, 2006

Hi

I never use *direct* C data type, instead I utilize 'typedef'.

E.g. in the global header file included in every C file:

#if defined(processor_A)

typedef unsigned long uint32_t /* long is 32-bit */
typedef signed long sint32_t /* long is 32-bit */

#elif defined(processor_B)

typedef unsigned int uint32_t /* int is 32-bit */
typedef signed int sint32_t /* int is 32-bit */

#endif

So, in my C files, I never use

unsigned long x; /* i have to think the size for particular
processor here :-( */

but

uint32_t x; /* 32-bit for x regardless processor type */

I just need to #define the processor type accordingly.

Rgds.

Ben Pfaff · May 6, 2006

Gimmmo said:
I never use *direct* C data type, instead I utilize 'typedef'.

Why?

Gimmmo · May 6, 2006

Because portability.

I'm developing embedded software for various micros.

When I need 16-bit size, then I just use uint16_t (it
is written clearly xxxx16xx).

Otherwise, I have to think first what is 16-bit for
this micro... for every variable I use... remember
my source tree is commonly used for various micros.

Rgds.

Ben C · May 6, 2006

Hi

I never use *direct* C data type, instead I utilize 'typedef'.

E.g. in the global header file included in every C file:

#if defined(processor_A)

typedef unsigned long uint32_t /* long is 32-bit */
typedef signed long sint32_t /* long is 32-bit */

#elif defined(processor_B)

typedef unsigned int uint32_t /* int is 32-bit */
typedef signed int sint32_t /* int is 32-bit */

#endif

You get all those in stdint.h anyway (although the signed ones are just
called int32_t, not sint32_t).

But I wouldn't recommend the practice of _always_ using types that have
exact sizes.

It's better to use them only when you require an exact size (e.g. you're
doing something like processing audio data in a particular format).

A lot of the time you just want a number. Just use int, then the
compiler can pick its favourite size of machine operation for what works
best on the target.

A question I have: is uint8_t always one octet (precisely 8 bits), or
one byte (CHAR_BIT bits which could be anything e.g. 9 or 32)?

The name implies 8 bits. If so, this sounds like a good solution for the
OP.

Keith Thompson · May 6, 2006

Gimmmo said:
Because portability.

I'm developing embedded software for various micros.

When I need 16-bit size, then I just use uint16_t (it
is written clearly xxxx16xx).

Otherwise, I have to think first what is 16-bit for
this micro... for every variable I use... remember
my source tree is commonly used for various micros.

Please don't top-post. See <http://www.caliburn.nl/topposting.html>.

If you specifically need a 16-bit type, yes, you need to use a
typedef. (I've used systems that don't even have a 16-bit integer
type.) C99 introduces a new standard header, <stdint.h>, that defines
typedefs for exact-width types. If you don't have a C99
implementation, see <http://sherman.lysator.liu.se/c/q8/index.html>
for Doug Gwyn's public-domain C90-compatible implementation of
<stdint.h> and other headers.

On the other hand, the exact size or range of a type is sometimes the
most important think about it, but not always. In many contexts,
"int" is the most appropriate type to use; it's guaranteed to be at
least 16 bits, and it's likely to be the fastest type for arithmetic.
For other purposes, you can use a type whose name is based on its
purpose rather than its size (such as size_t).

Keith Thompson · May 6, 2006

Ben C said:
A question I have: is uint8_t always one octet (precisely 8 bits), or
one byte (CHAR_BIT bits which could be anything e.g. 9 or 32)?

uint8_t, as the name implies, is always exactly 8 bits. An
implementation with CHAR_BIT > 8 cannot have a type uint8_t.

Malcolm · May 7, 2006

rsood said:
...although somewhat discouraging ^
I am actually interested in 8 bit quantities, because I am attempting
to write a cross-compiler/assembler, so it is rather important whether
it outputs 8 bits, 9 bits or 32 bits. I guess I'll have to rethink my
idea, or just accept it being limited somewhat.

So store your machine-code output in arrays of unsigned chars. If CHAR_BIT
is greater than eight, it is no problem, if the top bits are set then there
is bug in your program.
Then output with calls to fputc(), in a file opened in binary mode. If the
file system uses some format whereby bytes are not 8 bits, there will be a
converter available. If not, there is not much you can do you declare a nine
bit file system an eight bit system - the computer is inherently
non-portable and the only workaround is to output a text file with the
results as human-reable numbers. This almost certainly isn't worth
supporting.

Transmitting/receiving binary content portably	16	Feb 23, 2010
How to paste n+1 every single time without copying new line from excel	3	Jul 13, 2023
portably shuffle a deck	30	Aug 1, 2007
How to paste n+1 every single time without copying new line from excel	0	Jul 13, 2023
Structure Size and Padding Byte Questions	2	Oct 1, 2013
How do i get numberOfItemsHired to only accept 1-500 if it is outside those values error message should be displayed	10	Jul 5, 2024
Possible PHP/WP problem with code, trouble accessing custom archive links	1	Jan 5, 2023
fread 1 byte x N vs N bytes x 1	5	Oct 19, 2006

Portably accessing 1 byte

rsood

void * clvrmnky()

Lew Pitcher

Eric Sosman

rsood

cbmanica

Eric Sosman

Keith Thompson

Gimmmo

Ben Pfaff

Gimmmo

Ben C

Keith Thompson

Keith Thompson

Malcolm

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads