Portably accessing 1 byte

R

rsood

Hi

I'm developing a program, and naturally I want it to be as portable as
possible. I need to be able to access specific numbers of bytes in it,
but as far as I know, there is no keyword in the c language such as
'byte'. Is it always okay to assume that the char data type is always
1 byte, or is there some other way to be sure you are getting 1 byte
that is not processor/OS dependent that is better, or is there no way
to be both absolutly sure and absolutly portable?

Thanks for any help.
 
V

void * clvrmnky()

rsood said:
I'm developing a program, and naturally I want it to be as portable as
possible. I need to be able to access specific numbers of bytes in it,
but as far as I know, there is no keyword in the c language such as
'byte'. Is it always okay to assume that the char data type is always
1 byte, or is there some other way to be sure you are getting 1 byte
that is not processor/OS dependent that is better, or is there no way
to be both absolutly sure and absolutly portable?

sizeof(char) is always 1, isn't it? sizeof returns bytes, so I think
this is your basic byte type.
 
L

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
sizeof(char) is always 1, isn't it? sizeof returns bytes, so I think
this is your basic byte type.

True enough.

Is the OP interested in C 'bytes' (which are synonymous with 'char', and
are /at least/ 8 bits wide), or is he interested in 8-bit quantities
(which are sometimes called 'bytes' outside of the C arena)?

If he is interested in C 'bytes' then char will do
If he is interested in 8-bit quantities only, then (AFAIK) there is no
/portable/ way to manage that in C.

- --

Lew Pitcher, IT Specialist, Corporate Technology Solutions,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed here are my own, not my employer's)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEW6oBagVFX4UWr64RAnExAJ9MbVIA6oI3jI3d4CUHwvKTU8zm8QCdH+07
TDaIF2lv4fvBUlpBVyjT4Ww=
=JA/7
-----END PGP SIGNATURE-----
 
E

Eric Sosman

rsood wrote On 05/05/06 14:57,:
Hi

I'm developing a program, and naturally I want it to be as portable as
possible. I need to be able to access specific numbers of bytes in it,
but as far as I know, there is no keyword in the c language such as
'byte'. Is it always okay to assume that the char data type is always
1 byte, or is there some other way to be sure you are getting 1 byte
that is not processor/OS dependent that is better, or is there no way
to be both absolutly sure and absolutly portable?

One char is one byte, always and forever.

BUT "one byte" might not have the meaning you expect.
In C, "one byte" has at least eight bits, but might have
more. Machines with 9-bit bytes are now rare but were
once fairly common, and machines with 32-bit bytes can be
found even today.

If what you want is "one octet," that is, one eight-bit
quantity, you can store the value in a byte/char but you
must use a little care in manipulating it. For example,
if you start incrementing an `unsigned char' repeatedly,
it might take more than 256 increments before the original
value reappears.
 
R

rsood

Thank you very much.
One char is one byte, always and forever.

You guys have been very helpful ^...
If he is interested in 8-bit quantities only, then (AFAIK) there is no
/portable/ way to manage that in C.

....although somewhat discouraging ^ :)
I am actually interested in 8 bit quantities, because I am attempting
to write a cross-compiler/assembler, so it is rather important whether
it outputs 8 bits, 9 bits or 32 bits. I guess I'll have to rethink my
idea, or just accept it being limited somewhat.
 
C

cbmanica

rsood said:
I'm developing a program, and naturally I want it to be as portable as
possible. I need to be able to access specific numbers of bytes in it,

As noted, sizeof( char ) is guaranteed to be 1 by the Standard.
However, if you plan to manipulate the bits of the bytes, IIRC you want
"unsigned char"; fiddling with the bits of signed chars is not
guaranteed to be safe, and it is implementation defined whether plain
"char" is signed or unsigned.
 
E

Eric Sosman

rsood wrote On 05/05/06 16:18,:
Thank you very much.




You guys have been very helpful ^...




...although somewhat discouraging ^ :)
I am actually interested in 8 bit quantities, because I am attempting
to write a cross-compiler/assembler, so it is rather important whether
it outputs 8 bits, 9 bits or 32 bits. I guess I'll have to rethink my
idea, or just accept it being limited somewhat.

Consider that I/O always performs a translation of some
kind between the numbers stored inside the computer and the
magnetic squiggles, sonic squeals, photonic phlashes, or
whatever the recording medium uses. Even if the machine uses
eleven-bit characters, there is probably some way to get its
I/O gadgetry to traffic in octets. Obviously, a system that
can support an Internet connection must be capable of doing
so, whatever its internal architecture may look like. So
go ahead without too much fear; quite likely the worst that
can happen is that you'll need to post-process your program's
output through a filter of some kind.
 
K

Keith Thompson

rsood said:
Thank you very much.


You guys have been very helpful ^...


...although somewhat discouraging ^ :)
I am actually interested in 8 bit quantities, because I am attempting
to write a cross-compiler/assembler, so it is rather important whether
it outputs 8 bits, 9 bits or 32 bits. I guess I'll have to rethink my
idea, or just accept it being limited somewhat.

The word for an 8-bit quantity is "octet".

As far as I know, all existing conforming hosted C implementations
have CHAR_BIT==8 (i.e., bytes are 8 bits). The only current C
implementations I've heard of with CHAR_BIT>8 are DSPs (digital signal
processors). Most embedded systems other than DSPs have 8-bit bytes,
conceivably all of them (I actually don't know).

If you want absolute 100% portability, you'll have to think about how
the octets are transferred from one system to another. (If you copy a
sequence of octets from a system with 8-bit bytes to one with 9-bit
bytes, how is it done, and what does it look like when it gets to the
other end?) The C language doesn't say anything about how this is
done, and I don't know how it's done in practice. You could, for
example, require the octets to be represented as a sequence of bytes,
each of which has a value in the range 0..255, even if a byte is 16 or
32 bits.

On the other hand, if you don't mind limiting your program to run only
on systems with 8-bit bytes, that might be portable enough. If so,
you can add something like this:

#include <limits.h>
#if CHAR_BIT != 8
# error "Not supported on systems with CHAR_BIT != 8"
#endif
 
G

Gimmmo

Hi

I never use *direct* C data type, instead I utilize 'typedef'.


E.g. in the global header file included in every C file:


#if defined(processor_A)

typedef unsigned long uint32_t /* long is 32-bit */
typedef signed long sint32_t /* long is 32-bit */

#elif defined(processor_B)

typedef unsigned int uint32_t /* int is 32-bit */
typedef signed int sint32_t /* int is 32-bit */

#endif


So, in my C files, I never use

unsigned long x; /* i have to think the size for particular
processor here :-( */

but

uint32_t x; /* 32-bit for x regardless processor type */

I just need to #define the processor type accordingly.

Rgds.
 
G

Gimmmo

Because portability.

I'm developing embedded software for various micros.

When I need 16-bit size, then I just use uint16_t (it
is written clearly xxxx16xx).

Otherwise, I have to think first what is 16-bit for
this micro... for every variable I use... remember
my source tree is commonly used for various micros.


Rgds.
 
B

Ben C

Hi

I never use *direct* C data type, instead I utilize 'typedef'.


E.g. in the global header file included in every C file:


#if defined(processor_A)

typedef unsigned long uint32_t /* long is 32-bit */
typedef signed long sint32_t /* long is 32-bit */

#elif defined(processor_B)

typedef unsigned int uint32_t /* int is 32-bit */
typedef signed int sint32_t /* int is 32-bit */

#endif

You get all those in stdint.h anyway (although the signed ones are just
called int32_t, not sint32_t).

But I wouldn't recommend the practice of _always_ using types that have
exact sizes.

It's better to use them only when you require an exact size (e.g. you're
doing something like processing audio data in a particular format).

A lot of the time you just want a number. Just use int, then the
compiler can pick its favourite size of machine operation for what works
best on the target.

A question I have: is uint8_t always one octet (precisely 8 bits), or
one byte (CHAR_BIT bits which could be anything e.g. 9 or 32)?

The name implies 8 bits. If so, this sounds like a good solution for the
OP.
 
K

Keith Thompson

Gimmmo said:
Because portability.

I'm developing embedded software for various micros.

When I need 16-bit size, then I just use uint16_t (it
is written clearly xxxx16xx).

Otherwise, I have to think first what is 16-bit for
this micro... for every variable I use... remember
my source tree is commonly used for various micros.

Please don't top-post. See <http://www.caliburn.nl/topposting.html>.

If you specifically need a 16-bit type, yes, you need to use a
typedef. (I've used systems that don't even have a 16-bit integer
type.) C99 introduces a new standard header, <stdint.h>, that defines
typedefs for exact-width types. If you don't have a C99
implementation, see <http://sherman.lysator.liu.se/c/q8/index.html>
for Doug Gwyn's public-domain C90-compatible implementation of
<stdint.h> and other headers.

On the other hand, the exact size or range of a type is sometimes the
most important think about it, but not always. In many contexts,
"int" is the most appropriate type to use; it's guaranteed to be at
least 16 bits, and it's likely to be the fastest type for arithmetic.
For other purposes, you can use a type whose name is based on its
purpose rather than its size (such as size_t).
 
K

Keith Thompson

Ben C said:
A question I have: is uint8_t always one octet (precisely 8 bits), or
one byte (CHAR_BIT bits which could be anything e.g. 9 or 32)?

uint8_t, as the name implies, is always exactly 8 bits. An
implementation with CHAR_BIT > 8 cannot have a type uint8_t.
 
M

Malcolm

rsood said:
...although somewhat discouraging ^ :)
I am actually interested in 8 bit quantities, because I am attempting
to write a cross-compiler/assembler, so it is rather important whether
it outputs 8 bits, 9 bits or 32 bits. I guess I'll have to rethink my
idea, or just accept it being limited somewhat.
So store your machine-code output in arrays of unsigned chars. If CHAR_BIT
is greater than eight, it is no problem, if the top bits are set then there
is bug in your program.
Then output with calls to fputc(), in a file opened in binary mode. If the
file system uses some format whereby bytes are not 8 bits, there will be a
converter available. If not, there is not much you can do you declare a nine
bit file system an eight bit system - the computer is inherently
non-portable and the only workaround is to output a text file with the
results as human-reable numbers. This almost certainly isn't worth
supporting.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,183
Messages
2,570,966
Members
47,515
Latest member
Harvey7327

Latest Threads

Top