beginner question

Michael · Apr 22, 2004

In the past I have developed many micro-controller products using
assembler. I have recently started using C, and so far I love it! I am
very motivated to learn and become accustomed to the language.
I have a couple of simple questions; just to make sure I'm not going
about things the wrong way.

When writing functions that handle four bytes, I use the 'unsigned
long' variable type. At first to access the individual bytes of an
unsigned long, I would create a union as such:

union {
unsigned long longword;
char byte[4];
} varname;

Then use varname.longword or varname.byte[0-3] as required.

But while experimenting, I found a better way: (&(char)longword)[0-3]

Is it ok for code to be dependant on big or little endian? Or does
this indicate bad programming?
There are so many ways to do things in C. I'd appreciate any a
guidance you might have to offer.

Regards, Michael.

Arthur J. O'Dwyer · Apr 22, 2004

In the past I have developed many micro-controller products using
assembler. I have recently started using C, and so far I love it! I am
very motivated to learn and become accustomed to the language.

It's fun, isn't it?

I have a couple of simple questions; just to make sure I'm not going
about things the wrong way.

When writing functions that handle four bytes, I use the 'unsigned
long' variable type. At first to access the individual bytes of an
unsigned long, I would create a union as such:

union {
unsigned long longword;
char byte[4];
} varname;

Then use varname.longword or varname.byte[0-3] as required.

This works. Of course, you're not guaranteed to be able to put values
in through 'byte' and get anything sensible out through 'longword', but
given that you say you're doing microcontroller stuff, you probably can
find out exactly what happens on your platform when you do that.

The portable way to extract the bytes of an 'unsigned long' is to
look at it as an array of unsigned char, like this:

unsigned long lu = 42uL;
unsigned char *lu_bytes = &lu;
int i;

puts("The bytes of lu, from low to high address, are:\n");
for (i=0; i < sizeof lu; ++i)
printf("0x%x\n", (unsigned) lu_bytes);

Note the use of 'sizeof lu' in place of '4'. On many machines, this
will print 2A 00 00 00 (with proper linebreaks, of course); but it's
not guaranteed to. Other machines will print 00 00 00 2A, or
2A 00 00 00 00 00 00 00, or (on the DS9000, which uses a highly
sophisticated system of padding bits) DEADBEEF DEADBEEF 0 DEADBEEF.

But while experimenting, I found a better way: (&(char)longword)[0-3]

Click to expand...

This is wrong. I think you meant to say,

((char *)&longword)[0-3]

Even now that the syntax is correct, I strongly recommend the use of
'unsigned char' instead of 'char' when dealing with bytes. Save plain
'char' for dealing with actual CHARacters, as the name implies.

Is it ok for code to be dependant on big or little endian? Or does
this indicate bad programming?

Click to expand...

A lot of the time, it indicates non-portable programming. Here in
c.l.c, we strive for portability all the time; so if you want to learn
C targeted at your specific compiler/embedded system, I recommend you
find a different newsgroup. That said, Non-Portable is not Bad,
necessarily; it just means your code would be harder to port to a
new system, should you ever decide to try it.

There are so many ways to do things in C. I'd appreciate any a
guidance you might have to offer.

Click to expand...

See the welcome messages posted here occasionally, and Google for
'c.l.c FAQ'. The FAQ is pretty long and sometimes dense, but it will
give you a *lot* of "guidance." That's what it's for.

HTH,
-Arthur

Henri Manson · Apr 22, 2004

Michael said:
In the past I have developed many micro-controller products using
assembler. I have recently started using C, and so far I love it! I am
very motivated to learn and become accustomed to the language.
I have a couple of simple questions; just to make sure I'm not going
about things the wrong way.

When writing functions that handle four bytes, I use the 'unsigned
long' variable type. At first to access the individual bytes of an
unsigned long, I would create a union as such:

union {
unsigned long longword;
char byte[4];
} varname;

Then use varname.longword or varname.byte[0-3] as required.

But while experimenting, I found a better way: (&(char)longword)[0-3]

Is it ok for code to be dependant on big or little endian? Or does
this indicate bad programming?
There are so many ways to do things in C. I'd appreciate any a
guidance you might have to offer.

Regards, Michael.

not bad programming as long as you document your code it is dependent on
processor architecture

. if you want the bytes in the longword in to
represent lsb to msb bytes in a system independent way you can get them
by using the shift operators e.g

#include <stdio.h>

main()
{
long x = 0xABCD1234;
long y;

/* get bytes out of x. b1 is LSB, b4 is MSB */
unsigned char b1 = (unsigned char) x;
unsigned char b2 = (unsigned char) (x >> 8);
unsigned char b3 = (unsigned char) (x >> 16);
unsigned char b4 = (unsigned char) (x >> 24);

printf("b1 = %02x, b2 = %02x, b3 = %02x, b4 = %02x\n", b1, b2, b3, b4);
/* create a long value out of the bytes */
y = (long) b1 | (long) (b2 << 8) | (long) (b3 << 16) | (long) (b4
<< 24);
printf("y = %x\n", y);
return 0;
}

HTH

Henri Manson

Jack Klein · Apr 22, 2004

In the past I have developed many micro-controller products using
assembler. I have recently started using C, and so far I love it! I am
very motivated to learn and become accustomed to the language.

Click to expand...

It's fun, isn't it?

I have a couple of simple questions; just to make sure I'm not going
about things the wrong way.

When writing functions that handle four bytes, I use the 'unsigned
long' variable type. At first to access the individual bytes of an
unsigned long, I would create a union as such:

union {
unsigned long longword;
char byte[4];
} varname;

Then use varname.longword or varname.byte[0-3] as required.

Click to expand...

This works. Of course, you're not guaranteed to be able to put values
in through 'byte' and get anything sensible out through 'longword', but
given that you say you're doing microcontroller stuff, you probably can
find out exactly what happens on your platform when you do that.

The portable way to extract the bytes of an 'unsigned long' is to
look at it as an array of unsigned char, like this:

unsigned long lu = 42uL;
unsigned char *lu_bytes = &lu;

ITYM unsigned char *lu_bytes = (unsigned char *)&lu;

....to avoid the constraint violation and required diagnostic.

Jack Klein · Apr 22, 2004

In the past I have developed many micro-controller products using
assembler. I have recently started using C, and so far I love it! I am
very motivated to learn and become accustomed to the language.

I've been doing embedded system programming for 25 years, using C in
that type of work for 20.

You are making some non-portable assumptions, and ones that are more
likely to trip you up in embedded systems then they are in desktop
type programming.

I have a couple of simple questions; just to make sure I'm not going
about things the wrong way.

When writing functions that handle four bytes, I use the 'unsigned
long' variable type. At first to access the individual bytes of an
unsigned long, I would create a union as such:

The first mistake you are making is assuming that there are always 8
bits in a byte. That is not how C defines a byte, nor any real
authoritarian source. A quantity of exactly 8 bits is an "octet". C
defines a byte as the smallest addressable unit of storage, and also
the size of an object that can contain characters. A byte in C must
contain at least 8 bits, but can contain more.

union {
unsigned long longword;
char byte[4];
} varname;

Most of the code I have written for the past few months, and also for
the next few months as well, has been for a Texas Instruments 2812,
sort of a hybrid microcontroller and Digital Signal Processor. It has
a good C compiler, but it doesn't do 8 bits, not at all, the hardware
does not support it.

A byte on this processor has 16 bits. All memory access are 16 bits.
In its C compiler, the standard C macro CHAR_BIT, defined in

is 16. sizeof(char) == sizeof(short) == sizeof(int) == 1 said:
Then use varname.longword or varname.byte[0-3] as required.

But while experimenting, I found a better way: (&(char)longword)[0-3]

Is it ok for code to be dependant on big or little endian? Or does
this indicate bad programming?
There are so many ways to do things in C. I'd appreciate any a
guidance you might have to offer.

Regards, Michael.

In the past I have written C for an Analog Devices SHARC DSP as well.
That is a 32 bit only architecture, where all the integer types, char
through long, have sizeof 1 and 32 bits.

I would suggest using Henri's suggestion and using shift and mask.
Then endianness does not matter.

It is actually not hard in general to write code for a platform where
chars have more than 8 bits.

Michael · Apr 22, 2004

Arthur J. O'Dwyer said:
In the past I have developed many micro-controller products using
assembler. I have recently started using C, and so far I love it! I am
very motivated to learn and become accustomed to the language.

Click to expand...

It's fun, isn't it?

I have a couple of simple questions; just to make sure I'm not going
about things the wrong way.

When writing functions that handle four bytes, I use the 'unsigned
long' variable type. At first to access the individual bytes of an
unsigned long, I would create a union as such:

union {
unsigned long longword;
char byte[4];
} varname;

Then use varname.longword or varname.byte[0-3] as required.

Click to expand...

This works. Of course, you're not guaranteed to be able to put values
in through 'byte' and get anything sensible out through 'longword', but
given that you say you're doing microcontroller stuff, you probably can
find out exactly what happens on your platform when you do that.

The portable way to extract the bytes of an 'unsigned long' is to
look at it as an array of unsigned char, like this:

unsigned long lu = 42uL;
unsigned char *lu_bytes = &lu;
int i;

puts("The bytes of lu, from low to high address, are:\n");
for (i=0; i < sizeof lu; ++i)
printf("0x%x\n", (unsigned) lu_bytes);

Note the use of 'sizeof lu' in place of '4'. On many machines, this
will print 2A 00 00 00 (with proper linebreaks, of course); but it's
not guaranteed to. Other machines will print 00 00 00 2A, or
2A 00 00 00 00 00 00 00, or (on the DS9000, which uses a highly
sophisticated system of padding bits) DEADBEEF DEADBEEF 0 DEADBEEF.

But while experimenting, I found a better way: (&(char)longword)[0-3]

Click to expand...

This is wrong. I think you meant to say,

((char *)&longword)[0-3]

Even now that the syntax is correct, I strongly recommend the use of
'unsigned char' instead of 'char' when dealing with bytes. Save plain
'char' for dealing with actual CHARacters, as the name implies.

Is it ok for code to be dependant on big or little endian? Or does
this indicate bad programming?

Click to expand...

A lot of the time, it indicates non-portable programming. Here in
c.l.c, we strive for portability all the time; so if you want to learn
C targeted at your specific compiler/embedded system, I recommend you
find a different newsgroup. That said, Non-Portable is not Bad,
necessarily; it just means your code would be harder to port to a
new system, should you ever decide to try it.

There are so many ways to do things in C. I'd appreciate any a
guidance you might have to offer.

Click to expand...

See the welcome messages posted here occasionally, and Google for
'c.l.c FAQ'. The FAQ is pretty long and sometimes dense, but it will
give you a *lot* of "guidance." That's what it's for.

HTH,
-Arthur

Thanks Arthur!

Algoexpert for a beginner	0	Nov 13, 2022
Adding adressing of IPv6 to program	1	Feb 16, 2023
Beginner Question: 3D Models	11	Jun 19, 2013
cast question	4	Nov 18, 2010
beginner python GUI question	0	Aug 1, 2010
beginner question	4	Apr 26, 2004
Need beginner help with C (LabWindows/CVI)!	6	Oct 10, 2008
Beginner question regarding the use of system() commands	8	Apr 22, 2006

beginner question

Michael

Arthur J. O'Dwyer

Henri Manson

Jack Klein

Jack Klein

Michael

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads