C question

P

Paul

Hi,

Just a quick question, my C is rusty, so perhaps someone can refresh my
memory.

say I allocate a buffer like so:

char *buffer;
unsigned long somevalue=0;

void whatever()
{
buffer=malloc(4096);
}

Now say I want to load "somevalue" (as an unsigned long, so 4 bytes in size)
with the
data stored at buffer[10]; (The 4 consecutive bytes at buffer[10] to form
the value which
will be stored in the unsigned long)

Whats the correct syntax?

I cant simply use:

somevalue=(unsigned long)buffer[10];

Can i? Whats the "correct" and legal manner for doing this?

paul
 
J

Joona I Palaste

Paul said:
Just a quick question, my C is rusty, so perhaps someone can refresh my
memory.
say I allocate a buffer like so:
char *buffer;
unsigned long somevalue=0;
void whatever()
{
buffer=malloc(4096);
}
Now say I want to load "somevalue" (as an unsigned long, so 4 bytes in size)
with the
data stored at buffer[10]; (The 4 consecutive bytes at buffer[10] to form
the value which
will be stored in the unsigned long)
Whats the correct syntax?
I cant simply use:
somevalue=(unsigned long)buffer[10];
Can i? Whats the "correct" and legal manner for doing this?

Look up memcpy(). Note that the data in the buffer must have the correct
endianness for your platform.
 
P

Paul

Joona I Palaste said:
Look up memcpy(). Note that the data in the buffer must have the correct
endianness for your platform.

--
/-- Joona Palaste ([email protected]) ------------- Finland --------\
\-------------------------------------------------------- rules! --------/
"It's time, it's time, it's time to dump the slime!"
- Dr. Dante


I know memcpy (but I know this can also be done without memcpy, by the use
of pointers/casts)

Any tips?
 
J

Joona I Palaste

I know memcpy (but I know this can also be done without memcpy, by the use
of pointers/casts)
Any tips?

Casting the address of your unsigned long into a pointer to unsigned
char gives you a pointer that you can use to iterate through the bytes
of your unsigned long. Now just iterate sizeof(unsigned long) bytes
through both your unsigned long and the appropriate location in the
buffer, copying the bytes from the buffer to the unsigned long.
Any pointer type can be safely cast into a pointer to unsigned char, but
how the individual bytes make up the value is implementation-defined.
 
M

Michael Mair

Paul said:
Hi,

Just a quick question, my C is rusty, so perhaps someone can refresh my
memory.

say I allocate a buffer like so:

char *buffer;
unsigned long somevalue=0;

void whatever()
{
buffer=malloc(4096);
}

Now say I want to load "somevalue" (as an unsigned long, so 4 bytes in size)
with the
data stored at buffer[10]; (The 4 consecutive bytes at buffer[10] to form
the value which
will be stored in the unsigned long)

Whats the correct syntax?

I cant simply use:

somevalue=(unsigned long)buffer[10];

If anything: somevalue = *( (unsigned long *)&buffer[10] );
However, this is a Bad Idea and not guaranteed to work as the
alignment requirements for long may not be fulfilled.
Can i? Whats the "correct" and legal manner for doing this?

#include <stdlib.h>
#include <limits.h>

.....
unsigned char *buffer = NULL;
unsigned long somevalue = 0;
size_t i;

.....
buffer = malloc(10 + sizeof somevalue);
if (buffer==NULL) {
exit(EXIT_FAILURE); /* or s.th. more graceful */
}

.....
for (i=0, somevalue=0; i<sizeof somevalue; i++)
somevalue += (unsigned long)buffer[10+i] << (i*CHAR_BIT);

/* or:
for (i=0, somevalue=0; i<sizeof somevalue; i++)
somevalue = (somevalue << CHAR_BIT) + buffer[10+i];
*/
.....
The choice of for loop is entirely yours; the first makes
buffer[10] the lowest byte, the second the highest.


Cheers
Michael
 
C

CBFalconer

Paul said:
say I allocate a buffer like so:

char *buffer;
unsigned long somevalue=0;

void whatever()
{
buffer=malloc(4096);
}

Now say I want to load "somevalue" (as an unsigned long, so 4
bytes in size) with the data stored at buffer[10]; (The 4
consecutive bytes at buffer[10] to form the value which will
be stored in the unsigned long)

Whats the correct syntax? I cant simply use:

somevalue=(unsigned long)buffer[10];

Can i? Whats the "correct" and legal manner for doing this?

You have passed the first step, recognizing that there is a
problem. The next step is deciding what order the bytes are in in
the buffer. They may be high byte first, low byte first, and other
combinations. The bytes may contain only 8 bits of information, or
may contain CHAR_BIT bits of information (but the most common value
of CHAR_BIT is 8). So, assuming high byte first, and 8 bits of
usable information per byte, you might:

int i;
unsigned long somevalue;
unsigned char *p;

for (i = 0, p = &buffer[start], somevalue = 0; i < 4; i++, p++) {
somevalue = 256 * somevalue + (*p & 0xff);
}

The assumptions of high byte first and length of 4 apply only to
the data in the buffer with this approach. It doesn't matter if
CHAR_BIT is larger than 8. What your actual C system does for
endianess is immaterial. You have to ensure that p never gets
outside the range of buffer.
 
P

Paul

Michael Mair said:
somevalue=(unsigned long)buffer[10];

If anything: somevalue = *( (unsigned long *)&buffer[10] );
However, this is a Bad Idea and not guaranteed to work as the
alignment requirements for long may not be fulfilled.

If endianess is not a concern in this situation, and I have told the
compiler that everything is to be byte aligned, should this work ok?

Paul.
 
P

Paul

I should better explain what I am trying to do with this C code.

I have some data being stored in a buffer that I have previously
malloc()'d. The data contains cluster values for the FAT (file allocation
table)

On an intel based system (which i am writing the C code under). The
endianess
is fine. The data is stored in a manner that no translation/conversion
is necessary. Microsoft give this example in C on how to access the data in
the buffer
(taken from their document on the FAT32 spec)

FAT32ClusEntryVal = (*((unsigned long *) &Buffer[ThisFATEntOffset]));

What does the (*((unsigned long *) mean? Could someone explain this.

The &Buffer[ThisFATEntOffset] is pretty self explanitory. The cast to the
left
of it confuses me.

I like to try and understand what code is doing rather than just copying and
pasting and hoping for the best.

Paul
 
C

CBFalconer

Joona said:
Casting the address of your unsigned long into a pointer to unsigned
char gives you a pointer that you can use to iterate through the bytes
of your unsigned long. Now just iterate sizeof(unsigned long) bytes
through both your unsigned long and the appropriate location in the
buffer, copying the bytes from the buffer to the unsigned long.
Any pointer type can be safely cast into a pointer to unsigned char, but
how the individual bytes make up the value is implementation-defined.

Bad habits to get into. This introduces unnecessary dependancies
on the internal byte sex and byte size. The only thing that should
count is the format in the buffer.
 
J

Joona I Palaste

Bad habits to get into. This introduces unnecessary dependancies
on the internal byte sex and byte size. The only thing that should
count is the format in the buffer.

Yes I agree they're bad habits, but for some reason the OP specifically
asked for this.
 
M

Michael Mair

Paul said:
I should better explain what I am trying to do with this C code.

I have some data being stored in a buffer that I have previously
malloc()'d. The data contains cluster values for the FAT (file allocation
table)

On an intel based system (which i am writing the C code under). The
endianess
is fine. The data is stored in a manner that no translation/conversion
is necessary. Microsoft give this example in C on how to access the data in
the buffer
(taken from their document on the FAT32 spec)

FAT32ClusEntryVal = (*((unsigned long *) &Buffer[ThisFATEntOffset]));

What does the (*((unsigned long *) mean? Could someone explain this.

The &Buffer[ThisFATEntOffset] is pretty self explanitory. The cast to the
left
of it confuses me.

The cast means that the address of Buffer[....] is treated as if it was
the address of an unsigned long value.
The * outside the parentheses just gives you the value of the variable
at this address; as we claim that it is unsigned long, you get the
unsigned long value derived from the sizeof(unsigned long) bytes
starting at the address.

Short: Take address -- treat it as unsigned long * -- get the value
the pointer points to.

I like to try and understand what code is doing rather than just copying and
pasting and hoping for the best.

Excellent.


Cheers
Michael
 
M

Michael Mair

Paul said:
somevalue=(unsigned long)buffer[10];

If anything: somevalue = *( (unsigned long *)&buffer[10] );
However, this is a Bad Idea and not guaranteed to work as the
alignment requirements for long may not be fulfilled.

If endianess is not a concern in this situation, and I have told the
compiler that everything is to be byte aligned, should this work ok?

Well, you asked for the Right Way to do it. This means that we
do not know this beforehand.
If you say that you want to do it a little bit right, then (with
byte alignment and endianness taken care of) yes, this should
work ok.
Essentially, it is the same thing as you presented as suggested
solution elsethread.


Cheers
Michael
 
J

Jens.Toerring

Paul said:
I should better explain what I am trying to do with this C code.
I have some data being stored in a buffer that I have previously
malloc()'d. The data contains cluster values for the FAT (file allocation
table)
On an intel based system (which i am writing the C code under). The
endianess
is fine. The data is stored in a manner that no translation/conversion
is necessary. Microsoft give this example in C on how to access the data in
the buffer
(taken from their document on the FAT32 spec)
FAT32ClusEntryVal = (*((unsigned long *) &Buffer[ThisFATEntOffset]));
What does the (*((unsigned long *) mean? Could someone explain this.
The &Buffer[ThisFATEntOffset] is pretty self explanitory. The cast to the
left
of it confuses me.

Think about what type '&Buffer[ThisFATEntOffset]' has. It's the address
of a char. So if you would dereference that directly you would get the
value of the single char that's stored at that address and converted
to the type you have on the left hand side of the assignment. But you
don't want that char value, you want sizeof(unsigned long) chars
(probably 4 on the platform it was written for), starting at that
address and taken together as an unsigned long value. For that reaso
you must tell the compiler that '&Buffer[ThisFATEntOffset]' should be
treated as if it would be not a char pointer but a pointer to unsigned
long:

(unsigned long *) &Buffer[ThisFATEntOffset]

When you now dereference this by putting the asterik in front of it
then the code assumes that the compiler will give you the value stored
in the (probably 4) bytes at that offset, interpreted as an unsigned
long.

The problem with this is that while it probably will run without
problems on an Intel machine it may fail on other architectures. And
that's due to alignment issues. On several non-Intel architectures
an (unsigned) long can only start at addresses that can be divided
by either 2 or 4 (or possibly 8). On the other hand, a char array
can start at any address. Therefore, '&Buffer[ThisFATEntOffset]' may
be at an address at which an unsigned long can't start and when you
force the machine to try anyway by casting you get a bus error. That's
why people have been telling you not to do that but use memcpy()
instead. Using memcpy() is the only way where you are guaranteed that
it will work correctly on all machines. The programmers at Microsoft
of course don't have to care since all they support are Intel-like
archtitectures, so they can get away with doing it.

Regards, Jens
 
P

Paul

Michael Mair said:
Paul said:
somevalue=(unsigned long)buffer[10];

If anything: somevalue = *( (unsigned long *)&buffer[10] );
However, this is a Bad Idea and not guaranteed to work as the
alignment requirements for long may not be fulfilled.

If endianess is not a concern in this situation, and I have told the
compiler that everything is to be byte aligned, should this work ok?

Well, you asked for the Right Way to do it. This means that we
do not know this beforehand.
If you say that you want to do it a little bit right, then (with
byte alignment and endianness taken care of) yes, this should
work ok.
Essentially, it is the same thing as you presented as suggested
solution elsethread.


Cheers
Michael

Thats fine :) I was happy to see the options available to me.
I was interested to see what others thought was the best way to go about it.

I'm happy with the answers I recieved :)

Thanks to all!
 
C

CBFalconer

Joona said:
CBFalconer <[email protected]> scribbled the following:
.... snip ...


Yes I agree they're bad habits, but for some reason the OP
specifically asked for this.

So what? Do you feed an infant candy because it wants it? And I
think you misread his request in the first place.
 
C

CBFalconer

.... snip ...

The problem with this is that while it probably will run without
problems on an Intel machine it may fail on other architectures. And
that's due to alignment issues. On several non-Intel architectures
an (unsigned) long can only start at addresses that can be divided
by either 2 or 4 (or possibly 8). On the other hand, a char array
can start at any address. Therefore, '&Buffer[ThisFATEntOffset]' may
be at an address at which an unsigned long can't start and when you
force the machine to try anyway by casting you get a bus error. That's
why people have been telling you not to do that but use memcpy()
instead. Using memcpy() is the only way where you are guaranteed that
it will work correctly on all machines. The programmers at Microsoft
of course don't have to care since all they support are Intel-like
archtitectures, so they can get away with doing it.

No no no. Don't use memcpy. Don't make unwarrented assumptions
about the byte sex and byte size in the buffer. Don't assume it
agrees with the same quantities on your machine.
 
J

Jens.Toerring

CBFalconer said:
The problem with this is that while it probably will run without
problems on an Intel machine it may fail on other architectures. And
that's due to alignment issues. On several non-Intel architectures
an (unsigned) long can only start at addresses that can be divided
by either 2 or 4 (or possibly 8). On the other hand, a char array
can start at any address. Therefore, '&Buffer[ThisFATEntOffset]' may
be at an address at which an unsigned long can't start and when you
force the machine to try anyway by casting you get a bus error. That's
why people have been telling you not to do that but use memcpy()
instead. Using memcpy() is the only way where you are guaranteed that
it will work correctly on all machines. The programmers at Microsoft
of course don't have to care since all they support are Intel-like
archtitectures, so they can get away with doing it.
No no no. Don't use memcpy. Don't make unwarrented assumptions
about the byte sex and byte size in the buffer. Don't assume it
agrees with the same quantities on your machine.

With "guaranteed to work correctly" I just meant that that is the
only way to get the data out of the buffer cleanly in a larger
chunk than a single char, not that the value would necessarily make
any sense. Of course, if one has to deal with binary data one must
check byte order and size, and only if that fits then one should
use memcpy() to get at the data without getting into trouble with
alignment issues.
Regards, Jens
 
C

CBFalconer

CBFalconer said:
(e-mail address removed)-berlin.de wrote:
The problem with this is that while it probably will run without
problems on an Intel machine it may fail on other architectures. And
that's due to alignment issues. On several non-Intel architectures
an (unsigned) long can only start at addresses that can be divided
by either 2 or 4 (or possibly 8). On the other hand, a char array
can start at any address. Therefore, '&Buffer[ThisFATEntOffset]' may
be at an address at which an unsigned long can't start and when you
force the machine to try anyway by casting you get a bus error. That's
why people have been telling you not to do that but use memcpy()
instead. Using memcpy() is the only way where you are guaranteed that
it will work correctly on all machines. The programmers at Microsoft
of course don't have to care since all they support are Intel-like
archtitectures, so they can get away with doing it.
No no no. Don't use memcpy. Don't make unwarrented assumptions
about the byte sex and byte size in the buffer. Don't assume it
agrees with the same quantities on your machine.

With "guaranteed to work correctly" I just meant that that is the
only way to get the data out of the buffer cleanly in a larger
chunk than a single char, not that the value would necessarily make
any sense. Of course, if one has to deal with binary data one must
check byte order and size, and only if that fits then one should
use memcpy() to get at the data without getting into trouble with
alignment issues.

But that is the point - you don't check "if that fits" - you make
the code dependent only on the buffer format, which should be well
defined. Don't get into evil habits, or we will have to call you
Schildt. :)
 
C

Chris Torek

I should better explain what I am trying to do with this C code.
... On an intel based system ... Microsoft give this example in C
on how to access the data in the buffer ...

FAT32ClusEntryVal = (*((unsigned long *) &Buffer[ThisFATEntOffset]));

Others have explained the cast, but just to reiterate, a cast
-- which is just a type-name enclosed in parentheses, such as

(unsigned long *)

-- means "take a value, and convert it to a new (and possibly very
different) value, as if by assignment to a temporary variable whose
type is given by the cast". The conversion happens just as for
ordinary assignment, except that if there is something inherently
suspicious -- or even seriously wrong -- with that conversion, the
compiler should do its best to do it anyway without complaint.

As such, casts are quite powerful and should be used with care.
Think of them as being like nitroglycerine: in small quantities,
it can even save your life (cardiac patients use it to avoid
heart attacks), and when treated with care, it can be very handy,
but it can also blow your fingertips right off. :)

Having converted the pointer value to a new one of type
"unsigned long *", the unary "*" at the front of the whole
expression follows the new pointer, retrieving an "unsigned long"
from that address -- or crashing, or doing something else bad,
if you are not on an Intel x86 and the address is not aligned.
Of course, the result is machine-dependent, because FAT32
entries are written as four little-endian octets, regardless of
the underlying machine's representation for an "unsigned long",
which brings me to the thing no one else has mentioned yet in
the part of this thread that made it to my news server....

Since a FAT32 value is always between 0x00000000 and 0x0fffffff,
the type "unsigned long" is guaranteed to be big enough to hold
it. Unfortunately, it might well be *too* big. So this
expression is not only machine-endianness-dependent, but also
"machine uses 32-bit long"-dependent. Some C compilers for 64-bit
architectures (including x86-64) are now starting to have 64-bit
"long"s. Thus, even restricting oneself to Intel CPUs, this
expression is rather inadvisable. Building up a 32-bit value
from four 8-bit octets, using shift-and-mask code, will work
on machines other than the Intel. (The built-up value can be
safely stored in any "unsigned long", because all C implementations
are required to have a ULONG_MAX that is at least 0xffffffff,
though it may be larger, as on those 64-bit systems.)
 
J

Jack Klein

Michael Mair said:
somevalue=(unsigned long)buffer[10];

If anything: somevalue = *( (unsigned long *)&buffer[10] );
However, this is a Bad Idea and not guaranteed to work as the
alignment requirements for long may not be fulfilled.

If endianess is not a concern in this situation, and I have told the
compiler that everything is to be byte aligned, should this work ok?

C has no concept of "byte aligned", for anything other than bytes. If
your compiler provides some sort of non-standard extension like this,
ask in a support group for that compiler.

It's guaranteed to generate an address abort trap on an ARM processor,
no matter what you tell the compiler.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,159
Messages
2,570,879
Members
47,417
Latest member
DarrenGaun

Latest Threads

Top