Using exact-size structs to go thru raw byte buffers

toe · Feb 22, 2008

Assume we're working on a system where CHAR_BIT == 8.

Let's say we have a raw byte buffer in memory:

char unsigned data[112];

Within this buffer is data that you got from your network card, an
ethernet frame to be exact. An ethernet frame is laid out as follows:

First 6 octets: Destination MAC address
Second 6 octets: Source MAC address
Next two octets: Protocol

In order to analyse the ethernet frame, I was thinking that maybe we
could make an exact-size struct as follows:

struct FrameHeader {
uint8 dest[6],src[6];
uint16 proto;
};

(I realise that we'd need a special compiler that will allow us to
specify no padding between members. Also I realise we'd have to be
careful about alignment).

And then do the following:

if ( 0x800 == ((struct FrameHeader const*)data)->proto )
puts("Contains an IP packet");

So far, I believe we have two issues:
1) The alignment of "proto"
2) The byte order of "proto"

Firstly, to get around the byte order issue, I was thinking of
changing the structure to:

struct FrameHeader {
uint8 dest[6],src[6];
uint8 proto[2];
}

And then making a macro function to turn a "uint8[2]" into a "uint16"
using BigEndian:

#define OCTETS_TO_16(p) ( (uint16)*(p) << 8 | (p)[1] )

so that we could do:

if ( 0x800 == OCTETS_TO_16( ((struct FrameHeader const*)data)-

proto ) )puts("Contains an IP packet");

Does this sound good?

The program that's being written is a network protocol analyser. I
myself am not writing it, but I've been asked to give a little advice.
The program is being written for MS Windows, but since the person's
using a cross-platform library for networking, I think they might try
get it to compile for Linux and Mac aswell.

On these three OS's, is there any alignment requirements for integer
types, or will the program crash if we try to access a mis-aligned
integer?

Also, is endianess determined by the CPU, or is determined by the OS?
Does anyone know what the endianesses are for the common CPU's and
OS's?

Any tips appreciated.

toe · Feb 22, 2008

Just as an aside, some of you may remember that I posted recently
looking for a fully-portable implementation of the SHA-1 algorithm. I
had some code which was supposedly fully-portable, but when I ran it
on a Sun Solaris machine it gave me the wrong answer. It didn't crash
or anything, it just gave me a wrong answer. The reason it was wrong
is that the code assumed the machine to be little-endian (which is
what Intel x86 machines are -- and yes by the way I did just Google
that 60 seconds ago), whereas the Sun machines are big-endian.

CBFalconer · Feb 22, 2008

Just as an aside, some of you may remember that I posted recently
looking for a fully-portable implementation of the SHA-1 algorithm. I
had some code which was supposedly fully-portable, but when I ran it
on a Sun Solaris machine it gave me the wrong answer. It didn't crash
or anything, it just gave me a wrong answer. The reason it was wrong
is that the code assumed the machine to be little-endian (which is
what Intel x86 machines are -- and yes by the way I did just Google
that 60 seconds ago), whereas the Sun machines are big-endian.

Then the implementation was NOT fully portable. Probably did some
unclean conversions between integers and bytes. Just a guess.

Nick Keighley · Feb 22, 2008

Assume we're working on a system where CHAR_BIT == 8.

possibly stick an assert in somewhere so people have it drawn to their
attention if this isn't so. many on this ng will tell you to write the
code so it doesn't make this assumption.

Let's say we have a raw byte buffer in memory:

char unsigned data[112];

Within this buffer is data that you got from your network card, an
ethernet frame to be exact. An ethernet frame is laid out as follows:

First 6 octets: Destination MAC address
Second 6 octets: Source MAC address
Next two octets: Protocol

In order to analyse the ethernet frame, I was thinking that maybe we
could make an exact-size struct as follows:

struct FrameHeader {
uint8 dest[6],src[6];
uint16 proto;

};

(I realise that we'd need a special compiler that will allow us to
specify no padding between members. Also I realise we'd have to be
careful about alignment).

I tend not to be a fan of this technique. But in practice
if all the members are unsigned chars you should be ok.

And then do the following:

if ( 0x800 == ((struct FrameHeader const*)data)->proto )
puts("Contains an IP packet");

So far, I believe we have two issues:
1) The alignment of "proto"
2) The byte order of "proto"

Firstly, to get around the byte order issue, I was thinking of
changing the structure to:

struct FrameHeader {
uint8 dest[6],src[6];
uint8 proto[2];

}
better

And then making a macro function to turn a "uint8[2]" into a "uint16"
using BigEndian:

#define OCTETS_TO_16(p) ( (uint16)*(p) << 8 | (p)[1] )

so that we could do:

if ( 0x800 == OCTETS_TO_16( ((struct FrameHeader const*)data)-

proto ) )puts("Contains an IP packet");

Click to expand...

Does this sound good?

reasonable approach.

The program that's being written is a network protocol analyser. I
myself am not writing it, but I've been asked to give a little advice.
The program is being written for MS Windows, but since the person's
using a cross-platform library for networking, I think they might try
get it to compile for Linux and Mac aswell.

On these three OS's, is there any alignment requirements for integer
types, or will the program crash if we try to access a mis-aligned
integer?

probably. This tends to be a hardware rather than OS thing. And Linux
runs on a *lot* of hardware.

Also, is endianess determined by the CPU, or is determined by the OS?

the CPU. though some CPUs make it optional. Presumably the OS decides
then.

Does anyone know what the endianesses are for the common CPU's and
OS's?

Any tips appreciated.

you have a special case here. Comms protocols usually specify
the byte order. Then the implementation provides macros (hton() et
al)
to convert to and from platform and network (on-the-wire) byte order.
If network and platform (host) correspond the macros do nothing.
To port you just re-write the macros. Or you auto detect
the byte order then use the correct macro.

Richard Bos · Feb 22, 2008

Let's say we have a raw byte buffer in memory:

char unsigned data[112];

Within this buffer is data that you got from your network card, an
ethernet frame to be exact. An ethernet frame is laid out as follows:

First 6 octets: Destination MAC address
Second 6 octets: Source MAC address
Next two octets: Protocol

In order to analyse the ethernet frame, I was thinking that maybe we
could make an exact-size struct as follows:

Why go to all that trouble? One thing which is guaranteed to work, as
long as your layout is correct and chars are indeed 8 bits, is

#define PROTOCOL 12
#if (ENDIAN)
#define RAW_I16(x,y) (((int)x&0xff)<<8 + (y&0xff))
#else
#define RAW_I16(x,y) (((int)y&0xff)<<8 + (x&0xff))
#endif

if (RAW_I16(buffer[PROTOCOL], buffer[PROTOCOL+1]) == 0x0800)
puts("Contains an IP packet.");

christian.bau · Feb 22, 2008

Why go to all that trouble? One thing which is guaranteed to work, as
long as your layout is correct and chars are indeed 8 bits, is

#define PROTOCOL 12
#if (ENDIAN)
#define RAW_I16(x,y) (((int)x&0xff)<<8 + (y&0xff))
#else
#define RAW_I16(x,y) (((int)y&0xff)<<8 + (x&0xff))
#endif

if (RAW_I16(buffer[PROTOCOL], buffer[PROTOCOL+1]) == 0x0800)
puts("Contains an IP packet.");

This looks very wrong. I would expect that the buffer, as an array of
unsigned char, contains exactly the same data, whether it is running
on a bigendian, littleendian or some other machine. If an IP packet is
defined by byte 12 = 0x08, byte 13 = 0x00, then you would take the
first of your two definitions for RAW_I16, no matter what your
implementation looks like.

Practical packing for structs of bytes	12	Sep 17, 2010
What's the most raw networking library?	3	Feb 27, 2008
Help! Raw Socket CheckSum	6	Oct 9, 2004
Alignment issues -- are they an issue?	10	Sep 22, 2008
ARP packet network program	2	Dec 15, 2007
problem with encryption function sending hex values	2	Nov 27, 2005
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
[ANN] BitStruct	3	Oct 10, 2005

Using exact-size structs to go thru raw byte buffers

toe

toe

CBFalconer

Nick Keighley

Richard Bos

christian.bau

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads