weird variable declarations

J

John Williams

I've been spending some time expanding my rather primitive knowledge of
c++ through various exercises. One of them that I've been working on is
learning a bit about how compression works. I've started with a huffman
compression algorithm and ran into this example code to get me started:
http://www.flipcode.com/cgi-bin/fcarticles.cgi?show=64109

Unfortunately I have some questions as to why certain things where done
or what they mean, as well as if this might just be a very bad example
to try and glean anything off of before writing my own implementation.

The first thing I had issues with was his class declarations.

class cHuffmanNode
{
friend class cHuffmanTree;

private:
U8 ch;
U8 reserved[3];
U32 frequency;
class cHuffmanNode *parent,
*left_child,
*right_child;

public:

cHuffmanNode ()
{
ch=0;
frequency=0;
reserved[0]=reserved[1]=reserved[2]=0;
parent=left_child=right_child=NULL;
}

int operator () (const cHuffmanNode& a, const
cHuffmanNode& b) const;
int operator () (const cHuffmanNode* a, const
cHuffmanNode* b) const;
};

He defines a few variables in a way I've never seen before (and correct
me if I'm misinterpreting anything). Defining his variables to be of
type U8 or U32. From the rest of his code I gather that U8 defines a
variable of save 1 byte and U32 as 4 bytes, which obviously makes some
sense...but what does one gain by using these types. I didn't see any
typedefs in the code so it doesn't appear to be just be an alias for the
normal char's and int's. So what are these? Are they considered normal
in anyway, and if they are at all normal, why would you use them in
place of the more normal types?

Secondly and this is what makes me fear that this might just be a bad
example to try and learn anything from.

bool cHuffmanTree::write_bit (bool bit, U8& theByte, bool reset)
{
static long bit_index = 7;
static U8 byte = 0x00;

bool flush = false;

if (reset)
{ bit_index=7;
byte = 0x00;
return false;
}

if(bit_index == -1)
{
// flush byte
bit_index = 7;
byte = 0x00;
}

//byte |= (bit ? 1 : 0) << bit_index;
byte |= bit << bit_index;
bit_index--;

if(bit_index == -1)
{
theByte = byte;
flush=true;
}
return flush;
}

He declares his bit_index as type long when it appears it should be
bounded by -1 and 7. Now it might not be horrible (or it might even
have a purpose that someone more experienced could explain), but I would
think that this should be declared as a smaller type? In this example
it probably wouldn't cause any real issues, but as a habit in more
complex programs I would think it would lead to increased overhead, and
poor memory usage. Unless there is some reason it's
safer/better/necessary to do it this way that I'm missing?
 
V

Victor Bazarov

John said:
I've been spending some time expanding my rather primitive knowledge
of c++ through various exercises. [...]
The first thing I had issues with was his class declarations.

class cHuffmanNode
{
friend class cHuffmanTree;

private:
U8 ch;
U8 reserved[3];
U32 frequency;
class cHuffmanNode *parent,
*left_child,
*right_child;

public:

cHuffmanNode ()
{
ch=0;
frequency=0;
reserved[0]=reserved[1]=reserved[2]=0;
parent=left_child=right_child=NULL;
}

int operator () (const cHuffmanNode& a, const
cHuffmanNode& b) const;
int operator () (const cHuffmanNode* a, const
cHuffmanNode* b) const;
};

He defines a few variables in a way I've never seen before (and
correct me if I'm misinterpreting anything). Defining his variables
to be of type U8 or U32. From the rest of his code I gather that U8
defines a variable of save 1 byte and U32 as 4 bytes, which obviously
makes some sense...but what does one gain by using these types.

On different implementations or systems even the same type (say, 'int')
can have different sizes (and bit counts). To always use a specific
size in bits for something (I am not questioning the need, mind you),
one would define custom types and use them.
I
didn't see any typedefs in the code so it doesn't appear to be just
be an alias for the normal char's and int's.

If you didn't see the typedefs, it doesn't mean they aren't there.
So what are these? Are
they considered normal in anyway, and if they are at all normal, why
would you use them in place of the more normal types?

They are *most likely* typedefs. Look better, you'll find them.
Secondly and this is what makes me fear that this might just be a bad
example to try and learn anything from.

It might be a bad example, but for different reasons.
bool cHuffmanTree::write_bit (bool bit, U8& theByte, bool reset)
{
static long bit_index = 7;
static U8 byte = 0x00;

The use for a hex form is not justified. And there really is no need
to initialise a _static_ integral value to 0, it's the default.
bool flush = false;

if (reset)
{ bit_index=7;
byte = 0x00;
return false;
}

if(bit_index == -1)
{
// flush byte
bit_index = 7;
byte = 0x00;
}

//byte |= (bit ? 1 : 0) << bit_index;
byte |= bit << bit_index;
bit_index--;

if(bit_index == -1)
{
theByte = byte;
flush=true;
}
return flush;
}

He declares his bit_index as type long when it appears it should be
bounded by -1 and 7. Now it might not be horrible (or it might even
have a purpose that someone more experienced could explain), but I
would think that this should be declared as a smaller type?

I would argue that 'int' is sufficient, yes.
In this
example it probably wouldn't cause any real issues, but as a habit in
more complex programs I would think it would lead to increased
overhead, and poor memory usage. Unless there is some reason it's
safer/better/necessary to do it this way that I'm missing?

If 'bit_index' were a member of some data structure, it might make
a difference. However, when it's a local object, there probably no
reason for concern, unless this all is supposed to work really fast
even on a 16-bit processor (where operations with 'long' could be
a tad more expensive). Even then, you should measure first and only
optimize what matters.

V
 
S

SasQ

Dnia Mon, 19 Mar 2007 13:23:45 -0400, Victor Bazarov napisa³(a):
On different implementations or systems even the same type
(say, 'int') can have different sizes (and bit counts).
To always use a specific size in bits for something (I am
not questioning the need, mind you), one would define
custom types and use them.

But from what one might define custom types, if sizes of
all built-in types are "swampy ground" to build on? What to
use to achieve sizes defined precisely to the one bit?
A bit field? Or maybe something else?
 
V

Victor Bazarov

SasQ said:
Dnia Mon, 19 Mar 2007 13:23:45 -0400, Victor Bazarov napisa³(a):


But from what one might define custom types, if sizes of
all built-in types are "swampy ground" to build on? What to
use to achieve sizes defined precisely to the one bit?
A bit field? Or maybe something else?

It depends on the problem to be solved. It depends on the set
of platforms to which the code is to be ported. It depends on
how many (or how few) concessions the programmer is willing to
make. There is no hard and fast answer.

It basically comes back to the same "why C or C++ does not have
(or did not have, if we take C99) any integral types of fixed
sizes, like 16 bits, 32 bits, et cetera?" It does not (or did
not) have them because the most generic version of every
algorithm that can be expressed in the language terms never
requires (required) fixed-sized types. The need in those first
appeared when the language was used to describe interaction
with hardware. And then somebody decided that this quite non-
portable area needed to be added to the language. Why?...

What kind of general problem requires "sizes defined precisely
to the one bit"? Just curious.

V
 
S

SasQ

Dnia Mon, 19 Mar 2007 17:29:05 -0400, Victor Bazarov napisa³(a):
It does not (or did not) have them because the most generic
version of every algorithm that can be expressed in the
language terms never requires (required) fixed-sized types.

I agree with that and understand.
But there is also the other side of the Moon ;J
The need in those first appeared when the language was
used to describe interaction with hardware. And then
somebody decided that this quite non-portable area
needed to be added to the language. Why?...

Speed and access to hardware controlling registers.
What kind of general problem requires "sizes defined
precisely to the one bit"? Just curious.

1. Encryption algorithms.
One day I've been implementing the AES [Rjandel]
encryption algorithm and it was defined to use
precise bit sizes [eg. 128-bit matrices]. I have had do
implement it using awful low-level code based on unportable
assumptions of my particular platform [but it was for a
school excercise, so who cares ;P].
2. Compression algorithms.
It often use some twiddling of the particular bits.
3. Controling hardware devices through memory-mapped registers.
[but here bit-fields may be used' I think so].
4. Multimedia
This is related to compression algorithms too.
5. Networking.
Networking protocols are precisely defined, sometimes
to the one bit [eg. octets, packet formats etc.]

So? What is the best and the most elegant approach to
deal with that kind of programming?
 
V

Victor Bazarov

SasQ said:
Dnia Mon, 19 Mar 2007 17:29:05 -0400, Victor Bazarov napisa³(a):
[..] What is the best and the most elegant approach to
deal with that kind of programming?

Did you read the first paragraph of my reply to which you're
responding? It depends. There is no *single* "best and the most
elegant" approach.

V
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,968
Messages
2,570,149
Members
46,695
Latest member
StanleyDri

Latest Threads

Top