J
John Williams
I've been spending some time expanding my rather primitive knowledge of
c++ through various exercises. One of them that I've been working on is
learning a bit about how compression works. I've started with a huffman
compression algorithm and ran into this example code to get me started:
http://www.flipcode.com/cgi-bin/fcarticles.cgi?show=64109
Unfortunately I have some questions as to why certain things where done
or what they mean, as well as if this might just be a very bad example
to try and glean anything off of before writing my own implementation.
The first thing I had issues with was his class declarations.
class cHuffmanNode
{
friend class cHuffmanTree;
private:
U8 ch;
U8 reserved[3];
U32 frequency;
class cHuffmanNode *parent,
*left_child,
*right_child;
public:
cHuffmanNode ()
{
ch=0;
frequency=0;
reserved[0]=reserved[1]=reserved[2]=0;
parent=left_child=right_child=NULL;
}
int operator () (const cHuffmanNode& a, const
cHuffmanNode& b) const;
int operator () (const cHuffmanNode* a, const
cHuffmanNode* b) const;
};
He defines a few variables in a way I've never seen before (and correct
me if I'm misinterpreting anything). Defining his variables to be of
type U8 or U32. From the rest of his code I gather that U8 defines a
variable of save 1 byte and U32 as 4 bytes, which obviously makes some
sense...but what does one gain by using these types. I didn't see any
typedefs in the code so it doesn't appear to be just be an alias for the
normal char's and int's. So what are these? Are they considered normal
in anyway, and if they are at all normal, why would you use them in
place of the more normal types?
Secondly and this is what makes me fear that this might just be a bad
example to try and learn anything from.
bool cHuffmanTree::write_bit (bool bit, U8& theByte, bool reset)
{
static long bit_index = 7;
static U8 byte = 0x00;
bool flush = false;
if (reset)
{ bit_index=7;
byte = 0x00;
return false;
}
if(bit_index == -1)
{
// flush byte
bit_index = 7;
byte = 0x00;
}
//byte |= (bit ? 1 : 0) << bit_index;
byte |= bit << bit_index;
bit_index--;
if(bit_index == -1)
{
theByte = byte;
flush=true;
}
return flush;
}
He declares his bit_index as type long when it appears it should be
bounded by -1 and 7. Now it might not be horrible (or it might even
have a purpose that someone more experienced could explain), but I would
think that this should be declared as a smaller type? In this example
it probably wouldn't cause any real issues, but as a habit in more
complex programs I would think it would lead to increased overhead, and
poor memory usage. Unless there is some reason it's
safer/better/necessary to do it this way that I'm missing?
c++ through various exercises. One of them that I've been working on is
learning a bit about how compression works. I've started with a huffman
compression algorithm and ran into this example code to get me started:
http://www.flipcode.com/cgi-bin/fcarticles.cgi?show=64109
Unfortunately I have some questions as to why certain things where done
or what they mean, as well as if this might just be a very bad example
to try and glean anything off of before writing my own implementation.
The first thing I had issues with was his class declarations.
class cHuffmanNode
{
friend class cHuffmanTree;
private:
U8 ch;
U8 reserved[3];
U32 frequency;
class cHuffmanNode *parent,
*left_child,
*right_child;
public:
cHuffmanNode ()
{
ch=0;
frequency=0;
reserved[0]=reserved[1]=reserved[2]=0;
parent=left_child=right_child=NULL;
}
int operator () (const cHuffmanNode& a, const
cHuffmanNode& b) const;
int operator () (const cHuffmanNode* a, const
cHuffmanNode* b) const;
};
He defines a few variables in a way I've never seen before (and correct
me if I'm misinterpreting anything). Defining his variables to be of
type U8 or U32. From the rest of his code I gather that U8 defines a
variable of save 1 byte and U32 as 4 bytes, which obviously makes some
sense...but what does one gain by using these types. I didn't see any
typedefs in the code so it doesn't appear to be just be an alias for the
normal char's and int's. So what are these? Are they considered normal
in anyway, and if they are at all normal, why would you use them in
place of the more normal types?
Secondly and this is what makes me fear that this might just be a bad
example to try and learn anything from.
bool cHuffmanTree::write_bit (bool bit, U8& theByte, bool reset)
{
static long bit_index = 7;
static U8 byte = 0x00;
bool flush = false;
if (reset)
{ bit_index=7;
byte = 0x00;
return false;
}
if(bit_index == -1)
{
// flush byte
bit_index = 7;
byte = 0x00;
}
//byte |= (bit ? 1 : 0) << bit_index;
byte |= bit << bit_index;
bit_index--;
if(bit_index == -1)
{
theByte = byte;
flush=true;
}
return flush;
}
He declares his bit_index as type long when it appears it should be
bounded by -1 and 7. Now it might not be horrible (or it might even
have a purpose that someone more experienced could explain), but I would
think that this should be declared as a smaller type? In this example
it probably wouldn't cause any real issues, but as a habit in more
complex programs I would think it would lead to increased overhead, and
poor memory usage. Unless there is some reason it's
safer/better/necessary to do it this way that I'm missing?