Sure. The image format is PNG. I deliberately didn't mention that
originally to make sure that my question wouldn't accidentally be
dismissed as off-topic.
Understandable. Just FYI, there are libraries that make dealing with
PNG easier, although no library is required to use this open format.
The machines this code runs on will in practice most likely have 8 bit
chars, but I'd rather not rely on any such implementation-specific
details if I can avoid it.
I see no reason to worry about implementation specific details either
regarding use of PNG. The only thing that really stands out about it
in my mind is that it uses network byte order in its format. As
another user mentions, there are functions for converting to network
byte order, or you can just write it out that way in your code.
Apparently my only problem with endianness was that I didn't understand
the concept. I also believe I see why the exact number of bits in a char
doesn't matter as far as storing and manipulating single bytes in memory
goes.
Correct.
What still escapes me, however, is how to ensure that the bytes I
write into the file are 'eight bit bytes' regardless of how many bits
the host machine uses to store a char.
Im not sure you really want to do this.
Generally, media files are formatted such that they have a header and
data portions. It is typical that headers are represented by one or
more data structures or tables that describe the data portions of the
file through the use of byte values. - regardless of the number of
bits in each byte that make up the bytes in these structures, there
are still only a finite number of meaningful values available to each
byte.
Read and parse the bytes as per the specification:
http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html#PNG-file-signature
Parse the header and handle it separately from the data.
In any case, regardless of whether they are 8 or 9 bit bytes, the
number of bytes is the same, since each byte is used to represent a
finite number of values.
Probably I've misunderstood something again; I'll try to explain my
reasoning. Suppose I want to write the byte 1111 1111 to my file.
Suppose also that the machine the code is running on uses a 9 bit char,
so my byte's actual representation in memory is 0 1111 1111.
Yes, and if you consider the meaning of this, there is no value
difference between the two bytes. Both are 255 in decimal or 0xFF in
hex.
I don't think you have to worry about handling decimal values ranging
from 256 to 512 in a single byte for PNG.
If I just
output this char to the file using an fstream in binary mode, won't it
write those 9 bits to the file? So if I wrote four chars on the
9-bit-char machine wouldn't I get a file with 4*9 bits, not 4*8 bits?
Indeed you would get 4*9 bits and not 4*8 bits in the case you have
described, but that does not mean you don't have equivalent values in
each byte.
In summary this is all very simple.
Your file format (PNG) deals with bytes and so does C++. You can read
http://www.parashift.com/c++-faq-lite/intrinsic-types.html section
26.6.
At such time that you need to worry about a specific bit, you count
from the right to the left.
For example, in the case of obtaining the value of bit 5, one way to
do this is you can RIGHT SHIFT your byte by 4: i.e byte >>4 and then
AND the byte with 1: i.e.byte &1. If the remaining value is 1, then
the bit is on, if it is 0, then it is off.
In the case, of 9, 16, 32, or even 64-bit bytes, the above holds true
for obtaining a bit's value.
Of course there will be times when more than one bit has meaning, but
again, consult the standard for the file format you are dealing with
at that time. MPEG is a good example of this.
Once again, the 9th bit you are trying to account for should pose no
issues for you when using PNG and I see no need for such
considerations.
Just handle the byte ordering and you should be fine.
Thanks for your patience, Charles.
Kristian
I hope this helps,
Charles