Specific sizes of variable types

Keith Thompson · Jan 22, 2006

Joe Wright said:
pete wrote: [...]

Binary files aren't closely associated with portability.
Text files are.

Click to expand...

Go pete! That's why source code is in text files!

Not to talk down to anyone but ASCII is American Standard Code for
Information Interchange. It is text. It is Open Standard.

It is the right way, usually, to communicate among disparate systems
and architectures.

ASCII isn't really portable either. Plain text is portable, assuming
it can be translated as it's transferred from one machine to another,
and assuming you restrict it to a portable subset of the available
characters. (Try reading ASCII on a system that uses EBCDIC, or vice
versa.)

Malcolm · Jan 22, 2006

Joe Wright said:
Trying to write programs to handle format differences among binary files
from disparate systems will make you very tired, frustrated and old.

And it is hard to parse text files.
Bascially a binary file is either coherent or non-coherent. A text file can
have any sort of nonsense such as
Nemployees 10000000000000000000000

trying to handle that robustly is a real headache.

Joe Wright · Jan 22, 2006

Keith said:
Joe Wright said:

pete wrote:
[...]

Binary files aren't closely associated with portability.
Text files are.

Click to expand...

Go pete! That's why source code is in text files!

Not to talk down to anyone but ASCII is American Standard Code for
Information Interchange. It is text. It is Open Standard.

It is the right way, usually, to communicate among disparate systems
and architectures.

Click to expand...

ASCII isn't really portable either. Plain text is portable, assuming
it can be translated as it's transferred from one machine to another,
and assuming you restrict it to a portable subset of the available
characters. (Try reading ASCII on a system that uses EBCDIC, or vice
versa.)

Are you picking on me?

I have never encountered a 'text' file in EBCDIC. Have you? Well maybe..

Keith Thompson · Jan 22, 2006

Joe Wright said:
Keith Thompson wrote: [...]

ASCII isn't really portable either. Plain text is portable, assuming
it can be translated as it's transferred from one machine to another,
and assuming you restrict it to a portable subset of the available
characters. (Try reading ASCII on a system that uses EBCDIC, or vice
versa.)

Click to expand...

Are you picking on me?

Not at all.

I have never encountered a 'text' file in EBCDIC. Have you? Well maybe..

Yes, I've worked on IBM mainframe systems (but not lately).

The C standard places some specific requirements on the character set,
but it doesn't mention ASCII. The characters '0'..'9' are required to
be in order and contiguous; 'A'..'Z' and 'a'..'z' are not (because
they aren't contiguous in EBCDIC). EBCDIC is probably slowly dying
out, but it's not dead yet; there are still plenty of running systems
that use it.

Expanded character sets that are strict supersets of 7-bit ASCII seem
to be the wave of the future, but nothing is certain -- and there's
rarely any need to write code that will work on an ASCII system but
fail on an EBCDIC system.

tmp123 · Jan 22, 2006

gamehack said:
Hi all,

I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there
a portable way in being sure that each integer will be exactly 2 bytes?
Should I have a platform dependent header file which checks for
CHAR_BIT and sizeof(int) etc and have a few typedefs like int16, long32
etc.
<OT>
If I have to go non-portable, then what's the way to determine the
sizes before compilation and typedef accordingly?
</OT>

Thanks

The family of functions (non-standard, but very common) "htons", ...
could also help to code the I/O functions.

Kind regards.

Note: I work all days with an EBCDIC main frame. Thus, char coding is
something mandatory even for a simple "ftp".

DISCLAIMER:
If someone follows this suggestion, their program could not work, and
his boss can fire him. Or the program could work, becaume unnecessary
and make also redundant. It is also posible that the best option is do
not made nothing: you could be made redundant, but with less effort
spent.

Dave Thompson · Feb 6, 2006

Malcolm wrote:

Not good enough. You haven't allowed for the possible variation of
CHAR_BIT. Also you must not create an integer overflow. Thus:

You haven't either. His code malfunctions if the bytes in (or to be
precise read from) the file aren't 8-bit values (perhaps padded);
yours returns wrong results, which isn't correct either.

unsigned int fget16(FILE *fp) {
unsigned int ans;

ans = (fgetc(fp) & 0xff) << 8;

And this doesn't prevent overflow on an implementation with 16-bit int
and not 'safe' (e.g. 2sC-bitwise) shifts. You need the left operand
unsigned int (at least) _before_ shifting, either explicitly:
ans = (unsigned)fgetc(fp) << 8;
or if you insist on the masking you can 'hide' it there:
ans = (fgetc(fp) & 0xffU) << 8;
although unless you comment this some reader(s) and particularly a
maintenance programmer may not realize the reason and "fix" it.

ans |= (fgetc(fp) & 0xff);
return ans;
}

- David.Thompson1 at worldnet.att.net

Sizes of pointers	233	Jul 30, 2013
Sizes and types for network programming	35	Sep 14, 2010
Integer sizes	14	Dec 4, 2003
Max value of a variable	9	Dec 1, 2010
Generic iterators to specific types	6	May 9, 2007
Column types with DB API	0	Jun 28, 2009
Finding limits of precision for float types	6	Dec 13, 2003
Finite list of things that vary with CPU-type (proper)	1	Dec 4, 2008

Specific sizes of variable types

Keith Thompson

Malcolm

Joe Wright

Keith Thompson

tmp123

Dave Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads