Specific sizes of variable types

K

Keith Thompson

Joe Wright said:
pete wrote: [...]
Binary files aren't closely associated with portability.
Text files are.

Go pete! That's why source code is in text files!

Not to talk down to anyone but ASCII is American Standard Code for
Information Interchange. It is text. It is Open Standard.

It is the right way, usually, to communicate among disparate systems
and architectures.

ASCII isn't really portable either. Plain text is portable, assuming
it can be translated as it's transferred from one machine to another,
and assuming you restrict it to a portable subset of the available
characters. (Try reading ASCII on a system that uses EBCDIC, or vice
versa.)
 
M

Malcolm

Joe Wright said:
Trying to write programs to handle format differences among binary files
from disparate systems will make you very tired, frustrated and old.
And it is hard to parse text files.
Bascially a binary file is either coherent or non-coherent. A text file can
have any sort of nonsense such as
Nemployees 10000000000000000000000

trying to handle that robustly is a real headache.
 
J

Joe Wright

Keith said:
Joe Wright said:
pete wrote:
[...]
Binary files aren't closely associated with portability.
Text files are.

Go pete! That's why source code is in text files!

Not to talk down to anyone but ASCII is American Standard Code for
Information Interchange. It is text. It is Open Standard.

It is the right way, usually, to communicate among disparate systems
and architectures.


ASCII isn't really portable either. Plain text is portable, assuming
it can be translated as it's transferred from one machine to another,
and assuming you restrict it to a portable subset of the available
characters. (Try reading ASCII on a system that uses EBCDIC, or vice
versa.)
Are you picking on me?

I have never encountered a 'text' file in EBCDIC. Have you? Well maybe..
 
K

Keith Thompson

Joe Wright said:
Keith Thompson wrote: [...]
ASCII isn't really portable either. Plain text is portable, assuming
it can be translated as it's transferred from one machine to another,
and assuming you restrict it to a portable subset of the available
characters. (Try reading ASCII on a system that uses EBCDIC, or vice
versa.)
Are you picking on me?

Not at all.
I have never encountered a 'text' file in EBCDIC. Have you? Well maybe..

Yes, I've worked on IBM mainframe systems (but not lately).

The C standard places some specific requirements on the character set,
but it doesn't mention ASCII. The characters '0'..'9' are required to
be in order and contiguous; 'A'..'Z' and 'a'..'z' are not (because
they aren't contiguous in EBCDIC). EBCDIC is probably slowly dying
out, but it's not dead yet; there are still plenty of running systems
that use it.

Expanded character sets that are strict supersets of 7-bit ASCII seem
to be the wave of the future, but nothing is certain -- and there's
rarely any need to write code that will work on an ASCII system but
fail on an EBCDIC system.
 
T

tmp123

gamehack said:
Hi all,

I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there
a portable way in being sure that each integer will be exactly 2 bytes?
Should I have a platform dependent header file which checks for
CHAR_BIT and sizeof(int) etc and have a few typedefs like int16, long32
etc.
<OT>
If I have to go non-portable, then what's the way to determine the
sizes before compilation and typedef accordingly?
</OT>

Thanks

The family of functions (non-standard, but very common) "htons", ...
could also help to code the I/O functions.

Kind regards.

Note: I work all days with an EBCDIC main frame. Thus, char coding is
something mandatory even for a simple "ftp".

DISCLAIMER:
If someone follows this suggestion, their program could not work, and
his boss can fire him. Or the program could work, becaume unnecessary
and make also redundant. It is also posible that the best option is do
not made nothing: you could be made redundant, but with less effort
spent.
 
D

Dave Thompson

Malcolm wrote:

Not good enough. You haven't allowed for the possible variation of
CHAR_BIT. Also you must not create an integer overflow. Thus:
You haven't either. His code malfunctions if the bytes in (or to be
precise read from) the file aren't 8-bit values (perhaps padded);
yours returns wrong results, which isn't correct either.
unsigned int fget16(FILE *fp) {
unsigned int ans;

ans = (fgetc(fp) & 0xff) << 8;

And this doesn't prevent overflow on an implementation with 16-bit int
and not 'safe' (e.g. 2sC-bitwise) shifts. You need the left operand
unsigned int (at least) _before_ shifting, either explicitly:
ans = (unsigned)fgetc(fp) << 8;
or if you insist on the masking you can 'hide' it there:
ans = (fgetc(fp) & 0xffU) << 8;
although unless you comment this some reader(s) and particularly a
maintenance programmer may not realize the reason and "fix" it.
ans |= (fgetc(fp) & 0xff);
return ans;
}

- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
474,175
Messages
2,570,942
Members
47,489
Latest member
BrigidaD91

Latest Threads

Top