[...]
That seems to me an inaccurate description of this thread.
Kanze has pointed out the strengths of text formats, but
has also noted that there are times when binary formats
are needed. Who has been saying that text formats are
"universally preferable" to binary formats?
I think he missed a "when possible", or something similar.
*You* are accusing *me* of missing the fine print??!!
Let's see what I have written. From my post
http://groups.google.no/group/comp.lang.c++/msg/1c4004bbac86a046
[RA] > > File I/O operations with text-formatted floating-point data
[JK] > A lot of time compared to what?
[RA] Wall clock time. Relative time, compared to dumping
binary data to disk. Any way you want.
...
[RA] > > The rule-of-thumb is 30-60 seconds per 100 MBytes of
[JK] > Try it on what machine
.
[RA] Any machine. The problem is to decode text-formatted numbers
to binary.
...
Here is a test I wrote in matlab a few years ago, to demonstrate
the problem (WinXP, 2.4GHz, no idea about disk):
[matlab code snipped]
Output:
------------------------------------
Wrote ASCII data in 24.0469 seconds
Read ASCII data in 42.2031 seconds
Wrote binary data in 0.10938 seconds
Read binary data in 0.32813 seconds
------------------------------------
Binary writes are 24.0/0.1 = 240x faster than text write.
Binary reads are 42.2/0.32 = 130x faster than text read.
...
The timing numbers (both absolute and relative) would be of
similar orders of magnitude if you repeated the test with C++.
...
The application I'm working with would need to crunch through
some 10 GBytes of numerical data per hour.
I think these excerpts should be sufficient to sketch what
kind of world I am living and working in.
Do note thet I never - unlike some other paricipants in this
thread - claimed my numbers to be exact. I am fairly certain
my English is good enough that the above would reasonably be
expected to be interpreted by a reader as *representative*
numbers. If you look closely, I also commented that coding
up a program in C++ instead of matlab as I had done, would
result in *different* numbers, but not solve the fundamental
problem.
So I can't see any reason why you attack me for my numbers
being "wrong"; I never stated they were exact.
A few posts further out:
http://groups.google.no/group/comp.lang.c++/msg/0abdc440e78f98d6
[RA] So what does text-based formats actually buy you?
[JK] Shorter development times, less expensive development, greater
reliability...
In sum, lower cost.
[RA] As long as you keep two factors in mind:
1) The user's time is not yours (the programmer) to waste.
2) The users's storage facilities (disk space, network
bandwidth etc) are not yours (the programmer) to waste.
[JK] The user pays for your time. Spending it to do something
which
results in a less reliable program, and that he doesn't need,
is
irresponsible, and borders on fraud.
This one really pissed me off. Here I had explained to you
what application I am working with, made you aware of the users
requirements in the operational situation, and you explicitly
state that paying attention to such concerns is 'borderline fraud'!
So I can not interpret this in any other way than that you will
use text-based formats, come hell or high water. Which essentially
invalidate any otherwise relevant arguments you might have presented
throughout thread.
Binary formats are an optimization:
No, it's not. The selection of file formats is a strategic desing
decision on a par with using binary O(lgN) or linear O(N) search
engines; like choosing betweene a O(NlgN) quick sort or a O(N^2)
bubble sort algorithm.
Such factors govern what problems can be handled by the software
with reasonable effort and within reasonable time.
True, both binary and text-based numerical IO are O(N), but since
text-based numerical IO is orders of magnitude slower, the strategic
impact on design decisions is the same.
you sometimes need this
optimization (and you certainly should be aware of the
possibility of using it), but you don't use them unless timing
or data size constraints make it necessary.
Hipocrate!
This is exactly what I have been arguing for days and weeks already.
What changed?
Rune