I have written a program that uses pre-calculated data that is currently
in a binary file. The program needs to access about 1 Mb of data in the
binary file that is scattered across the 500 Mb file.
Should the program read piecewise from the file to get all the data it
needs, or load the entire contents into memory and then read the bits it
needs?
Yes.
Which is just a way of saying: it depends. The general rule
would be to write the data as simply formatted text, and parse
it. If it's 500 Mb binary, however, that's likely to be a
little slow. And you can't seek to an arbitrary position in a
text file. A binary format might help; it could be faster, and
depending on the format, you may or may not be able to
effectively use seek to only read the relevant parts.
If the data has no historical value (i.e. you don't have to save
it---it's only used for communicating between these two
programs), and you can ensure that the two programs are running
on the same machine, and have been compiled with the same
compiler (and version), using the same options, then you can
consider using a binary dump of the memory. In that case, the
"best" solution is probably implementation specific: mmap under
Unix, CreateFileMapping under Windows.
Maybe more importantly, is the binary file technique the best
one to use given the circumstances or is there a better
technique out there?
It depends a lot on how long the data have to persist. If
there's even the slightest risk that you'll have to read them
with a future version of your program, or even a recompiled
version, then you need to define a format, and use it.
The format may be binary: binary formats are a lot harder to
debug, but generally end up with smaller files and faster
formatting and parsing. Although the difference isn't always as
much as one might think. Note too that it's possible to read
and write a file containing text in binary mode, to allow
seeking. If you want to go that route, you'll probably want to
ensure that all "records" have a fixed length. (If there are
different record types, consider storing them in separate
files.)