While I agree that in this situation I should do both, what would you
propose for cases where the data being sent is supposed to be
executable code:
I happen to know that for enterprise disk drives (like what Google
uses to store everything) the firmware is protected by exactly what I
describe. Since the firmware has to be able to run, the kind of fix
you propose is not possible. I would assume that if this kind of data
transfer was deemed poor, that Google and others would be demanding
something better (can you imagine if Google's database stopped working
because someone overwrote the firmware on their hard-drive?).
Again, I suppose that in this case writing a parser is a better option
(parsing a dict with strings by hand is faster than reading
documentation on someone else's parser anyway), but both is the best
option by far.
Again, thank you all for your help.
I as a fan of biological structures tend to favor the 'many-small'
strategy: expose your servers, but only a fraction to any given
source. If one of them crashes, blacklist their recent sources.
Distribute and decentralize ("redundantfy"). Compare I guess to a jet
plane with 1,000 engines, of which a few can fail no problem.
Resources can be expendable in small proportions.
More generally, think of a minimalist operating system, that can
tolerate malicious code execution, and just crash and reboot a lot.
If 'foreign code' execution is fundamental to the project, you might
even look at custom hardware. Otherwise, if it's a lower priority,
just run a custom Python install, and delete modules like os.py,
os.path.py, and maybe even sys.py. Either remove their corresponding
libraries, or create a wrapper that gets Admin approval for calls like
'subprocess.exec' and 'os.path.remove'.
You notice Windows now obtains User approval for internet access by a
new program it doesn't recognize.