D
Dimitri Maziuk
Roedy Green sez:
That is assuming
0. software that translated source into binary works correctly:
we know it doesn't. And when it doesn't we get to the interesting
part: failure modes. HTML browser can fail to "View source" and
user will still see the content. Binary browser?
1. binary representation is not necessarily more compact. E.g.
using double-byte characters vs. single-byte + charset header.
2. nobody cares about processing speed. The bottleneck is network
I/O, not CPU speed. What we do care about is byte ordering, original
word sizes, and all other fun stuff you need to deal with when
getting raw bytes over the wire.
3. non-issue as there's no reason why text markup format specs
must necessarily be less tight that binary format specs. What
happened in Real Life when Nutscrape, Microshaft, and whathaveyou
add feechoorz to their software and then shove them up HTML
specs would happen with any format, binary, shminary.
4. non-issue. Your own estimate is that 1% of HTML is good, ergo
only 1% of webshite designers and authors of HTML editing software
understand HTML. Ergo, they don't _have_ to understand it already,
obscuring the format further won't change anything.
(Obviously, the assumption that people will make better $foo if
they don't understand $foo is in itself rather amusing. E.g.
people would make better cars if they didn't understand how cars
work.)
5. who said anything about classes? You can process HTML with sed:
s/<.+>//g will give you nice plain text output, and you can add
bells and whistles as appropriate for your hardware.
Furrfu
Dima
Think of what fraction of the
planet's XML or HTML documents would pass a complete W3C validation
suite, perhaps under 1%. Using a binary format solves that problem in
one fell swoop with the additional benefits of:
1. more compact, faster download.
2. faster processing.
3. tighter specification.
4. fewer people have to understand it.
5. simpler classes needed to process it, important in handhelds.
That is assuming
0. software that translated source into binary works correctly:
we know it doesn't. And when it doesn't we get to the interesting
part: failure modes. HTML browser can fail to "View source" and
user will still see the content. Binary browser?
1. binary representation is not necessarily more compact. E.g.
using double-byte characters vs. single-byte + charset header.
2. nobody cares about processing speed. The bottleneck is network
I/O, not CPU speed. What we do care about is byte ordering, original
word sizes, and all other fun stuff you need to deal with when
getting raw bytes over the wire.
3. non-issue as there's no reason why text markup format specs
must necessarily be less tight that binary format specs. What
happened in Real Life when Nutscrape, Microshaft, and whathaveyou
add feechoorz to their software and then shove them up HTML
specs would happen with any format, binary, shminary.
4. non-issue. Your own estimate is that 1% of HTML is good, ergo
only 1% of webshite designers and authors of HTML editing software
understand HTML. Ergo, they don't _have_ to understand it already,
obscuring the format further won't change anything.
(Obviously, the assumption that people will make better $foo if
they don't understand $foo is in itself rather amusing. E.g.
people would make better cars if they didn't understand how cars
work.)
5. who said anything about classes? You can process HTML with sed:
s/<.+>//g will give you nice plain text output, and you can add
bells and whistles as appropriate for your hardware.
Furrfu
Dima