IMO it's too bad that "they" chose \r\n as the standard. Having two
bytes as the end of line marker makes sense on typewriters and
similarly operating printing equipment.
I may well be mistaken, but I think at the time they set that standard,
such equipment was still in use. So it may have been a consideration.
Nowadays, I think having a single byte as the EOL maker is quite a bit
clearer.
Rather than thinking in bytes and the like when inserting an EOL marker,
inserting really an EOL marker (that then gets translated by low level code
to the appropriate byte sequence as needed) is probably the less archaic
way to do that
On the other hand, with the use of UTF-8 encodings and the like, the
byte-to-character mapping is gone anyway, so perhaps I should just get
used to it ;-)
Yes
"Bytes" is getting definitely too low level. Especially with higher
level languages like Python... there are not many byte manipulation
facilities anyway. The language is at a much higher level, and in that
sense the classic strings are a bit out of line, it seems.
Which are those reasons, except for backward compatability?
I don't know how many reasons you need besides backward compatibility, but
all the DOS (still around!) and Windows apps that would break... ?!? I
think breaking that compatibility would be more expensive than the whole
Y2k bug story. And don't be fooled... you may run a Linux system, but you'd
pay your share of that bill anyway.
Less FAQs in this group about people putting tabs, newlines and other
characters in their filenames because they forget to escape their
backslashes?
Or forget to use raw strings. (If you don't want it to be escaped, please
say so
But similar as I wrote above with the EOL thing, I think that the whole
backslash escape character story is not quite well-chosen. In a way, this a
mere C compatibility pain in the neck... (Of course there are
implementation and efficiency reasons, mainly because Python is based on C
APIs, but all that is as arbitrary as the selection of the backslash as
path separator.)
There could be other solutions (in Python, I mean). Only accept raw strings
in APIs that deal with paths? Force coders to create paths as objects, in a
portable way, maybe by removing the possibility to create paths from
strings that are more than one level in the path? Or introduce a Unicode
character that means "portable path separator"? Or whatever...
Strings and filenames are usually tightly coupled in any program
handing files, though.
Yes, and that's IMO something from way below in the implementation depths.
While file names and paths are strings, not every string is a valid and
useful file name or path. This shows that using strings for file names and
paths has tradition (coming from low level languages like C), but IMO is
not quite appropriate for a higher abstraction level.
Almost every programming language I know of uses it as the escape
character, except for perhaps VB Script and the likes. Not sure about
the different assembly languages, though.
There are so many languages... and I know so few of them...
http://en.wikipedia.org/wiki/Category:Programming_languages
Now it may be predominant (I still think it's mostly present in languages
that are in some way influenced by C), but in the 70ies?
IIRC, Pascal uses '^' for a similar purpose (not quite the same, but
similar). This form is still in ample use in documentation to mean
Ctrl- said:
Sure. I've talked more about this specific subject in this thread than
in the rest of my life ;-)
There's a first for everything
I think cooperation and uniformity can be a very good thing. On the other
hand, Microsoft want the software written for their platform to stay on
their platform. That's probably one of the major reasons to remain
incompatible with other systems.
Probably. But even if I'd had a say there (and I hate switching between
separator characters just as much as the next guy, and possibly do so more
than you given that I work on a Windows system, with slashes in repository
paths and URIs), I'm not sure I'd make the jump away from the backslash as
path separator. That's just breaking too much code. You don't want to have
all these curses directed at you...
Gerhard