James said:
There is a huge volume of programs that can and do use no text.
However, I don't know of any program today that uses text in
ASCII;
You must be thinking of shrink-wrap-type user-interactive programs rather
than in-house development tools, for example.
text is used to communicate with human beings, and ASCII
isn't sufficient for that.
Millions of posts on USENET seem to contradict that statement.
Except that the examples are false. C/C++/Java and Ada require
Unicode.
To be general they do. One could easily eliminate that requirement and still
get much work done. I'm "arguing" not against Unicode, but that the ASCII
subset, in and of itself, is useful.
Practically everything on the network is UTF-8.
Basically, except for some historical tools, ASCII is dead.
Nah, it's alive and well, even if you choose to call it a subset of
something else. Parse all of the non-binary group posts and see how many
non-ASCII characters come up (besides your tagline!).
As long as you're the only person using your code, you can do
what you want.
person, or company, or group, or alliance all work. Standards were meant to
be... ignored (there's always a better way)!
I understand the rationale.
First, there is no such thing as an ASCII text file.
Then what is a file that contains only ASCII printable characters (throw in
LF and HT for good measure)?
For that
matter, under Unix, there is no such thing as a text file. A
file is a sequence of bytes.
And if the file is opened in text mode?
How those bytes are interpreted
depends on the application.
So the distinction between text and binary mode is .... ?
Internally, the program is still working with ASCII strings,
assuming English is the language (PURE English that recognizes
only 26 letters, that is).
Pure English has [...]
_I_ was giving the definition of "Pure English" in the context (like a
glossary). How many letters are there in the English alphabet? How many?
Surely I wasn't taught umlauts in gradeschool. You are arguing semantics and
I'm arguing practicality: if I can make a simplifying assumption, I'm gonna
do it (and eval that assumption given the task at hand)!
accented characters in some words (at least
according to Merriam Webster, for American English). Pure
English distiguishes between open and closing quotes, both
single and double. Real English distinguishes between a hyphen,
an en dash and an em dash.
But that's all irrelevant, because in the end, you're writing
bytes, and you have to establish some sort of agreement between
what you mean by them, and what the programs reading the data
mean. (*If* we could get by with only the characters in
traditional ASCII, it would be nice, because for historical
reasons, most of the other encodings encountered encode those
characters identically. Realistically, however, any program
dealing with text has to support more, or nobody will use it.)
Where did you get that bullshit?
This week's trade rags (it's still around here, so if you want the exact
reference, just ask me). It makes sense too: Apple moved off of PowerPC also
probably to avoid doom. I'm a Wintel developer exclusively right now also,
so it makes double sense to me.
Sun does sell x86 processors
(using the AMD chip). And IBM and HP are quite successful with
there lines of non x86 processors. (IMHO, where Sun went wrong
was in abandoning its traditional hardware market, and moving
into software adventures like Java.)
Topic for another thread for sure (those kinds of threads are fun, but don't
result in anything useful). What you said parenthetically above, I kinda
agree with: Open Solaris looked like a winner to me until they made it
subserviant to Java (a platform to push Java). Dumb Sun move #2. (But I only
track these things lightly on the surface).
I'm not referencing any application domain in particular.
Apparently you referenced OSes a few times.
Practically all of the Unix applications I know take the
encoding from the environment; those that don't use UTF-8 (the
more recent ones, anyway). All of the Windows applications I
know use UTF-16LE.
Do you think anyone would use MS Office or Open Office if they
only supported ASCII?
I was talking about simpler class of programs and libraries even: say, a
program's options file and the ini-file parser (designated subset of 7-bit
ASCII).
Apparently there is a semantic gap in our "debate". I'm not sure where it
is, but I think it may be in that you are talking about what goes on behind
the scenes in an OS, for example, and I'm just using the simple ini-file
parser using some concoction called ASCIIString as the workhorse.
Yes. That's where I live and work. In the real world. I
produce programs that other people use. (In practice, my
programs don't usually deal with text, except maybe to pass it
through, so I'm not confronted with the problem that often. But
often enough to be aware of it.)
You opportunistically took that out of context. I was alluding toward the
difference between the problem domain (the real world) and the solution
domain (technology).
Well you snipped off the context so I don't know how I meant that.
Programs assign semantics to those ones and zeros.
Even at the hardware level---a float and an int may contain the
same number of bits, but the code uses different instructions
with them. Programs interpret the data.
Which brings us back to my point above---you don't generally
control how other programs are going to interpret the data you
write.
If you say so. But if I specify that ini-files are for my program may
contain only the designated subset of 7-bit ASCII, and someone puts an
invalid character in there, expect a nasty error box popping up.
Sorry, I don't know what you're talking about.
Nevermind. It just seemed like you were arguing both sides of the point in
the two threads combined.
I've already had to deal with C with the symbols in Kanji.
So use it once and then jettison all simpler things? The C/C++ APIs are
overly-general (IMO) that's why I don't use them unless the situation
warrants it. Generality makes complexity. Every developer should know how to
implement a linked list, for example. Every developer should have a number
of linked lists he uses, as having only one design paradigm ensures every
program/project is a compromise. IMO. YMMV.
That
would have been toward the end of the 1980s. And I haven't seen
a program in the last ten years which didn't use symbols and
have comments in either French or German.
But you're in/from France right? Us pesky "americans" huh.
Fine. If you write a compiler, and you're the only person to
use it, you can do whatever you want. But there's no sense in
talking about it here, since it has no relevance in the real
world.
You're posting in extremism to promote generalism? Good engineering includes
exploiting simplifying assumptions (and avoiding the hype, on the flip
side). (You'd really put non-ASCII characters in source code comments?
Bizarre.)
Most programs don't need to be international. Data and development tools are
not the same.
Well it would be for me! So yes it is!
(Actually, the most difficult language to program
in is English,
Not for me! Context matters! (I was the context, along with many other
developers here).
It's one of my primarly languages as well. Not the only one,
obviously, but one of them.
"primarly" (hehe
). "A set of primary languages?". One primary or none
probably. (None is as good as one, I'm not dissing... I only know two
languages and a third ever so lightly for "I took it in HS").
That has nothing to do with the operating system. Read the
language standards.
Ah ha! The golden calf. I had a feeling there was a god amongst us. :/
I'm not "big" on "standards". (Separate thread!).
No. Do you know any of the languages in question? All of them
clearly require support for at least the first BMP of Unicode in
the compiler. You may not use that possibility---a lot of
people don't---but it's a fundamental part of the language.
THAT _IS_ the point (!): if a program (or other) doesn't require it, then it
is just CHAFF. This ever-expoused over-generality and
general-is-good-and-always-better gets very annoying in these NGs. Save the
committe stuff for c.l.c++.moderated or the std group. The chaff is probably
holding back practicality for those who can't distinquish politics.