F
Francesco
Hi there,
sorry for posting this as a separate thread but the other one started
with the wrong foot.
After having posted (there) that C++ program with Chinese characters
used as identifiers, I begun to think: what if those identifiers
aren't really valid?
Then I started my search for checking out whether that program was
really valid C++ as I prematurely claimed.
Searching the web I wasn't able to find any source for clarifying this
issue - I was looking for some Unicode table classifying characters as
"digit", "alphabetic" and so on, and I wasn't able to find it - maybe
such a table doesn't even exist. I found an on-line interface to a
Chinese characters DB reporting codes, strokes classifications and so
on, but that's all about it.
Then, browsing my copy of TC++PL I've dropped my eye on the grammar.
An identifier is declared in this way:
-------
identifier:
nondigit
identifier nondigit
identifier digit
-------
and also:
-------
nondigit: one of
universal-character-name
_ a b c [...] x y z
A B C [...] X Y Z
-------
Of course, there is a universal-character-name for each digit,
punctuation sign and so on, but since those are defined as specific
grammar items (i.e. "digit", "preprocessing-op-or-punc" and so on) I
assume that "one of universal-character-name" excludes those
characters by definition.
So then, does it mean that "universal-character-name" stands for [a
representation of] _any_ character other than those defined by other
parts of the grammar - even if they represent a digit in some other
language?
For instance, take the character 二 (two) - if missing, the glyph looks
like an equal sign "=", just for information.
That's a digit in Chinese, does C++ consider it digit or nondigit?
Thank you for your attention,
best regards,
Francesco
sorry for posting this as a separate thread but the other one started
with the wrong foot.
After having posted (there) that C++ program with Chinese characters
used as identifiers, I begun to think: what if those identifiers
aren't really valid?
Then I started my search for checking out whether that program was
really valid C++ as I prematurely claimed.
Searching the web I wasn't able to find any source for clarifying this
issue - I was looking for some Unicode table classifying characters as
"digit", "alphabetic" and so on, and I wasn't able to find it - maybe
such a table doesn't even exist. I found an on-line interface to a
Chinese characters DB reporting codes, strokes classifications and so
on, but that's all about it.
Then, browsing my copy of TC++PL I've dropped my eye on the grammar.
An identifier is declared in this way:
-------
identifier:
nondigit
identifier nondigit
identifier digit
-------
and also:
-------
nondigit: one of
universal-character-name
_ a b c [...] x y z
A B C [...] X Y Z
-------
Of course, there is a universal-character-name for each digit,
punctuation sign and so on, but since those are defined as specific
grammar items (i.e. "digit", "preprocessing-op-or-punc" and so on) I
assume that "one of universal-character-name" excludes those
characters by definition.
So then, does it mean that "universal-character-name" stands for [a
representation of] _any_ character other than those defined by other
parts of the grammar - even if they represent a digit in some other
language?
For instance, take the character 二 (two) - if missing, the glyph looks
like an equal sign "=", just for information.
That's a digit in Chinese, does C++ consider it digit or nondigit?
Thank you for your attention,
best regards,
Francesco