N
Neal said:
The type system looks very interesting!
It's just a pity they based the syntax on C rather
than something more enlightened. (Why do people
keep doing that when they design languages?)
[...]It's just a pity they based the syntax on C rather than something more
enlightened. (Why do people keep doing that when they design languages?)
When the only tool you've used is a hammer, every tool you design ends up
looking like a hammer.
Mark said:As a rule of thumb people don't like change? This obviously assumes
that language designers are people
That's probably true (on both counts).
I guess this means we need to encourage more
Pythoneers to become language designers!
The type system looks very interesting!
It's just a pity they based the syntax on C rather
than something more enlightened. (Why do people
keep doing that when they design languages?)
That's probably true (on both counts).
I guess this means we need to encourage more
Pythoneers to become language designers!
Rick said:The multiplication operator can ONLY be used on
numerics.
I'm not convinced about that part. I notice that
subtraction, multiplication and division are bundled
into a single interface Numeric, but there is a
separate one called Summable for addition --
apparently so that they could use + for string
concatenation.
This seems to be a case of one rule for the language
designers and a different one for everyone else.
If it's okay for '+' to be used on something that's
not a number, why not '*'?
character
Satisfied Interfaces: Comparable<Character>, Enumerable<Character>, Ordinal<Other>
A 32-bit Unicode character.
Satisfied Interfaces: Category, Cloneable<List<Element>>, Collection<Element>,
Comparable<String>, Correspondence<Integer,Element>, Iterable<Element,Null>,
List<Character>, Ranged<Integer,String>, Summable<String>
string
Satisfied Interfaces: Category, Cloneable<List<Element>>, Collection<Element>,
Comparable<String>, Correspondence<Integer,Element>, Iterable<Element,Null>,
List<Character>, Ranged<Integer,String>, Summable<String>
A string of characters. Each character in the string is a 32-bit Unicode
character. The internal UTF-16 encoding is hidden from clients.
A string is a Category of its Characters, and of its substrings:
Clean. Far, far away from a unicode handling which may require
18 bytes (!) more to encode a non ascii n-chars string than a
ascii n-chars string.
(With performances following expectedly "globally" the same logic)
44
jmf
string
Satisfied Interfaces: Category, Cloneable<List<Element>>, Collection<Element>,
Comparable<String>, Correspondence<Integer,Element>, Iterable<Element,Null>,
List<Character>, Ranged<Integer,String>, Summable<String>
A string of characters. Each character in the string is a 32-bit Unicode
character. The internal UTF-16 encoding is hidden from clients.
A string is a Category of its Characters, and of its substrings:
I'm trying to figure this out. Reading the docs hasn't answered this.
If each character in a string is a 32-bit Unicode character, and (as
can be seen in the examples) string indexing and slicing are
supported, then does string indexing mean counting from the beginning
to see if there were any surrogate pairs?
The string reference says:
"""Since a String has an underlying UTF-16 encoding, certain operations are
expensive, requiring iteration of the characters of the string. In
particular, size requires iteration of the whole string, and get(), span(),
and segment() require iteration from the beginning of the string to the
given index."""
The get and span operations appear to be equivalent to indexing and slicing.
I'm trying to figure this out. Reading the docs hasn't answered this. If
each character in a string is a 32-bit Unicode character, and (as can be
seen in the examples) string indexing and slicing are supported, then
does string indexing mean counting from the beginning to see if there
were any surrogate pairs?
Unless they have done something *really* clever, the language designers
lose a hundred million points for screwing up text strings. There is
*absolutely no excuse* for a new, modern language with no backwards
compatibility concerns to choose one of the three bad choices:
I can't figure out what that means, since it contradicts itself. First
it says *every* character is 32-bits (presumably UTF-32), then it says
that internally it uses UTF-16. At least one of these statements is
wrong. (They could both be wrong, but they can't both be right.)
Chris Angelico said:Right, that's what I was looking for and didn't find. (I was searching
the one-page reference manual rather than reading in detail.) So, yes,
they're O(n) operations. Thanks for hunting that down.
ChrisA
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.