Patrick Tyler wrote in post #990031:
Hello,
=20
I know that this has been covered a bit here:
http://www.ruby-forum.com/topic/186437 but I'm still not certain that = I
understand.
=20
s =3D "foo"
=20
s[3] is nil, like I would expect.
=20
s[3,0] is "", instead of nil.
=20
That behaviour is contrary to the description in the 1.9.2 docs here:
=20
http://www.ruby-doc.org/core/classes/Array.html
The docs certainly could be more clear but the actual behavior is =
self-consistent and useful.
Note: I'm assuming 1.9.X version of String.
It helps to consider the numbering in the following way:
-4 -3 -2 -1 <-- numbering for single argument indexing
0 1 2 3 =20
+---+---+---+---+
| a | b | c | d |=20
+---+---+---+---+
0 1 2 3 4 <-- numbering for two argument indexing or start of =
range
-4 -3 -2 -1
The common (and understandable) mistake is too assume that the semantics =
of the single argument index are the same as the semantics of the =
*first* argument in the two argument scenario (or range). They are not =
the same thing in practice and the documentation doesn't reflect this. =
The error though is definitely in the documentation and not in the =
implementation:
single argument: the index represents a single character position =
within the string. The result is either the single character string =
found at the index or nil because there is no character at the given =
index.
s =3D ""
s[0] # nil because no character at that position
s =3D "abcd"
s[0] # "a"
s[-4] # "a"
s[-5] # nil, no characters before the first one
two integer arguments: the arguments identify a portion of the string to =
extract or to replace. In particular, zero-width portions of the string =
can also be identified so that text can be inserted before or after =
existing characters including at the front or end of the string. In this =
case, the first argument does *not* identify a character position but =
instead identifies the space between characters as shown in the diagram =
above. The second argument is the length, which can be 0.
s =3D "abcd" # each example below assumes s is reset to "abcd"
To insert text before 'a': s[0,0] =3D "X" # "Xabcd"
To insert text after 'd': s[4,0] =3D "Z" # "abcdZ"
To replace first two characters: s[0,2] =3D "AB" # "ABcd"
To replace last two characters: s[-2,2] =3D "CD" # "abCD"
To replace middle two characters: s[1..3] =3D "XX" # "aXXd"
The behavior of a range is pretty interesting. The starting point is the =
same as the first argument when two arguments are provided (as described =
above) but the end point of the range can be the 'character position' as =
with single indexing or the "edge position" as with two integer =
arguments. The difference is determined by whether the double-dot range =
or triple-dot range is used:
s =3D "abcd"
s[1..1] # "b"
s[1..1] =3D "X" # "aXcd"
s[1...1] # ""
s[1...1] =3D "X" # "aXbcd", the range specifies a zero-width portion =
of the string
s[1..3] # "bcd"
s[1..3] =3D "X" # "aX", positions 1, 2, and 3 are replaced.
s[1...3] # "bc"
s[1...3] =3D "X" # "aXd", positions 1, 2, but not quite 3 are =
replaced.
If you go back through these examples and insist and using the single =
index semantics for the double or range indexing examples you'll just =
get confused. You've got to use the alternate numbering I show in the =
ascii diagram to model the actual behavior.
Gary Wright