Bizarre Range behavior

S

Scott Briggs

Can someone please explain this behavior in ruby (1.8.6p111):
("2"..."8").to_a => ["2", "3", "4", "5", "6", "7"]
("2".."8").to_a => ["2", "3", "4", "5", "6", "7", "8"]
("2".."9").to_a => ["2", "3", "4", "5", "6", "7", "8", "9"]
("2".."10").to_a => []
("2".."11").to_a => []
("1".."11").to_a
=> ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]


Cheers,
Scott
 
Y

Yossef Mendelssohn

Can someone please explain this behavior in ruby (1.8.6p111):

=3D> ["2", "3", "4", "5", "6", "7"]>> ("2".."8").to_a

=3D> ["2", "3", "4", "5", "6", "7", "8"]>> ("2".."9").to_a

=3D> ["2", "3", "4", "5", "6", "7", "8", "9"]>> ("2".."10").to_a
=3D> []
("2".."11").to_a =3D> []
("1".."11").to_a

=3D> ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]


It gets better.
=3D> ["100"]

It seems you're running not so much into strange Range behavior as
strange String behavior in certain numeric circumstances. Or maybe a
combination of strange Range and String behvior. If you want the
ranges to make more sense, use actual numbers.
=3D> [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

If you want strings in the result, you can get that with a little bit
of work.
=3D> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]
 
M

Matthew K. Williams

Can someone please explain this behavior in ruby (1.8.6p111):
("2"..."8").to_a => ["2", "3", "4", "5", "6", "7"]
("2".."8").to_a => ["2", "3", "4", "5", "6", "7", "8"]
("2".."9").to_a => ["2", "3", "4", "5", "6", "7", "8", "9"]
("2".."10").to_a => []
("2".."11").to_a => []
("1".."11").to_a
=> ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]

It's because you're using strings -- "11" comes before "2", hence the
failure, because it's an invalid range, just as if you had (11 .. 2) is
invalid.

Matt
 
S

Scott Briggs

Matt, that doesn't explain why "1".."11" works and "2".."11" doesn't
work.

Scott
("2".."11").to_a => []
("1".."11").to_a
=> ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]

It's because you're using strings -- "11" comes before "2", hence the
failure, because it's an invalid range, just as if you had (11 .. 2) is
invalid.

Matt
 
S

Scott Briggs

Ah, I should clarify that. When ruby interprets "11" as an integer 11
for "1".."11", then why doesn't it do the same when it's "2".."11"?

Scott

Scott said:
Matt, that doesn't explain why "1".."11" works and "2".."11" doesn't
work.

Scott
("2".."11").to_a
=> []
("1".."11").to_a
=> ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]

It's because you're using strings -- "11" comes before "2", hence the
failure, because it's an invalid range, just as if you had (11 .. 2) is
invalid.

Matt
 
R

Rob Biedenharn

Can someone please explain this behavior in ruby (1.8.6p111):
("2"..."8").to_a

=> ["2", "3", "4", "5", "6", "7"]>> ("2".."8").to_a

=> ["2", "3", "4", "5", "6", "7", "8"]>> ("2".."9").to_a

=> ["2", "3", "4", "5", "6", "7", "8", "9"]>> ("2".."10").to_a
=> []
("2".."11").to_a => []
("1".."11").to_a

=> ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]

Well, you need to think about String#succ when the Range endpoints are
String.
It gets better.
=> ["100"]

Now, that one is odd. I'd have predicted a result of:
=> ["100", "101", "102", "103", "104", "105", "106", "107", "108",
"109"]
on the basis of staring with "100" and applying #succ until the value
was >"11" like this loop does:

a = []
v = "100"
loop do
break if v > "11"
a << v
v = v.succ
end
p a

This loop produced the "right" result for "2".."11" (namely an empty
array) so the actual result defies (my) explanation.
It seems you're running not so much into strange Range behavior as
strange String behavior in certain numeric circumstances. Or maybe a
combination of strange Range and String behvior. If you want the
ranges to make more sense, use actual numbers.
=> [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

If you want strings in the result, you can get that with a little bit
of work.
=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]



Of course, you can also do things like:

("a".."ah").to_a
=> ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
"n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "aa",
"ab", "ac", "ad", "ae", "af", "ag", "ah"]

Which might help label your spreadsheet columns.

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
 
M

Matthew K. Williams

Matt, that doesn't explain why "1".."11" works and "2".."11" doesn't
work.

irb(main):015:0> "1" < "11"
=> true
irb(main):016:0> "2" < "11"
=> false


irb(main):021:0> "11" < "2"
=> true

This is true because it's comparing strings to get the range -- it
compares the first character of each string, then stops when it can't go
any further.

Try this for an example of how the expansion is occurring:

("a".."cat".to_a

(I'm only putting a portion of it here)

=> ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n",
"o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "aa", "ab",
....
"caa", "cab", "cac", "cad", "cae", "caf", "cag", "cah", "cai", "caj",
"cak", "cal", "cam", "can", "cao", "cap", "caq", "car", "cas", "cat"]

In string order, it's going to compare strings of length 1 first, then
strings of length 2, etc... Here's another example (with an attempt at an
explanation):

irb(main):019:0> ("11" .. "2").to_a
=> ["11"]

As we've seen before, "11" < "2", so it's a part of the range, but then it
stops, we're done.

Matt
Scott
("2".."11").to_a
=> []
("1".."11").to_a
=> ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]

It's because you're using strings -- "11" comes before "2", hence the
failure, because it's an invalid range, just as if you had (11 .. 2) is
invalid.

Matt
 
R

Rob Biedenharn

("2"..."8").to_a
=> ["2", "3", "4", "5", "6", "7"]
("2".."8").to_a
=> ["2", "3", "4", "5", "6", "7", "8"]
("2".."9").to_a
=> ["2", "3", "4", "5", "6", "7", "8", "9"]
("2".."10").to_a => []
("2".."11").to_a => []
("1".."11").to_a
=> ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]

It's because you're using strings -- "11" comes before "2", hence
the failure, because it's an invalid range, just as if you had
(11 .. 2) is invalid.

Matt


Well, it certainly isn't invalid. You can easily have a Range where
the end is less than the begin value.

r = 3..-1
=> 3..-1
irb> r.to_a
=> []
irb> "hello"[r]
=> "lo"

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
 
M

Matthew K. Williams

It gets better.
("100".."11").to_a
=> ["100"]

Now, that one is odd. I'd have predicted a result of:
=> ["100", "101", "102", "103", "104", "105", "106", "107", "108", "109"]
on the basis of staring with "100" and applying #succ until the value was
"11" like this loop does:

It's doing a comparison of the strings -- it has to do with the
length of the string. "100" is longer than "11", it also happens to be
less characters (and, based on #succ, it's "less").

In order to find the range, it's going to compare the two strings --

+ it compares for the string lengths to get whether the beginning is less
than the end

+ It then uses #succ to try to expand the range, but since "100" has more
characters than "11", it stops...

Hope I've not muddied it too much.....

Matt
 
M

Matthew K. Williams

Well, it certainly isn't invalid. You can easily have a Range where the end
is less than the begin value.

r = 3..-1
=> 3..-1
irb> r.to_a
=> []
irb> "hello"[r]
=> "lo"

I guess the code for substring treats it differently than #to_a -- just
taking the bounds. Huh. That's pretty interesting. Learn something
every day. Makes sense when I stop to think about it, though.

Just don't try "hello"[3,-1]....

I need to read the rdocs more often....
Matt
 
R

Rob Biedenharn

It gets better.
("100".."11").to_a
=> ["100"]

Now, that one is odd. I'd have predicted a result of:
=> ["100", "101", "102", "103", "104", "105", "106", "107", "108",
"109"]
on the basis of staring with "100" and applying #succ until the
value was
"11" like this loop does:

It's doing a comparison of the strings -- it has to do with the
length of the string. "100" is longer than "11", it also happens to
be less characters (and, based on #succ, it's "less").

In order to find the range, it's going to compare the two strings --

+ it compares for the string lengths to get whether the beginning
is less than the end

+ It then uses #succ to try to expand the range, but since "100" has
more characters than "11", it stops...

Hope I've not muddied it too much.....

Matt


Well, the Range#to_a is actually Enumerable#to_a and uses Range#each
defined in range.c

After checking that the beginning of the range responds to :succ and
if it is a Fixnum (which are special), it finds that the Range.begin
is a String:

else if (TYPE(beg) == T_STRING) {
VALUE args[5];
long iter[2];

args[0] = beg;
args[1] = end;
args[2] = range;
iter[0] = 1;
iter[1] = 1;
rb_iterate(str_step, (VALUE)args, step_i, (VALUE)iter);
}

str_step calls rb_str_upto defined in string.c

VALUE
rb_str_upto(VALUE beg, VALUE end, int excl)
{
VALUE current, after_end;
ID succ = rb_intern("succ");
int n;

StringValue(end);
n = rb_str_cmp(beg, end);
if (n > 0 || (excl && n == 0)) return beg;
after_end = rb_funcall(end, succ, 0, 0);
current = beg;
while (!rb_str_equal(current, after_end)) {
rb_yield(current);
if (!excl && rb_str_equal(current, end)) break;
current = rb_funcall(current, succ, 0, 0);
StringValue(current);
if (excl && rb_str_equal(current, end)) break;
StringValue(current);
if (RSTRING_LEN(current) > RSTRING_LEN(end) || RSTRING_LEN(current)
== 0)
break;
}

return beg;
}

Now, not having read a lot of Ruby's C code, I'm not sure what some
bits are for (like calling StringValue(current) so much), but it does
ultimately behave almost like Matt said. The difference being that
the rb_yield(current) has already happened once before the length
check (RSTRING_LEN(current) > RSTRING_LEN(end)). I think the
RSTRING_LEN(current)==0 is there to catch "".succ == "", but that just
means that (""..any).to_a is [""] and yet ("".."").to_a is [] (because
after_end will be "" and the loop is never entered).

So it's the odd situation that String is given some special treatment
and has the unusual property that there are strings a,b such that:
a < b && a.length > b.length

Knowing this, here's an even more bizzare-looking example:

irb> "19".succ
=> "20"
irb> ("2".."19").to_a
=> []
irb> ("2"..."20").to_a
=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19"]


-Rob


Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
 
D

Daniel Berger

-----Original Message-----
From: Yukihiro Matsumoto [mailto:[email protected]]
Sent: Tuesday, August 04, 2009 10:56 PM
To: ruby-talk ML
Subject: Re: Bizarre Range behavior

Hi,

In message "Re: Bizarre Range behavior"
on Wed, 5 Aug 2009 05:32:37 +0900, Rob Biedenharn

|Knowing this, here's an even more bizzare-looking example:
|
|irb> "19".succ
|=> "20"
|irb> ("2".."19").to_a
|=> []
|irb> ("2"..."20").to_a
|=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
|"14", "15", "16", "17", "18", "19"]

What if I sprinkle more magic to the language and change String#upto
to generate numerical sequences when all characters in edges are
digits, so that

irb> ("2".."19").to_a
=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19"]
irb> ("2"..."20").to_a
=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"]

Any opinion? I already made a patch for trunk.

I vote against. If people want numeric ranges, it's their job to use
numerics, not magically convert stringy numbers into actual numbers. This
isn't Perl after all.

Regards,

Dan
 
B

Brian Candler

Yukihiro said:
What if I sprinkle more magic to the language and change String#upto
to generate numerical sequences when all characters in edges are
digits, so that

irb> ("2".."19").to_a
=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19"]
irb> ("2"..."20").to_a
=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"]

Any opinion?

-1 for added complexity with little benefit
 
J

James Coglan

[Note: parts of this message were removed to make it a legal post.]

2009/8/5 Brian Candler said:
Yukihiro said:
What if I sprinkle more magic to the language and change String#upto
to generate numerical sequences when all characters in edges are
digits, so that

irb> ("2".."19").to_a
=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19"]
irb> ("2"..."20").to_a
=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"]

Any opinion?

-1 for added complexity with little benefit


I'm also against this. I prefer explicit type conversions here: changing
behaviour because a string happens to look like a number will likely cause
more problems than it solves. In fact this kind of thing shows up in
JavaScript and it usually masks bugs where the developer has failed to
properly handle user input.

If this change were to go ahead, I'd also argue for changing String#+ to
recognise numbers, which might also mean changing Numeric#+ for symmetry.
 
R

Robert Klemme

2009/8/5 Daniel Berger said:
-----Original Message-----
From: Yukihiro Matsumoto [mailto:[email protected]]
Sent: Tuesday, August 04, 2009 10:56 PM
To: ruby-talk ML
Subject: Re: Bizarre Range behavior

Hi,

In message "Re: Bizarre Range behavior"
=A0 =A0 on Wed, 5 Aug 2009 05:32:37 +0900, Rob Biedenharn

|Knowing this, here's an even more bizzare-looking example:
|
|irb> "19".succ
|=3D> "20"
|irb> ("2".."19").to_a
|=3D> []
|irb> ("2"..."20").to_a
|=3D> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
|"14", "15", "16", "17", "18", "19"]

What if I sprinkle more magic to the language and change String#upto
to generate numerical sequences when all characters in edges are
digits, so that

irb> ("2".."19").to_a
=3D> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19"]
irb> ("2"..."20").to_a
=3D> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"]

Any opinion? =A0I already made a patch for trunk.

I vote against. If people want numeric ranges, it's their job to use
numerics, not magically convert stringy numbers into actual numbers. This
isn't Perl after all.

I strongly agree. Typing .to_i isn't too hard and it makes clear what
is intended.

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
J

James Gray

-----Original Message-----
From: Yukihiro Matsumoto [mailto:[email protected]]
Sent: Tuesday, August 04, 2009 10:56 PM
To: ruby-talk ML
Subject: Re: Bizarre Range behavior

Hi,

In message "Re: Bizarre Range behavior"
on Wed, 5 Aug 2009 05:32:37 +0900, Rob Biedenharn

|Knowing this, here's an even more bizzare-looking example:
|
|irb> "19".succ
|=> "20"
|irb> ("2".."19").to_a
|=> []
|irb> ("2"..."20").to_a
|=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
|"14", "15", "16", "17", "18", "19"]

What if I sprinkle more magic to the language and change String#upto
to generate numerical sequences when all characters in edges are
digits, so that

irb> ("2".."19").to_a
=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19"]
irb> ("2"..."20").to_a
=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"]

Any opinion? I already made a patch for trunk.

I vote against. If people want numeric ranges, it's their job to use
numerics, not magically convert stringy numbers into actual numbers.
This isn't Perl after all.

I agree. I'm against the change.

James Edward Gray II
 
P

Piyush Ranjan

[Note: parts of this message were removed to make it a legal post.]

+1 for that. and -1 for the change.

It just makes the developer look too stupid. Can't we let the developers
understand the difference between a string and an integer ?


I vote against. If people want numeric ranges, it's their job to use
 
S

Scott Briggs

Piyush said:
+1 for that. and -1 for the change.

It just makes the developer look too stupid. Can't we let the developers
understand the difference between a string and an integer ?

If it was 20 years ago, I'd understand this sentiment. What I don't
understand is why programming languages seem to insist on using
semantics that don't adapt to the natural ways that humans interact or
think. It's almost as if people prefer to fight against the inevitable
evolution of programming languages.

In this case, your only argument for not introducing this "magic" is
because people need to understand the difference between a string and an
integer, why is that so critical in this case? There's no ambiguity in
"2".."11".

There are a lot of constructs in ruby that make it much easier to use
and understand from a natural language point of view, one of the big
strengths of ruby, and this in turn makes it more accessible to people
who are interested in programming and not getting bogged down in the
minutiae of why "2" is greater than "11".
 
D

David Masover

If it was 20 years ago, I'd understand this sentiment. What I don't
understand is why programming languages seem to insist on using
semantics that don't adapt to the natural ways that humans interact or
think.

Because the semantics with which humans interact and think are ambiguous,
often illogical, and often rely on intuition.

We can't give our languages intuition, but the more we try to do so, and the
more magic we introduce, the less predictable things get.
There are a lot of constructs in ruby that make it much easier to use
and understand from a natural language point of view, one of the big
strengths of ruby, and this in turn makes it more accessible to people
who are interested in programming and not getting bogged down in the
minutiae of why "2" is greater than "11".

Programming inevitably leads to at least understanding these minutiae. I use
Ruby, and I love it for that natural-language expressiveness, and also just
for the conciseness, even where I know it's less efficient:

(2..11).map:)&to_s)

But there's a case to be made that at a certain point, you need to understand
what's going on. A simple example: What's the difference between a string and a
symbol? Someone who uses strings where they should use symbols is making their
program needlessly inefficient and verbose; someone who does the opposite is
introducing a rather serious memory leak and potential DoS vulnerability.

You could make the case that we should just use strings, and find ways to make
them really efficient. But hey, at least the semantics of symbols are adequately
covered by strings -- the semantics of numbers really aren't.

Put another way: Currently, we're allowed to do:

puts 'Ho! '*3 + 'Merry Christmas!'

Now, suppose we start making + and * smart, so that '2'*'3'='6'. Now what does
'2'*3 do? Is it '6', or 6, or '222'? It certainly seems feasible a newbie
would get stuck here -- for example, what if they feel like adding 000 as a
delimiter -- '0'*80 instead of '-'*80 to make a horizontal line -- did they
get eighty zeros, or the product of 0*80=0?

Or suppose they added a space into their number accidentally -- is '2 '*80
equal to '160' or '2 2 2 2 ...'? Maybe it's just me, but '2 ' seems like a
much more probable mistake (and a harder one to catch) than saying '2' when
you mean 2.

By making the easy stuff ridiculously easy (and assuming users are idiots), it
adds enough ambiguity to drive users crazy later on.

Maybe I'm overreacting, and this would be fine for ranges, but I think "magic"
only makes sense when it's very well understood and predictable. 'puts'
calling #to_s on everything, and 'p' calling #inspect on everything, makes
sense. Range calling #to_i sometimes just seems like it's asking for trouble.
 
J

James Coglan

[Note: parts of this message were removed to make it a legal post.]

2009/8/19 David Masover said:
Because the semantics with which humans interact and think are ambiguous,
often illogical, and often rely on intuition.

We can't give our languages intuition, but the more we try to do so, and
the
more magic we introduce, the less predictable things get.


Programming inevitably leads to at least understanding these minutiae. I
use
Ruby, and I love it for that natural-language expressiveness, and also just
for the conciseness, even where I know it's less efficient:


I second this. "Magic" (for want of a better word) is only useful when it
gives you a faster way to achieve the same result. To anyone with moderate
or above programming experience, the difference between strings and numbers
is important and I for one would be annoyed at finding strings being
magically handled as numbers when that isn't what I wanted -- especially if
it were happening to user-supplied data.

This isn't an implementation detail that ought to be hidden from the user to
make things easier (like dynamic typing, or automatic garbage collection):
strings and numbers are conceptually different types of data that support
different operations and different semantics. I think trying to do too much
automatic type conversion is likely to end up producing a lot of the
problems that exist with number/string/boolean comparison in PHP and (to a
lesser extent) JavaScript.

David mentions concatenation vs addition -- what about splitting? I can
split "1234" into "12" and "34" and I have two perfectly valid strings; if I
split the number 1234 into 12 and 34 I've not done something meaningful. In
a number the digits have meaning based on their position within the number,
which itself depends on the base used to represent the number. A string is
just a sequence of glyphs, which have no intrinsic meaning at a technical
level.

Ruby's design is said to follow the principle of least surprise; to me this
means that consistency and correctness shouid be maintained. Blurring the
boundaries between strings and numbers is a frequent cause of bugs for
beginners in some other languages, and I think Ruby does well to enforce
some separation between them to guide people in the right direction.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,169
Messages
2,570,919
Members
47,460
Latest member
eibafima

Latest Threads

Top