sorting by rand?

P

Peter Seebach

Browsing something at a bookstore recently, I saw an example of
an array shuffle allegedly implemented with
x.sort { rand }
or something to that effect.

This seems sort of unlikely, to me, to be a correct shuffling
algorithm. I'm not even sure it's safe at all. I know that
doing the same thing to C qsort implementations occasionally
causes one to blow up spectacularly, because its assumption
that the comparison between any two items is constant gets
broken.

So, two vaguely related questions:
1. Is there a reasonably sane way to do this, should I ever find
myself wanting one?
2. Am I right in guessing that handing garbage to Array#sort
is probably crazy?

-s
 
R

Rick DeNatale

Browsing something at a bookstore recently, I saw an example of
an array shuffle allegedly implemented with
x.sort { rand }
or something to that effect.
This seems sort of unlikely, to me, to be a correct shuffling
algorithm. I'm not even sure it's safe at all. I know that
doing the same thing to C qsort implementations occasionally
causes one to blow up spectacularly, because its assumption
that the comparison between any two items is constant gets
broken.

So, two vaguely related questions:
1. Is there a reasonably sane way to do this, should I ever find
myself wanting one?
2. Am I right in guessing that handing garbage to Array#sort
is probably crazy?

I believe that the "something to that effect was probably:

x.sort_by {rand}

sort_by takes a block which normally takes one argument. sort_by
calls this block once for each element of the enumeration and
associates that value with the element, it then sorts those values and
returns an array of the asssociated elements in the order of the
values. So for example:

(1..50).sort_by{|e| -e}
==>[50, 49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36,
35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19,
18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

The block without the argument just assignes a random value to each element.

Perfectly safe as far as I can see.

x.sort {rand}

doesn't really work, whether or not it's safe. Rand without an
argument returns a float >= 0.0 and < 1.0, the block for sort is
expected to return -1, 0, or 1. Depending on the implementaiton of the
particular sort method, you might get different results. For arrays,
it looks like you'll get the same results with repeated calls:

irb(main):020:0> (1..10).to_a.sort {rand}
=> [10, 6, 1, 7, 3, 8, 5, 9, 4, 2]
irb(main):021:0> (1..10).to_a.sort {rand}
=> [10, 6, 1, 7, 3, 8, 5, 9, 4, 2]

Whereas sort_by seems to do what's expected.

irb(main):022:0> (1..10).to_a.sort_by {rand}
=> [5, 9, 2, 1, 7, 6, 4, 10, 8, 3]
irb(main):023:0> (1..10).to_a.sort_by {rand}
=> [4, 8, 5, 6, 2, 3, 10, 7, 1, 9]
 
R

Rick DeNatale

Well, okay. It works in practice. Is it guaranteed to work, and is
it actually reasonable to expect the output to be randomly shuffled?

See my longer explanation of sort_by which crossed in the e-mail.
Just for comparison, I tried:

l = a.length
a.each_index { |x| a[x] = a[rand(l)] }

I'm pretty sure I got the generic shuffle algorithm right (Knuth FTW),
and if so, it produces quality random results in about 1/4 the time.

Looks suspicious to me, you are replacing each element of x in turn
with a randomly selected element, which means that you are likely to
end up with a different set of elements in the end. One of the
requirements of shuffling an array would seem to be that

shuffle(a).sort == a.sort

Something like:

a.each_index {|x| y=rand; a[x], a[y] = a[y], a[x]}

would seem to be a better approach, but a.sort_by {rand} does the same
thing more concisely.
 
M

M. Edward (Ed) Borasky

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Peter said:
Browsing something at a bookstore recently, I saw an example of
an array shuffle allegedly implemented with
x.sort { rand }
or something to that effect.

This seems sort of unlikely, to me, to be a correct shuffling
algorithm. I'm not even sure it's safe at all. I know that
doing the same thing to C qsort implementations occasionally
causes one to blow up spectacularly, because its assumption
that the comparison between any two items is constant gets
broken.

So, two vaguely related questions:
1. Is there a reasonably sane way to do this, should I ever find
myself wanting one?
2. Am I right in guessing that handing garbage to Array#sort
is probably crazy?

-s

Welll ... I do this in Excel all the time -- add a column of random
numbers to the right of the (CSV) file I want to shuffle, sort it about
five times, hitting F9 before each sort to change the random numbers,
and then deleting the random column and saving the file. Translate that
to Ruby and I think you've got it.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGNYhUaZUb+jwczfoRAgn3AKDJnG8yQYVTWkoPTw9YNT42/6M/OACg1NFg
76A4qzYkZjIIKBA/YAPblDM=
=KVsa
-----END PGP SIGNATURE-----
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,241
Messages
2,571,219
Members
47,850
Latest member
StewartTha

Latest Threads

Top