Beginner Question--rand( )

H

hedylogus

I've just begun learning C++ and I'm trying to write a program to shuffle a
deck of cards. I've succeeded....for the most part....but every now and
then rand() produces duplicate random numbers causing me to "lose" a card.
How do I avoid this??? I've attached my code below for reference. Thanks
in advance.

------------------------
#include <iostream>
#include <cstdlib>
#include <ctime>
#include <string>
#include <map>
#include <algorithm>

using namespace std;

string cardDeck[] = {"As", "Ah", "Ad", "Ac",
"2s", "2h", "2d", "2c",
"3s", "3h", "3d", "3c",
"4s", "4h", "4d", "4c",
"5s", "5h", "5d", "5c",
"6s", "6h", "6d", "6c",
"7s", "7h", "7d", "7c",
"8s", "8h", "8d", "8c",
"9s", "9h", "9d", "9c",
"Ts", "Th", "Td", "Tc",
"Js", "Jh", "Jd", "Jc",
"Qs", "Qh", "Qd", "Qc",
"Ks", "Kh", "Kd", "Kc"};

map<int,string> shuffledDeck;
int randomNumber[52]; //used for debug purposes

int main()
{
srand( time(NULL) );

cout << "Current seed value: "
<< time(NULL)
<< "\n\n";

for (int i=0; i<52; ++i)
{
randomNumber = rand();
shuffledDeck[ randomNumber ] = cardDeck;
}

for (map<int,string>::iterator p = shuffledDeck.begin(); p !=
shuffledDeck.end(); ++p)
{
cout << p->first
<< "\t"
<< p->second
<< endl;
}

if (shuffledDeck.size () != 52) //indicates duplicate random numbers
{
sort(randomNumber, randomNumber+52); //easy viewing
cout << endl
<< "DUPLICATE RANDOM NUMBERS!\n\n";
for (int i=0; i<52; ++i)
cout << randomNumber
<< "\t";

cout << "\n";
}

cout << endl;
system("PAUSE");
return 0;
}
 
P

Pete Becker

hedylogus said:
every now and
then rand() produces duplicate random numbers

Of course it does. If it didn't, the numbers wouldn't be random.
How do I avoid this???

std::random_shuffle.

--

"To delight in war is a merit in the soldier,
a dangerous quality in the captain, and a
positive crime in the statesman."
George Santayana

"Bring them on."
George W. Bush
 
C

Cy Edmunds

Pete Becker said:
Of course it does. If it didn't, the numbers wouldn't be random.

Repeat values are a consequence of finite numbers of bits computed by the
algorithm, not randomness itself. Of course with certain lousy
implementations of rand() only a miserly 15 bits (the absolute minimum
allowed by the standard) are provided, resulting in repetition in very short
run lengths.
std::random_shuffle.

Or get a decent random number generator (31 bits or more). They are readily
available. You will still get repeats if you let your program run for the
rest of your life.
 
B

Buster Copley

Cy said:
Repeat values are a consequence of finite numbers of bits computed by the
algorithm, not randomness itself. Of course with certain lousy
implementations of rand() only a miserly 15 bits (the absolute minimum
allowed by the standard) are provided, resulting in repetition in very short
run lengths.

That's silly. If the numbers were truly random, the expected number of
draws before a duplicate would be much less than the number of possible
results.
Or get a decent random number generator (31 bits or more). They are readily
available. You will still get repeats if you let your program run for the
rest of your life.

If only 52 draws are made, what difference does drawing from 2^32 values
rather than 2&16 make? The chance of some number occurring twice in
those 52 draws would presumably still be significantly greater than
zero in the latter case.

(To the OP:)
Search for Fisher-Yates shuffle for the usual (optimal) implementation
of random_shuffle.

Regards,
Buster
 
B

Buster Copley

Buster said:
If only 52 draws are made, what difference does drawing from 2^32 values
rather than 2&16 make? The chance of some number occurring twice in
those 52 draws would presumably still be significantly greater than
zero in the latter case.

That is, 2^16, and I hope you can guess which case I meant.
 
P

Pete Becker

Cy said:
Repeat values are a consequence of finite numbers of bits computed by the
algorithm, not randomness itself.

No. Suppose you roll a six-sided die five times and you get 5, 2, 3, 6,
and 1. Do you expect 4 to come up next?

Of course with certain lousy
implementations of rand() only a miserly 15 bits (the absolute minimum
allowed by the standard) are provided, resulting in repetition in very short
run lengths.


Or get a decent random number generator (31 bits or more).

Won't help, if your technique is wrong. That's why random_shuffle isn't
called rand. They do different things.

--

"To delight in war is a merit in the soldier,
a dangerous quality in the captain, and a
positive crime in the statesman."
George Santayana

"Bring them on."
George W. Bush
 
D

Danny Anderson

....but every now and
then rand() produces duplicate random numbers causing me to "lose" a card.
How do I avoid this??? I've attached my code below for reference. Thanks
in advance.

Despite quite a few answers already, I will take a stab at this in a
'newbie' context, since I am barely, if at all, past that stage myself.
rand() produces a stream of psuedo-random numbers based on a seed, the
computer clock in your case.
rand() cannot ensure that there are no duplicates in that stream. I can
think of three options for you in this case:

1) Find a more sophisticated random number generator. This is reasonable,
but IMHO, you would be better off with #2 or #3

2) Develop a test condition so you can "draw again" if the random number
tries to draw more than once from the same spot in CardDeck[]. This is a
reasonable solution, except for the fact that your code show s that you
have at least a passing familiarity with STL.

3) Go for the double bonus: go for random_shuffle like Rolf suggested. Not
only will you solve your duplication problem, your code will be much smaller
and cleaner. The bonus-bonus is that your code will be very readable, a
definite plus as far as grading goes.
 
P

Pete Becker

Danny said:
1) Find a more sophisticated random number generator. This is reasonable,
but IMHO, you would be better off with #2 or #3

This is not the problem.
2) Develop a test condition so you can "draw again" if the random number
tries to draw more than once from the same spot in CardDeck[]. This is a
reasonable solution, except for the fact that your code show s that you
have at least a passing familiarity with STL.

This becomes slower as you use up numbers. Bad approach.
3) Go for the double bonus: go for random_shuffle like Rolf suggested. Not
only will you solve your duplication problem, your code will be much smaller
and cleaner. The bonus-bonus is that your code will be very readable, a
definite plus as far as grading goes.

Or learn how to do it yourself. It's not that hard, and provides some
useful insights.

--

"To delight in war is a merit in the soldier,
a dangerous quality in the captain, and a
positive crime in the statesman."
George Santayana

"Bring them on."
George W. Bush
 
D

Danny Anderson

3) Go for the double bonus: go for random_shuffle like Rolf suggested.
Not
Or learn how to do it yourself. It's not that hard, and provides some
useful insights.

Maybe so- my take was that the guy was using STL structures and some general
algorithms like sort(), but didn't seem to know about random_shuffle. In
that case, I think random_shuffle is a really good choice. I understand the
DIY rationalization, but it seems to me a lot of stuff works in the opposite
order: learn to use the tools available first, the improve and fabricate new
better tools when the need arises.

Just an opinion, don't flog me for it too much. :)

Danny
 
C

Cy Edmunds

Pete Becker said:
No. Suppose you roll a six-sided die five times and you get 5, 2, 3, 6,
and 1. Do you expect 4 to come up next?

No, I don't. But if I did get a 4, on the seventh roll I would certainly get
a duplicate. This has nothing to do with randomness. It only has to do with
the fact that a die only has six sides. Hence:

"Repeat values are a consequence of finite numbers of bits computed by the
algorithm, not randomness itself."

<snip>
 
S

Sam Holden

No, I don't. But if I did get a 4, on the seventh roll I would certainly get
a duplicate. This has nothing to do with randomness. It only has to do with
the fact that a die only has six sides. Hence:

"Repeat values are a consequence of finite numbers of bits computed by the
algorithm, not randomness itself."

Don't let hundreds of years of mathematics stand in the way of your unique
of the universe, will you?

Here's an algorithm for generating an unlimited number of "bits" and hence
there is no finite limit to the number of bits.

Roll a die, if the result is not a one or two record the result. If the result
is anthing else roll two dice (or one die twice) and concatanate the results
given by applying this mechanism recursively.

If on our first "roll" we end up rolling:

3
4
1
3
2
1
6
3
4
1
2
6
2
4
1
1

Giving:

12112211

As my randombly generated result.

To be random it must be possible that that exact result will be repeated the
next time I try. Even though, there are an infinite number of potential
outcomes.
 
C

Cy Edmunds

Sam-

I suggest you read my original post. The perhaps you can formulate a
relevant reply.
 
S

Sam Holden

Sam-

I suggest you read my original post. The perhaps you can formulate a
relevant reply.

No thanks.

Any random process can generate repeats, else it isn't random.

Simple mathematics.

You can claim otherwise, for all I care. You can claim pi is rational too,
it really doesn't bother me.
 
C

Cy Edmunds

Sam Holden said:
No thanks.

Any random process can generate repeats, else it isn't random.

Simple mathematics.

You can claim otherwise, for all I care. You can claim pi is rational too,
it really doesn't bother me.

sigh. OK, let's go over it one point at a time. But this time we will deal
with actual quotes rather than inuendos. Here is what I actually said:

1. "Repeat values are a consequence of finite numbers of bits computed by
the algorithm, not randomness itself."

In case English isn't your first language, let me explain what this means.
If an algorithm generates a sequence of values, each of which is described
by a finite number of bits, the sequence must eventually generate repeat
values. This is true for random number generators but also for any other
algorithm. It is a direct consequence of the fact that n bits can only
express 2^n different values.

Does this somehow contract "hundreds of years of mathematics?" If so, how?

2. "Of course with certain lousy implementations of rand() only a miserly 15
bits (the absolute minimum allowed by the standard) are provided, resulting
in repetition in very short run lengths."

The maximum run length before repeats must occur is 2^n. Obviously higher
for higher values of n. If you sample with replacement the situation is more
complex although easily analyzable. But the trend is still, more bits ->
longer cycle time.

Does this somehow contract "hundreds of years of mathematics?" If so, how?

That's it. Only two points, both of which are obviously true. Now to Pete's
response:

"No. Suppose you roll a six-sided die five times and you get 5, 2, 3, 6, and
1. Do you expect 4 to come up next?"

Now which of my two points was Pete addressing? It certainly doesn't
contradict point 1. It also has nothing to do with point 2. In fact his
assertion is certainly true but has nothing to do with either of my points.

Now my response:

"No, I don't."

In other words, I agree with his assertion.

"But if I did get a 4, on the seventh roll I would certainly get a
duplicate. This has nothing to do with randomness. It only has to do with
the fact that a die only has six sides."

Here I am using his own example to restate my point #1.

Now, to you. You assert:

"Any random process can generate repeats, else it isn't random."

Technically, this is incorrect. For instance, the mathematics of sampling
from a Normal distribution, whose variables are truly continuous, allows for
no repeats. But given the context of computer algorithms, which after all is
what we were talking about, the number of bits must be finite and your
assertion is certainly correct.

Now for the punch line: having reviewed the dialog so far I ask you, where
do you see me saying otherwise? I want a direct quote or an apology.

If you want to make snotty responses I suggest you actually read the posts.

Cy
 
S

Sam Holden

On Sun, 27 Jul 2003 22:48:40 GMT,

[snip]
sigh. OK, let's go over it one point at a time. But this time we will deal
with actual quotes rather than inuendos. Here is what I actually said:

1. "Repeat values are a consequence of finite numbers of bits computed by
the algorithm, not randomness itself."

In case English isn't your first language, let me explain what this means.
If an algorithm generates a sequence of values, each of which is described
by a finite number of bits, the sequence must eventually generate repeat
values. This is true for random number generators but also for any other
algorithm. It is a direct consequence of the fact that n bits can only
express 2^n different values.

I disagree on this point.

Yes, with a finite number of outcomes, repeats are required for infinite
(or a finite number greater than the number of outcomes) applications.
That's a given.

However, a random number generator which has the domain of the nonnegative
integers, for example, to be truly random must be capable of generating
repeats. Of course
the probability of a repeat is infinitesimal.

If repeats aren't possible then the process isn't random.

The disagreement is probably about the definition of "random" which, as
with most terms, means many quiet different things. In comp.lang.c++ the
two normal definitions are "wierd or uncertain" as in "my program is
randomly crashing", or "to select from a set with equal probability of
all elements being selected", as in "how do I generate a random number?".
Does this somehow contract "hundreds of years of mathematics?" If so, how?

2. "Of course with certain lousy implementations of rand() only a miserly 15
bits (the absolute minimum allowed by the standard) are provided, resulting
in repetition in very short run lengths."

The maximum run length before repeats must occur is 2^n. Obviously higher
for higher values of n. If you sample with replacement the situation is more
complex although easily analyzable. But the trend is still, more bits ->
longer cycle time.

Does this somehow contract "hundreds of years of mathematics?" If so, how?

That's it. Only two points, both of which are obviously true. Now to Pete's
response:

"No. Suppose you roll a six-sided die five times and you get 5, 2, 3, 6, and
1. Do you expect 4 to come up next?"

Now which of my two points was Pete addressing? It certainly doesn't
contradict point 1. It also has nothing to do with point 2. In fact his
assertion is certainly true but has nothing to do with either of my points.

Does everything have to be in response to your specific points?

Are people not allowed to go off on tangents, discuss misinterpretations,
ignore the actual point, discuss things from the perspective of C++, etc?
Now my response:

"No, I don't."

In other words, I agree with his assertion.

"But if I did get a 4, on the seventh roll I would certainly get a
duplicate. This has nothing to do with randomness. It only has to do with
the fact that a die only has six sides."

Here I am using his own example to restate my point #1.

Now, to you. You assert:

"Any random process can generate repeats, else it isn't random."

Technically, this is incorrect. For instance, the mathematics of sampling
from a Normal distribution, whose variables are truly continuous, allows for
no repeats. But given the context of computer algorithms, which after all is
what we were talking about, the number of bits must be finite and your
assertion is certainly correct.

Now for the punch line: having reviewed the dialog so far I ask you, where
do you see me saying otherwise? I want a direct quote or an apology.

If you want to make snotty responses I suggest you actually read the posts.

This is usenet, I'm free to jump in and reply to a post I read. Honestly
the back story doesn't worry me. If it isn't quoted in the message I'm
replying to. I'm replying to a particular point in a particular post, if
you don't like that then there is no obligation to read my posts.
 
P

Peter van Merkerk

Cy Edmunds said:
by

No, I don't. But if I did get a 4, on the seventh roll I would certainly get
a duplicate. This has nothing to do with randomness. It only has to do with
the fact that a die only has six sides. Hence:

"Repeat values are a consequence of finite numbers of bits computed by the
algorithm, not randomness itself."

You probably mean "Repeating *sequences* of values are a consequence of
finite numbers of bits computed by the *pseudo random* algorithm.".
Within a pseudo random sequence the same value may occure multiple
times. Even if a algorithm would use an infinite number of bits to
compute a value, it could still produce many times the same value.
 
W

Wellu =?iso-8859-1?Q?M=E4kinen?=

Within a pseudo random sequence the same value may occure multiple
times. Even if a algorithm would use an infinite number of bits to
compute a value, it could still produce many times the same value.

Note that even real random number generator might produce
sequence of million same numbers.
 
B

Buster

sigh. OK, let's go over it one point at a time. But this time we
Your quote above has TWO parts:

1 (a) "Repeat values are a consequence of finite numbers of bits computed
by the algorithm."

1 (b) "Repeat values are not a consequence of randomness itself."

The first part is true, and utterly vacuous. The second part, which is
false, has been the target of the arguments presented so far. If you
didn't notice this, you cannot have understood those arguments. If you
noticed it but decided to ignore it, then I can think of several apt
Latin words, along with any number in demotic anglo-saxon.

No, but no one said it does. You're defending one half of your
statement while Pete and Simon question the other half.

Here's a demonstration that (b) is false.

Let X[0], X[1], ..., X[k-1] be discrete random variables uniformly
distributed on the set { 0, 1, 2, ..., n-1 }. Let A be the event
that there is at least one repetition in this collection, that is,
A = (X==X[j]) for some i and j with 0 <= i < j < k. The complementary
event B is that all the X's are distinct:
B = (X[1]!=X[0]) && (X[2] is not in { X[0], X[1] })
&& (X[3] is not in { X[0], X[1], X[2] })
&& ...
&& (X[k-1] is not in { X[0], X[1], ..., X[k-2] })

We can see that B has probability
P (B) = ((n-1)/n) * ((n-2)/n) * ... * ((n-k+1)/n)
= n!/((n-k)! * pow (N, k-1)).
And of course P (A) = P (B) - 1.

We'll suppose k = 52 as in the original post. With n = 1 << 15,
P (A) comes to 0.03968, and with n = 1 << 31 it's 0.0000006175.

This is an approximation to the probability of the OP's algorithm failing
to do its job on any given run. If I were him I'd use the Fisher-Yates
shuffle instead. Then I'm absolutely certain to get what I want.

Caveat: for all I know, you can choose your pseudo-random number
generator in such a way that you are guaranteed not to get any
repetition during any run of k draws, whether k be 52 or 5002.
However, the OP had settled on using std::rand (), so he could rely
on no such guarantee.

Regards,
Buster Copley
 
P

Peter van Merkerk

Within a pseudo random sequence the same value may occure multiple
Note that even real random number generator might produce
sequence of million same numbers.

Though the probability of that happing is very small (but not
impossible).
 
C

Cy Edmunds

Buster said:
Your quote above has TWO parts:

1 (a) "Repeat values are a consequence of finite numbers of bits computed
by the algorithm."

1 (b) "Repeat values are not a consequence of randomness itself."

The first part is true, and utterly vacuous. The second part, which is
false, has been the target of the arguments presented so far. If you
didn't notice this, you cannot have understood those arguments. If you
noticed it but decided to ignore it, then I can think of several apt
Latin words, along with any number in demotic anglo-saxon.

No, but no one said it does. You're defending one half of your
statement while Pete and Simon question the other half.

Here's a demonstration that (b) is false.

Let X[0], X[1], ..., X[k-1] be discrete random variables uniformly
distributed on the set { 0, 1, 2, ..., n-1 }.

Note that this algorithm computes a finite number of bits. Thus the
remainder of your proof which quite correctly shows that duplicate values
are inevitable does not contradict 1b.
Let A be the event
that there is at least one repetition in this collection, that is,
A = (X==X[j]) for some i and j with 0 <= i < j < k. The complementary
event B is that all the X's are distinct:
B = (X[1]!=X[0]) && (X[2] is not in { X[0], X[1] })
&& (X[3] is not in { X[0], X[1], X[2] })
&& ...
&& (X[k-1] is not in { X[0], X[1], ..., X[k-2] })

We can see that B has probability
P (B) = ((n-1)/n) * ((n-2)/n) * ... * ((n-k+1)/n)
= n!/((n-k)! * pow (N, k-1)).
And of course P (A) = P (B) - 1.

We'll suppose k = 52 as in the original post. With n = 1 << 15,
P (A) comes to 0.03968, and with n = 1 << 31 it's 0.0000006175.

This is an approximation to the probability of the OP's algorithm failing
to do its job on any given run. If I were him I'd use the Fisher-Yates
shuffle instead. Then I'm absolutely certain to get what I want.

Caveat: for all I know, you can choose your pseudo-random number
generator in such a way that you are guaranteed not to get any
repetition during any run of k draws, whether k be 52 or 5002.
However, the OP had settled on using std::rand (), so he could rely
on no such guarantee.

Regards,
Buster Copley


1b is just as "vacuous" as 1a.You get repeat values from algorithms which
have nothing to do with randomness. For instance, consider a generator which
produces j % 4 where j = 0, 1, 2, ... infinity. You get your first duplicate
at j = 4. Since an algorithm with absolutely no element of randomness can
generate duplicate values, we are safe in saying:

"Repeat values are not a consequence of randomness itself."

Go over your own proof in the case of an infinite population size. The
probability of a repeat value is 1/infinity which is zero (not
"infinitesimal" as someone else asserted). Thus duplicate values can never
happen unless the population from which you are sampling is finite. Or,

"Repeat values are a consequence of finite numbers of bits
computed by the algorithm, not randomness itself."

Cy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top