I've just been looking at the details of random number
generators (RNGs), in particular R's |
function. We use these a lot for simulations, and (my immediate
concern) for random-walk MCMC samplers. The questions I wanted
to answer were:
- How often does
runif(n, 0, 1) produce
values equal to 0 or 1? (Never.)
- Can a sequence of random numbers from
have two values that are exactly equal? (Yes.)
- If I plug in the same value for
will I always get the same sequence of numbers? (Yes,
provided you also use the same RNG kind.)
- If I plug in different values for
will I get different sequences? (Not guaranteed.)
The internals of
Most pseudo-random number generators actually produce long
integers, which have values up to 231
- 1 = 2147483647 = 2.15 billion. These are then converted by
runif to give the values you need.
If your call was
runif(n, 0, 1), the interval (0,
1) is divided into 2147483647 segments and the output gives the
mid-points of the segments.
The smallest value you can get is 2-33
= 1.16e-10 and the largest is 1 - 2-33
= 1 - 1.16e-10. Even allowing for the imprecision of
floating-point numbers, those are well clear of 0 and 1.
Since there are only a limited number of possible
values in the output (albeit 2 billion), a long sequence of
numbers will have exact duplicates. A sequence of 100 million
numbers had 1% duplicated.
What set seed does
The actual seed for most RNGs is very long. For the Mersenne-Twister
RNG, the default in R, it is a vector of 624 integers (use
.Random.seed in R to see it; the first number codes the
When you call set.seed with an integer argument, a specific
vector of values is generated for
Using the same integer again will give the same
you'll get the same sequence of pseudorandom numbers provided
you are using the same kind of RNG.
But there is no guarantee that a different argument will give a
.Random.seed and a different sequence.