6 April 2017
Basic idea is obtain samples of values \(\{\theta^{(1)}, \theta^{(2)}, \theta^{(3)},\ldots,\theta^{(n)} \}\) from distribution of interest and use this sample to estimate properties of distribution.
Say \(\theta\) is a univariate random variable.
Now make a histogram to get an idea of the shape of the distribution:
Let's approximate the mean of the distribution by the average of the sampled values:
\(E[\theta] \approx \bar{\theta} = \frac1 {1000} \sum_1^{1000} \theta^{(i)} =\) 3.8390958
We omit details of how RNGs actually work (see notes for short description)
Important: Standard random number generators used as the default in many languages (e.g. java.util.Random
) are not adequate for scientific simulation studies.
I recommend instead the Mersenne Twister which is implemented in most languages. It is the default generator in Python.
In Python to repeart a sequence of random bits, choose a seed a
and call random.seed(a)
to set generator in same state
To sample from \(X\) a discrete r.v. with \(P(X = x_i) = p_i\) 1. generate \(u \sim U(0,1)\) 2. if \(u< p_1\), set \(X = x_1\) otherwise if \[\sum_{j = 1}^{i-1}p_j < u < \sum _{j = 1}^i p_j\] set \(X = x_i\).
Suppose \(X\) can take values A, B, C, D or E with the probabilities
## value probability ## 1 A 0.10 ## 2 B 0.30 ## 3 C 0.15 ## 4 D 0.35 ## 5 E 0.10
Suppose we draw \(u \sim U(0,1)\) and get \(u =\) 0.503.
Then \(x\) takes the value C since \[\Pr(A) + \Pr(B) = 0.1+0.3 = 0.4 \leq u = 0.503 \leq 0.55 = \Pr(A) + \Pr(B) + \Pr(C)\].
## value probability ## 1 A 0.10 ## 2 B 0.30 ## 3 C 0.15 ## 4 D 0.35 ## 5 E 0.10
Drawing a uniform rv of 0.503 produces a C.
So generate 20 realisations of the rv \(X\) by generating the uniform rv \(U\) and transforming: for u
## [1] 0.601 0.223 0.376 0.048 0.474 0.274 0.197 0.765 0.197 0.435 0.103 ## [12] 0.260 0.691 0.794 0.800 0.808 0.528 0.375 0.740 0.250
Produces \(x\): D, B, B, A, C, B, B, D, B, C, B, B, D, D, D, D, C, B, D, B
Suppose we want to simulate from \(X\) which has the cumulative distribution function (cdf) \(F(X)\).
Remember, if \(X\) has probability distribution function \(f(X)\) then the cdf is \(F(X) = \int_{-\infty}^x f(t) \, dt\)
If \(U \sim U(0,1)\), then \(X = F^{-1}(U)\) produces a draw from \(X\) where \(F^{-1}\) is the inverse of \(X\).
So when we can find the inverse of the cdf is known, sampling from the distribution is easy.
If \(X \sim \mbox{Exp}(\lambda)\) then \(F(x) = 1-e^{-\lambda x}\) and so \(F^{-1}(u) = - \frac{\log(1-u)}{\lambda}\)
So generate exponential rv \(x\) by generating the uniform rv \(u\) and set \(x = - \log(1-u)/\lambda\)
So to draw 20 exponential rvs, draw 20 uniform rvs and transform them:
\(u\): 0.012, 0.057, 0.187, 0.238, 0.251, 0.295, 0.298, 0.351, 0.397, 0.464, 0.473, 0.479, 0.485, 0.536, 0.602, 0.646, 0.721, 0.761, 0.915, 0.938
\(x\): 0.01, 0.06, 0.21, 0.27, 0.29, 0.35, 0.35, 0.43, 0.51, 0.62, 0.64, 0.65, 0.66, 0.77, 0.92, 1.04, 1.28, 1.43, 2.47, 2.78