March 31 2017
Named after Jacob Bernoulli, a Swiss mathematician.
If \(p = 0.5\), 30 random draws from a Bernoulli distribution look like
sample(c(0,1),30,prob = c(0.5,0.5), replace = TRUE)
## [1] 0 1 1 0 0 1 1 0 0 0 1 1 0 1 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0
While when \(p = 0.2\), 30 random draws look like
sample(c(0,1),30,prob = c(0.8,0.2), replace = TRUE)
## [1] 0 0 0 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0
Geometric r.v. is the number of failures before the first success in repeated Bernoulli trials.
The probability distribution function (pdf) of geometric \(X\) is \(P(X = x) = (1-p)^x p\).
The expectation of \(X\) is \(E[X] = \frac qp\).
The variance of \(X\) is \(\mathrm{Var}(X) = \frac q{p^2}\).
If \(p = 0.5\), 30 random draws from a geometric distribution look like
rgeom(30,prob = 0.5)
## [1] 0 2 2 3 0 0 2 3 0 0 1 0 0 0 2 1 1 0 2 1 1 5 0 0 0 1 0 0 0 6
While when \(p = 0.2\), 30 random draws look like
rgeom(30,prob = 0.2)
## [1] 0 1 3 6 5 2 3 2 7 2 3 18 4 6 7 5 2 4 0 3 1 7 3 ## [24] 6 0 2 10 7 4 3
A binomial r.v. is the number of successes in \(n\) Bernoulli trials.
Fix \(n \geq 0\). Perform \(n\) Bernoulli trials with probability of success \(p\) and count the number of successes. The result is a binomial random variable.
Takes values between 0 and \(n\).
Two parameters:
\(n\), the number of Bernoulli trials undertaken
\(p\), the probability of success in the Bernoulli trials.
\[ f(x) = P(X = x) = {n \choose x} p^x (1-p)^{n-x} \mbox{ for } x = 0,1,2,\ldots, n.\] where \({n \choose x} = \frac{n!}{x!(n-x)!}.\)
Expectation of \(X\) is \(E[X] = np\).
The variance of \(X\) is \(\mathrm{Var}[X] = np(1-p) = npq\).
Write \(X \sim Bin(n,p)\) when \(X\) has a binomial distribution with parameters \(n\) and \(p\).
Let \(n = 20\). If \(p = 0.5\), 30 random draws from a binomial (n = 30, p = 0.5) distribution look like
rbinom(30, size = 20,prob = 0.5)
## [1] 11 12 8 5 11 6 11 10 7 9 9 10 15 8 9 10 11 10 10 11 11 12 12 ## [24] 10 6 10 11 9 9 9
While when \(p = 0.2\), 30 random draws from Bin(n = 30, p = 0.2) look like
rbinom(30, size = 20,prob = 0.2)
## [1] 4 5 1 3 5 6 3 3 8 5 3 0 3 5 4 6 2 3 3 6 3 5 9 7 2 2 3 3 3 3
Named after the French mathematician Siméon Denis Poisson.
\[f(x) = \exp(-\lambda) \frac{\lambda^x}{x!} \mbox{ for } x = 0,1,2,3,\ldots\] where \(0! = 1\) by definition.
If \(X\) is Poisson it has expectation \(E[X] = \lambda\)
and variance \(\mathbf{Var}[X] = \lambda.\)
Write \(X \sim Poiss(\lambda)\) when \(X\) has a Poisson distribution with parameter \(\lambda\).
30 random draws from a Poisson distribution with rate \(\lambda = 2\) look like
rpois(30,lambda = 2)
## [1] 0 3 1 1 2 2 3 2 0 2 2 3 2 2 6 3 3 2 3 5 3 0 1 1 1 1 0 2 2 5
While when \(\lambda = 0.5\), 30 random draws look like
rpois(30,lambda = 0.5)
## [1] 0 0 1 0 0 1 0 1 1 0 0 0 0 1 0 1 1 1 0 0 1 0 0 0 0 0 1 0 0 1
When \(X\) is discrete and takes \(n\) possible values, the uniform pdf is \(P(X = x_i) = 1/n\) for all \(x_i\).
When \(X\) uniform over on \([a,b]\), the density function is \(f(x) = \frac{1}{b-a}\).
Write \(X \sim U([a,b])\).
30 random draws from a discrete uniform distribution with possible outcomes 1,2,..,8:
sample(1:8,size = 30,replace = T)
## [1] 3 2 4 8 5 1 7 6 7 7 1 7 5 7 2 4 5 6 2 6 3 4 4 8 4 6 2 1 6 8
30 random draws from \(U([2,6])\) (rounded to 3 dp)
round(runif(30, min = 2, max = 6), digits = 3)
## [1] 3.499 5.299 3.616 2.802 4.969 4.070 5.145 2.276 2.760 2.653 5.248 ## [12] 3.506 4.207 4.106 5.016 3.010 3.107 3.324 3.539 3.997 3.635 5.468 ## [23] 3.886 5.890 2.906 2.104 2.604 3.396 2.582 2.729
The exponential distribution describes the waiting time between independent events.
Takes any non-negative value: \(X >= 0\)
It has a single parameter, \(\lambda\), known as the rate.
The probability density function for exponential \(X\) is \(f(x) = \lambda e^{-\lambda x}\), where \(x \geq 0\).
Write \(X \sim Exp(\lambda)\)
The expectation of \(X\) is \(E[X] = \frac 1 \lambda\).
The variance of \(X\) is \(\mathrm{Var}(X) = \frac 1 {\lambda^2}\).
30 random draws from an exponential distribution with rate parameter \(\lambda = 2\).
round(rexp(30,rate = 2), digits = 3)
## [1] 0.376 0.662 0.287 0.239 1.512 0.362 0.130 0.146 1.110 0.214 0.034 ## [12] 0.586 0.388 0.321 0.148 0.216 1.627 0.318 0.145 1.520 0.556 0.734 ## [23] 2.294 1.182 0.190 0.023 0.343 0.722 1.044 0.626
When \(\lambda = 0.5\), 30 draws look like
round(rexp(30,rate = 0.5),digits = 3)
## [1] 2.592 1.537 0.319 2.918 0.403 0.037 0.292 1.059 1.115 0.243 0.251 ## [12] 3.196 0.375 0.020 1.310 3.279 1.451 2.007 0.814 2.259 2.056 1.682 ## [23] 0.975 1.332 0.290 0.472 0.728 5.088 0.224 0.495
An important property of the exponential distribution is memorylessness.
Memorylessness is a property is shared with the geometric distribution and no other distributions.
So if you insist on the memoryless property, you are insisting on a geometric or exponential distribution.
Formally, if \(X\) is exponentially distributed, it has the memoryless property that (X > y + x | T > y ) = (X > x), x, y 0.
The Gamma distribution arises as the sum of a number of exponentials.
Two parameters: \(k\) and \(\theta\) called the shape and scale, respectively.
These parameters can be used to specify the mean and variance of the distribution.
Write \(X \sim Gamma(k,\theta).\)
\[ f(x)=\frac{1}{\theta^k\Gamma(k)}x^{k-1}\exp(-x/\theta) \mbox{ for } x>0, \] where \(\Gamma(k) = \int_0^\infty t^{k - 1}e^{-t} \, dt\) is the gamma function (the extension of the factorial function, \(k !\), to all real numbers).
The mean a gamma distributed random variable \(X\) is \(E[X] = k\theta\)
The variance of \(X\) is \(\mathrm{Var}(X) = k\theta^2\).
Gamma distribution has different parameterisations which result in different looking (but mathematically equivalent) expressions for the density, mean and variance — be sure to check which parametrisation is being used.
The normal distribution arises as a consequence of the central limit theorem which says that (under a few weak assumptions) the sum of a set of identical random variables is well approximated by a normal distribution.
Thus when random effects all add together, they often result in a normal distribution. Measurement error terms are typically modeled as normally distributed.
The Normal distribution, with mean \(\mu\) and variance \(\sigma^2\), (\(\mu\in\mathbb{R}, \sigma>0\)) has density function \[ f(x)=\frac{1}{\sigma\sqrt{2\pi}}\exp\left\{-\frac{1}{2\sigma^2}\left(x-\mu\right)^2\right\} \]
We write \(X \sim N(\mu,\sigma^2)\).
As we have seen, many distributions are derived by transforming one or more random variables drawn from another distribution. E.g.,
These relationships will help us later when we need to simulate from different distributions in that if can simulate draws from one distribution, we may be able to transform them into draws from another distribution.
There are some nice diagrams showing the complex relationships here and here.