Definition: Random Variable
A random variable (often abbreviated to “R.V.”) is a function that maps from the sample space of an experiment to the real numbers. Mathematically, we express this as \(X: \Omega \to \mathbb{R}\), where \(X\) is the random variable and the numerical value for some outcome \(\omega\) is \(X(\omega)\). There are two types of random variable: discrete and continuous.
Definition: Discrete Random Variable
A random variable \(X\) is discrete if the value it takes with positive probability is finite or countably-infinite. We might think of this as any variable that can be expressed in terms of integers \(\mathbb{Z}\), for instance.
What are some examples of discrete variables we might use in political science?
Definition: The Bernoulli Distribution
A random variable \(X\) has a Bernoulli distribution with parameter \(p\) if \(\mathbb{P}(X = 1) = p\) and \(\mathbb{P}(X = 0) = 1-p\). This is written as \(X \sim \text{Bern}(p)\) (spoken in English as “\(X\) is distributed Bernoulli \(p\)”). We might call such a variable a Bernoulli random variable.
In other words, \(X\) takes the value 1 with probability \(p\) and the value 0 with probability \(1-p\), such that \(p \in [0,1]\).
What are some examples of Bernoulli random variables we might use in political science?
Definition: Probability Mass Function
A Probability Mass Function (often abbreviated to “PMF”) is a function that gives the probability that a discrete random variable is exactly equal to some value. Mathematically, we express this as \(\mathbb{P}(X = x)\). A PMF has a couple of useful properties:
Definition: Cumulative Distribution Function
The Cumulative Distribution Function (often abbreviated to “CDF”) is a function that returns the probability that a variable is less than a particular value. Mathematically, we express this as \(F_X(x) \equiv \mathbb{P}(X \leq x)\). A CDF has a few useful properties:
Definition: The Bernoulli Distribution PMF and CDF
The PMF of the Bernoulli Distribution can be expressed as
\[\begin{equation*} \mathbb{P}(X = x) = f(x;p) = \begin{cases} p & \text{if } x = 1\\ 1-p & \text{if } x = 0 \end{cases} \end{equation*}\]
The CDF of the Bernoulli Distribution can be expressed as
\[\begin{equation*}
\mathbb{P}(X \leq x) = F_X(x;p) =
\begin{cases}
0 & \text{if } x < 0\\
1 - p & \text{if } 0 \leq x \leq 1\\
1 & \text{if } x > 1
\end{cases}
\end{equation*}\]
Definition: The Binomial Distribution
Let \(X\) be the number of successes in \(n\) independent Bernoulli trials each with success probability \(p\). Then, \(X\) follows a Binomial Distribution with parameters \(n\) and \(p\), expressed as \(X \sim \text{Bin}(n,p)\).
The PMF of the Binomial Distribution is
\[\begin{equation*} \mathbb{P}(X = x) = f(x;n,p) = {n \choose x} p^x (1-p)^{n-x} \end{equation*}\]
What are some examples of Binomial random variables we might use in political science?
Exercise: Using the Binomial Distribution
Given that \(n = 12\) and \(p = 0.75\), use R to find the probability of seeing a) 8 successes and b) 0 successes.
Solution: we can do this a couple of ways: first, by hand.
# a) n = 12, p = 0.75, and x = 8
# We plug those values into our PDF
choose(n = 12, k = 8) * 0.75^(8) * (1 - 0.75)^(12 - 8)
## [1] 0.1935777
# b) n = 12, p = 0.75, and x = 0
choose(n = 12, k = 0) * 0.75^(0) * (1 - 0.75)^(12 - 0)
## [1] 5.960464e-08
We can also use the dbinom() function in R.
# a) n = 12, p = 0.75, and x = 8
dbinom(x = 8, size = 12, prob = 0.75)
## [1] 0.1935777
# b) n = 12, p = 0.75, and x = 0
dbinom(x = 0, size = 12, prob = 0.75)
## [1] 5.960464e-08
Intuition Check: does it make sense that \(\mathbb{P}(X = 8)\) is greater than \(\mathbb{P}(X = 0)\)? Why?
Definition: Continuous Random Variable
A random variable \(X\) is said to be continuous if there exists a nonnegative function of the real numbers \(\mathbb{R}\) a probability density function (often abbreviated to “PDF”) \(f_x\) such that, for any interval \((a,b)\): \[\begin{equation*} \mathbb{P}(a \leq X \leq b) = \int_{a}^{b} f_X(x) dx \end{equation*}\]
There are some important properties of the PDF:
The CDF of a continuous random variable \(X\) is given by \[\begin{equation*}
\mathbb{P}(X \leq x) = F_X(x) = \int_{-\infty}^{x} f_X(t) dt
\end{equation*}\]
What are some examples of continuous random variables we might use in political science?
Definition: The Normal Distribution
A continuous random variable \(Z\) follows the standard normal distribution, written \(Z \sim \mathcal{N}(0,1)\), if it has the PDF:
\[\begin{equation*}
\phi(z) = \int_{-\infty}^{z} \frac{1}{\sqrt{2 \pi}} e^{\frac{-t^2}{2}}dt
\end{equation*}\]
The standard normal distribution has a bunch of nice properties. Here are a few:
What are some examples of normal random variables we might use in political science?