This assignment is designed to review the materials you learn in the
lab. Be sure to comment your code to clarify what you are doing. Not
only does it help with grading, but it will also help you when you
revisit it in the future. Please use either RMarkdown or
knitr to turn in your assignment. These are fully compatible
with R and LaTeX. If your code uses random number generation,
please use set.seed(12345)
for replicability. Please post
any questions on Piazza.
Intuitively, why should we care that the OLS estimator \(\hat{\beta}\) is “BLUE?”
Explain, in words (i.e., with minimal math and no equations or algebra), the logic of the proof of the Gauss-Markov Theorem. What is the goal of the Gauss-Markov Theorem, and what steps must we take to show it is true? What assumptions are necessary?
Let \(\hat{\beta}=
(\textbf{X}^{\top}\textbf{X})^{-1}\textbf{X}^{\top}Y\), and let
\(\tilde{\beta} = \textbf{C}Y\), with
\(\textbf{CX} = \textbf{I}\) and \(\textbf{C} =
(\textbf{X}^{\top}\textbf{X})^{-1}\textbf{X}^{\top} +
\textbf{D}\), such that \(\textbf{DX} =
0_{K+1 \times K+1}\). Assume all of our standard assumptions from
Checking Intuition. Show:
For both proofs, explain in English why every step is a valid
operation. In other words, each and every single line of your proofs
must have English sentence(s) justifying them.
To get you started, your answers should look something like
this:
Showing \(\tilde{\beta}\) is
an unbiased estimator for \(\hat{\beta}\): first, substitute \(\tilde{\beta} = \textbf{C}Y\) and \(\textbf{C} =
(\textbf{X}^{\top}\textbf{X})^{-1}\textbf{X}^{\top} +
\textbf{D}\) — implying that \(\tilde{\beta} = (\textbf{X}^{\top}\textbf{X})^{-1}
\textbf{X}^{\top} + \textbf{D})Y\) — into our equation.
\[\begin{equation*}
\begin{aligned}
\mathbb{E}\left[\tilde{\beta} - \beta \,|\, \textbf{X} \right] & =
\mathbb{E}\left[ (\textbf{X}^{\top}\textbf{X})^{-1} \textbf{X}^{\top} +
\textbf{D})Y - \beta \,|\, \textbf{X} \right]\\
& \vdots \\
& \text{(you fill out the rest of these steps!)} \\
& \vdots \\
& = 0
\end{aligned}
\end{equation*}\]
Showing \(\mathbb{V}\left(\tilde{\beta} \,|\, \textbf{X} \right) \geq \mathbb{V}\left(\hat{\beta}\,|\, \textbf{X} \right)\). This should follow the same structure and format as 2a.
(This is a continuation of the assignment from Lab 5 — you can reuse
the same model if you wish)
Load the gavote data from the faraway package.
Create a new variable undercount by calculating the percentage
of ballots that were not counted into votes, using the variables
votes and ballots. We will use undercount as
our outcome variable of interest. Choose three other variables to use as
predictors, and justify those choices. If you wish to construct a new
variable (such as perGore from previous weeks), you may do so
as long as you explain why it is meaningful. Write out your linear model
and run a linear regression using lm()
. Interpret your
results.
Let’s manually calculate the standard errors for our predictors.
Create the necessary vectors and matrices (don’t forget that we need a
column of 1’s for the intercept!); pay close attention to your data
types. R will store scalars generated through matrix operations
as a \(1 \times 1\) matrix. Compare
these standard errors with the ones generated using lm()
.
Are they equal?