This assignment is designed to review the materials you learn in the
lab. Be sure to comment your code to clarify what you are doing. Not
only does it help with grading, but it will also help you when you
revisit it in the future. Please use either RMarkdown or
knitr to turn in your assignment. These are fully compatible
with R and LaTeX. If your code uses random number generation,
please use set.seed(12345)
for replicability. Please post
any questions on Piazza.
Think about an outcome \(Y\) you care about for your research. Name at least two explanatory variables \(X\) that you think influence \(Y\). Discuss what each of the four least squares assumptions — the linear model, exogeneity, homoskedastic/mean-zero errors, and independent observations — mean in terms of your variables of interest.
Let \(\textbf{A}\) be any \(n \times n\) matrix. Show that \(\textbf{AA} = \textbf{I}\) if and only if
\((\textbf{I} - \textbf{A})(\textbf{I} +
\textbf{A}) = 0\).
Let \(\textbf{A}\) be a \(2 \times 2\) matrix, such that \(\textbf{A} = \begin{bmatrix}a & b \\ c &
d\end{bmatrix}\). Under what conditions is \(\textbf{AA} = \textbf{I}\)?
Load the gavote data from the faraway package.
Create a new variable undercount by calculating the percentage
of ballots that were not counted into votes, using the variables
votes and ballots. We will use undercount as
our outcome variable of interest. Choose three other variables to use as
predictors, and justify those choices. If you wish to construct a new
variable (such as perGore from last week), you may do so as
long as you explain why it is meaningful. Write out your linear model
and run a linear regression using lm()
. Interpret your
results.
Let’s manually calculate the coefficients for our predictors. Create
the necessary vectors and matrices (don’t forget that we need a column
of 1’s for the intercept!); pay close attention to your data types.
R will store scalars generated through matrix operations as a
\(1 \times 1\) matrix. Compare these
coefficients with the ones generated using lm()
. Are they
equal?