Econ 322 - Fall 2023
Problem Set 2
Due October 6th, 2023, 8 pm
Please answer each of the questions below.
Note that writing only a numeric answer to the
question is
not
enough to receive full credit unless otherwise stated. Please upload your
R
(or other
language) code along with your answers.
Total: 60 points.
Question 1: Sampling Distributions in
R
(20 Points)
Suppose that we are interested in studying commuting patterns at Rutgers University-New
Brunswick. As a first step, we want to better understand the distribution of distances that Rutgers
students need to travel to come to class. For this purpose, we gathered a dataset containing the
distances (in miles, which we denote by
D
) traveled by every Rutgers student, which you can find
in the dataset “
rutgers
distances.csv
”. Although unrealistic, assume that the total population
of Rutgers students is 1,000 (that is, the dataset contains the full population of interest).
In this question, we will get started using
R
by computing basic summary statistics (means and
variances) and sampling from the population distribution (i.e., all 1,000 Rutgers students). This
exercise will be useful to revisit some of the key concepts seen in lectures and, hopefully, provide
some practical intuition on the consequences of using smaller vs larger samples.
Answer the following questions:
1. (6 points) Compute the population mean of
D
. Compute the variance of
D
. Plot a histogram
of
D
. Round your answers to three decimal places. No need to explain how you did it.
2. (5 points) Compute the sample mean for the sample containing the 10 first observations in the
dataset. Note that, since the order of the students in the dataset is random, this is equivalent
to drawing a random sample of 10 observations. Do the same for the first 25 and the first 300
observations. Which sample mean is closer to the population mean in part 1? Why? Round
your answers to three decimal places.
3. (9 points) In this part, we will provide some evidence in favor of the Central Limit Theorem.
Recall that, in lecture, we said that if we drew
s
random samples from the population and
computed the sample mean for each of these
s
samples, the probability distribution of this
sample mean could be approximated by a normal distribution with mean
µ
D
and variance
σ
2
D
/n
for a large enough sample size
n
(sample size refers to the number of observations in each
random sample). Answer the questions below:
1