Assignment 2
100 points
Your submission to Canvas must be named HW2_FirstName_LastName
,
containing your
answers in a Word file, the source codes in R, and the .csv data file
used for the assignment.
(Do
NOT use any other extension, such as .7z or .tar, since I can’t open them.
If I cannot open your
file to check your answers and your code, I cannot give you credit for the assignment.
)
You may work in groups of at most three students total, but you must include the names of the
students you worked with in your assignment (if you decided to work in a group, only one
submission for the whole group, no need to have duplicate submissions).
Part 1 - Linear Regression (40 points, a
15 points, b
15 points, c
10 points)
Answer the questions in the textbook P. 396 – P 398 using R.
Part 2: Linear Regression in Excel (30 points
a
10 points, b
10 points, c
10 points)
Answer the questions in the textbook P. 396 – P 398 using Excel.
Part 3 (30 points)
Read the following 2 articles from HBR OnPoint posted on EduCat and make a summary of
each, making sure to answer the following questions (it is ok to repeat exactly what is in the
text):
1.
The Best Data Scientists Get Out and Talk to People (15 point) [BestDataScientists.pdf]
1.
What do great data scientists do in general? What would they do if they were in
the oil business in the example? What would they uncover?
1.
What are 3 things great data scientists do?
2.
A Refresher on Regression Analysis (15 point) (this is really just linear regression)
[RefresherLinearRegression.pdf]
1.
What is the dependent variable? What are the independent variables? What is the
regression line?
1.
What do people mean when they say “correlation is not causation”? What is the
example about travel and weight gain?
1.
What are three mistakes some people make about regression analysis?
1