Step 1: Generating cars dataset This block of Python code will generate the sample data for you. You will not be generating the data set using numpy module this week. Instead, the data set will be imported from a CSV file. To make the data unique to you, a random sample of size 30, without replacement, will be drawn from the data in the CSV file. The data set will be saved in a Python dataframe that will be used in later calculations. Click the block of code below and hit the Run button above. import pandas as pd from IPython.display import display, HTML #read data from etcars.csv data set. cars_df_orig - pd.read_csv("https://s3-us-west-2.amazonaws.com/data-analytics.zybooks.com/ntcars.csv") #randomly pick 30 observations from the data set to make the data set unique to you. cars_df cars_df_orig_sample(n=38, replace-False) #print only the first flue observations in the dataset. print("Cars data frame (showing only the first five observations)\n") display (HTML (cars_df.head().to_html())) Cars data frame (showing only the first five observations) Unampg cyl dap hp drit 0 14 Cala Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 Toyota Corona 21.5 4 120.1 23 Camaro 228 13.3 4 Home Sportsbout 18.7 Toyota Corola 33.9 4 0 97 3.70 2.485 20.01 1 0 8 350.0 245 3.73 3.840 15.41 0 8 380.0 175 3.15 3.440 17.02 0 71.1 85 4.22 1.895 19.90 1 0 1 19 import matplatlib.pyplot as plt #create scatterplot of variables mpg against wt. plt.plot(cars_df["wt"], cars_df["upg"], "o", color="red") #set a title for the plot, x-axis, and y-axis. plt.title("PG against weight") plt.xlabel("Weight (1880s lbs)") plt.ylabel("MPG") Step 2: Scatterplot of miles per gallon against weight The block of code below will create a scatterplot of the variables "miles per gallon" (coded as mpg in the data set) and "weight of the car (coded as wt). # show the plot. plt.show() Click the block of code below and hit the Run button above. NOTE: If the plot is not created, click the code section and hit the Run button again. 4 import matplatlib.pyplot as plt # create scatterplat of variables mpg against hp. plt.plot(cars_df["hp"], cars_df["#pg"], "o", color="blue") carb #set a title for the plot, x-axis, and y-axis. plt.title('WPG against Horsepower') 4 plt.xlabel("Horsepower") plt.ylabel("MPG") 4 Step 3: Scatterplot of miles per gallon against horsepower The block of code below will create a scatterplot of the variables "miles per gallon" (coded as mpg in the data set) and "horsepower of the car (coded as hp). # show the plot. plt.show() 2 Click the block of code below and hit the Run button above. NOTE: If the plot is not created, click the code section and hit the Run button again. Step 4: Correlation matrix for miles per gallon, weight and horsepower Now you will calculate the correlation coefficient between the variables "miles per gallon" and "weight". You will also calculate the correlation coefficient between the variables "miles per gallon" and "horsepower". The corr method of a dataframe returns the correlation matrix with the correlation coefficients between all variables in the dataframe. You will specify to only return the matrix for the three variables. Click the block of code below and hit the Run button above. #create correlation matrix for mpg, wt, and hp. # The correlation coefficient between mpg and wt is contained in the cell for mpg row and wt column (or wt row and mpg column). # The correlation coefficient between mpg and hp is contained in the cell for mpg row and hp column (or hp row and mpg column). mpg_wt_corr = cars_df[['mpg', 'wt', 'hp']].corr() print(mpg_wt_corr) mpg hp mpg 1.8eeeee -0.855046 -0.788600 wt -0.855046 1.000000 0.670918 hp -0.788600 0.670918 1.000000 Step 5: Multiple regression model to predict miles per gallon using weight and horsepower This block of code produces a multiple regression model with "miles per gallon" as the response variable, and "weight" and "horsepower" as predictor variables. The ols method in statsmodels.formula.api submodule returns all statistics for this multiple regression model. Click the block of code below and hit the Run button above. from statsmodels.formula.api import ols # create the multiple regression model with mpg as the response variable; weight and horsepower as predictor variables. model= ols('mpg - wt+hp', data=cars_df).fit() print(model.summary()) Dep. Variable: Model: Method: Date: Time: No. Observations: Df Residuals: Df Model: Covariance Type: ============= wt Intercept wt hp Omnibus: Prob (Omnibus): Skew: Kurtosis: Least Squares Sun, 11 Dec 2022 04:25:46 coef 36.5178 -3.6425 -0.0330 mpg R-squared: OLS std err nonrobust Adj. R-squared: Prob (F-statistic): Log-Likelihood: F-statistic: 30 AIC: 27 BIC: 2 1.736 0.686 0.009 =========================================================== t 21.031 -5.312 -3.503 P>|t| 0.000 0.000 0.002 7.195 Durbin-Watson: 0.027 Jarque-Bera (38): 1.011 Prob(JB): 3.641 Cond. No. [0.025 32.955 -5.049 -0.052 0.815 0.801 59.52 1.27e-10 -69.947 145.9 150.1 0.975] 40.081 -2.236 -0.014 2.073 5.624 0.0601 624.
Oh no! Our experts couldn't answer your question.
Don't worry! We won't leave you hanging. Plus, we're giving you back one question for the inconvenience.
1. Report the level of significance. What is the slope coefficient for the weight variable? Is this coefficient significant at 5% level of significance (alpha=0.05)? (Hint: Check the P-value, P is greater than the absolute value of t, for weight in Python output. Recall that this is the individual t-test for the beta parameter.) See Step 5 in the Python script.
2. What is the slope coefficient for the horsepower variable? Is this coefficient significant at 5% level of significance (alpha=0.05)? (Hint: Check the P-value, P is greater than the absolute value of t, for horsepower in Python output. Recall that this is the individual t-test for the beta parameter.) See Step 5 in the Python script.
3. What is the purpose of performing individual t-tests after carrying out the overall F-test? What are the differences in the interpretation of the two tests?
![Step 1: Generating cars dataset
This block of Python code will generate the sample data for you. You will not be generating the data set using numpy module this week. Instead, the data set
will be imported from a CSV file. To make the data unique to you, a random sample of size 30, without replacement, will be drawn from the data in the CSV
file. The data set will be saved in a Python dataframe that will be used in later calculations.
Click the block of code below and hit the Run button above.
import pandas as pd
from IPython.display import display, HTML
#read data from etcars.csv data set.
cars_df_orig - pd.read_csv("https://s3-us-west-2.amazonaws.com/data-analytics.zybooks.com/ntcars.csv")
#randomly pick 30 observations from the data set to make the data set unique to you.
cars_df cars_df_orig_sample(n=38, replace-False)
#print only the first flue observations in the dataset.
print("Cars data frame (showing only the first five observations)\n")
display (HTML (cars_df.head().to_html()))
Cars data frame (showing only the first five observations)
Unampg cyl dap hp drit
0
14 Cala Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0
Toyota Corona 21.5 4 120.1
23
Camaro 228 13.3
4 Home Sportsbout 18.7
Toyota Corola 33.9 4
0
97 3.70 2.485 20.01 1 0
8 350.0 245 3.73 3.840 15.41 0
8 380.0 175 3.15 3.440 17.02 0
71.1 85 4.22 1.895 19.90 1
0
1
19
import matplatlib.pyplot as plt
#create scatterplot of variables mpg against wt.
plt.plot(cars_df["wt"], cars_df["upg"], "o", color="red")
#set a title for the plot, x-axis, and y-axis.
plt.title("PG against weight")
plt.xlabel("Weight (1880s lbs)")
plt.ylabel("MPG")
Step 2: Scatterplot of miles per gallon against weight
The block of code below will create a scatterplot of the variables "miles per gallon" (coded as mpg in the data set) and "weight of the car (coded as wt).
# show the plot.
plt.show()
<Figure size 640x480 with 1 Axes>
Click the block of code below and hit the Run button above.
NOTE: If the plot is not created, click the code section and hit the Run button again.
4
import matplatlib.pyplot as plt
# create scatterplat of variables mpg against hp.
plt.plot(cars_df["hp"], cars_df["#pg"], "o", color="blue")
carb
#set a title for the plot, x-axis, and y-axis.
plt.title('WPG against Horsepower')
4
plt.xlabel("Horsepower")
plt.ylabel("MPG")
4
Step 3: Scatterplot of miles per gallon against horsepower
The block of code below will create a scatterplot of the variables "miles per gallon" (coded as mpg in the data set) and "horsepower of the car (coded as hp).
# show the plot.
plt.show()
2
Click the block of code below and hit the Run button above.
NOTE: If the plot is not created, click the code section and hit the Run button again.](https://content.bartleby.com/qna-images/question/c697cfa7-35be-400d-88d6-26bf90a0b862/757e4540-0489-4262-a9d6-e535775db69e/57pucc_thumbnail.png)
![Step 4: Correlation matrix for miles per gallon, weight and horsepower
Now you will calculate the correlation coefficient between the variables "miles per gallon" and "weight". You will also calculate the correlation coefficient
between the variables "miles per gallon" and "horsepower". The corr method of a dataframe returns the correlation matrix with the correlation coefficients
between all variables in the dataframe. You will specify to only return the matrix for the three variables.
Click the block of code below and hit the Run button above.
#create correlation matrix for mpg, wt, and hp.
# The correlation coefficient between mpg and wt is contained in the cell for mpg row and wt column (or wt row and mpg column).
# The correlation coefficient between mpg and hp is contained in the cell for mpg row and hp column (or hp row and mpg column).
mpg_wt_corr = cars_df[['mpg', 'wt', 'hp']].corr()
print(mpg_wt_corr)
mpg
hp
mpg 1.8eeeee -0.855046 -0.788600
wt -0.855046 1.000000 0.670918
hp -0.788600 0.670918 1.000000
Step 5: Multiple regression model to predict miles per gallon using weight and horsepower
This block of code produces a multiple regression model with "miles per gallon" as the response variable, and "weight" and "horsepower" as predictor
variables. The ols method in statsmodels.formula.api submodule returns all statistics for this multiple regression model.
Click the block of code below and hit the Run button above.
from statsmodels.formula.api import ols
# create the multiple regression model with mpg as the response variable; weight and horsepower as predictor variables.
model= ols('mpg - wt+hp', data=cars_df).fit()
print(model.summary())
Dep. Variable:
Model:
Method:
Date:
Time:
No. Observations:
Df Residuals:
Df Model:
Covariance Type:
=============
wt
Intercept
wt
hp
Omnibus:
Prob (Omnibus):
Skew:
Kurtosis:
Least Squares
Sun, 11 Dec 2022
04:25:46
coef
36.5178
-3.6425
-0.0330
mpg R-squared:
OLS
std err
nonrobust
Adj. R-squared:
Prob (F-statistic):
Log-Likelihood:
F-statistic:
30 AIC:
27
BIC:
2
1.736
0.686
0.009
===========================================================
t
21.031
-5.312
-3.503
P>|t|
0.000
0.000
0.002
7.195 Durbin-Watson:
0.027 Jarque-Bera (38):
1.011
Prob(JB):
3.641 Cond. No.
[0.025
32.955
-5.049
-0.052
0.815
0.801
59.52
1.27e-10
-69.947
145.9
150.1
0.975]
40.081
-2.236
-0.014
2.073
5.624
0.0601
624.](https://content.bartleby.com/qna-images/question/c697cfa7-35be-400d-88d6-26bf90a0b862/757e4540-0489-4262-a9d6-e535775db69e/8hz2kgi_thumbnail.png)





