08.0 Scatterplots and Linear Models
pdf
keyboard_arrow_up
School
Ivy Tech Community College, Indianapolis *
*We aren’t endorsed by this school
Course
123
Subject
Mathematics
Date
Apr 3, 2024
Type
Pages
19
Uploaded by ConstableSpiderPerson901
8. Linear Models 8. Linear Models Trends in Data Our world is more data-driven than ever. The ability to use data to make predictions has become an essential skill. Below is a scatterplot of data based on the average MATH 123 grades for 302 students and the number of class sessions they missed: 1)
The independent variables
are all the values found along the x-axis (horizontal axis) of a graph. What is the independent variable of this graph? Why might it be called “independent”?
2)
The dependent variables
are all the values found along the y-axis (vertical axis) of a graph. What is the dependent variable of this graph? 3)
Data is graphed by plotting ordered pairs, often written as (x, y). Interpret the ordered pair provided on the graph, and be sure to include what each value represents.
4)
A positive trend
occurs when one variable increases and the other variable also increases; a negative trend
occurs when one variable increases and the other variable decreases. Using the names of the two variables from the scatterplot above, describe the kind of trend occurring here.
5)
Based on this data, what course grade would you predict for a student who misses 12 class sessions? What are some limitations on your prediction? (
5, 61
)
0
20
40
60
80
100
0
2
4
6
8
10
12
14
16
Overall Cousre Score
Number of Absences
Average Course Scores vs. Attendance 2021-2022
Discover!
8. Linear Models Modeling Linear Data with Equations The new Student Life Director received the following graphs of account balances for different student groups for the first 5 weeks of the semester. He would like to write an equation that would model the balance in each account for a given week. 1)
How are the graphs similar to one another? What patterns do you notice? 2)
What are the key characteristics that would be needed to describe the patterns shown? 3)
Where do each of the account balances begin? This value is also known as the y-intercept. a)
LSU Account Balance Starting value: b)
SGA Account Balance Starting value: c)
PTK Account Balance Starting value: Next, we’ll consider how each account’s balance is changing from time to time. Discover!
The y-intercept, represented by
b
, is the point at which the graph crosses the y-axis. b
is also the y-
value that corresponds to an x-value of zero. b
represents the starting value or initial amount for the model. -$20
-$15
-$10
-$5
$0
$5
$10
$15
$20
$25
$30
0
1
2
3
4
5
Week
LSU Account Balance
-$20
-$15
-$10
-$5
$0
$5
$10
$15
$20
$25
$30
0
1
2
3
4
5
Week
SGA Account Balance
-$20
-$15
-$10
-$5
$0
$5
$10
$15
$20
$25
$30
0
1
2
3
4
5
Week
PTK Account Balance
8. Linear Models 4)
How much does each account change per week? Discuss how you determined this rate for each account. This value is also known as the slope
or unit rate of change
. a)
LSU Account Balance Slope value: b)
SGA Account Balance Slope value: c)
PTK Account Balance Slope value: d)
How are the values of the slope reflected in the graphs? Notice that the slope is consistent between the points on the line. We can use any two points of data to find the slope. For example, The LSU account has points at (2, 10) and (5, 25) 𝑚 =
Absolute Change in ?
Absolute Change in ?
=
$25−$10
5−2
= $15
3 ?𝑒𝑒𝑘𝑠
=
$______ per week Pick any two different points from the graph to confirm that you get the same result. Linear models
are equations that can be used to predict values of the independent or dependent variable when the absolute change
between consecutive y-values is constant. In the ? = 𝒎? + 𝒃
linear model, m
and b
will be fixed numerical values for a given situation and x
and y will be variables. 5)
What equations would model the account balances shown in the graphs? a)
LSU Account Balance Equation: b)
SGA Account Balance Equation: c)
PTK Account Balance Equations: 6)
Use the equation for the PTK Account Balance to predict the balance for week 6. Amount = Unit Rate (Quantity) + Starting Value This is commonly written in slope-intercept form: ? = 𝒎? + 𝒃
.
The slope
of a linear model, represented by 𝒎
, describes the steepness
of the line. It can be calculated from the ratio
of absolute change in y
to the absolute change in x
. The slope is the unit rate
of change
(increase or decrease) in the model.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
8. Linear Models Creating a Linear Model David weighs 240 pounds and sets a weight loss goal of 2 pounds per week. 1)
We will create a linear model to predict David’s weight for a given number of weeks after he begins his plan. a)
What will the dependent variable, y
, represent in our model? What units will y
be measured in? b)
What will the independent variable, x
, represent in our model? What units will x
be measured in? c)
What is the y-intercept, b, the starting point? d)
What is the slope, m
, the rate at which y
changes when x
increases by 1 week? Is it increasing or decreasing? e)
Put the pieces together to write the model ? = 𝒎? + 𝒃.
2)
How much does David hope to weigh after 4 weeks? 3)
How much does David hope to weigh after 10 weeks? 4)
How long would it take David to reach his goal weight of 180 pounds on this plan? 5)
Complete the graph of David’s plan below.
6)
If David were to raise his weight loss goal to 3 pounds per week, how would the model and the graph change?
7)
Is it reasonable to assume David could stick to this weight loss indefinitely? 180
190
200
210
220
230
240
0
5
10
15
20
25
30
Weight in Pounds
Weeks
Weight Loss Plan
Connect!
8. Linear Models Using Excel for Linear Models We can use Excel to quickly generate the values for several (x, y) points based on an equation. Step 1: First we will generate a column of consecutive values for our x values. We’ve labeled column A as ‘
W
eeks’, and entered a ‘0’ and ‘1’ in the column. Now highlight the cells with both the 0 and 1 and grab the bottom right corner to autofill the values up to the desired number. Step 2: Now we will enter an equation in column B that will generate values for our dependent variable. These values will depend on the values in column A. The use of the cell reference A2 in the formula is called a relative reference. After the formula has been entered in cell B2, select cell B2 and use the auto-fill drag feature to auto-fill the values for column B. You can also double-click the bottom right corner of the cell to auto-fill.
Step 3: After the data for column B has been generated, highlight both columns and insert a graph. For this graph, we chose a “
Scatterplot with Smooth Lines and Markers
.”
Excel!
Place mouse here (
a solid black + cursor should appear
) and drag down while holding left button. We’ve auto
-filled this column to 10. Note the icon in the bottom right that appears when dragging the cells. Remember that you can (and should) add axis titles to the graph by clicking the + icon next to the graph or using the Design menu. In cell B2, we’ve entered the formula =-2*A2+240
8. Linear Models Breaking Even The local Phi Theta Kappa chapter is selling lemon shakeups on campus as a fundraiser for their trip to Catalyst this year. PTK members are volunteering their time to work at the booth, but the chapter spent $90 on supplies to make the shakeups. They intend to charge $3 per shakeup. Use the questions below to construct a linear model that will give the profit or loss for a given number of shakeups sold. 1)
What will x
and y
represent in terms of the problem? 2)
What is the slope, m
? 3)
What is the starting amount, b
, prior to selling the shakeups? Hint
: Is the group starting with a profit or a loss?
4)
Construct a linear model that represents the money raised by this fundraiser. 5)
The chapter purchased enough materials to make 225 shakeups. How much money will they have raised if they sell all 225 shakeups? 6)
How many shakeups would they have to sell to raise $300? 7)
The break-even
point of a business model is the point where y = 0 (also called the x
-
intercept). In other words, it is the point where all of the initial costs have been paid for, and the profit is $0. How many shakeups will the chapter have to sell to break even? Constructing Models from Data Points 1)
In 1980, the average cost of a pack of cigarettes was $1.00. In 2000, the average cost was $3.00 per pack. We will use this information to create a linear model for prices. a)
Calculate the slope needed for a linear model. b)
Use the slope value in a meaningful sentence. c)
Construct a linear model that predicts the cost of a pack of cigarettes after 1980. d)
Use this information to estimate the cost of a pack of cigarettes in 2020. Hint: what value will you use for x for 2020?
Connect!
Connect!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
8. Linear Models 2)
The population of Flat Rock, Illinois was 4,800 in the year 1900 but had dropped to just 450 by 2020. a)
Calculate the slope rounded to the hundredths place. What does this value represent? b)
Construct a linear model that predicts the population when given a year since 1900. c)
Use your model to predict the population in the year 2025. d)
Can this trend continue indefinitely? Explain. Alternative Starting Year Suppose the local population of a certain insect was estimated to be 4.8 million in 1952 and 11.4 million in 2002: 1)
Calculate the unit rate of change of the population per year. 2)
Construct a linear model that assumes 1952 is the starting point. 3)
Use your model to find the insect population for 1992. In the examples so far, we have equated the starting year with x = 0. Using such models requires us to determine what value to use for x by finding how many years a given year is from the start year. In the alternative approach below, the y-intercept is shifted so that x = 0 literally starts at year 0. Using this technique allows the user to plug the actual year in for x. 4)
Replace y, x, and m in the y = mx + b formula with the slope you found in number 1 above and the point (2002, 11.4). Solve the equation for b. 5)
Write the linear model using the m from number 1 and the b from number 4 above. 6)
Use this model to find the insect population for 1992. How does this value compare to your answer from number 3 above? 7)
Replace y, x and m in the y = mx + b formula with the slope you found in number 1 above and the point (1952, 4.8). Solve the equation for b. How does this value compare to your answer from number 4 above? Explore!
8. Linear Models Using Linear Models
1)
Consider the graph to the right: a)
Write an equation to model the number of items sold as shown in the graph. b)
Use the model to determine how many items would be sold after 11 festivals were attended. c)
How many festivals would be required to exceed 350 sales? 2)
A bowling alley does not allow any personal footwear on their lanes; instead, they charge $3.50 for shoe rental. For each game bowled, bowlers are charged an additional $5.25. a)
Construct the linear model that would determine the total cost for any number of games bowled. b)
If you only have $20 to spend, what is the maximum number of games you could bowl? 3)
The median adjusted income in Indiana in 1990 was $46,300 and in 2011 was $44,450. a)
Find the unit rate of change. b)
Write a linear model to predict the median adjusted income for any given year. c)
Use your linear model to predict the median adjusted income in Indiana for the year 2000. Connect!
15
40
65
90
115
140
165
190
0
25
50
75
100
125
150
175
200
0
2
4
6
Number of Items Sold
Festivals Attended
Sales of Handmade Goods
8. Linear Models Percent of Smokers in the United States The following graphic comes from a Gallup survey and shows the results of a longitudinal study that asked if the respondent had smoked in the past week. Select points are labeled.
4)
A dark line has been drawn from the starting value in 1971 to the end value for 2021. Find the linear model for this line. a)
If we let 1971 be the start time, then x will be the years since 1971; this means for the year 1971, 𝑥 = 0
. If 1971 is the start year, what is the y-intercept, 𝒃
? b)
Calculate the slope, m
, the rate at which y changes when x
increases by 1 year. Is it increasing or decreasing? c)
Put the pieces together to write the model ? = 𝒎? + 𝒃.
5)
Was the actual decrease in the percentage of smokers in the U.S. constant or did it vary from year to year? 6)
Does the model assume a constant decrease or a varying decrease? 7)
If the decrease had been constant from 1971 to 2021, what would the estimated percentage of smokers have been at the turn of the century? (
Hint
: What is the value of 𝑥
?) 8)
Compare the estimate for the year 2000 with the actual value from the graph. Write a sentence comparing the two values. (Think absolute and relative change
!) 9)
Use the linear model to estimate the expected percentage of smokers there were in 2011. (1971, 42)
(1978, 36)
(1983, 38)
(1988, 32)
(2000, 25)
(2009, 20)
(2014, 21)
(2021, 16)
0
10
20
30
40
50
1971
1976
1981
1986
1991
1996
2001
2006
2011
2016
2021
Percentage Points
Year
Percentage of Smokers U.S.
Connect!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
8. Linear Models Trendline The previous example used the beginning and end points of the data to create a model, but this may not provide accurate predictions. A better method is to use linear regression to find the best fit line or trendline
. Linear regression is a complex and tedious mathematical process, but Excel can easily do the work for us. The resulting line and equation are closest to all the datapoints on the scatterplot.
1)
What is the slope of the linear model? Explain the meaning of this value in context.
2)
What course grade does the model predict for students who missed for 0 classes? 3)
Use the equation to predict the course grade for students who missed 5 classes. 4)
The actual data point for 5 missed classes is (5, 61). Compare the model’s value to the actual value. The scope of the model
refers to the range of values of the independent (x) variable. It is not recommended that the model be used for estimating values very far outside the scope of the model since the trend may change outside the range of values for which there is data. 5)
What is the scope of the model for Average Course Scores vs. Attendance? 6)
Would it make sense to estimate the average course grade for a student who misses 16 class sessions? Why or why not? y = -4.6x + 81.2
0
20
40
60
80
100
0
2
4
6
8
10
12
14
16
Overall Cousre Score
Number of Absences
Average Course Scores vs. Attendance 2021-2022
Discover!
8. Linear Models Linear Correlation Statistical studies relating two variables often report the strength of linear correlation, or relationship, between variables. Correlation is most commonly used in analyzing data that follows a linear pattern. The strength or weakness of the correlation is determined by how clustered or widespread the data values are from a straight line. The closer the datapoints are to a line, the stronger the correlation. Assessing Linear Correlation 1)
The data from the Average Course Scores vs. Attendance is an example of strong negative linear correlation. Does this guarantee that all students who have more absences will have a lower course score? Correlation is a mathematical relationship and does NOT
imply causation. Two variables may be related, but that does not mean changes in one variable affect the other. Below are some examples to highlight the difference between correlation and causation. Can you think of other examples? Correlation –
Relationship Causation –
Cause and Effect •
Time spent studying and test scores •
Education and salary •
Alcoholism and smoking •
Taking a test and earning points •
Completing requirements and being certified •
Alcoholism and cirrhosis of the liver Linear Correlation Coefficient In addition to the best fit trendline, Excel can also provide an 𝑅
2
value that quantifies the strength or weakness of the correlation between x and y. The linear correlation coefficient, r
,
can be found by taking the square root of 𝑅
2
. This calculation will result in a value between 0 and 1, with 0 being no correlation and 1 being a perfect fit line. The correlation will be positive or negative based on the slope of the line. The table to the right provides a rough guide to r value and strength of correlation. The trendline on the previous page would have an 𝑅
2
value of 0.9411 taking √0.9411
yields 0.97 a very strong negative correlation. 1)
The graph on the next page has an 𝑅
2
value of 0.3439 What rating would you give the correlation? r Value Strength of Correlation 0-0.29 None or Very Weak 0.3-0.49 Weak 0.5-0.69 Moderate 0.7-0.89 Strong 0.9-1 Very Strong Correlation is a statistical measure of the relationship between two variables.
Discover!
No linear Correlation Strong Positive Correlation Moderate Positive Correlation Strong Negative Correlation Moderate Negative Correlation Explore!
8. Linear Models Life Expectancy and GDP 1)
Consider the graph above. Would you say that there is a strong, moderate, or weak correlation between life expectancy and GDP per capita? 2)
What is the scope of the model? 3)
Should we use the model to predict life expectancy beyond $150,000? Explain. 4)
Use the linear model to estimate the life expectancy when x = $102,000. How does this compare to the actual life expectancy of Luxembourg? Using Excel for Scatterplots and Linear Models Given a list of x and y values, or just y values, we can generate a Scatterplot in Excel and then insert a trendline that best fits that data. Step 1: Enter data –
The data should be in two adjoining columns. If your data is not in the desired order, you may have to rearrange the data using cut and paste, or you can use the ‘Select Data’ option after the graph is created and make changes there. Excel will make the values in the left column the x-
values for the graph, and the values in the right column the y-values. If only one column of data is supplied, Excel will make that data the y-values and start the x-values at zero by default. Gabon
Iceland
Luxembourg
Qatar
Russia
Swaziland
U.S.
y = 0.0002x + 66.819
40
45
50
55
60
65
70
75
80
85
90
$0
$20,000
$40,000
$60,000
$80,000
$100,000
$120,000
$140,000
Life Expectancy at birth vs. GDP per capita
Explore!
Excel!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
8. Linear Models Step 2: Highlight the data –
Highlight the desired data for the scatter plot. Be sure to highlight all the data by pulling the cursor all the way to the bottom of the data set. Step 3: Insert a scatterplot -
From the tabs at the top of the page, select the “Insert” tab. Then, in the “Chart” area, find and select the scatterplots. A chart-type box will appear. Select the first chart type for scatterplot only (circled). Step 4: Adjust an axis
-
Notice how the scatter plot is bunched. The x-axis values are marked 0 –
450, but the data set only has values from about 320 –
400. We can set the axis options to make the chart more readable. Right-click on the x-axis values and select “Format Axis” from the menu. In the Format Axis
menu, enter a new minimum and maximum, and you can enter a different major scale. Here the minimum is reset at 320, the maximum is 400 and the major scale is set at 10 points. Notice how the scatterplot is easier to read. Step 5: Add trendline
–
Edit the Chart title, then add and edit the axis titles (see Excel in section 5). Select the graph so that the ‘Design’ menu tab is open at the top of Excel. Click on ‘Add Chart Element’ in the Design Menu and arrow down to the ‘Trendline’ and then the ‘Linear’ option. You can also add a trendline by clicking the plus symbol to the right of the graph or by right-clicking on the data points to open a menu to add the trendline. 13.80
13.90
14.00
14.10
14.20
14.30
14.40
14.50
14.60
14.70
0.00
100.00
200.00
300.00
400.00
500.00
Chart Title
Right-click the x-axis values
8. Linear Models Step 6: Add the linear equation
. Click on “Add Chart Element” in the Design Menu again, arrow down to the “Trendline” and then
to the “More Trendline Options” to open the “Format Trendline” Menu. Scroll to the bottom and check the box for “Display Equation on Chart.”
You may need to move the box with the cursor to put it in a more desirable location. The R
2 value can also be optionally added. Global Temperature and Atmospheric CO
2 The chart above shows the atmospheric CO
2
concentration in ppm (parts per million) and the corresponding average global temperature and CO
2
concentration for a particular year from 1970 to 2021. 1)
Which variable is dependent upon which according to the graph? ___________________________ is dependent upon __________________________ 2)
Interpret the slope of the model. 3)
Use the linear model to estimate the average temperature expected when CO
2
is 400ppm. 4)
How strong is the correlation between the average global temperature and the atmospheric concentration of CO
2
? 5)
Can you use this linear model to estimate the CO
2
concentration when the temperature is 14
o
C? 6)
Can it be concluded from the graph that CO
2
causes global climate change? Explain. Connect!
Remember that you can (and should) add axis titles to the graph using the +
icon beside the graph or using the Design Menu. y = 0.0099x + 10.754
R² = 0.8885
13.6
13.8
14.0
14.2
14.4
14.6
14.8
15.0
320.00
330.00
340.00
350.00
360.00
370.00
380.00
390.00
400.00
Avg Global Temp in Celsius
CO2 Concentration in ppm
Average Global Temperature and Atmospheric CO2 Concentration
8. Linear Models Global Temperature Consider the similarities and differences in the graphs below: 1)
Interpret the slope of the linear model for the graphs.
2)
What does x represent in each of the graphs?
3)
Use the trendline equation (linear model) from the first graph to predict the average global temperature for 2020 rounded to the nearest tenth. 4)
Use the trendline equation from the second graph to predict the average global temperature for 2020 rounded to the nearest tenth. y = 0.01737x + 13.92332
13.6
13.8
14.0
14.2
14.4
14.6
14.8
15.0
0
10
20
30
40
50
60
Degrees Celsius
Years Since 1970
Average Global Temperature 1970-2021
Connect!
y = 0.01737x - 20.28805
13.6
13.8
14.0
14.2
14.4
14.6
14.8
15.0
1970
1980
1990
2000
2010
2020
2030
Degrees Celsius
Year
Average Global Temperature 1970-2021
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
8. Linear Models Breast Cancer Mortality We can use this chart to help us better understand the impact of breast cancer in Indiana: 1)
What is the scope of the model in years? 2)
How would you classify this correlation? Is it positive or negative; strong, moderate, or weak? 3)
Could you say that the passage of time is causing mortality rates to decrease? Explain. 4)
Use the linear model to estimate the mortality rate for 2001. How does this compare to the actual mortality rate for 2001? 5)
Use the linear model to predict the mortality rate for 2015. Will the model be accurate for 2015? 6)
Find the following: a)
Estimate the absolute change in actual mortality rates from 1975 to 2010. b)
Approximate the relative change in actual mortality rates from 1975 to 2010. c)
Which measure of change would you share if: - you were hoping to implement a new experimental treatment? - you were throwing a benefit for cancer survivors? y = -0.1351x + 276.25
3
4
5
6
7
8
9
10
11
1975
1980
1985
1990
1995
2000
2005
2010
2015
Mortality rate per 100
Breast Cancer Mortality in Indiana, Ages 50 and Under
Connect!
8. Linear Models Firearm Death Rates On the graph below, firearm death rates for 2013 were broken down by state and compared to gun ownership by state: 1)
Indiana’s actual firearm death rate per 100,000 was 13.04
. If the population of Indiana was 6.6 million in 2013, how many firearm deaths were there in Indiana? (Round to the nearest whole number.)
2)
Discuss the mathematical trend of this data. What does the trend mean in context? 3)
Classify the correlation as either weak, moderate, or strong. 4)
Does this mean that owning a gun causes firearm deaths? 5)
39.1% of Hoosiers owned a gun in 2013. Use the linear model to predict the expected firearm death rate in Indiana. Remember to use the decimal version of the percent in the model.
6)
How does the actual firearm death rate of 13.04 compare to the predicted value found in #5? AK
LA
WY
KY
IN
SD
IL
HI
y = 22.946x + 3.1999
0
5
10
15
20
25
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
Number of Firearm Deaths per 100,000 People
Percent of Population that are Registered Gun Owners
Firearm Death Rates per 100,000 (2013)
Explore!
8. Linear Models Linear Models 1)
Leasing a particular car costs $850 plus a monthly fee of $225. Create a model that gives the total cost for a given number of months leased. a)
What will the dependent variable, y
, represent in our model? What units will y
be measured in? b)
What will the independent variable, x
, represent in our model? What units will x
be measured in? c)
What is the starting value for the lease? In other words, what is the y-intercept, 𝒃
? d)
What is the slope, m
? Slope represents the rate at which y
changes when x
increases by 1 month. Is it increasing or decreasing? e)
Put the pieces together to write the model ? = 𝒎? + 𝒃.
f)
How much would it cost to lease the car for 6 months? g)
How many months could you lease the car if your budget was $3,800? 2)
The Happy Widget Company has a fixed cost of $1,500 each day to run their factory and a variable cost of $1.25 for each widget they produce. a)
Create a linear model for their daily cost. What do your x
and y
represent? b)
How much does it cost them to produce 300 widgets? c)
How many widgets can they produce for $2,040? Practice!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
8. Linear Models 3)
Christmas bonuses are based on years of service. Dante is a new employee and his Christmas bonus was $225. Tim, who also works in Dante’s
department, has been with the company for 22 years; his bonus was $675. a)
Calculate and interpret the slope. b)
Construct a linear model that predicts the Christmas bonus for any number of years of service. c)
Use your model to predict the Christmas bonus for an employee who has worked with the company for 13 years. Trendlines and Linear Models 1)
Interpret the slope of the model. 2)
Use the linear model to estimate the meat consumption for 2010. 3)
How does the model estimate compare with the actual values labeled on the graph? 4)
Would it be reasonable to use the model to predict consumption in 2025? 2010, 172.9
y = 1.0472x - 1916.4
60.0
80.0
100.0
120.0
140.0
160.0
180.0
200.0
1905
1925
1945
1965
1985
2005
Pounds Per Person
Meat Consumption Per Capita U.S. 1909-2012
(Data taken from www.earth-policy.org/?/data_center/C23; Scatterplot created by authors) Practice!