STAT311W24Week4
Rmd
keyboard_arrow_up
School
University of Washington *
*We aren’t endorsed by this school
Course
200
Subject
Statistics
Date
Feb 20, 2024
Type
Rmd
Pages
8
Uploaded by CommodoreSeahorseMaster945
---
title: "Lab 2: Introduction to Data"
author: "STAT 311"
date: "Winter 2024"
output: openintro::lab_report
---
Adapted from: https://openintrostat.github.io/oilabs-tidy/02_intro_to_data/intro_to_data.html
```{r load-packages, message=FALSE, warning=FALSE}
library(tidyverse)
library(openintro)
```
In this lab we explore flights, specifically a random sample of domestic flights that departed from the three major New York City airports in 2013. We will generate simple graphical and numerical summaries of data on these flights and explore delay times. Since this is a large data set, along the way you’ll also learn the indispensable skills of data processing and subsetting.
First, we’ll view the nycflights data frame.Let's load it into our environment.
```{r}
data(nycflights)
``` The data set nycflights is a data matrix, with each row representing an observation
and each column representing a variable. R calls this data format a data frame, which is a term that will be used throughout the labs. For this data set, each observation is a single flight.
To view the names of the variables, type the command
```{r}
names(nycflights)
```
The codebook (description of the variables) can be accessed by pulling up the help file:
```{r}
?nycflights
```
Remember that you can use glimpse to take a quick peek at your data to understand its contents better.
```{r}
glimpse(nycflights)
```
The nycflights data frame is a massive trove of information. Let’s think about some
questions we might want to answer with these data:
- How delayed were flights that were headed to Los Angeles?
- How do departure delays vary by month?
- Which of the three major NYC airports has the best on time percentage for departing flights?
Let’s start by examining the distribution of departure delays of all flights with a
histogram.
```{r}
ggplot(data = nycflights, aes(x = dep_delay)) +
geom_histogram()
```
This function says to plot the dep_delay variable from the nycflights data frame on
the x-axis. It also defines a geom (short for geometric object), which describes the type of plot you will produce.
Histograms are generally a very good way to see the shape of a single distribution of numerical data, but that shape can change depending on how the data is split between the different bins. You can easily define the binwidth you want to use:
```{r}
ggplot(data = nycflights, aes(x = dep_delay)) +
geom_histogram(binwidth = 15)
```
```{r}
ggplot(data = nycflights, aes(x = dep_delay)) +
geom_histogram(binwidth = 150)
```
### Exercise 1
**Question:** Look carefully at these three histograms. How do they compare? Are features revealed in one that are obscured in another?
#### Solution
<!-- End of Exercise 1 -->
If you want to visualize only delays of flights headed to Los Angeles, you need to first filter the data for flights with that destination (dest == "LAX") and then make a histogram of the departure delays of only those flights.
```{r}
lax_flights <- nycflights %>% filter(dest == "LAX")
lax_flights
```
(Note, it’s common to add a break to a new line after %>% to help readability)
Let’s decipher this command: Take the nycflights data frame, filter for flights headed to LAX, and save the result as a new data frame called lax_flights.
- == means “if it’s equal to”.
- LAX is in quotation marks since it is a character string.
```{r}
#Histogram of departure delays for flights flying to LAX
ggplot(data = lax_flights, aes(x = dep_delay)) +
geom_histogram()
```
Logical operators: Filtering for certain observations (e.g. flights from a particular airport) is often of interest in data frames where we might want to examine observations with certain characteristics separately from the rest of the data. To do so, you can use the filter function and a series of logical operators. The most commonly used logical operators for data analysis are as follows:
- == means “equal to”
- != means “not equal to”
- > or < means “greater than” or “less than”
- >= or <= means “greater than or equal to” or “less than or equal to”
You can also obtain numerical summaries for these flights:
```{r}
lax_flights %>%
summarise(mean_dd = mean(dep_delay), median_dd = median(dep_delay), n = n())
```
Note that in the summarise function you created a list of three different numerical
summaries that you were interested in. The names of these elements are user defined, like mean_dd, median_dd, n, and you can customize these names as you like (just don’t use spaces in your names). Calculating these summary statistics also requires that you know the function calls. Note that n() reports the sample size.
Summary statistics: Some useful function calls for summary statistics for a single numerical variable are as follows:
- mean
- median
- sd
- var
- IQR
- min
- max
Note that each of these functions takes a single vector as an argument and returns a single value.
### Exercise 2
**Question:** Create a new data frame that includes flights headed to SFO in February, and save this data frame as `sfo_feb_flights.` How many flights meet these criteria?
#### Solution
```{r code-chunk-label}
# creating a data set with flights flying to SFO in Feb
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
#outputting the number of flights to SFO in February
```
### Exercise 3
**Question:** Describe the distribution of the **arrival** delays of these flights using a histogram and appropriate summary statistics.
#### Solution
Exploring a histogram with the data.
```{r, message = FALSE, warning = FALSE}
```
Looking at the summary statistics:
```{r}
# display summary statistics
```
The histogram shows a possible right skew in the data. This is further reinforced by the fact that the mean is larger than the median.
A right skew would indicate that most flights are early, but there are a few heavily delayed flights that are skewing the mean towards higher values.
<!-- End of Exercise 3 -->
Another useful technique is quickly calculating summary statistics for various groups in your data frame. For example, we can modify the above command using the group_by function to get the same summary stats for each origin airport:
```{r}
# sfo_feb_flights %>%
# group_by(origin) %>%
# summarise(median_dd = median(dep_delay), iqr_dd = IQR(dep_delay), n_flights = n())
```
### Exercise 4
**Question:** Calculate the median and interquartile range for `arr_delays` of flights in in the `sfo_feb_flights` data frame, grouped by carrier. Which carrier has the most variable arrival delays?
#### Solution
```{r}
```
We can see that there are only 5 carriers that operated flights (in our data set) to SFO in February of 2013.
Based on the IQR values on the previous slide, it looks like Delta Airlines and United Airlines have the most variable arrival delays because they have the highest
IQR. <!-- End of Exercise 4 -->
### Exercise 4.1
**Question:** Which month has the highest average delay departing from an NYC airport?
```{r}
```
### Exercise 5
**Question:** Suppose you really dislike departure delays and you want to schedule your travel in a month that minimizes your potential departure delay leaving NYC. One option is to choose the month with the lowest mean departure delay. Another option is to choose the month with the lowest median departure delay. What are the pros and cons of these two choices?
#### Solution
**Con of using the median departure delay:**
**Con of using the mean departure delay.**
<!-- End of Exercise 5 -->
### Exercise 6
Suppose you will be flying out of NYC and want to know which of the three major NYC
airports has the best on time departure rate of departing flights. Also supposed that for you, a flight that is delayed for less than 5 minutes is basically “on time.”" You consider any flight delayed for 5 minutes or more to be “delayed”. **Question:** If you were selecting an airport simply based on on time departure percentage, which NYC airport would you choose to fly out of?
In order to determine which airport has the best on time departure rate, you can
- first classify each flight as “on time” or “delayed”,
- then group flights by origin airport,
- then calculate on time departure rates for each origin airport,
- and finally filter the summary output to include only the aiport with the highest
on time departure rate.
Let’s start with classifying each flight as “on time” or “delayed” by creating a new variable with the mutate function.
```{r}
nycflights <- nycflights %>%
mutate(dep_type = ifelse(dep_delay < 5, "on time", "delayed"))
nycflights
```
The first argument in the mutate function is the name of the new variable we want to create, in this case dep_type. Then if dep_delay < 5, we classify the flight as "on time" and "delayed" if not, i.e. if the flight is delayed for 5 or more minutes. Note that we are also overwriting the nycflights data frame with the new version of this data frame that includes the new dep_type variable.
We can handle all of the remaining steps in one code chunk:
```{r}
nycflights %>%
group_by(origin) %>% #group nyc flights by origin airport
summarise(ot_dep_rate = sum(dep_type == "on time") / n()) %>% #on-time dep rate
filter(ot_dep_rate == max(ot_dep_rate)) #airport with best on-time dep rate
```
We can see that `LGA` has the best on time departure percentage.
You can also visualize the distribution of on on time departure rate across the three airports using a segmented bar plot.
```{r}
ggplot(data = nycflights, aes(x = origin, fill = dep_type)) +
geom_bar()
```
### Exercise 7
**Question:** Mutate the data frame so that it includes a new variable that contains the average speed, `avg_speed` traveled by the plane for each flight (in mph). **Hint:** Average speed can be calculated as distance divided by number of hours of travel, and note that `air_time` is given in minutes.
#### Solution
This exercise is asking us to create the new `avg_speed` variable. Recall that the equation for average speed in miles per hour is given by
$$ \text{ avg speed} = \dfrac{\text{distance in miles}}{\text{time in hours}} $$ Also recall that the `time` variable is given in minutes. This means
$$ \mathtt{avg\underline{}speed} = \dfrac{\mathtt{distance \ in \ miles}}{\
mathtt{time \ in \ minutes}} \; \mathtt{\ast \ \dfrac{\mathtt{60 \ minutes}}{\
mathtt{1 \ hr}}} $$
Coding this in R:
```{r}
```
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
### Exercise 8
**Question:** Make a scatterplot of `avg_speed` vs. `distance.` Describe the relationship between average speed and distance. Hint: Use `geom_point()`.
#### Solution
```{r, message=FALSE}
```
##### Exercise 8 Follow Up
**Why are the `distance` values on the x-axis represented as a "stack" of points?**
To determine the answer to this question, it is best to start by looking at the first 20 observations for the `distance` variable.
```{r}
head(nycflights$distance, 20)
```
Notice that these values are all integers. The fact that `distance` is made up of only integer values (whole numbers) means there is a finite (limited) number of choices for the value of `distance`, which makes "stacked" points almost inevitable for large data sets. Let's look at how many unique distance's were traveled.
```{r}
```
Note that our data set has 32735 observations in total, but only 204 different distances were traveled. ### Exercise 9
**Question:** Replicate the plot at the bottom of this page: https://openintrostat.github.io/oilabs-tidy/02_intro_to_data/intro_to_data.html
**Hint:** The data frame plotted only contains flights from American Airlines, Delta Airlines, and United Airlines, and the points are `color`ed by `carrier`. Once you replicate the plot, determine (roughly) what the cutoff point is for departure delays where you can still expect to get to your destination on time.
#### Solution
Recall that the desired plot only contains observations for flights flown by Delta Airline, American Airlines, and United Airlines. This necessitates that we `filter`
out the data based on this condition.
Let's start by filtering the flights flown by Delta (`DL`), American (`AA`), and United (`UA`).
```{r}
# filtering flights from Delta, American, and United
```
```{r}
# plotting the filtered data, colored by carrier
```
In addition to replicating the above plot, we were also asked to determine what the
"cutoff" point was for a flight that departed late to still arrive on time. ```{r}
#determine what departure/arrival flights are considered "on time" vs. "delayed" based on the 5 min criteria from exercise 6
```
```{r}
#find the max flight departure delay that does not results in a delayed arrival
```
##### Fun Fact about the `color = ` Option
The `color = ` option gets applied to any items you add to the plot after the `aes`
function. To see this, observe what happens when we use the `geom_smooth` function to add trend lines for each carrier.
```{r}
# ggplot(nycflights_filtered) +
# aes(x = dep_delay, y = arr_delay, color = carrier) +
# geom_point() +
# geom_smooth(method = "lm", se = FALSE) + # labs(title = "Depature Delay Time vs. Arrival Delay Time",
# subtitle = "Plotted by Carrier for Delta Airlines, American Airlines, and United Airlines") +
# xlab("Length of Departure Delay (in minutes)") +
# ylab("Length of Arrival Delay (in minutes)")
```
Notice that adding lines to the plot also added lines to the legend.
Related Documents
Related Questions
The National Highway Traffic Safety Administration (NHTSA) collects traffic safety-related data for the U.S. Department of Transportation. According to NHTSA's data, fatal collisions in were the result of collisions with fixed objects (NHTSA website, https://www.careforcrashvictims.com/wp-content/uploads/2018/07/Traffic-Safety-Facts-2016_-Motor-Vehicle-Crash-Data-from-the-Fatality-Analysis-Reporting-System-FARS-and-the-General-Estimates-System-GES.pdf). The following table provides more information on these collisions.Excel File: data04-21.xlsx
Fixed Object Involved in Collision
Number of Collisions
Pole/post
1,416
Culvert/curb/ditch
2,516
Shrubbery/tree
2,585
Guardrail
896
Embankment
947
Bridge
231
Other/unknown
1,835
Assume that a collision will be randomly chosen from this population. Round your answers to four decimal places.
arrow_forward
Data mining is the extraction of knowledge and data patterns from various raw data sets by examining patterns from various raw data sets by examining trends and business reports used for classification of data and prediction of the data set.
Give an example of an actual or potential application of big data or data mining in a marketing organization. Describe how the application meets the criteria of being big data or data mining.
arrow_forward
How Democratic is Georgia? Cou X
com/webapps/discussionboard/do/conference?action=list_forums&course_id=_21308_1&nav=discussion
M
ek 6
DESCRIPTION
Political Science: Georgia Democrats How Democratic is
Georgia? County-by-county results are shown for a recent
election. For your convenience, the data have been sorted in
increasing order (Source: County and City Data Book,
12th edition, U.S. Census Bureau).
Percentage of Democratic Vote by Counties in Georgia
31
41
33 34 34
46
55
+
56
Q Search
5
2
5
49
7
9
S
56 57 57 59
G
38 38 39
43
- ©
hp
a. Make a box-and-whisker plot of the data. Find the
interquartile range.
50
66
$
$
♫
8
V
N
8
TOTAL
POSTS
33
hs
W
arrow_forward
https://docs.google.com/spreadsheets/d/10QitIiY-vJVaC88bhhduSVoBw0cXRzsYbPZ-vCZx_Kg/edit?usp=sharing
arrow_forward
A data set contains the observations 8,5,4,6,9. find ( ∑x )^2
arrow_forward
Sahar Rasoul-Math 7 End of Yea X Gspy ninjas book-Google
docs.google.com/spreadsheets/d/1j5MotWzsc0V1V3Qyl4rbP_OFOUotaNXCIIFax>
Copy of Copy of Col...
8.8
Sahar Rasoul - Math 7 End of Year Digital Task Cards Student Version ☆
File Edit View Insert Format Data Tools Extensions Help Last edit was 5 minu
$ % .0 .00 123 Century Go... ▼ 18 Y BIS
fx| =IF(B4="Question 1", Sheet2! H21, if(B4="Question 2", Sheet2! H22, IF(B4="
n
100%
36:816
A
B
C
6
16
A flashlight can light
a circular area of up
to 6 feet in diameter.
What is the maximum
area that can be lit?
Round to the nearest
tenth.
30x
0004
15
A Sheet1
https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.amazon.com%2FSpy-Ninjas-Ultimate-Guidebook-Scholastic%2Fdp
7
8
9
10
11
12
13
14
3
5.
7.
a
5
$9
A
arrow_forward
After these steps, your data sheet should look like below (only part of my data sheet):
A
B
C
D
E
F
G
H
I
Mixture Mostly_Building Mostly_Open Mostly_Sky Mostly_Trees Most_Water Other Roads_Cars
1
2 Stress_level
3 Stress_level
4 Stress_level
5 Stress_level
6 Stress_level
7
Stress_level
8 Stress_level
9 Stress_level
10 Stress_level
11 Stress_level
12 Stress_level
13 Stress_level
14 Stress_level
15 Stress_level
16 Stress_level
17 Stress_level
18 Stress_level
19 Stress_level
20 Stress_level
21 Stress_level
22 Stress_level
23 Stress_level
24 Stress_level
25 Stress_level
26 Stress_level
27 Stress_level
2
4
2
2
1
2
2
1
2
1
3
1
1
2
2
1
1
3
3
2
2
3
2
2
1
2
1
4
2
2
3
4
1
5
3
3
3
1
2
1
1
4
3
1
3
4
6
5
1
4
1
2
3
2
1
2
3
2
2
3
3
1
3
2
1
2
1
3
2
1
1
1
4
1
1
3
1
3
1
1
2
3
2
2
2
1
3
2
1
2
1
2
5
1
1
2
1
1
3
3
1
4
2
2
2
2
1
2 1
1
1
2
1
1
1
3
1
1
1
1
1
3
1
3
1
1
5
1
3
2
1
2
3
3
5
4
5
3
4
4
5
4
5
5
5
2
4
4
6
3
6
3
7
2
3
2
is in
b. Conduct Single Factor ANOVA analysis as we did in Unit 10. Note: your data…
arrow_forward
List an advatge of microdata, and of aggregated data
Answer in a few sentences thank you.
arrow_forward
The demand and forecast information for the XYZ Company over a twelve-month period has been collected in the Microsoft Excel Online file below. Use the Microsoft Excel Online file below to develop forecast accuracy and answer the following questions.
Forecast Accuracy Measures
Period
Actual Demand
Forecast
Error
Absolute Error
Error^2
Abs. % Error
1
1,300
1,378
2
2,000
1,676
3
1,800
1,974
4
1,700
2,272
5
2,300
2,570
6
3,800
2,868
7
3,200
3,166
8
3,100
3,464
9
3,900
3,761
10
4,600
4,059
11
4,200
4,357
12
4,300
4,655
Total
Average
RSFE
MAD
MSE
MAPE
Tracking Signal
1. What can be concluded about the quality of the forecasts? Assume that the control limit for the tracking signal is ±3. The results indicate (bias or no bias) in the…
arrow_forward
Can you answer A,B,C with clear answers. You can use the data in the second photo
arrow_forward
If the average number of yards per game of all the HS wide receivers catches with 50 attempts in the 2010 season averaged 49 yards per game. A sample of 20 wide receivers from 2010 averaged 46.54 yards per game.
Is 49 yards a parameter or statistic? and provide correct statistical notation
Is 46.54 years a parameter or statistic? and provide correct statistical notation
arrow_forward
COULD YOU SOLVE IT WITH EXCEL SOLVER. I NEED EXCEL SOLVER SOLUTION AND ALSO COULD YOU UPLOAD ANSWER WITH EXCEL SOLVER PHOTOS.
please do not provide solution in image format thank you!
arrow_forward
what three articles relate to the Data Analytics & Statistical Applications topics including the links?
arrow_forward
the link to the data is given below. please help asap i will upvote!!
https://drive.google.com/file/d/18JRXEQEk8c-voKyNTUhUM9aZGhCWEr1N/view?usp=sharing
arrow_forward
Explain how and why secondary sources should be referenced in research papers and reports
arrow_forward
The table below displays the adult literacy rate in Bolivia for several different years. The adult literacy rate is the percentage of people ages 15 and above who can both read and write with understanding a short simple statement about their everyday life.
Data downloaded on 2/19/2020 from https://ourworldindata.org/grapher/literacy-rate-adults?tab=chart&time=1973..2016.
Year 1976 2001 2012
Literacy Rate 63.2% 86.7% 94.5%
When answering the questions below, round to four decimal places in your intermediate computations.
Use interpolation or extrapolation (whichever is appropriate) to predict the literacy rate in Bolivia in 1992. Round your answer to one decimal place. You only get one submission for the unit.
---Select---
Use interpolation or extrapolation (whichever is appropriate) to predict the literacy rate in Bolivia in 2050. Round your answer to one decimal place. You only get one submission for the unit.
---Select---
Is your 2050 prediction realistic? You must…
arrow_forward
com/static/nb/ui/evo/index.html?deploymentld%3D57211919418147002668187168&elSBN=9781337114264&id%3D900392331&snapshotld%3D19233498
GE MINDTAP
Q Search this course
-ST 260
Save
Submit Assignment for Grading
ons
Exercise 08.46 Algorithmic
« Question 10 of 10
Check My Work (4 remaining)
eBook
The 92 million Americans of age 50 and over control 50 percent of all discretionary income. AARP estimates that the average annual expenditure on restaurants and carryout
food was $1,876 for individuals in this age group. Suppose this estimate is based on a sample of 80 persons and that the sample standard deviation is $550. Round your
answers to the nearest whole numbers.
a. At 99% confidence, what is the margin of error?
b. What is the 99% confidence interval for the population mean amount spent on restaurants and carryout food?
C. What is your estimate of the total amount spent by Americans of age 50 and over on restaurants and carryout food?
million
d. If the amount spent on restaurants and…
arrow_forward
Please do question 10 B part. Thanks
arrow_forward
https://docs.google.com/spreadsheets/d/10QitIiY-vJVaC88bhhduSVoBw0cXRzsYbPZ-vCZx_Kg/edit?usp=sharing
Here is the link for the excel
arrow_forward
On the text website, http://www.pearsonhighered.com/stock_watson/, you will find the data file Growth, which contains data on average growth rates from 1960 through 1995 from 65 countries, along with variables that are potentially related to growth. A detailed description is given in Growth_Description, also available on the website. In this exercise, you will investigate the relationship between growth and trade.
a) Construct a scatterplot of average annual growth rate (Growth) on the average trade share (TradeShare). Does there appear to be a relationship between the variables?
b) One country, Malta, has a trade share much larger than the other countries. Find Malta on the scatterplot. Does Malta look like an outlier?
c) Using all observations, run a regression of Growth on TradeShare. What is the estimated slope? What is the estimated intercept? Use the regression to predict the growth rate for a country with a trade share of 0.5 and for another with a trade share equal to 1.0.
d)…
arrow_forward
On the text website, http://www.pearsonhighered.com/stock_watson/, you will find the data file Growth, which contains data on average growth rates from 1960 through 1995 from 65 countries, along with variables that are potentially related to growth. A detailed description is given in Growth_Description, also available on the website. In this exercise, you will investigate the relationship between growth and trade.
a) Construct a scatterplot of average annual growth rate (Growth) on the average trade share (TradeShare). Does there appear to be a relationship between the variables? Have been answered before by bartleby
b) One country, Malta, has a trade share much larger than the other countries. Find Malta on the scatterplot. Does Malta look like an outlier? Have been answered before by bartleby
c) Using all observations, run a regression of Growth on TradeShare. What is the estimated slope? What is the estimated intercept? Use the regression to predict the growth rate for a country…
arrow_forward
What is the advantage of using existing datasets as a data collection method?
A It provides the most accurate and reliable data
B
It allows for customization and control over the data collection process
It saves time and resources by utilizing data already available
All of the above
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
MATLAB: An Introduction with Applications
Statistics
ISBN:9781119256830
Author:Amos Gilat
Publisher:John Wiley & Sons Inc
Probability and Statistics for Engineering and th...
Statistics
ISBN:9781305251809
Author:Jay L. Devore
Publisher:Cengage Learning
Statistics for The Behavioral Sciences (MindTap C...
Statistics
ISBN:9781305504912
Author:Frederick J Gravetter, Larry B. Wallnau
Publisher:Cengage Learning
Elementary Statistics: Picturing the World (7th E...
Statistics
ISBN:9780134683416
Author:Ron Larson, Betsy Farber
Publisher:PEARSON
The Basic Practice of Statistics
Statistics
ISBN:9781319042578
Author:David S. Moore, William I. Notz, Michael A. Fligner
Publisher:W. H. Freeman
Introduction to the Practice of Statistics
Statistics
ISBN:9781319013387
Author:David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:W. H. Freeman
Related Questions
- The National Highway Traffic Safety Administration (NHTSA) collects traffic safety-related data for the U.S. Department of Transportation. According to NHTSA's data, fatal collisions in were the result of collisions with fixed objects (NHTSA website, https://www.careforcrashvictims.com/wp-content/uploads/2018/07/Traffic-Safety-Facts-2016_-Motor-Vehicle-Crash-Data-from-the-Fatality-Analysis-Reporting-System-FARS-and-the-General-Estimates-System-GES.pdf). The following table provides more information on these collisions.Excel File: data04-21.xlsx Fixed Object Involved in Collision Number of Collisions Pole/post 1,416 Culvert/curb/ditch 2,516 Shrubbery/tree 2,585 Guardrail 896 Embankment 947 Bridge 231 Other/unknown 1,835 Assume that a collision will be randomly chosen from this population. Round your answers to four decimal places.arrow_forwardData mining is the extraction of knowledge and data patterns from various raw data sets by examining patterns from various raw data sets by examining trends and business reports used for classification of data and prediction of the data set. Give an example of an actual or potential application of big data or data mining in a marketing organization. Describe how the application meets the criteria of being big data or data mining.arrow_forwardHow Democratic is Georgia? Cou X com/webapps/discussionboard/do/conference?action=list_forums&course_id=_21308_1&nav=discussion M ek 6 DESCRIPTION Political Science: Georgia Democrats How Democratic is Georgia? County-by-county results are shown for a recent election. For your convenience, the data have been sorted in increasing order (Source: County and City Data Book, 12th edition, U.S. Census Bureau). Percentage of Democratic Vote by Counties in Georgia 31 41 33 34 34 46 55 + 56 Q Search 5 2 5 49 7 9 S 56 57 57 59 G 38 38 39 43 - © hp a. Make a box-and-whisker plot of the data. Find the interquartile range. 50 66 $ $ ♫ 8 V N 8 TOTAL POSTS 33 hs Warrow_forward
- https://docs.google.com/spreadsheets/d/10QitIiY-vJVaC88bhhduSVoBw0cXRzsYbPZ-vCZx_Kg/edit?usp=sharingarrow_forwardA data set contains the observations 8,5,4,6,9. find ( ∑x )^2arrow_forwardSahar Rasoul-Math 7 End of Yea X Gspy ninjas book-Google docs.google.com/spreadsheets/d/1j5MotWzsc0V1V3Qyl4rbP_OFOUotaNXCIIFax> Copy of Copy of Col... 8.8 Sahar Rasoul - Math 7 End of Year Digital Task Cards Student Version ☆ File Edit View Insert Format Data Tools Extensions Help Last edit was 5 minu $ % .0 .00 123 Century Go... ▼ 18 Y BIS fx| =IF(B4="Question 1", Sheet2! H21, if(B4="Question 2", Sheet2! H22, IF(B4=" n 100% 36:816 A B C 6 16 A flashlight can light a circular area of up to 6 feet in diameter. What is the maximum area that can be lit? Round to the nearest tenth. 30x 0004 15 A Sheet1 https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.amazon.com%2FSpy-Ninjas-Ultimate-Guidebook-Scholastic%2Fdp 7 8 9 10 11 12 13 14 3 5. 7. a 5 $9 Aarrow_forward
- After these steps, your data sheet should look like below (only part of my data sheet): A B C D E F G H I Mixture Mostly_Building Mostly_Open Mostly_Sky Mostly_Trees Most_Water Other Roads_Cars 1 2 Stress_level 3 Stress_level 4 Stress_level 5 Stress_level 6 Stress_level 7 Stress_level 8 Stress_level 9 Stress_level 10 Stress_level 11 Stress_level 12 Stress_level 13 Stress_level 14 Stress_level 15 Stress_level 16 Stress_level 17 Stress_level 18 Stress_level 19 Stress_level 20 Stress_level 21 Stress_level 22 Stress_level 23 Stress_level 24 Stress_level 25 Stress_level 26 Stress_level 27 Stress_level 2 4 2 2 1 2 2 1 2 1 3 1 1 2 2 1 1 3 3 2 2 3 2 2 1 2 1 4 2 2 3 4 1 5 3 3 3 1 2 1 1 4 3 1 3 4 6 5 1 4 1 2 3 2 1 2 3 2 2 3 3 1 3 2 1 2 1 3 2 1 1 1 4 1 1 3 1 3 1 1 2 3 2 2 2 1 3 2 1 2 1 2 5 1 1 2 1 1 3 3 1 4 2 2 2 2 1 2 1 1 1 2 1 1 1 3 1 1 1 1 1 3 1 3 1 1 5 1 3 2 1 2 3 3 5 4 5 3 4 4 5 4 5 5 5 2 4 4 6 3 6 3 7 2 3 2 is in b. Conduct Single Factor ANOVA analysis as we did in Unit 10. Note: your data…arrow_forwardList an advatge of microdata, and of aggregated data Answer in a few sentences thank you.arrow_forwardThe demand and forecast information for the XYZ Company over a twelve-month period has been collected in the Microsoft Excel Online file below. Use the Microsoft Excel Online file below to develop forecast accuracy and answer the following questions. Forecast Accuracy Measures Period Actual Demand Forecast Error Absolute Error Error^2 Abs. % Error 1 1,300 1,378 2 2,000 1,676 3 1,800 1,974 4 1,700 2,272 5 2,300 2,570 6 3,800 2,868 7 3,200 3,166 8 3,100 3,464 9 3,900 3,761 10 4,600 4,059 11 4,200 4,357 12 4,300 4,655 Total Average RSFE MAD MSE MAPE Tracking Signal 1. What can be concluded about the quality of the forecasts? Assume that the control limit for the tracking signal is ±3. The results indicate (bias or no bias) in the…arrow_forward
- Can you answer A,B,C with clear answers. You can use the data in the second photoarrow_forwardIf the average number of yards per game of all the HS wide receivers catches with 50 attempts in the 2010 season averaged 49 yards per game. A sample of 20 wide receivers from 2010 averaged 46.54 yards per game. Is 49 yards a parameter or statistic? and provide correct statistical notation Is 46.54 years a parameter or statistic? and provide correct statistical notationarrow_forwardCOULD YOU SOLVE IT WITH EXCEL SOLVER. I NEED EXCEL SOLVER SOLUTION AND ALSO COULD YOU UPLOAD ANSWER WITH EXCEL SOLVER PHOTOS. please do not provide solution in image format thank you!arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- MATLAB: An Introduction with ApplicationsStatisticsISBN:9781119256830Author:Amos GilatPublisher:John Wiley & Sons IncProbability and Statistics for Engineering and th...StatisticsISBN:9781305251809Author:Jay L. DevorePublisher:Cengage LearningStatistics for The Behavioral Sciences (MindTap C...StatisticsISBN:9781305504912Author:Frederick J Gravetter, Larry B. WallnauPublisher:Cengage Learning
- Elementary Statistics: Picturing the World (7th E...StatisticsISBN:9780134683416Author:Ron Larson, Betsy FarberPublisher:PEARSONThe Basic Practice of StatisticsStatisticsISBN:9781319042578Author:David S. Moore, William I. Notz, Michael A. FlignerPublisher:W. H. FreemanIntroduction to the Practice of StatisticsStatisticsISBN:9781319013387Author:David S. Moore, George P. McCabe, Bruce A. CraigPublisher:W. H. Freeman
MATLAB: An Introduction with Applications
Statistics
ISBN:9781119256830
Author:Amos Gilat
Publisher:John Wiley & Sons Inc
Probability and Statistics for Engineering and th...
Statistics
ISBN:9781305251809
Author:Jay L. Devore
Publisher:Cengage Learning
Statistics for The Behavioral Sciences (MindTap C...
Statistics
ISBN:9781305504912
Author:Frederick J Gravetter, Larry B. Wallnau
Publisher:Cengage Learning
Elementary Statistics: Picturing the World (7th E...
Statistics
ISBN:9780134683416
Author:Ron Larson, Betsy Farber
Publisher:PEARSON
The Basic Practice of Statistics
Statistics
ISBN:9781319042578
Author:David S. Moore, William I. Notz, Michael A. Fligner
Publisher:W. H. Freeman
Introduction to the Practice of Statistics
Statistics
ISBN:9781319013387
Author:David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:W. H. Freeman