
Concept explainers
a.
Construct box plot of the variable price.
Identify whether there are outliers or not.
Find the
Find the first
Find the third quartile value.
a.

Answer to Problem 37CE
Output of box plot for the variable price using MINITAB software is,
Yes, there are 3 outliers in the dataset.
The median price is 3,733.
The first quartile value is 1,478.
The third quartile value is 6,141.
Explanation of Solution
Calculation:
Step by step procedure to obtain boxplot using MINITAB software is given as,
- Choose Graph > Boxplot.
- In Graph variables enter the columns Price.
- Click OK.
Outliers:
In the boxplot, the outlier is represented using asterisk. In the boxplot of data set there are 3 asterisks representing outliers. Hence, there are three outliers in the dataset.
Median:
The median is the middle value of the data set. In the boxplot, the line in middle of the box represents median of the dataset. The line corresponds to value 3,733.
Hence, the median value is 3,733.
First quartile:
The border line towards the left side of the box represents the value of first quartile. In this box plot, the line of the box on left side corresponds to the value approximately 1,478.
Hence, the third quartile value is 6,141.
Third quartile:
The border line towards the right side of the box represents the value of third quartile. In this box plot, the line of the box on right side corresponds to the value approximately 6,141.
Hence, the first quartile value is 1,478.
b.
Construct box plot of the variable size.
Identify whether there are outliers or not.
Find the median price.
Find the first quartile value.
Find the third quartile value.
b.

Answer to Problem 37CE
Output of box plot for the variable size using MINITAB software is,
Yes, there are 3 outliers in the dataset.
The median price is 0.84.
The first quartile value is 0.515.
The third quartile value is 1.12.
Explanation of Solution
Calculation:
Step by step procedure to obtain boxplot using MINITAB software is given as,
- Choose Graph > Boxplot.
- In Graph variables enter the columns Size.
- Click OK.
Outliers:
In the boxplot, the outlier is represented using asterisk. In the boxplot of data set there are 3 asterisks representing outliers. Hence, there are three outliers in the dataset.
Median:
The median is the middle value of the data set. In the boxplot, the line in middle of the box represents median of the dataset. The line corresponds to value 0.84.
Hence, the median value is 0.84.
First quartile:
The border line towards the left side of the box represents the value of first quartile. In this box plot, the line of the box on left side corresponds to the value approximately 0.515.
Hence, the third quartile value is 0.515.
Third quartile:
The border line towards the right side of the box represents the value of third quartile. In this box plot, the line of the box on right side corresponds to the value approximately 1.12.
Hence, the first quartile value is 1.12.
c.
Construct
Identify whether there is association between the two variables or not.
Identify whether association is direct or indirect.
Identify whether any point seems to be different from the others.
c.

Answer to Problem 37CE
Output of scatter diagram for variables price and size using MINITAB software is,
Yes, there is association between the variables price and size.
The association is direct.
Yes, the first observation of both the price and size is large when compared to other observations.
Explanation of Solution
Calculation:
Step by step procedure to obtain scatter diagram using MINITAB software is given as,
- Choose Graph > Scatterplot > select Simple.
- In Y variable enter the column Price.
- In X variable enter the column Size.
- Click OK.
In the scatter diagram it can be observed that, the Price has increased as the Size increases indicating that the association between the variables.
Hence, there is association between the variables price and size
The relation is said to be direct if value of one variable increases due to effect of another variable. From the scatter diagram, the value of Price has increased as the Size increases indicating a direct or positive association.
Hence, the association is direct.
From the scatter diagram, it can be observed that one of the observations corresponding to the value of 5.03 carats for size and $44,312 for price is far from all the other observations. Hence, one point seems to be different from the others.
d.
Construct a
Find the most common cut grade.
Find the most common shape.
Find the most common combination of cut grade and shape.
d.

Answer to Problem 37CE
The contingency table for the variables shape and cut grade is,
Shape | Cut Grade | |||||
Average | Good | Ideal | Premium | Ultra Ideal | Total | |
Emerald | 0 | 0 | 1 | 0 | 0 | 1 |
Marquise | 0 | 2 | 0 | 1 | 0 | 3 |
Oval | 0 | 0 | 0 | 1 | 0 | 1 |
Princess | 1 | 0 | 2 | 2 | 0 | 5 |
Round | 1 | 3 | 3 | 13 | 3 | 23 |
Total | 2 | 5 | 6 | 17 | 3 | 33 |
The most common cut grade is premium.
The most common shape is round.
The most common combination of cut grade and shape is premium and round.
Explanation of Solution
Calculation:
Contingency table:
A table that is used for classifying observations based on the two identifiable characteristics is termed as contingency table. It is used for summarizing two variables.
The variable cut grade is classified into 5 different categories ‘average, good, ideal, premium, ultra ideal’. The variable shape is classified into 5 different categories ‘emerald, marquise, oval, princess, and round’.
Count the number of cut grades are average with shape of emerald. From the data, there is no combination of average cut grades with shape of emerald. Hence, the frequency is 0.
Similarly, count the frequency for each of the possible combination of cut grade and shape. Then calculate the totals for each column and row. The contingency table is obtained as below,
Shape | Cut Grade | |||||
Average | Good | Ideal | Premium | Ultra Ideal | Total | |
Emerald | 0 | 0 | 1 | 0 | 0 | 1 |
Marquise | 0 | 2 | 0 | 1 | 0 | 3 |
Oval | 0 | 0 | 0 | 1 | 0 | 1 |
Princess | 1 | 0 | 2 | 2 | 0 | 5 |
Round | 1 | 3 | 3 | 13 | 3 | 23 |
Total | 2 | 5 | 6 | 17 | 3 | 33 |
The cut grade ‘Premium’ has a total of 17, which is large when compared to other cut grades. This shows that, the most common cut grade of diamonds is ‘Premium.
Hence, the most common cut grade is premium.
The shape ‘Round’ has a total of 23, which is large when compared to other shapes. This shows that, the most common shape of diamonds is ‘Round’.
Hence, the most common shape is round.
The combination of cut grade ‘Premium’ and shape ‘Round’ has a total of 13, which is large when compared to other combinations. This shows that, the most common combination of diamonds is cut grade ‘Premium’ and shape ‘Round’.
Hence, the most common combination of cut grade and shape is premium and round.
Want to see more full solutions like this?
Chapter 4 Solutions
EBK STATISTICAL TECHNIQUES IN BUSINESS
- a) Let X and Y be independent random variables both with the same mean µ=0. Define a new random variable W = aX +bY, where a and b are constants. (i) Obtain an expression for E(W).arrow_forwardThe table below shows the estimated effects for a logistic regression model with squamous cell esophageal cancer (Y = 1, yes; Y = 0, no) as the response. Smoking status (S) equals 1 for at least one pack per day and 0 otherwise, alcohol consumption (A) equals the average number of alcohoic drinks consumed per day, and race (R) equals 1 for blacks and 0 for whites. Variable Effect (β) P-value Intercept -7.00 <0.01 Alcohol use 0.10 0.03 Smoking 1.20 <0.01 Race 0.30 0.02 Race × smoking 0.20 0.04 Write-out the prediction equation (i.e., the logistic regression model) when R = 0 and again when R = 1. Find the fitted Y S conditional odds ratio in each case. Next, write-out the logistic regression model when S = 0 and again when S = 1. Find the fitted Y R conditional odds ratio in each case.arrow_forwardThe chi-squared goodness-of-fit test can be used to test if data comes from a specific continuous distribution by binning the data to make it categorical. Using the OpenIntro Statistics county_complete dataset, test the hypothesis that the persons_per_household 2019 values come from a normal distribution with mean and standard deviation equal to that variable's mean and standard deviation. Use signficance level a = 0.01. In your solution you should 1. Formulate the hypotheses 2. Fill in this table Range (-⁰⁰, 2.34] (2.34, 2.81] (2.81, 3.27] (3.27,00) Observed 802 Expected 854.2 The first row has been filled in. That should give you a hint for how to calculate the expected frequencies. Remember that the expected frequencies are calculated under the assumption that the null hypothesis is true. FYI, the bounderies for each range were obtained using JASP's drag-and-drop cut function with 8 levels. Then some of the groups were merged. 3. Check any conditions required by the chi-squared…arrow_forward
- Suppose that you want to estimate the mean monthly gross income of all households in your local community. You decide to estimate this population parameter by calling 150 randomly selected residents and asking each individual to report the household’s monthly income. Assume that you use the local phone directory as the frame in selecting the households to be included in your sample. What are some possible sources of error that might arise in your effort to estimate the population mean?arrow_forwardFor the distribution shown, match the letter to the measure of central tendency. A B C C Drag each of the letters into the appropriate measure of central tendency. Mean C Median A Mode Barrow_forwardA physician who has a group of 38 female patients aged 18 to 24 on a special diet wishes to estimate the effect of the diet on total serum cholesterol. For this group, their average serum cholesterol is 188.4 (measured in mg/100mL). Suppose that the total serum cholesterol measurements are normally distributed with standard deviation of 40.7. (a) Find a 95% confidence interval of the mean serum cholesterol of patients on the special diet.arrow_forward
- The accompanying data represent the weights (in grams) of a simple random sample of 10 M&M plain candies. Determine the shape of the distribution of weights of M&Ms by drawing a frequency histogram. Find the mean and median. Which measure of central tendency better describes the weight of a plain M&M? Click the icon to view the candy weight data. Draw a frequency histogram. Choose the correct graph below. ○ A. ○ C. Frequency Weight of Plain M and Ms 0.78 0.84 Frequency OONAG 0.78 B. 0.9 0.96 Weight (grams) Weight of Plain M and Ms 0.84 0.9 0.96 Weight (grams) ○ D. Candy Weights 0.85 0.79 0.85 0.89 0.94 0.86 0.91 0.86 0.87 0.87 - Frequency ☑ Frequency 67200 0.78 → Weight of Plain M and Ms 0.9 0.96 0.84 Weight (grams) Weight of Plain M and Ms 0.78 0.84 Weight (grams) 0.9 0.96 →arrow_forwardThe acidity or alkalinity of a solution is measured using pH. A pH less than 7 is acidic; a pH greater than 7 is alkaline. The accompanying data represent the pH in samples of bottled water and tap water. Complete parts (a) and (b). Click the icon to view the data table. (a) Determine the mean, median, and mode pH for each type of water. Comment on the differences between the two water types. Select the correct choice below and fill in any answer boxes in your choice. A. For tap water, the mean pH is (Round to three decimal places as needed.) B. The mean does not exist. Data table Тар 7.64 7.45 7.45 7.10 7.46 7.50 7.68 7.69 7.56 7.46 7.52 7.46 5.15 5.09 5.31 5.20 4.78 5.23 Bottled 5.52 5.31 5.13 5.31 5.21 5.24 - ☑arrow_forwardく Chapter 5-Section 1 Homework X MindTap - Cengage Learning x + C webassign.net/web/Student/Assignment-Responses/submit?pos=3&dep=36701632&tags=autosave #question3874894_3 M Gmail 品 YouTube Maps 5. [-/20 Points] DETAILS MY NOTES BBUNDERSTAT12 5.1.020. ☆ B Verify it's you Finish update: All Bookmarks PRACTICE ANOTHER A computer repair shop has two work centers. The first center examines the computer to see what is wrong, and the second center repairs the computer. Let x₁ and x2 be random variables representing the lengths of time in minutes to examine a computer (✗₁) and to repair a computer (x2). Assume x and x, are independent random variables. Long-term history has shown the following times. 01 Examine computer, x₁₁ = 29.6 minutes; σ₁ = 8.1 minutes Repair computer, X2: μ₂ = 92.5 minutes; σ2 = 14.5 minutes (a) Let W = x₁ + x2 be a random variable representing the total time to examine and repair the computer. Compute the mean, variance, and standard deviation of W. (Round your answers…arrow_forward
- The acidity or alkalinity of a solution is measured using pH. A pH less than 7 is acidic; a pH greater than 7 is alkaline. The accompanying data represent the pH in samples of bottled water and tap water. Complete parts (a) and (b). Click the icon to view the data table. (a) Determine the mean, median, and mode pH for each type of water. Comment on the differences between the two water types. Select the correct choice below and fill in any answer boxes in your choice. A. For tap water, the mean pH is (Round to three decimal places as needed.) B. The mean does not exist. Data table Тар Bottled 7.64 7.45 7.46 7.50 7.68 7.45 7.10 7.56 7.46 7.52 5.15 5.09 5.31 5.20 4.78 5.52 5.31 5.13 5.31 5.21 7.69 7.46 5.23 5.24 Print Done - ☑arrow_forwardThe median for the given set of six ordered data values is 29.5. 9 12 23 41 49 What is the missing value? The missing value is ☐.arrow_forwardFind the population mean or sample mean as indicated. Sample: 22, 18, 9, 6, 15 □ Select the correct choice below and fill in the answer box to complete your choice. O A. x= B. μεarrow_forward
- Big Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin HarcourtGlencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillHolt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGAL


