Refer to the North Valley Real Estate data that reports information on homes sold during the last year. For the variable price, select an appropriate class interval and organize the selling prices into a frequency distribution. Write a brief report summarizing your findings. Be sure to answer the following questions in your report.
- a. Around what values of price do the data tend to cluster?
- b. Based on the frequency distribution, what is the typical selling price in the first class? What is the typical selling price in the last class?
- c. Draw a cumulative relative frequency distribution. Using this distribution, fifty percent of the homes sold for what price or less? Estimate the lower price of the top ten percent of homes sold. About what percent of the homes sold for less than $300,000?
- d. Refer to the variable bedrooms. Draw a bar chart showing the number of homes sold with 2, 3, 4 or more bedrooms. Write a description of the distribution.
a.
Find an appropriate class interval.
Create a frequency distribution for the selling prices and explain the results.
At what values of price the data tend to cluster.
Answer to Problem 51DA
The frequency distribution for the selling price is given below:
Selling price (1,000’s) | Frequency |
Cumulative frequency |
120-240 | 26 | 26 |
240-360 | 36 | |
360-480 | 27 | |
480-600 | 7 | |
600-720 | 4 | |
720-840 | 2 | |
840-960 | 3 | |
Total | 105 |
Explanation of Solution
Selection of number of classes:
“2 to the k rule” suggests that the number of classes is the smallest value of k, where
It is given that the data set consists of 105 observations. The value of k can be obtained as follows:
Here,
Therefore, the number of classes for the given data set is 7.
From the data set North Valley Real Estate Data, the maximum and minimum values are 919,480 and 167,962, respectively.
The formula for the class interval is given as follows:
Where, i is the class interval and k is the number of classes.
Therefore, the class interval for the given data can be obtained as follows:
In practice, the class interval size is usually rounded up to some convenient number. Therefore, the reasonable class interval is 120,000.
Frequency distribution:
The frequency table is a collection of mutually exclusive and exhaustive classes, which shows the number of observations in each class.
Since the minimum value is 167,962 and the class interval is 120,000, the first class would be 120,000-240,000 or 160,000-280,000. Here, the first one is preferred as the first class of the frequency distribution. The frequency distribution for the selling price can be constructed as follows:
Selling price (1,000’s) | Frequency |
Cumulative frequency |
120-240 | 26 | 26 |
240-360 | 36 | |
360-480 | 27 | |
480-600 | 7 | |
600-720 | 4 | |
720-840 | 2 | |
840-960 | 3 | |
Total | 105 |
From the above table, 89 out of 105 homes are sold between $120,000 and $480,000.
b.
Find the typical selling in first class.
Find the typical selling in last class.
Answer to Problem 51DA
The typical selling price of first class is $180,000.
The typical selling price of last class is $900,000.
Explanation of Solution
The lower and upper limits of the first class are $120,000 and $240,000.
The typical selling price of first class is calculated as follows:
Thus, typical selling price of first class is $180,000.
The lower and upper limits of the last class are $840,000 and $960,000.
The typical selling price of last class is calculated as follows:
Thus, typical selling price of last class is $900,000.
c.
Create a cumulative relative frequency polygon for the frequency distribution.
Identify the price for which less than 50% of the homes are sold.
Find the lower price of the top 10% of homes are sold.
About what percentage of homes are sold for less than $300,000.
Answer to Problem 51DA
The cumulative frequency polygon for the given data is as follows:
There are 50% of the homes sold approximately less than $254,000.
The top 10% of homes are sold for at least $520,000.
There are 59% of homes sold for less than $300,000.
Explanation of Solution
For the given data set, the cumulative relative frequency table with midpoints of classes is obtained as follows:
Selling price (1,000’s) | Midpoint |
Cumulative frequency |
Relative cumulative frequency |
120-240 | 26 | ||
240-360 | 62 | ||
360-480 | 89 | ||
480-600 | 96 | ||
600-720 | 100 | ||
720-840 | 102 | ||
840-960 | 105 |
The cumulative relative frequency polygon for the given data can be drawn using EXCEL.
Step-by-step procedure to obtain the frequency polygon using EXCEL is as follows:
- Enter the column of midpoints along with the cumulative relative frequency column.
- Select the total data range with labels.
- Go to Insert > Charts > line chart.
- Select the appropriate line chart.
- Click OK.
From the above cumulative relative frequency polygon, 50% of the homes are sold approximately less than $254,000.
The top 10% of homes are sold for at least $520,000.
There are 59% of homes sold for less than $300,000.
d.
Create a bar chart for the number of bedrooms for the variable bedrooms.
Answer to Problem 51DA
The bar chart for the number of bedrooms for the variable bedrooms is as follows:
Explanation of Solution
For the variable bedrooms, the frequency table is obtained as follows:
Number of bedrooms | Frequency |
2 | 24 |
3 | 26 |
4 or more | 55 |
Total | 105 |
The bar chart for the given data can be drawn using EXCEL.
Step-by-step procedure to obtain the bar chart using EXCEL is as follows:
- Enter the column of bedrooms along with the frequency column.
- Select the total data range with labels.
- Go to Insert > Charts > bar chart.
- Select the appropriate bar chart.
- Click OK.
From the above bar chart, the highest frequency occurred in the last category, the shape of the distribution is negatively skewed.
Want to see more full solutions like this?
Chapter 2 Solutions
Loose Leaf for Statistical Techniques in Business and Economics
- Install RStudio: Begin by installing RStudio on your computer. If you haven't done so, please refer to the official RStudio website for download and installation instructions. Watch the Tutorial Video: Watch the provided video tutorial that explains how to run RStudio. Pay close attention to the steps for opening and managing data files. https://www.youtube.com/watch?v=RhJp6vSZ7z0 Open RStudio: Once RStudio is installed, open the application. Load the Dataset: In RStudio, open a data file named "mtcars". To do this, type the command mtcars in the script editor and run the command. Attach the Data: Next, attach the dataset using the command attach(mtcars). Examine the Variables: Carefully review and note the names of all variables in the dataset. Examples of these variables include: Mileage (mpg) Number of Cylinders (cyl) Displacement (disp) Horsepower (hp) Research: Google to understand these variables. Statistical Analysis: Select mpg variable, and perform the following…arrow_forwardA marketing professor has surveyed the students at her university to better understand attitudes towards PPT usage for higher education. To be able to make inferences to the entire student body, the sample drawn needs to represent the university’s student population on all key characteristics. The table below shows the five key student demographic variables. The professor found the breakdown of the overall student body in the university’s fact book posted online. A non-parametric chi-square test was used to test the sample demographics against the population percentages shown in the table above. Review the output for the five chi-square tests on the following pages and answer the five questions: Based on the chi-square test, which sample variables adequately represent the university’s student population and which ones do not? Support your answer by providing the p-value of the chi-square test and explaining what it means. Using the results from Question 1, make recommendation for…arrow_forwardA marketing professor has surveyed the students at her university to better understand attitudes towards PPT usage for higher education. To be able to make inferences to the entire student body, the sample drawn needs to represent the university’s student population on all key characteristics. The table below shows the five key student demographic variables. The professor found the breakdown of the overall student body in the university’s fact book posted online. A non-parametric chi-square test was used to test the sample demographics against the population percentages shown in the table above. Review the output for the five chi-square tests on the following pages and answer the five questions: Based on the chi-square test, which sample variables adequately represent the university’s student population and which ones do not? Support your answer by providing the p-value of the chi-square test and explaining what it means. Using the results from Question 1, make recommendation for…arrow_forward
- A retail chain is interested in determining whether a digital video point-of-purchase (POP) display would stimulate higher sales for a brand advertised compared to the standard cardboard point-of-purchase display. To test this, a one-shot static group design experiment was conducted over a four-week period in 100 different stores. Fifty stores were randomly assigned to the control treatment (standard display) and the other 50 stores were randomly assigned to the experimental treatment (digital display). Compare the sales of the control group (standard POP) to the experimental group (digital POP). What were the average sales for the standard POP display (control group)? What were the sales for the digital display (experimental group)? What is the (mean) difference in sales between the experimental group and control group? List the null hypothesis being tested. Do you reject or retain the null hypothesis based on the results of the independent t-test? Was the difference between the…arrow_forwardWhat were the average sales for the four weeks prior to the experiment? What were the sales during the four weeks when the stores used the digital display? What is the mean difference in sales between the experimental and regular POP time periods? State the null hypothesis being tested by the paired sample t-test. Do you reject or retain the null hypothesis? At a 95% significance level, was the difference significant? Explain why or why not using the results from the paired sample t-test. Should the manager of the retail chain install new digital displays in each store? Justify your answer.arrow_forwardA retail chain is interested in determining whether a digital video point-of-purchase (POP) display would stimulate higher sales for a brand advertised compared to the standard cardboard point-of-purchase display. To test this, a one-shot static group design experiment was conducted over a four-week period in 100 different stores. Fifty stores were randomly assigned to the control treatment (standard display) and the other 50 stores were randomly assigned to the experimental treatment (digital display). Compare the sales of the control group (standard POP) to the experimental group (digital POP). What were the average sales for the standard POP display (control group)? What were the sales for the digital display (experimental group)? What is the (mean) difference in sales between the experimental group and control group? List the null hypothesis being tested. Do you reject or retain the null hypothesis based on the results of the independent t-test? Was the difference between the…arrow_forward
- Question 4 An article in Quality Progress (May 2011, pp. 42-48) describes the use of factorial experiments to improve a silver powder production process. This product is used in conductive pastes to manufacture a wide variety of products ranging from silicon wafers to elastic membrane switches. Powder density (g/cm²) and surface area (cm/g) are the two critical characteristics of this product. The experiments involved three factors: reaction temperature, ammonium percentage, stirring rate. Each of these factors had two levels, and the design was replicated twice. The design is shown in Table 3. A222222222222233 Stir Rate (RPM) Ammonium (%) Table 3: Silver Powder Experiment from Exercise 13.23 Temperature (°C) Density Surface Area 100 8 14.68 0.40 100 8 15.18 0.43 30 100 8 15.12 0.42 30 100 17.48 0.41 150 7.54 0.69 150 8 6.66 0.67 30 150 8 12.46 0.52 30 150 8 12.62 0.36 100 40 10.95 0.58 100 40 17.68 0.43 30 100 40 12.65 0.57 30 100 40 15.96 0.54 150 40 8.03 0.68 150 40 8.84 0.75 30 150…arrow_forward- + ++ Table 2: Crack Experiment for Exercise 2 A B C D Treatment Combination (1) Replicate I II 7.037 6.376 14.707 15.219 |++++ 1 བྱ॰༤༠སྦྱོ སྦྱོཋཏྟཱུ a b ab 11.635 12.089 17.273 17.815 с ас 10.403 10.151 4.368 4.098 bc abc 9.360 9.253 13.440 12.923 d 8.561 8.951 ad 16.867 17.052 bd 13.876 13.658 abd 19.824 19.639 cd 11.846 12.337 acd 6.125 5.904 bcd 11.190 10.935 abcd 15.653 15.053 Question 3 Continuation of Exercise 2. One of the variables in the experiment described in Exercise 2, heat treatment method (C), is a categorical variable. Assume that the remaining factors are continuous. (a) Write two regression models for predicting crack length, one for each level of the heat treatment method variable. What differences, if any, do you notice in these two equations? (b) Generate appropriate response surface contour plots for the two regression models in part (a). (c) What set of conditions would you recommend for the factors A, B, and D if you use heat treatment method C = +? (d) Repeat…arrow_forwardQuestion 2 A nickel-titanium alloy is used to make components for jet turbine aircraft engines. Cracking is a potentially serious problem in the final part because it can lead to nonrecoverable failure. A test is run at the parts producer to determine the effect of four factors on cracks. The four factors are: pouring temperature (A), titanium content (B), heat treatment method (C), amount of grain refiner used (D). Two replicates of a 24 design are run, and the length of crack (in mm x10-2) induced in a sample coupon subjected to a standard test is measured. The data are shown in Table 2. 1 (a) Estimate the factor effects. Which factor effects appear to be large? (b) Conduct an analysis of variance. Do any of the factors affect cracking? Use a = 0.05. (c) Write down a regression model that can be used to predict crack length as a function of the significant main effects and interactions you have identified in part (b). (d) Analyze the residuals from this experiment. (e) Is there an…arrow_forward
- A 24-1 design has been used to investigate the effect of four factors on the resistivity of a silicon wafer. The data from this experiment are shown in Table 4. Table 4: Resistivity Experiment for Exercise 5 Run A B с D Resistivity 1 23 2 3 4 5 6 7 8 9 10 11 12 I+I+I+I+Oooo 0 0 ||++TI++o000 33.2 4.6 31.2 9.6 40.6 162.4 39.4 158.6 63.4 62.6 58.7 0 0 60.9 3 (a) Estimate the factor effects. Plot the effect estimates on a normal probability scale. (b) Identify a tentative model for this process. Fit the model and test for curvature. (c) Plot the residuals from the model in part (b) versus the predicted resistivity. Is there any indication on this plot of model inadequacy? (d) Construct a normal probability plot of the residuals. Is there any reason to doubt the validity of the normality assumption?arrow_forwardStem1: 1,4 Stem 2: 2,4,8 Stem3: 2,4 Stem4: 0,1,6,8 Stem5: 0,1,2,3,9 Stem 6: 2,2 What’s the Min,Q1, Med,Q3,Max?arrow_forwardAre the t-statistics here greater than 1.96? What do you conclude? colgPA= 1.39+0.412 hsGPA (.33) (0.094) Find the P valuearrow_forward
- Holt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGALGlencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin Harcourt
- Functions and Change: A Modeling Approach to Coll...AlgebraISBN:9781337111348Author:Bruce Crauder, Benny Evans, Alan NoellPublisher:Cengage Learning