Homework 1-Anna Harvey

.docx

School

University of Massachusetts, Dartmouth *

*We aren’t endorsed by this school

Course

500

Subject

Statistics

Date

Jul 2, 2024

Type

docx

Pages

9

Uploaded by MasterNarwhalPerson715

Homework 1 POM 500 Statistical Analysis Note : Attempt all questions as per rubric. Problems including case study has a weightage of 10 marks each. The maximum you can score is 50. Use Excel function wherever possible. Problem-1 Using ‘VehicleFailureData’ available on the course website, answer the following: a) Classify each variable as nominal, ordinal, interval or ratio. a. Vehicle number Ordinal because they are arranged in (ascending) order to identify an attribute of the vehicle. b. Failure Month Interval because this is determined from 0 to a particular month in numerical form. c. Mileage at Failure Interval because this is determined from 0 to a particular number of miles in numerical form. d. Labor Hours Interval because this is determined from 0 to a particular number of hours in numerical form. e. Labor Cost Interval because this is determined from 0 to a particular dollar amount in numerical form. f. Material Cost Interval because this is determined from 0 to a particular dollar amount in numerical form. g. State Nominal because there are abbreviations used to identify an attribute of the vehicle (in this case being the state). b) Identify qualitative and quantitative variables. a. Qualitative —State, Vehicle Number b. Quantitative —Material Cost, Labor Cost, Mileage at Failure, Labor Hours, Failure Month c) How many vehicles failed in month-9? (Write Excel function) a. 71 vehicles failed in month 9. b. Function: =COUNTIF($B$2:$B$1625,B37) d) What was the maximum labor cost? (Write Excel function) a. $3,234.41 is the maximum labor cost. b. Function: =MAX($E$2:$E$1625) e) What was the total failure cost for the data available? (Write Excel function) a. $685,836.05 is the total failure cost. b. Function: =SUM($E$2:$E$1625,$F$2:$F$1625)
Problem-2 A seven-year medical research study reported that women whose mothers took the drug DES during pregnancy were twice as likely to develop tissue abnormalities that might lead to cancer as were women whose mothers did not take the drug. a) This study involved the comparison of two populations. What were the populations? a. The population of women whose mothers took the drug DES during pregnancy. b. The population of women whose mothers did not take the drug DES during pregnancy. b) Do you suppose the data were obtained in a survey or an experiment? a. I would hope that the data was obtained in a survey given its nature, including the seven year time frame, the predicted side effects, etc. However, due to the fact that we are attempting to control one variable (the consumption of the DES drug) this would be much easier to collect data in an experiment and track how they influence the rate of tissue abnormality development. c) For the population of women whose mothers took the drug DES during pregnancy, a sample of 3980 women showed 63 developed tissue abnormalities that might lead to cancer. Provide a descriptive statistic that could be used to estimate the number of women out of 1000 in this population who have tissue abnormalities. a. Sample size = 3980 Developed abnormality = 63 63 / 3,980 = 0.0158… This means that 15.8 out of every 1,000 women whose mothers did take the DES drug during pregnancy developed tissue abnormalities. d) For the population of women whose mothers did not take the drug DES during pregnancy, what is the estimate of the number of women out of 1000 who would be expected to have tissue abnormalities? a. 15.8 out of 1,000 for those whose mothers did take the drug. They are twice as likely than those whose mothers did not. (2) 15.8 / 2 = 7.9 So, 7.9 out of every 1,000 women whose mothers did not take the DES drug during pregnancy developed tissue abnormalities. e) Medical studies often use a relatively large sample (in this case, 3980). Why? a. By using a large sample pool, they are able to collect enough data on the number of cases that actually show correlation between the drug and the predicted or suspected outcome. This is also why they will typically take several years, in order to see these effects develop over time.
Problem-3 ACNielsen conducts weekly surveys of television viewing throughout the United States. The ACNielsen statistical rating indicates the size of the viewing audience for each major network television program. Rankings of the television program and of the viewing audience market share for each network are published each week. a) What is AC Nielsen attempting to measure? a. AC Nielsen is attempting to measure the Television rating point of each major television network. This helps show what programs that viewers of certain demographics watch most. b) What is the population? a. The population is television network program viewers in the US. c) Why would a sample be used in this situation? a. A sample would be used because it is unrealistic to take data from every TV viewer throughout the United States. It would be extremely expensive and take too much time, so sampling allows them to collect the data and carry out the experiment in a more efficient way. d) What kinds of decisions or actions are based on the ACNielsen studies? a. These studies allow the networks to see the number of people in a demographic that watch their given programs and allows them to cater their strategies more effectively. This could influence the timing of programs, commercials shown, the days the program runs, etc. It also can show them which programs are not worth continuing as they are not developing a big enough viewership.
Problem-4 Using ‘VehicleFailureData’, summarize the data for failures in top 10 (maximum number of vehicle failures) states by constructing the following: a) Relative and percent frequency distributions STATE FREQUENCY RELATIVE PERCENT TX 293 0.290674603 29.06746032 CA 200 0.198412698 19.84126984 FL 168 0.166666667 16.66666667 GA 75 0.074404762 7.44047619 AZ 61 0.060515873 6.051587302 LA 49 0.048611111 4.861111111 NC 48 0.047619048 4.761904762 PA 43 0.04265873 4.265873016 MI 36 0.035714286 3.571428571 CO 35 0.034722222 3.472222222 TOTAL 1008 b) Bar Chart TX CA FL GA AZ LA NC PA MI CO 0 50 100 150 200 250 300 350 293 200 168 75 61 49 48 43 36 35 FREQUENCY State Number of Vehicle Failures c) Pie Chart
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help