hw1_questions

docx

School

Virginia Commonwealth University *

*We aren’t endorsed by this school

Course

STAT-543

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

3

Uploaded by ChiefEmu4147

Report
STAT 543 Homework #1 1. The newspaper article “Spray Away Flu” ( Omaha World-Herald , June 8, 1998) reported on a study of the effectiveness of a new flu vaccine that is administered by nasal spray rather than by injection. The article states that the “researchers gave the spray to 1070 healthy children, 15 months to 6 years old, before the flu season two winters ago. One percent developed confirmed influenza, compared with 18% of the 532 children who received a placebo. And only one vaccinated child developed an ear infection after coming down with influenza. . . . Typically 30% to 40% of children with influenza later develop an ear infection.” The researchers concluded that the nasal flu vaccine was effective in reducing the incidence of flu and also in reducing the number of children with flu who subsequently develop ear infections. a. What were the researchers trying to learn? What questions motivated their research? b. Do you think that the study was conducted in a reasonable way? What additional information would you want in order to evaluate this study? 2. A recent article in Online Trends described a study in which researchers looked at a random sample of 500 publicly accessible Facebook pages of 18-year-olds. The content of each page was analyzed. One of the conclusions reported was that displaying sport or hobby involvement was associated with decreased references to risky behavior (sexual references or references to substance abuse or violence). a. Is the study described an observational study or an experiment? b. Is it reasonable to generalize the stated conclusion to all 18-year-olds with a publicly accessible Facebook pages? What aspect of the study supports your answer? c. Not all Facebook users have publicly accessible pages. Is it reasonable to generalize the stated conclusion to all 18-year-old Facebook users? Explain. d. Is it reasonable to generalize the stated conclusion to all Facebook users with publicly accessible pages? Explain. 3. The head of the quality control department at a printing company would like to carry out an experiment to determine which of three different glues results in the greatest binding strength. One factor thought to affect binding strength is whether the book is being bound as a paperback or a hardback. a. What type of experimental design do you suggest should be used here? Be sure to explain why you gave this answer. b. What would be the treatments in your experiment? c. What would be the response variable in your experiment (i.e., what measurement is used for group comparison)? 4. For each of the situations described, state whether the sampling procedure is simple random sampling, stratified random sampling, cluster sampling, or convenience sampling. a. All first-year students at a university are enrolled in 1 of 30 sections of a seminar course. To select a sample of freshmen at this university, a researcher selects four
sections of the seminar course at random from the 30 sections and all students in the four selected sections are included in the sample. b. To obtain a sample of students, faculty, and staff at a university, a researcher randomly selects 50 faculty members from a list of faculty, 100 students from a list of students, and 30 staff members from a list of staff. c. A university researcher obtains a sample of students at his university by using the 85 students enrolled in his Psychology 101 class. d. To obtain a sample of the seniors at a particular high school, a researcher writes the name of each senior on a slip of paper, places the slips in a box and mixes them, and then selects 10 slips. The students whose names are on the selected slips of paper are included in the sample. 5. In a survey of 100 people who had recently purchased motorcycles, data on the following variables were recorded: Gender of purchaser Brand of motorcycle purchased Number of previous motorcycles owned by purchaser Telephone area code of purchaser Weight of motorcycle as equipped at purchase a. Which of these variables are categorical? b. Which of these variables are discrete numerical? c. Which type of graphical display would be an appropriate choice for summarizing the gender data, a bar chart or a histogram? d. Which type of graphical display would be an appropriate choice for summarizing the weight data, a bar chart or a histogram? 6. “Crime Finds the Never Married” is the conclusion drawn in an article from USA Today (June 29, 2001). This conclusion is based on data from the Justice Department’s National Crime Victimization Survey, which estimated the number of violent crimes per 1000 people, 12 years of age or older, to be 51 for the never married, 42 for the divorced or separated, 13 for married individuals, and 8 for the widowed. Does being single cause an increased risk of violent crime? Describe a potential confounding variable that illustrates why it is unreasonable to conclude that a change in marital status causes a change in crime risk. The following exercises require the use of statistical software. VCU provides free student software licenses for several different packages; you can find more information and helpful links on our Canvas webpage. Please state what software package you are using for your analysis and include only the relevant output needed to address each part of the following exercises . Also, please remember to put appropriate labels on your graphs/charts (e.g., axis labels). 7. The article “Determination of Most Representative Subdivision” ( Journal of Energy Engineering [1993]: 43–55) gave data on various characteristics of subdivisions that could be used in deciding whether to provide electrical power using overhead lines or underground
lines. Data on the variable x = total length of streets within a subdivision (measured in feet) are as follows: 2640 1250 2840 2400 1530 1240 1850 1896 1050 1450 3150 1300 2460 1120 3150 2120 1320 2100 7850 3350 1419 960 5020 4770 2250 2700 2320 2109 1280 3600 1670 840 1000 3870 1860 3380 960 2730 3330 2400 1890 4390 3060 4220 4270 4700 810 You can also download a CSV file containing this data from Canvas. a.Construct a histogram of the data using boundaries of 0 to <1000, 1000 to <2000, and so on. Describe the resulting histogram (shape, features, etc.). b.Construct another histogram of the data, but this time use boundaries of 0 to <500, 500 to <1000, and so on. Describe the resulting histogram (shape, features, etc.). c.Which of the two histograms do you prefer? Explain why. d.Construct a box-and-whisker plot of the data. Describe the resulting plot. e.Compare the box-and-whisker plot to the histograms. Discuss similarities as well as any features that are only evident in one type of plot and not in the other. f.Compute the sample mean, sample standard deviation, and five-number summary. g.Based upon your graphs above, what measures of center and spread should you use? h.Describe the data using appropriate measures of center and spread and describe any unusual features. What can you conclude about the subdivisions in this study? 8. The U.S. Department of Education publishes student loan default rates for colleges which participate in various student financial assistance programs, such as Pell grants and Perkins loans. On Canvas, you will find two files containing default rate information for borrowers who entered the repayment period in fiscal year 2018. There are two files, one from a subset of public colleges, and one from private colleges. a.Construct side-by-side boxplots of the numbers of students who defaulted (“NumberDefaulted”) for public vs. private schools. Describe any outliers or unusual features you see in the data. (You could also use histograms, but boxplots are often easier to compare.) b.Now, do the same comparison using the default rates (“DefaultRate”). c.If you were interested in comparing the loan repayment progress (or lack thereof) for public and private school borrowers, would you prefer the graph(s) from part (a) or from part (b)? Why? d.Based on this data, could you draw any conclusions about the quality of public college educational programs, the efficacy of student loan counseling programs, etc? Briefly explain your thoughts, mentioning any other likely factors that you think may be playing a role.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help