Stat 116- 2.1 Categorical variables 2- Fall 2023 - Google Docs

pdf

School

University of Kentucky *

*We aren’t endorsed by this school

Course

296

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

5

Uploaded by CorporalSummer11143

Report
STAT 116 Chapter 2 Describing Data / 2.1 Categorical variables One categorical variable Frequency tables, pie charts, and bar charts can all be used to display data concerning one categorical variable. 1. Frequency table contains the counts of how often each value occurs in the dataset. May include the percent of the dataset that falls into each category, Do you believe in UFOS ? Response Frequency Relative frequency Some have been alien spacecrafts All explained by human activity/natural phenomenon No opinion TOTAL https://news.gallup.com/poll/350096/americans-believe-ufos.aspx What proportion of the respondents believe in UFOs ? Proportion in a category = Number of cases in that category/ Total number of cases Notation for : Sample proportion Population proportion 2. Pie Chart displays data concerning one categorical variable by partitioning a circle into "slices" that represent the proportion ( percentage) in each category. 1
STAT 116 3. Bar Chart can be used to display data concerning one categorical variable. The bars, which may be vertical or horizontal, symbolize the number of cases in each category and are separated by spaces. Two categorical variables Data concerning two categorical variables can be displayed in a two-way table, side by side bar chart, or stacked ( segmented) bar chart . Two way table One variable will be represented in the rows and a second variable will be represented in the columns Ex: . In the StudentSurvey dataset, 362 students are asked which award they would prefer to win: an Academy Award, a Nobel Prize, or an Olympic gold medal. 20 of the 31 students who prefer an Academy Award are female, 76 of the 149 students who prefer a Nobel Prize are female, and 73 of the 182 who prefer an Olympic gold medal are female. a) Create a two-way table for these variables. b) Which award is the most popular? What proportion of all students selected this award? c) Determine the difference between the proportion of males and that of females who prefer an Olympic gold medal Total Total 2
STAT 116 Stacked (segmented) bar chart One categorical variable is represented on the x-axis and the second categorical variable is displayed as different parts (i.e., segments) of each bar. Side by Side bar chart Each bar represents one combination of the two categorical variables. If you compare this to the two-way table above, each bar represents the value in one cell. Extra Practice 2.29 Is There a Genetic Marker for Dyslexia? A disruption of a gene called DYXC1 on chromosome 15 for humans may be related to an increased risk of developing dyslexia. Researchers studied the gene in 109 people diagnosed with dyslexia and in a control group of 195 others who had no learning disorder. The DYXC1 break occurred in 10 of those with dyslexia and in 5 of those in the control group. a) Is this an experiment or an observational study? What are the variables? b) How many rows and how many columns will the data table have? c) Display the results of the study in a two-way table. d) Compare the proportion of each group who have the break on the DYXC1 gene.Does there appear to be an association between this genetic marker and dyslexia for the people in this sample? Can we assume that the gene disruption causes dyslexia? Why or why not? 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Section 2.1: Categorical Variables Example 1: Talking About Sports A survey in November 2012 asked a random sample of 2,000 US adults “How often do you talk about sports with family and friends?” The results are given in the following frequency table. Response Frequency Every day or nearly every day 302 About once a week 277 Occasionally 526 Rarely or never 895 TOTAL 2000 a). What proportion rarely or never talk about sports? b). What percent of people in the sample talk about sports once a week or more? c). Give a relative frequency table for this dataset. Quick Self-Quiz: Frequency and Relative Frequency Tables In a blind taste test, people were given four different types of water and asked to select their top choice. Ten of the participants selected tap water, 25 selected Aquafina, 41 selected Fiji, and 24 selected Sam’s Choice. a). Display the results in a frequency table. b). What proportion selected Aquafina? c). What proportion selected bottled (not tap) water? d). Display the results in a relative frequency table.
page 2 Example 2: Relationship Status and Gender 169 college students were asked about relationship status and gender. The results are given in the following two-way table. Female Male Total In a relationship 32 10 42 It’s complicated 12 7 19 Single 63 45 108 Total 107 62 169 a). What proportion of students in this sample are in a relationship? b). What proportion of females in this sample are in a relationship? c). What proportion of the people who are in a relationship in this sample are female? d). What proportion of males in this sample are in a relationship? e). Using ! to represent the proportion of females in a relationship and " to represent the proportion of males in a relationship, find the difference in proportions ! − !̂ " . Example 3: Handedness and Occupation In a study of handedness in occupations, 10 out of 118 psychiatrists were left-handed, 26 out of 148 architects were left-handed, 5 of 132 orthopedic surgeons were left-handed, and 16 of 105 lawyers were left-handed. a). Make a two-way table of this relationship. b). What proportion of all the people in the sample are left-handed? Quick Self-Quiz: Finding Proportions from Two-Way Tables Errors in medical prescriptions occur relatively frequently. In a study, two groups of doctors had similar error rates and one group switched to e-prescriptions while the other continued with hand-written prescriptions. One year later, the number of errors was measured. The results are given in the two-way table. Error No Error Electronic 254 3594 Written 1478 2370 a). Fill in the row and column totals. b). What proportion of all the prescriptions had errors in them? c). What proportion of electronic prescriptions had errors in them? d). What proportion of written prescriptions had errors in them? e). What proportion of the prescriptions with errors were written prescriptions?