wor4_sols

docx

School

Queensland University of Technology *

*We aren’t endorsed by this school

Course

500

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

9

Uploaded by SuperHumanBoar4181

Report
Workshop 4 Introduction to probability MXN500 Introduction About this workshop In this week’s workshop we will be exploring some basic probability rules. We will consider a couple of questions and use R to solve and produce effective diagrams. We assume you can get started in R with ease and that you’ve done the lecture and lab for this week and all previous weeks. Assumed skills Manipulating data objects in R Calculating and interpreting summary statistics Writing and executing functions in R Visualising results with ggplot2 Learning objectives Professional skills Visualising results A reminder of expectations in the workshop: Keep a record of the work being completed, both the R script and this document Allow everyone a chance to participate in the workshop All opinions are valued provided they do not harm others Everyone is expected to help out with completing the work, learning seldom occurs by watching someone else do maths Shuffle the group around so that someone different is controlling R Activity 1 – Bert and Ernie Two friends, Bert and Ernie, normally spend Saturday night out in the Valley. They regularly have an argument about the question of which club to go to just before 3am. Bert always wants to go to the “Electric Playground”. Ernie does not like to go to the same club every Saturday. For after 3am, he particularly likes both the “Electric Playground” and “Family”. On a given night, Ernie wants to go to the “Electric Playground”, but definitely not to “Family”, with probability 0.3. Ernie is fine with going to any of the two clubs with
probability 0.2. He is okay with going to “Family” with probability 0.6. On all other Saturday nights, Ernie is too drunk to go anywhere at 3am. Write all relevant events using proper set notation and answer the following questions. Exercise: Define all relevant events. Answer: Let E be the event that Ernie wants to (and is able to) go to the Electric Playground. Let F be the event that Ernie wants to (and is able to) go to the Family. P ( E F )= 0.3 P ( EF )= 0.2 P ( F )= 0.6 Exercise: Modify the following code to plot a Venn diagram of Ernie’s club preference. library (VennDiagram) draw.pairwise.venn ( area1 = ..., area2 = ..., cross.area = ..., category = c ( "Electric Playground" , "Family" ), lty = rep ( "blank" , 2 ), fill = c ( "light blue" , "pink" ), alpha = rep ( 0.5 , 2 ), cat.pos = c ( 0 , 0 ), cat.dist = rep ( 0.025 , 2 ), scaled = F) Answer: library (VennDiagram) draw.pairwise.venn ( area1 = 0.5 , area2 = 0.6 , cross.area = 0.2 , category = c ( "Electric Playground" , "Family" ), lty = rep ( "blank" , 2 ), fill = c ( "light blue" , "pink" ), alpha = rep ( 0.5 , 2 ), cat.pos = c ( 0 , 0 ), cat.dist = rep ( 0.025 , 2 ), scaled = F) Exercise: What is the probability that, on a given night, Bert and Ernie happen to have an argument because Ernie wants to go to “Family” and not to the “Electric Playground”?
Answer: P ( E F )= P ( F )− P ( EF ) law of total probability ¿ 0.6 0.2 ¿ 0.4 Exercise: What is the probability that, on a given night, Bert and Ernie happen to have no argument because they happily go to the Electric Playground? Answer: P ( E )= P ( EF )+ P ( E F ) ¿ 0.2 + 0.3 ¿ 0.5 Exercise: What is the probability that, on a given night, Ernie is too drunk to get into any club at 3am? Answer: P ( E F )= 1 P ( E F ) complement ¿ 1 −( P ( E )+ P ( F )− P ( EF )) addition rule for non-disjoint events ¿ 1 −( 0.5 + 0.6 0.2 ) ¿ 1 0.9 ¿ 0.1 Exercise: What is the probability that, on a given night, Ernie does not want to or is not able to go to the Electric Playground? Answer: P ( E )= 1 P ( E ) complement rule ¿ 1 0.5 ¿ 0.5 Activity 2 – VIP Pass The company that owns WB Movie World, Sea World and Wet’n’Wild offers a “VIP Pass” for 99 dollars that gives you unlimited entry into all 3 theme parks for 1 year. Now the company is considering offering this “VIP Pass” for another year. However, managers are worried that there might be too many people who pay only once for the “VIP Pass” while going to the theme parks very often. They decide that they will not offer the “VIP Pass” for
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
another year if the probability that a customer with a “VIP Pass” goes to the parks more than 5 times is greater than 0.35. From the data of the current year they know that the probability is 0.5 that a customer buying the “VIP Pass” is 18 years or younger, the probability is 0.4 that a customer with a “VIP Pass” is between 19 and 40 years of age, and the probability is 0.1 that a customer with a “VIP Pass” is 41 years or older. They also know that the probability is 0.4, 0.3 and 0.2 that a “VIP Pass” customer of age 18 , age 18 40 and age 41 , respectively, goes to the parks more than 5 times during the year. Exercise: Define all relevant events. Answer: Let Y be the event that a VIP pass customer is in the youngest age range. Let M be the event that a VIP pass customer is in the middle age range. Let O be the event that a VIP pass customer is in the oldest age range. Let F be the event that a VIP pass customer uses the pass more than five times in a year. P ( Y )= 0.5 P ( M )= 0.4 P ( O )= 0.1 P ( F Y )= 0.4 P ( F M )= 0.3 P ( F O )= 0.2 Exercise: Modify the following code to create a visualisation of the scenario. library (DiagrammeR) nodes <- create_node_df ( n = 10 , type = "number" , label = c ( "" , "<=18" , "19-40" , ">=41" , "<=5" , ">5" , "<=5" , ">5" , "<=5" , ">5" )) edges <- create_edge_df ( from = c ( 1 , 1 , 1 , 2 , 2 , 3 , 3 , 4 , 4 ), to = c ( 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 ), label = c (...), rel = "leading to" ) graph <- create_graph ( nodes_df = nodes, edges_df = edges, attr_theme= NULL ) # View the graph render_graph (graph)
Answer: library (DiagrammeR) nodes <- create_node_df ( n = 10 , type = "number" , label = c ( "" , "<=18" , "19-40" , ">=41" , "<=5" , ">5" , "<=5" , ">5" , "<=5" , ">5" )) edges <- create_edge_df ( from = c ( 1 , 1 , 1 , 2 , 2 , 3 , 3 , 4 , 4 ), to = c ( 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 ), label = c ( "0.5" , "0.4" , "0.1" , "0.6" , "0.4" , "0.7" , "0.3" , "0.8" , "0.2" ), rel = "leading to" ) graph <- create_graph ( nodes_df = nodes, edges_df = edges, attr_theme= NULL ) # View the graph render_graph (graph) Exercise: Should they offer the “VIP Pass” for another year? Answer: To answer this question, we need to determine the probability that a VIP pass customer will use the pass more than five times in a year. P ( F )= P ( FY FM FO ) law of total probability ¿ P ( FY )+ P ( FM )+ P ( FO ) addition rule for disjoint events ¿ P ( F Y ) × P ( Y )+ P ( F M ) ×P ( M )+ P ( F O ) ×P ( O ) conditional probability ¿ 0.4 × 0.5 + 0.3 × 0.4 + 0.2 × 0.1 ¿ 0.2 + 0.12 + 0.02 ¿ 0.34
Since the probability that a customer uses the VIP pass more than 5 times is less than 35%, the parks should continue to make the offer. Exercise: What is the probability that a customer who goes to the parks more than 5 times is 18 years or younger? Answer: P ( Y F )= P ( F Y ) ×P ( Y ) P ( F ) Baye’s Theorem ¿ 0.4 × 0.5 0.34 0.59 Activity 3 – Random Data Generation In this activity, we will learn how to use R to generate random data. Exercise: Modify the following code to generate 1000 uniform random variable on the interval [1,6]. rand_df <- data.frame ( x = runif ( n = ..., min = ..., max = ...)) Answer: rand_df <- data.frame ( x = runif ( n = 1000 , min = 1 , max = 6 )) Exercise: Create a histogram that shows your random numbers. Answer: ggplot ( data = rand_df, aes (x)) + geom_histogram ( colour = 'darkgrey' , fill = 'coral' ) + theme_bw () Exercise: Use the round function to round your randomly generated data to whole numbers. Create a histogram with the rounded data.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Answer: rounded_df<- data.frame ( x = round (rand_df $ x, digits = 0 )) ggplot ( data = rounded_df, aes (x)) + geom_histogram ( colour = 'darkgrey' , fill = 'coral' , bins = 6 ) + scale_x_continuous ( breaks = seq ( 1 , 6 , 1 )) + theme_bw () Exercise: Does this look like a discrete uniform distribution on {1,…,6} (like rolling a six- sided die)? If not, what would you change? Answer: This does not look like a discrete uniform distribution on {1,…,6}. This is because we would need to increase the original interval of our continuous random variable before rounding. Exercise: Create a function that randomly generates from a discrete uniform distribution on {1,…,6}. Answer: rand_dice <- function (obs){ return ( round ( runif ( n = obs, min = 0.5 , max = 6.5 ), digits = 0 )) } Exercise: Using the function you just created, plot a histogram of 1000 observations. Answer: rand_discrete_df <- data.frame ( x = rand_dice ( 1000 )) ggplot ( data = rand_discrete_df, aes (x)) + geom_histogram ( colour = 'darkgrey' , fill = 'coral' , bins = 6 ) + scale_x_continuous ( breaks = seq ( 1 , 6 , 1 )) + theme_bw ()
Activity 4 – O Block Lifts At one stage, lift breakdowns in O block were fairly common. Data was collected on two lifts at the same end of O block. ## Probability A works A broken ## 1 B works 0.1 0.4 ## 2 B broken 0.2 0.3 Exercise: Define all relevant events. Answer: Let A be the event that lift A works. Let B be the event that lift B works. P ( AB )= 0.1 P ( A B )= 0.2 P ( A B )= 0.4 P ( A B )= 0.3 Exercise: What is the probability that lift A works given lift B works? Answer: P ( A B )= P ( AB ) P ( B ) conditional probability ¿ P ( AB ) P ( AB )+ P ( A B ) law of total probability ¿ 0.1 0.1 + 0.4 ¿ 0.2
Exercise: What is the probability that lift B works given lift A works? Answer: P ( B A )= P ( BA ) P ( A ) conditional probability ¿ P ( BA ) P ( AB )+ P ( A B ) law of total probability ¿ 0.1 0.1 + 0.2 ¿ 1 3 Exercise: What is the probability that at least one lift is working? Answer: P ( A B )= P ( A )+ P ( B )− P ( AB ) addition rule for non-disjoint events ¿ 0.3 + 0.5 0.1 We calculated these probabilities in the previous questions ¿ 0.7 Exercise: What is the probability that neither lift is working? Answer: P ( A B )= 0.3 Exercise: Are lift breakdowns independent? Why or why not? Answer: P ( AB )= 0.1 P ( A B )= 0.2 Therefore lift breakdowns are not independent. Note, there are many ways to show this.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help