Understanding Joint, Marginal, and Conditional Probability

IE6400 Foundations for Data Analytics Engineering ¶ Fall 2023 ¶ Module 2: Joint, Marginal and Conditional Probability ¶ Probability Concepts ¶ 1. Joint Probability: ¶ • Definition : The joint probability of two events, ( A ) and ( B ), denoted as $P(A \ cap B)$ or $P(A, B)$, is the probability that both events occur at the same time. • Formula : $P(A \cap B) = P(A) \times P(B|A)$ or $P(A \cap B) = P(B) \times P(A|B) $ 2. Conditional Probability: ¶ • Definition : The conditional probability of an event ( A ) given that another event ( B ) has occurred is denoted as $P(A|B)$. It represents the probability of ( A ) occurring, assuming that ( B ) has already occurred. • Formula : $P(A|B) = \frac{P(A \cap B)}{P(B)}$ 3. Marginal Probability: ¶ • Definition : The marginal probability of an event ( A ) is simply the probability of that event occurring without any condition on another event. It's also known as the "unconditional probability" or simply the "probability." • Formula : For two events ( A ) and ( B ), the marginal probability of ( A ) can be found by summing up the joint probabilities of ( A ) occurring with each possible state of ( B ). That is, $P(A) = \sum_{b} P(A, B=b)$, where ( B=b ) represents each possible state of ( B ). Relationship : • These probabilities are related in the sense that they provide different perspectives on the likelihood of events. Joint probability considers two events together, conditional probability considers one event given the occurrence of another, and marginal probability considers one event without any conditions. Joint , Conditional and Marginal Probability ¶ Exercise 1 Understanding Joint Probability through Dice Rolling Simulation ¶ Problem Statement ¶ Imagine you have two six-sided dice: • Die A: A standard die with faces [1, 2, 3, 4, 5, 6]. • Die B: Another standard die with faces [1, 2, 3, 4, 5, 6]. We will simulate the rolling of die A and die B 10,000 times. Our goal is to calculate the joint probability of the following two specific events: 1. Event 1: Die A rolls a 2. 2. Event 2: Die B rolls a 4. The joint probability is the probability of both events happening at the same time. Objective: ¶ 1. Simulate the rolling of two dice 10,000 times. 2. Calculate the joint probability of rolling a 2 with die A and a 4 with die B. 3. Visualize the outcomes. 4. Interpret the results. Step 1: Importing Necessary Libraries ¶ In [1]: import numpy as np import matplotlib.pyplot as plt import seaborn as sns

Step 2: Simulating the Dice Rolls ¶ We'll use numpy to simulate the rolling of two dice 10,000 times. In [2]: np.random.seed(0) # for reproducibility n_rolls = 10000 # Simulating the rolls rolls_A = np.random.randint(1, 7, n_rolls) rolls_B = np.random.randint(1, 7, n_rolls) Step 3: Calculating the Joint Probability ¶ We'll calculate the joint probability of rolling a 2 with die A and a 4 with die B. In [3]: # Identifying the successful events success_events = np.logical_and(rolls_A == 2, rolls_B == 4) # Calculating the joint probability joint_prob = np.sum(success_events) / n_rolls # Print the result print(f"Joint Probability of event A (rolling a 2) and event B (rolling a 4) is : {joint_prob}") Joint Probability of event A (rolling a 2) and event B (rolling a 4) is : 0.028 Step 4: Visualization ¶ We'll visualize the outcomes of the dice rolls using seaborn . In [4]: # Creating a DataFrame for visualization import pandas as pd df = pd.DataFrame({'Die A': rolls_A, 'Die B': rolls_B}) # Plotting sns.histplot(df, bins=np.arange(1, 9), discrete=True, stat='probability', common_norm=False) plt.title('Distribution of Dice Rolls') plt.xlabel('Die Face') plt.ylabel('Probability') plt.legend(['Die A', 'Die B']) plt.show()

Interpretation ¶ The joint probability calculated gives us the probability of both events (rolling a 2 with die A and rolling a 4 with die B) occurring together in a single roll. The visualization shows the distribution of outcomes for each die over the 10,000 rolls. Conclusion ¶ Through simulation, we can estimate probabilities of various events. The joint probability provides insights into the likelihood of multiple events occurring together. Understanding this concept is crucial in various fields like statistics, data science, and various research areas where dependency between events is analyzed. Exercise 2 Understanding Joint and Marginal Probabilities from Customer Complaints ¶ Problem Statement ¶ Consider a scenario at a popular company service center where they receive various complaints from their customers. Out of a total of 100 complaints: • 80 customers complained about late delivery of the items. • 60 customers complained about poor product quality. We want to answer the following questions: 1. What is the probability that a customer complaint will be about both product quality and late delivery? 2. What is the probability that a complaint will be only about late delivery? Objective: ¶ 1. Calculate the joint probability of complaints about both product quality and late delivery. 2. Calculate the marginal probability of complaints only about late delivery. 3. Visualize the outcomes. 4. Interpret the results. Step 1: Importing Necessary Libraries ¶ In [5]: import matplotlib.pyplot as plt Step 2: Calculating Probabilities ¶ Given the data, we can use the principle of Inclusion-Exclusion to find the joint and marginal probabilities.

Your preview ends here