= 3:10 1 newfile.py + <> + TT AN b.research.google.com + ✓ Load the tips dataset from Seaborn ↑ 63 lll 53% RAM Disk r B ⚫ The Seaborn "Tips" dataset contains information about restaurant bills, tips, and customer demographics. • Here are the column descriptions of the Tips dataset: 。 total_bill: Meal cost. 。 tip: Tip amount. 。 sex: Payer gender. 。 smoker: Smoker (yes/no). 。 day: Day of week. 。 time: Lunch/dinner. 。 size: Party size. [1] import seaborn as sns import pandas as pd import matplotlib.pyplot as plt import numpy as np tips = sns.load_dataset("tips") display(tips.head(3)) print(tips.info()) print("\nDataset Description: ") print(tips.describe()) total bill tip sex smoker day time size 0 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.50 Male No Sun Dinner 3 RangeIndex: 244 entries, 0 to 243 Data columns (total 7 columns): # Column Non-Null Count Dtype 0 total_bill 244 non-null 1 tip 2 sex 3 smoker 244 non-null 244 non-null 244 non-null float64 float64 category category 4 day 244 non-null category 5 time 244 non-null category 6 size 244 non-null int64 dtypes: category(4), float64(2), int64(1) memory usage: 7.4 KB None Dataset Description: total_bill tip size count 244.000000 244.000000 244.000000 mean 19.785943 2.998279 2.569672 std 8.902412 1.383638 0.951100 min 3.070000 1.000000 1.000000 25% 13.347500 2.000000 2.000000 50% 17.795000 2.900000 2.000000 75% 24.127500 3.562500 3.000000 max 50.810000 10.000000 6.000000 Qestion 1: Exploratory Data Analysis • Create histograms and box plots for the total_bill and tip columns (Four visuals in total). • What can we say about the distribution of each variable? (i.e normalized? Skewed?) [ ] plt.figure(figsize=(12, 6)) plt.subplot(2, 2, 1) # Code goes here ||| * 3:12 D AN b.research.google.com + newfile.py + <> + TT RAM Disk 63 lll 52% ▾ Qestion 1: Exploratory Data Analysis • Create histograms and box plots for the total_bill and tip columns (Four visuals in total). What can we say about the distribution of each variable? (i.e normalized? Skewed?) [ ] plt.figure(figsize=(12, 6)) plt.subplot(2, 2, 1) # Code goes here (*) Count 50 40 30 20 10 50 40 30 20 10 Histogram of Total Bill Box Plot of Total Bill 10 20 30 40 50 20 30 40 50 total_bill total_bill Histogram of Tip Box Plot of Tip 10 6 tip tip B ✓ Question 2: Count Plots • Create count plots for the sex, smoker, and day columns. Interpret the results in few words: Which group is larger? [ ] plt.figure(figsize=(15, 5)) (†) + plt.subplot(1, 4, 1) # Code goes here Count Plot of Sex 160 Count Plot of Smoker Count Plot of Day Count Plot of Meal 175 140 120 140 120 100 100 80 60 150 125 100 80 80 40 75 60 60 40 20 40 20 20- 50 25 0 0 0 0 Male Female Yes No Thur Fri Sat Sun Lunch Dinner sex smoker day time 10 * r

= 3:10 1 newfile.py + <> + TT AN b.research.google.com + ✓ Load the tips dataset from Seaborn ↑ 63 lll 53% RAM Disk r B ⚫ The Seaborn "Tips" dataset contains information about restaurant bills, tips, and customer demographics. • Here are the column descriptions of the Tips dataset: 。 total_bill: Meal cost. 。 tip: Tip amount. 。 sex: Payer gender. 。 smoker: Smoker (yes/no). 。 day: Day of week. 。 time: Lunch/dinner. 。 size: Party size. [1] import seaborn as sns import pandas as pd import matplotlib.pyplot as plt import numpy as np tips = sns.load_dataset("tips") display(tips.head(3)) print(tips.info()) print("\nDataset Description: ") print(tips.describe()) total bill tip sex smoker day time size 0 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.50 Male No Sun Dinner 3 RangeIndex: 244 entries, 0 to 243 Data columns (total 7 columns): # Column Non-Null Count Dtype 0 total_bill 244 non-null 1 tip 2 sex 3 smoker 244 non-null 244 non-null 244 non-null float64 float64 category category 4 day 244 non-null category 5 time 244 non-null category 6 size 244 non-null int64 dtypes: category(4), float64(2), int64(1) memory usage: 7.4 KB None Dataset Description: total_bill tip size count 244.000000 244.000000 244.000000 mean 19.785943 2.998279 2.569672 std 8.902412 1.383638 0.951100 min 3.070000 1.000000 1.000000 25% 13.347500 2.000000 2.000000 50% 17.795000 2.900000 2.000000 75% 24.127500 3.562500 3.000000 max 50.810000 10.000000 6.000000 Qestion 1: Exploratory Data Analysis • Create histograms and box plots for the total_bill and tip columns (Four visuals in total). • What can we say about the distribution of each variable? (i.e normalized? Skewed?) [ ] plt.figure(figsize=(12, 6)) plt.subplot(2, 2, 1) # Code goes here ||| * 3:12 D AN b.research.google.com + newfile.py + <> + TT RAM Disk 63 lll 52% ▾ Qestion 1: Exploratory Data Analysis • Create histograms and box plots for the total_bill and tip columns (Four visuals in total). What can we say about the distribution of each variable? (i.e normalized? Skewed?) [ ] plt.figure(figsize=(12, 6)) plt.subplot(2, 2, 1) # Code goes here () Count 50 40 30 20 10 50 40 30 20 10 Histogram of Total Bill Box Plot of Total Bill 10 20 30 40 50 20 30 40 50 total_bill total_bill Histogram of Tip Box Plot of Tip 10 6 tip tip B ✓ Question 2: Count Plots • Create count plots for the sex, smoker, and day columns. Interpret the results in few words: Which group is larger? [ ] plt.figure(figsize=(15, 5)) (†) + plt.subplot(1, 4, 1) # Code goes here Count Plot of Sex 160 Count Plot of Smoker Count Plot of Day Count Plot of Meal 175 140 120 140 120 100 100 80 60 150 125 100 80 80 40 75 60 60 40 20 40 20 20- 50 25 0 0 0 0 Male Female Yes No Thur Fri Sat Sun Lunch Dinner sex smoker day time 10 r