= 3:10 1 newfile.py + <> + TT AN b.research.google.com + ✓ Load the tips dataset from Seaborn ↑ 63 lll 53% RAM Disk r B ⚫ The Seaborn "Tips" dataset contains information about restaurant bills, tips, and customer demographics. • Here are the column descriptions of the Tips dataset: 。 total_bill: Meal cost. 。 tip: Tip amount. 。 sex: Payer gender. 。 smoker: Smoker (yes/no). 。 day: Day of week. 。 time: Lunch/dinner. 。 size: Party size. [1] import seaborn as sns import pandas as pd import matplotlib.pyplot as plt import numpy as np tips = sns.load_dataset("tips") display(tips.head(3)) print(tips.info()) print("\nDataset Description: ") print(tips.describe()) total bill tip sex smoker day time size 0 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.50 Male No Sun Dinner 3 RangeIndex: 244 entries, 0 to 243 Data columns (total 7 columns): # Column Non-Null Count Dtype 0 total_bill 244 non-null 1 tip 2 sex 3 smoker 244 non-null 244 non-null 244 non-null float64 float64 category category 4 day 244 non-null category 5 time 244 non-null category 6 size 244 non-null int64 dtypes: category(4), float64(2), int64(1) memory usage: 7.4 KB None Dataset Description: total_bill tip size count 244.000000 244.000000 244.000000 mean 19.785943 2.998279 2.569672 std 8.902412 1.383638 0.951100 min 3.070000 1.000000 1.000000 25% 13.347500 2.000000 2.000000 50% 17.795000 2.900000 2.000000 75% 24.127500 3.562500 3.000000 max 50.810000 10.000000 6.000000 Qestion 1: Exploratory Data Analysis • Create histograms and box plots for the total_bill and tip columns (Four visuals in total). • What can we say about the distribution of each variable? (i.e normalized? Skewed?) [ ] plt.figure(figsize=(12, 6)) plt.subplot(2, 2, 1) # Code goes here ||| * 3:12 D AN b.research.google.com + newfile.py + <> + TT RAM Disk 63 lll 52% ▾ Qestion 1: Exploratory Data Analysis • Create histograms and box plots for the total_bill and tip columns (Four visuals in total). What can we say about the distribution of each variable? (i.e normalized? Skewed?) [ ] plt.figure(figsize=(12, 6)) plt.subplot(2, 2, 1) # Code goes here (*) Count 50 40 30 20 10 50 40 30 20 10 Histogram of Total Bill Box Plot of Total Bill 10 20 30 40 50 20 30 40 50 total_bill total_bill Histogram of Tip Box Plot of Tip 10 6 tip tip B ✓ Question 2: Count Plots • Create count plots for the sex, smoker, and day columns. Interpret the results in few words: Which group is larger? [ ] plt.figure(figsize=(15, 5)) (†) + plt.subplot(1, 4, 1) # Code goes here Count Plot of Sex 160 Count Plot of Smoker Count Plot of Day Count Plot of Meal 175 140 120 140 120 100 100 80 60 150 125 100 80 80 40 75 60 60 40 20 40 20 20- 50 25 0 0 0 0 Male Female Yes No Thur Fri Sat Sun Lunch Dinner sex smoker day time 10 * r

icon
Related questions
Question

Need help with the machine learning question

=
3:10 1
newfile.py
+ <> + TT
AN
b.research.google.com +
✓
Load the tips dataset from Seaborn
↑
63
lll 53%
RAM
Disk
r
B
⚫ The Seaborn "Tips" dataset contains information about restaurant bills, tips, and customer
demographics.
• Here are the column descriptions of the Tips dataset:
。 total_bill: Meal cost.
。 tip: Tip amount.
。 sex: Payer gender.
。 smoker: Smoker (yes/no).
。 day: Day of week.
。 time: Lunch/dinner.
。 size: Party size.
[1] import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
tips = sns.load_dataset("tips")
display(tips.head(3))
print(tips.info())
print("\nDataset Description: ")
print(tips.describe())
total bill tip sex smoker day
time size
0
16.99 1.01 Female
No Sun Dinner
2
1
10.34 1.66 Male
No Sun Dinner
3
2
21.01 3.50 Male
No Sun Dinner
3
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
# Column
Non-Null Count
Dtype
0
total_bill 244 non-null
1
tip
2
sex
3
smoker
244 non-null
244 non-null
244 non-null
float64
float64
category
category
4
day
244 non-null
category
5
time
244 non-null
category
6 size
244 non-null
int64
dtypes: category(4), float64(2), int64(1)
memory usage: 7.4 KB
None
Dataset Description:
total_bill
tip
size
count 244.000000
244.000000
244.000000
mean
19.785943
2.998279
2.569672
std
8.902412
1.383638
0.951100
min
3.070000
1.000000
1.000000
25%
13.347500
2.000000
2.000000
50%
17.795000
2.900000
2.000000
75%
24.127500
3.562500
3.000000
max
50.810000 10.000000
6.000000
Qestion 1: Exploratory Data Analysis
• Create histograms and box plots for the total_bill and tip columns (Four visuals in total).
• What can we say about the distribution of each variable? (i.e normalized? Skewed?)
[ ] plt.figure(figsize=(12, 6))
plt.subplot(2, 2, 1)
# Code goes here
|||
*
Transcribed Image Text:= 3:10 1 newfile.py + <> + TT AN b.research.google.com + ✓ Load the tips dataset from Seaborn ↑ 63 lll 53% RAM Disk r B ⚫ The Seaborn "Tips" dataset contains information about restaurant bills, tips, and customer demographics. • Here are the column descriptions of the Tips dataset: 。 total_bill: Meal cost. 。 tip: Tip amount. 。 sex: Payer gender. 。 smoker: Smoker (yes/no). 。 day: Day of week. 。 time: Lunch/dinner. 。 size: Party size. [1] import seaborn as sns import pandas as pd import matplotlib.pyplot as plt import numpy as np tips = sns.load_dataset("tips") display(tips.head(3)) print(tips.info()) print("\nDataset Description: ") print(tips.describe()) total bill tip sex smoker day time size 0 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.50 Male No Sun Dinner 3 <class 'pandas.core.frame.DataFrame'> RangeIndex: 244 entries, 0 to 243 Data columns (total 7 columns): # Column Non-Null Count Dtype 0 total_bill 244 non-null 1 tip 2 sex 3 smoker 244 non-null 244 non-null 244 non-null float64 float64 category category 4 day 244 non-null category 5 time 244 non-null category 6 size 244 non-null int64 dtypes: category(4), float64(2), int64(1) memory usage: 7.4 KB None Dataset Description: total_bill tip size count 244.000000 244.000000 244.000000 mean 19.785943 2.998279 2.569672 std 8.902412 1.383638 0.951100 min 3.070000 1.000000 1.000000 25% 13.347500 2.000000 2.000000 50% 17.795000 2.900000 2.000000 75% 24.127500 3.562500 3.000000 max 50.810000 10.000000 6.000000 Qestion 1: Exploratory Data Analysis • Create histograms and box plots for the total_bill and tip columns (Four visuals in total). • What can we say about the distribution of each variable? (i.e normalized? Skewed?) [ ] plt.figure(figsize=(12, 6)) plt.subplot(2, 2, 1) # Code goes here ||| *
3:12 D
AN
b.research.google.com +
newfile.py
+ <> + TT
RAM
Disk
63
lll 52%
▾ Qestion 1: Exploratory Data Analysis
• Create histograms and box plots for the total_bill and tip
columns (Four visuals in total).
What can we say about the distribution of each variable? (i.e
normalized? Skewed?)
[ ] plt.figure(figsize=(12, 6))
plt.subplot(2, 2, 1)
# Code goes here
(*)
Count
50
40
30
20
10
50
40
30
20
10
Histogram of Total Bill
Box Plot of Total Bill
10
20
30
40
50
20
30
40
50
total_bill
total_bill
Histogram of Tip
Box Plot of Tip
10
6
tip
tip
B
✓ Question 2: Count Plots
• Create count plots for the sex, smoker, and day columns.
Interpret the results in few words: Which group is larger?
[ ] plt.figure(figsize=(15, 5))
(†)
+
plt.subplot(1, 4, 1)
# Code goes here
Count Plot of Sex
160
Count Plot of Smoker
Count Plot of Day
Count Plot of Meal
175
140
120
140
120
100
100
80
60
150
125
100
80
80
40
75
60
60
40
20
40
20
20-
50
25
0
0
0
0
Male
Female
Yes
No
Thur
Fri
Sat
Sun
Lunch
Dinner
sex
smoker
day
time
10
*
r
Transcribed Image Text:3:12 D AN b.research.google.com + newfile.py + <> + TT RAM Disk 63 lll 52% ▾ Qestion 1: Exploratory Data Analysis • Create histograms and box plots for the total_bill and tip columns (Four visuals in total). What can we say about the distribution of each variable? (i.e normalized? Skewed?) [ ] plt.figure(figsize=(12, 6)) plt.subplot(2, 2, 1) # Code goes here (*) Count 50 40 30 20 10 50 40 30 20 10 Histogram of Total Bill Box Plot of Total Bill 10 20 30 40 50 20 30 40 50 total_bill total_bill Histogram of Tip Box Plot of Tip 10 6 tip tip B ✓ Question 2: Count Plots • Create count plots for the sex, smoker, and day columns. Interpret the results in few words: Which group is larger? [ ] plt.figure(figsize=(15, 5)) (†) + plt.subplot(1, 4, 1) # Code goes here Count Plot of Sex 160 Count Plot of Smoker Count Plot of Day Count Plot of Meal 175 140 120 140 120 100 100 80 60 150 125 100 80 80 40 75 60 60 40 20 40 20 20- 50 25 0 0 0 0 Male Female Yes No Thur Fri Sat Sun Lunch Dinner sex smoker day time 10 * r
Expert Solution
steps

Step by step

Solved in 2 steps

Blurred answer