Lab 2 Instructions
docx
keyboard_arrow_up
School
Concordia University *
*We aren’t endorsed by this school
Course
213
Subject
Communications
Date
Feb 20, 2024
Type
docx
Pages
4
Uploaded by PresidentFogReindeer35
COMM213 - Computing and Visualization
Tools for Business Analytic
Lab 2
Statistical tools for sampling
Descriptive Statistics in Excel
Descriptive Statistics in Tableau
In this lab, we are going to cover more data manipulation techniques and review the basic
descriptive analysis in Excel and Tableau. Use the Loan_clean.xlsx dataset to answer the
following questions. Below are the topics that will be covered in this lab session.
Statistical tools for sampling ●
Random sampling in excel
Q1. Using the rand() function, create a random sample of size 100
and copy it into a
new sheet. Name the new sheet Random_Sample.
●
Strata sampling in excel
Q2. Using the rand() function, create a random sample of size 100 which is
representative of proportions in the “term” variable. Copy the resulting sample into a new
sheet named Strata_Sample.
Descriptive Statistics in Excel
●
Measures of central tendency
●
Measures of dispersion ●
Pivot tables ●
Frequency distribution
●
Histograms
●
Box plots
Q3. Using Excel functions and Data Analysis ToolPak: a.
Fill the following table with required descriptive statistics for the “int_rate”. (You may
obtain different results when from mine examining random sample data, as your specific
random sample could differ.).
b.
Specify the type of each statistic/parameter.
Statistic/Parameter
Type
Function
Population
Random Sample
(
Your Answer may differ
from the one on video
)
Mean
Median
1
Mode
Kurtosis
Skewness
Minimum
Maximum
Standard Deviation
Variance
Range
IQR
Sum
Count
Pivot table: A PivotTable is a powerful tool to calculate, summarize, and analyze data that lets
you see comparisons, patterns, and trends in your data.
In the population dataset:
Q4. Using Excel pivot table, answer the following questions:
a.
What is the average int_rate for different loan terms?
36 months: 60 months: b.
What is the total loan amount for Home improvement that is fully paid by the borrowers? c.
What is the average loan_amount of the loans issued in July 2018 and are fully paid? d.
What is the maximum loan amount with a 60 months term that has been borrowed for
home improvement? Q5. Frequency distribution table: Using the pivot table, answer the following questions:
a.
Which loan title has been borrowed more frequently? b.
How many fully paid loans have an interest rate less than 10.31? c.
Create a histogram of interest rate distributions. Use a bin size of 3.
d.
In the loans with Current status, which interest rate range is more frequent? 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
e.
Create the cross tabulation table of “initial_list_status” versus “term”, what percentage of
60 month loans were initially listed as whole? Q6. Compare interest rate distribution for various initial_list_status using a boxplot.
a.
Is there any difference in the interest rate for whole and fractional initial_list_status?
b.
Is there any outlier in interest rate values? If so, which loan id(s) are outlier?
Descriptive Statistics in Tableau
●
Summary Card
●
Histograms
A histogram is a chart that displays the shape of a distribution. A histogram looks like a
bar chart but groups values for a continuous measure into ranges, or bins.
●
Box plots
Using the original Loan_clean.xlsx:
Q7. Create a summary card showing descriptive statistics for interest rate same as Q3.
Q8.
Create separate histograms of interest rates for each loan term on one sheet. Q9.
Answer Q6 using Tableau. Filter your boxplots with loan terms.
3