f. Which measure of center (mean or median) is more appropriate for these data? Why? Consider the shape of the distribution discussed in part c. h. Calculate the standard deviation of the annual income data. Calculate the interquartile range of the annual income data.

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question
f.
Which measure of center (mean or median) is more appropriate for these data?
Why? Consider the shape of the distribution discussed in part c.
8
h.
Calculate the standard deviation of the annual income data.
Calculate the interquartile range of the annual income data.
Visualizing Two Variables
Let's continue to explore the annual income data, but now consider how annual income data
may vary between loan status (current or fully paid).
Construct a side-by-side boxplot for annual income broken up by loan status. Include
informative labels and a title.
J.
How do the distributions of annual income compare for loan status? Comment on
the shape, center, spread, and presence of outliers for the two groups.
k.
Exploring a Single Categorical Variable
Finally, we'll focus our attention only on the loan status variable.
Construct a table of counts for the loan status variable. Report the number of
observations in each category below.
I.
Construct a table of proportions for the loan status variable. Report the proportions
for each category below.
Construct a barplot that displays the distribution of loan status types. Include
informative labels and a title. Include your barplot below.
Transcribed Image Text:f. Which measure of center (mean or median) is more appropriate for these data? Why? Consider the shape of the distribution discussed in part c. 8 h. Calculate the standard deviation of the annual income data. Calculate the interquartile range of the annual income data. Visualizing Two Variables Let's continue to explore the annual income data, but now consider how annual income data may vary between loan status (current or fully paid). Construct a side-by-side boxplot for annual income broken up by loan status. Include informative labels and a title. J. How do the distributions of annual income compare for loan status? Comment on the shape, center, spread, and presence of outliers for the two groups. k. Exploring a Single Categorical Variable Finally, we'll focus our attention only on the loan status variable. Construct a table of counts for the loan status variable. Report the number of observations in each category below. I. Construct a table of proportions for the loan status variable. Report the proportions for each category below. Construct a barplot that displays the distribution of loan status types. Include informative labels and a title. Include your barplot below.
Part 2 of the Introduction to RStudio & Tutorial uses two variables from the Lending Club data:
loan amount and homeownership. For the assignment you'll submit, you will practice using
two different variables. Please make sure the assignment you submit uses the correct variables
(specified in the questions below).
Exploring a Single Quantitative Variable
For this portion of the assignment, you'll practice using R to explore the annual_income
variable in the loan50.csv data set.
a.
Construct a histogram of the annual income data. Include informative labels and a
title. Include your histogram below.
b.
Construct a boxplot of the annual income data. Include informative labels and a
title. Include your boxplot below.
C.
Using the histogram you constructed in part a and the boxplot from part b, describe
the shape of the distribution of the annual income variable and comment on the presence of
any outliers.
d.
e.
Calculate the mean of the annual income data.
Calculate the median of the annual income data.
Transcribed Image Text:Part 2 of the Introduction to RStudio & Tutorial uses two variables from the Lending Club data: loan amount and homeownership. For the assignment you'll submit, you will practice using two different variables. Please make sure the assignment you submit uses the correct variables (specified in the questions below). Exploring a Single Quantitative Variable For this portion of the assignment, you'll practice using R to explore the annual_income variable in the loan50.csv data set. a. Construct a histogram of the annual income data. Include informative labels and a title. Include your histogram below. b. Construct a boxplot of the annual income data. Include informative labels and a title. Include your boxplot below. C. Using the histogram you constructed in part a and the boxplot from part b, describe the shape of the distribution of the annual income variable and comment on the presence of any outliers. d. e. Calculate the mean of the annual income data. Calculate the median of the annual income data.
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 3 steps with 4 images

Blurred answer
Knowledge Booster
Payback period
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education