Quiz L04 Submit - HW 4

pdf

School

Pennsylvania State University *

*We aren’t endorsed by this school

Course

365

Subject

Industrial Engineering

Date

Feb 20, 2024

Type

pdf

Pages

7

Uploaded by Jordanmchampion

Report
L04: Submit - HW 4 Started: Feb 7 at 1:27pm Quiz Instructions Please be sure to read L04 before attempting this assignment. The widely used Statlog German credit dataset ( click here ( https://psu.instructure.com/courses/2313011/files/158251342?wrap=1 ) (https://psu.instructure.com/courses/2313011/files/158251342 /download?download_frd=1) ), named 'South German Credit' data, contains some important features (i.e., variables) related to a debtor, which can be used to predict if there is a credit risk once the debtor has applied for loans from the bank. ID# : Users’ ID duration: Credit duration in months (in the range of 0 and 1200) purpose : Purpose for which the credit is needed (Ten levels: furniture, car(used), car(new), retraining, repairs, domestic appliances, business, television, vacation, others) employment_duration: Duration of debtor's employment with the current employer (Five levels: unemployed, less than 1 year, 1-4 years, 4-7 years, more than 7 yrs) age: Age expressed in years (in the range of 18-75) housing: Type of housing the debtor lives in (Three levels: rent, own, for free) foreign_worker: Is the debtor a foreign worker? (Two levels: Yes, No) number_credits: Number of credits including the current one the debtor has (or had) at this bank (Four levels: 1, 2-3, 4-5, equal or more than 6) credit_history : The credit history of the debtor (Five Levels) 0 : delay in paying off in the past 1 : critical account/other credits elsewhere 2 : no credits taken/all credits paid back duly 3 : existing credits paid back duly till now Quiz: L04: Submit - HW 4 https://psu.instructure.com/courses/2313011/quizzes/4976401/take 1 of 7 2/7/2024, 1:29 PM
4 : all credits at this bank paid back duly amount: Amount requested by the debtor (in the range of 1K and MAX) installment_rate : Credit installments as a percentage of debtor's disposable income (Four levels): 1 : >= 35 2 : 25 <= … < 35 3 : 20 <= … < 25 4 : < 20 credit_risk : Credit risk assessed as the potential that a borrower fails to meet its obligations (Two levels: good, bad) 2 pts Question 1 Use the appropriate function to read the data into R as a data frame named hw4.data . Complete the blanks below with the missing syntax you would use to complete this import. Please assume that you've already set your working directory to the folder containing the file. = ( , header = ) 4 pts Question 2 Use R and the ggplot package to create the plot shown below. Quiz: L04: Submit - HW 4 https://psu.instructure.com/courses/2313011/quizzes/4976401/take 2 of 7 2/7/2024, 1:29 PM
Complete the blanks below with the missing syntax you would use to generate this plot in R. plot1 <- ( , aes( , )) plot1 + 1 pts Question 3 Based on your plot, which subcategory of data type has the most outliers? <1 yr >=7 yrs Quiz: L04: Submit - HW 4 https://psu.instructure.com/courses/2313011/quizzes/4976401/take 3 of 7 2/7/2024, 1:29 PM
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
1<=...<4 yrs 4<=...<7 yrs unemployed 1.5 pts Question 4 Based on your plot, which subcategory of data type has the most normal distribution? <1 yr >=7 yrs 1<=...<4 yrs 4<=...<7 yrs unemployed 1 pts Question 5 Based on your plot, which subcategory/subcategories of employment duration has/have a median less than or equal to 30? (Select all that apply) <1 yr >=7 yrs 1<=...<4 yrs 4<=...<7 yrs unemployed Quiz: L04: Submit - HW 4 https://psu.instructure.com/courses/2313011/quizzes/4976401/take 4 of 7 2/7/2024, 1:29 PM
4 pts Question 6 Create a histogram for the continuous variable 'age'. Be sure to label your x-axis as "Age" and your y-axis as "Frequency". Complete the blanks below with the missing syntax you would use to generate this plot in R. hist <- ( , aes( )) hist + (binwidth = 0.5) + labs(x= ,y= ) 1.5 pts Question 7 Based on your histogram, the most frequently reported number of people impacted in a data breach can be found: at or near 20 at or near 25 at or near 30 at or near 35 1 pts Question 8 Create two additional histograms in R. One should be for the continuous variable "amount", and the other should be for the continuous variable "duration". Quiz: L04: Submit - HW 4 https://psu.instructure.com/courses/2313011/quizzes/4976401/take 5 of 7 2/7/2024, 1:29 PM
When examining all the histograms created in R, which of these distributions appears to be the most normal? age amount duration 3 pts Question 9 Create a scatterplot to show the relationship between "age" and "amount." Complete the blanks below with the missing syntax you would use to generate this plot in R. scatter1 <- ( ,aes( , )) scatter1 + 1 pts Question 10 Create another scatterplot in R. It should show the relationship between "duration" and "amount." When examining all the scatterplots created in R, which variable seems to be most strongly related to "amount"? age duration Quiz: L04: Submit - HW 4 https://psu.instructure.com/courses/2313011/quizzes/4976401/take 6 of 7 2/7/2024, 1:29 PM
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
No new data to save. Last checked at 1:29pm Submit Quiz Quiz: L04: Submit - HW 4 https://psu.instructure.com/courses/2313011/quizzes/4976401/take 7 of 7 2/7/2024, 1:29 PM