hw02-sol

pdf

School

University of California, Berkeley *

*We aren’t endorsed by this school

Course

142

Subject

Economics

Date

Feb 20, 2024

Type

pdf

Pages

Uploaded by BaronRhinocerosPerson995

Problem Set 2: Summarize global cesarean delivery rates and GDP across 137 countries Your name and student ID January 29, 2024 Instructions • Solutions will be released by Sunday, January 28th. • This semester, problem sets are for practice only and will not be turned in for marks. Helpful hints: • Every function you need to use was taught during lecture! You may need to revisit the lecture code to help you along by opening the relevant files on Datahub. Alternatively, you may wish to view the code in the condensed PDFs posted on the course website. Good luck! • Knit your file early and often to minimize knitting errors! If you copy and paste code for the slides, you are bound to get an error that is hard to diagnose. Typing out the code is the way to smooth knitting! We recommend knitting your file each time after you write a few sentences/add a new code chunk, so you can detect the source of knitting errors more easily. This will save you and the GSIs from frustration! • If your code runs off the page of the knitted PDF, be sure to fix this! If it doesn’t look right, go back to your .Rmd file and add spaces (new lines) using the return or enter key so that the code runs onto the next line. 1

Summarizing global cesarean delivery rates and GDP across 137 countries Introduction Recall from this week’s lab that we constructed bar charts and histograms to explore a data set that looked at global rates of cesarean delivery and GDP. If you need a refresher, you can view your knitted file from lab and remind yourself about what you found. In this week’s problem set, you will describe these distributions using numbers. You will investigate the mean and median of the distribution of GDP. You will also examine the distribution of cesarean delivery separately for countries of varying income levels. Lastly, you will describe the spread of the distributions using quartiles and make a box plot . Execute this code chunk to load the required libraries: library (readr) ## ## Attaching package: ' readr ' ## The following objects are masked from ' package:testthat ' : ## ## edition_get, local_edition library (dplyr) ## ## Attaching package: ' dplyr ' ## The following object is masked from ' package:testthat ' : ## ## matches ## The following objects are masked from ' package:stats ' : ## ## filter, lag ## The following objects are masked from ' package:base ' : ## ## intersect, setdiff, setequal, union library (ggplot2) Just like in lab, read in the data that is stored as a .csv file and assign it to an object called CS_data . We will also use dplyr’s mutate() to create the new cesarean delivery variable that ranges between 0 and 100: CS_data <- read_csv ( "data/cesarean.csv" ) ## Rows: 137 Columns: 7 ## -- Column specification -------------------------------------------------------- ## Delimiter: "," ## chr (4): Country_Name, CountryCode, Income_Group, Region ## dbl (3): Births_Per_1000, GDP_2006, CS_rate ## ## i Use ` spec() ` to retrieve the full column specification for this data. ## i Specify the column types or set ` show_col_types = FALSE ` to quiet this message. 2

# The code below reorders the Factor variable ` Income_Group ` in the # order specified in this function. This will affect the order the ggplot # panels are shown in question 8 when we use ` facet_wrap() ` . CS_data $ Income_Group <- forcats :: fct_relevel (CS_data $ Income_Group, "Low income" , "Lower middle income" , "Upper middle income" , "High income: nonOECD" , "High income: OECD" ) CS_data <- CS_data %>% mutate ( CS_rate_100 = CS_rate * 100 ) 3

Your preview ends here