act6_tonyzhang

pdf

School

Pennsylvania State University *

*We aren’t endorsed by this school

Course

184

Subject

Mathematics

Date

Apr 3, 2024

Type

pdf

Pages

5

Uploaded by MajorMaskMule13

Report
Activity 6 Tony Zhang 2/27/2024 Use Headers Use headers to organize your document. The first level heading is denoted by a single pound sign/hash tag, # . Each new problem/exercise should get a Level 1 Heading. For subparts, increase the heading level by increasing the number of hash tags. For example, if Problem 1 has Parts A (with parts i-ii) and B, your R Markdown file would have the following: # Problem 1 [text] ## Part A [text] ### Part i [text] ### Part ii [text] ## Part B [text] Code There are two ways to include code in your document: inline and chunks. Inline Code To add inline code, you’ll need to type a grave mark ‘ (the key to the left of the numeral 1 key), followed by a lower case r, a space, then the R commands you wish to r and a final grave. For example ‘ r nrow(dataFrame) ‘ would return the number of rows in the data frame named “dataFrame”. Inline code is good for calling values you have stored and doing quick calculations on those values. Inline code will not be added to the Code Appendix. Code Chunks For more complicated code such as data manipulation and cleaning, creating graphs or tables, model building and testing, you’ll want to use code chunks. You can do this in two ways: You can click the Insert button found just above the RStudio’s editor page (has an icon of a white circle with a green plus sign and a green square with a white C) and selecting R from the drop down list. You can create your own code chunk by typing three graves in a row, returning twice and typing three more graves. You should see the editor become shaded gray for those three lines. You will want to write your code starting in the middle blank line. In the first line, right after the third grave, you’ll want to set options including coding language and chunk name as well as other options (e.g., figure caption and dimensions). 1
Mathematics To type mathematical formulas, you will need to use LaTeX commands. For inline mathematics you’ll need to enclose your mathematical expression in \( and \). For display math (on it’s own line and centered), enclose the expression in \[ and \]. The following code will automatically create your Code Appendix by grabbing all of your code chunks and writing that code here. Take a moment to look through the appendix and make sure that your code is fully readable. Use comments in your code to help create markers for what code does what. 2
Code Appendix # This template file is based off of a template created by Alex Hayes # https://github.com/alexpghayes/rmarkdown_homework_template # Setting Document Options knitr :: opts_chunk $ set ( echo = TRUE , warning = FALSE , message = FALSE , fig.align = "center" ) install.packages ( "dcData" , repos = "http://cran.us.r-project.org" ) install.packages ( "tidyverse" , repos = "http://cran.us.r-project.org" ) library (dcData) library (tidyverse) data ( "BabyNames" ) str (BabyNames) summary (BabyNames) head (BabyNames) tail (BabyNames) names_of_interest <- c ( "Tony" , "John" , "Emily" , "Michael" ) filtered_data <- BabyNames %>% filter (name %in% names_of_interest) grouped_data <- filtered_data %>% group_by (name, year, .groups = "drop" ) summarized_data <- grouped_data %>% summarize ( popularity = sum (count), .groups = "drop" ) ggplot (summarized_data, aes ( x = year, y = popularity, group = name, color = name)) + geom_line ( size = 1 , alpha = 0.5 ) + ylab ( "Popularity" ) + xlab ( "Year" ) install.packages ( "dcData" , repos = "http://cran.us.r-project.org" ) install.packages ( "tidyverse" , repos = "http://cran.us.r-project.org" ) ## package ’tidyverse’ successfully unpacked and MD5 sums checked ## ## The downloaded binary packages are in ## C:\Users\tonyz\AppData\Local\Temp\RtmpMvZnPl\downloaded_packages library (dcData) library (tidyverse) data ( "BabyNames" ) Step 1 str (BabyNames) ## ’data.frame’: 1792091 obs. of 4 variables: ## $ name : chr "Mary" "Anna" "Emma" "Elizabeth" ... ## $ sex : chr "F" "F" "F" "F" ... ## $ count: int 7065 2604 2003 1939 1746 1578 1472 1414 1320 1288 ... ## $ year : int 1880 1880 1880 1880 1880 1880 1880 1880 1880 1880 ... 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
summary (BabyNames) ## name sex count year ## Length:1792091 Length:1792091 Min. : 5.0 Min. :1880 ## Class :character Class :character 1st Qu.: 7.0 1st Qu.:1948 ## Mode :character Mode :character Median : 12.0 Median :1981 ## Mean : 186.1 Mean :1972 ## 3rd Qu.: 32.0 3rd Qu.:2000 ## Max. :99674.0 Max. :2013 head (BabyNames) ## name sex count year ## 1 Mary F 7065 1880 ## 2 Anna F 2604 1880 ## 3 Emma F 2003 1880 ## 4 Elizabeth F 1939 1880 ## 5 Minnie F 1746 1880 ## 6 Margaret F 1578 1880 tail (BabyNames) ## name sex count year ## 1792086 Zyere M 5 2013 ## 1792087 Zyhier M 5 2013 ## 1792088 Zylar M 5 2013 ## 1792089 Zymari M 5 2013 ## 1792090 Zymeer M 5 2013 ## 1792091 Zyree M 5 2013 Step 3 Part 1 The variable “sex” does not appear in the graph. Part 2 Popularity has been transformed, it has been changed from count. Year has also been transformed, the graph represents broader intervals. Step 4 Part 1 Names with low popularity have been filtered out. Part 2 Cases with the same name and differing years have been grouped together. 4
Part 3 No new variables have been introduced. Step 5 1. Filter the table to only include names of interest. 2. Group the filtered data table by the “name” and “year” variables. 3. Summarize the “count” variable within each group 4. Create a new data table containing the “name”, “year”, and “popularity” variables. 5. Plot the data using a line graph. Step 7 names_of_interest <- c ( "Tony" , "John" , "Emily" , "Michael" ) filtered_data <- BabyNames %>% filter (name %in% names_of_interest) grouped_data <- filtered_data %>% group_by (name, year, .groups = "drop" ) summarized_data <- grouped_data %>% summarize ( popularity = sum (count), .groups = "drop" ) ggplot (summarized_data, aes ( x = year, y = popularity, group = name, color = name)) + geom_line ( size = 1 , alpha = 0.5 ) + ylab ( "Popularity" ) + xlab ( "Year" ) 0 25000 50000 75000 1875 1900 1925 1950 1975 2000 Year Popularity name Emily John Michael Tony 5