Week-4-Assignment

pdf

School

Bunker Hill Community College *

*We aren’t endorsed by this school

Course

IT471

Subject

Statistics

Date

Jun 24, 2024

Type

pdf

Pages

36

Uploaded by Sushilghimiray639

Report
Week 4 Shraddha Bijukchhe 2024-05-30 1. Loading Libraries: We will start by loading the socviz and tidyverse libraries, which provide useful functions and datasets for data visualization and manipulation. # Load the socviz library for social science data visualization library (socviz) # Load the tidyverse library for data science, which includes ggplot2 for plotting library (tidyverse) ## Warning: package ’tidyverse’ was built under R version 4.3.3 ## Warning: package ’ggplot2’ was built under R version 4.3.3 ## Warning: package ’tidyr’ was built under R version 4.3.2 ## Warning: package ’readr’ was built under R version 4.3.2 ## Warning: package ’purrr’ was built under R version 4.3.2 ## Warning: package ’dplyr’ was built under R version 4.3.2 ## Warning: package ’stringr’ was built under R version 4.3.2 ## Warning: package ’lubridate’ was built under R version 4.3.2 ## -- Attaching core tidyverse packages ------------------------ tidyverse 2.0.0 -- ## v dplyr 1.1.4 v readr 2.1.5 ## v forcats 1.0.0 v stringr 1.5.1 ## v ggplot2 3.5.0 v tibble 3.2.1 ## v lubridate 1.9.3 v tidyr 1.3.1 ## v purrr 1.0.2 ## -- Conflicts ------------------------------------------ tidyverse_conflicts() -- ## x dplyr::filter() masks stats::filter() ## x dplyr::lag() masks stats::lag() ## i Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors #socviz: This library provides datasets and functions specifically useful for social science data visualization. #tidyverse: This is a collection of R packages designed for data science. It includes ggplot2 for data visualization, dplyr for data manipulation, tidyr for data tidying, and others. Summarizing mpg DataFrame: We will display metadata for the mpg dataframe using the summary() function to understand its structure and the summary statistics of its variables. 1
# Display summary statistics for the mpg dataframe summary (mpg) ## manufacturer model displ year ## Length:234 Length:234 Min. :1.600 Min. :1999 ## Class :character Class :character 1st Qu.:2.400 1st Qu.:1999 ## Mode :character Mode :character Median :3.300 Median :2004 ## Mean :3.472 Mean :2004 ## 3rd Qu.:4.600 3rd Qu.:2008 ## Max. :7.000 Max. :2008 ## cyl trans drv cty ## Min. :4.000 Length:234 Length:234 Min. : 9.00 ## 1st Qu.:4.000 Class :character Class :character 1st Qu.:14.00 ## Median :6.000 Mode :character Mode :character Median :17.00 ## Mean :5.889 Mean :16.86 ## 3rd Qu.:8.000 3rd Qu.:19.00 ## Max. :8.000 Max. :35.00 ## hwy fl class ## Min. :12.00 Length:234 Length:234 ## 1st Qu.:18.00 Class :character Class :character ## Median :24.00 Mode :character Mode :character ## Mean :23.44 ## 3rd Qu.:27.00 ## Max. :44.00 #summary(mpg): This function provides summary statistics for each variable in the dataframe, such as mean, median, minimum, and maximum values, as well as the distribution of categorical variables. The mpg dataframe provides comprehensive data on various car models, including details such as manufac- turer, model, engine displacement (displ), year, number of cylinders (cyl), transmission type (trans), drive type (drv), city miles per gallon (cty), highway miles per gallon (hwy), fuel type (fl), and vehicle class. Summary statistics reveal that the dataset consists of 234 entries with engine displacements ranging from 1.6 to 7.0 liters, a median year of 2004, and vehicles predominantly having 4 to 8 cylinders. Fuel efficiency varies, with city miles per gallon ranging from 9 to 35 (median 17) and highway miles per gallon from 12 to 44 (median 24). These statistics provide an overview of the range, central tendencies, and distribution of the dataset’s variables, essential for understanding the dataset’s characteristics. 2. Summarizing gapminder DataFrame: Similarly, we will display metadata for the gapminder dataframe using the summary() function to understand its structure and the summary statistics of its variables. #Load the library library (gapminder) ## Warning: package ’gapminder’ was built under R version 4.3.3 # Display summary statistics for the gapminder dataframe summary (gapminder) ## country continent year lifeExp ## Afghanistan: 12 Africa :624 Min. :1952 Min. :23.60 ## Albania : 12 Americas:300 1st Qu.:1966 1st Qu.:48.20 ## Algeria : 12 Asia :396 Median :1980 Median :60.71 2
## Angola : 12 Europe :360 Mean :1980 Mean :59.47 ## Argentina : 12 Oceania : 24 3rd Qu.:1993 3rd Qu.:70.85 ## Australia : 12 Max. :2007 Max. :82.60 ## (Other) :1632 ## pop gdpPercap ## Min. :6.001e+04 Min. : 241.2 ## 1st Qu.:2.794e+06 1st Qu.: 1202.1 ## Median :7.024e+06 Median : 3531.8 ## Mean :2.960e+07 Mean : 7215.3 ## 3rd Qu.:1.959e+07 3rd Qu.: 9325.5 ## Max. :1.319e+09 Max. :113523.1 ## #summary(gapminder): This function provides summary statistics for each variable in the dataframe, giving insights into the range and distribution of the data. The summary of the gapminder dataframe provides a concise statistical overview of key variables related to global demographics and economics. It reveals that the dataset encompasses observations from 142 countries across different continents, with data spanning from 1952 to 2007. Life expectancy varies widely, ranging from 23.6 to 82.6 years, with a median of 60.71 years. Population sizes exhibit substantial diversity, with a minimum of 60,010 and a maximum of 1.319 billion, indicating significant disparities among countries. Similarly, GDP per capita values range from 241.2 to 113,523.1, with a median of 3,531.8, reflecting eco- nomic variations among nations. This summary provides valuable insights into the distribution and central tendencies of key demographic and economic indicators, facilitating further analysis and interpretation of global trends over time. 3. Creating a ggplot Object: We are going to assign a ggplot object to the variable ‘p’. This ggplot object will serve as the foundation for creating visualizations based on the gapminder dataset. Specifically, we are mapping GDP per capita (gdpPercap) to the x-axis and life expectancy (lifeExp) to the y-axis. By doing so, we aim to explore the relationship between a country’s economic prosperity, as indicated by GDP per capita, and the life expectancy of its population. This assignment sets the stage for further customization and layering of graphical elements to create insightful visualizations that illustrate trends and patterns in global demographics and economics. # Create a ggplot object ' p ' with gapminder dataset p <- ggplot ( data = gapminder, mapping = aes ( x = gdpPercap, y = lifeExp, color = continent, size = pop)) # Add points to the plot with transparency set to 0.7 geom_point ( alpha = 0.7 ) + # Set x-axis scale to log10 and format labels as dollar values scale_x_log10 ( labels = scales :: dollar_format ()) + # Manually specify colors for different continents scale_color_manual ( values = c ( "#F8766D" , "#00BA38" , "#619CFF" , "#FFC100" , "#A3A3A3" , "#E76BF3" )) + # Set size range for points and specify breaks for size legend scale_size ( range = c ( 2 , 12 ), breaks = c ( 1e+06 , 5e+07 , 1e+09 )) + # Add axis and legend labels labs ( x = "GDP per capita" , y = "Life expectancy" , size = "Population" , color = "Continent" ) + # Add plot title ggtitle ( "Life expectancy and GDP per capita by continent" ) + 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
# Customize plot theme to minimal theme_minimal () + # Customize various text elements in the plot theme ( plot.title = element_text ( hjust = 0.5 , size = 16 , face = "bold" ), axis.text = element_text ( size = 12 ), axis.title = element_text ( size = 14 ), legend.title = element_text ( size = 14 ), legend.text = element_text ( size = 12 )) # Print the plot print (p) 40 60 80 $1,000 $10,000 $100,000 GDP per capita Life expectancy Population 1e+06 5e+07 1e+09 Continent Africa Americas Asia Europe Oceania Life expectancy and GDP per capita by continent #ggplot Object Creation: A ggplot object named ‘p’ is created using the ggplot() function. Data from the gapminder dataset is mapped to aesthetics (x, y, color, size) using the aes() function. #Geometric Elements: Points are added to the plot using geom_point(), with transparency set to 0.7 to make overlapping points more visible. #Scale Transformations: The x-axis scale is set to log10 using scale_x_log10() to better visualize data with a wide range. Labels on the x-axis are formatted as dollar values using scales::dollar_format(). #Color and Size Scales: Colors for different continents are manually specified using scale_color_manual(), while the size of points is adjusted using scale_size() with a specified range and breaks. #Axis and Legend Labels: Labels for x-axis, y-axis, point size (population), and color (continent) are added using the labs() function. 4
#Themes and Text Customization: The plot theme is set to minimal using theme_minimal(), and various text elements (plot title, axis labels, legend titles, legend text) are customized using the theme() function. #Printing the Plot: The plot object ‘p’ is printed using the print() function to display the visualization. The output of the code is a scatter plot visualizing the relationship between GDP per capita and life expectancy across different continents. Each point represents a country, with the size of the points corre- sponding to the population size of the country and the color representing the continent. The plot illustrates a general trend of higher GDP per capita being associated with longer life expectancy, with notable varia- tions among continents, such as higher GDP per capita and life expectancy in Europe and North America compared to other regions like Africa and Asia. 4. Checking the Structure of p: This step involves using the str() function to examine the internal structure of the ggplot object ‘p’. By inspecting its components, such as data, aesthetics, geometries, scales, and themes, we gain insight into how the plot is constructed. Understanding the structure of ‘p’ allows for better customization and manipulation of the plot, facilitating effective data visualization and analysis. # Display the internal structure of the ggplot object p str (p) ## List of 11 ## $ data : tibble [1,704 x 6] (S3: tbl_df/tbl/data.frame) ## ..$ country : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ... ## ..$ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 3 3 ... ## ..$ year : int [1:1704] 1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ... ## ..$ lifeExp : num [1:1704] 28.8 30.3 32 34 36.1 ... ## ..$ pop : int [1:1704] 8425333 9240934 10267083 11537966 13079460 14880372 12881816 13867957 1 ## ..$ gdpPercap: num [1:1704] 779 821 853 836 740 ... ## $ layers :List of 1 ## ..$ :Classes ’LayerInstance’, ’Layer’, ’ggproto’, ’gg’ <ggproto object: Class LayerInstance, Layer, ## aes_params: list ## compute_aesthetics: function ## compute_geom_1: function ## compute_geom_2: function ## compute_position: function ## compute_statistic: function ## computed_geom_params: list ## computed_mapping: uneval ## computed_stat_params: list ## constructor: call ## data: waiver ## draw_geom: function ## finish_statistics: function ## geom: <ggproto object: Class GeomPoint, Geom, gg> ## aesthetics: function ## default_aes: uneval ## draw_group: function ## draw_key: function ## draw_layer: function ## draw_panel: function ## extra_params: na.rm ## handle_na: function ## non_missing_aes: size shape colour ## optional_aes: 5
## parameters: function ## rename_size: FALSE ## required_aes: x y ## setup_data: function ## setup_params: function ## use_defaults: function ## super: <ggproto object: Class Geom, gg> ## geom_params: list ## inherit.aes: TRUE ## layer_data: function ## map_statistic: function ## mapping: NULL ## position: <ggproto object: Class PositionIdentity, Position, gg> ## compute_layer: function ## compute_panel: function ## required_aes: ## setup_data: function ## setup_params: function ## super: <ggproto object: Class Position, gg> ## print: function ## setup_layer: function ## show.legend: NA ## stat: <ggproto object: Class StatIdentity, Stat, gg> ## aesthetics: function ## compute_group: function ## compute_layer: function ## compute_panel: function ## default_aes: uneval ## dropped_aes: ## extra_params: na.rm ## finish_layer: function ## non_missing_aes: ## optional_aes: ## parameters: function ## required_aes: ## retransform: TRUE ## setup_data: function ## setup_params: function ## super: <ggproto object: Class Stat, gg> ## stat_params: list ## super: <ggproto object: Class Layer, gg> ## $ scales :Classes ’ScalesList’, ’ggproto’, ’gg’ <ggproto object: Class ScalesList, gg> ## add: function ## add_defaults: function ## add_missing: function ## backtransform_df: function ## clone: function ## find: function ## get_scales: function ## has_scale: function ## input: function ## map_df: function ## n: function ## non_position_scales: function 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
## scales: list ## train_df: function ## transform_df: function ## super: <ggproto object: Class ScalesList, gg> ## $ guides :Classes ’Guides’, ’ggproto’, ’gg’ <ggproto object: Class Guides, gg> ## add: function ## assemble: function ## build: function ## draw: function ## get_custom: function ## get_guide: function ## get_params: function ## get_position: function ## guides: NULL ## merge: function ## missing: <ggproto object: Class GuideNone, Guide, gg> ## add_title: function ## arrange_layout: function ## assemble_drawing: function ## available_aes: any ## build_decor: function ## build_labels: function ## build_ticks: function ## build_title: function ## draw: function ## draw_early_exit: function ## elements: list ## extract_decor: function ## extract_key: function ## extract_params: function ## get_layer_key: function ## hashables: list ## measure_grobs: function ## merge: function ## override_elements: function ## params: list ## process_layers: function ## setup_elements: function ## setup_params: function ## train: function ## transform: function ## super: <ggproto object: Class GuideNone, Guide, gg> ## package_box: function ## print: function ## process_layers: function ## setup: function ## subset_guides: function ## train: function ## update_params: function ## super: <ggproto object: Class Guides, gg> ## $ mapping :List of 4 ## ..$ x : language ~gdpPercap ## .. .. - attr(*, ".Environment")=<environment: R_GlobalEnv> ## ..$ y : language ~lifeExp 7
## .. .. - attr(*, ".Environment")=<environment: R_GlobalEnv> ## ..$ colour: language ~continent ## .. .. - attr(*, ".Environment")=<environment: R_GlobalEnv> ## ..$ size : language ~pop ## .. .. - attr(*, ".Environment")=<environment: R_GlobalEnv> ## ..- attr(*, "class")= chr "uneval" ## $ theme :List of 136 ## ..$ line :List of 6 ## .. .. $ colour : chr "black" ## .. .. $ linewidth : num 0.5 ## .. .. $ linetype : num 1 ## .. .. $ lineend : chr "butt" ## .. .. $ arrow : logi FALSE ## .. .. $ inherit.blank: logi TRUE ## .. .. - attr(*, "class")= chr [1:2] "element_line" "element" ## ..$ rect :List of 5 ## .. .. $ fill : chr "white" ## .. .. $ colour : chr "black" ## .. .. $ linewidth : num 0.5 ## .. .. $ linetype : num 1 ## .. .. $ inherit.blank: logi TRUE ## .. .. - attr(*, "class")= chr [1:2] "element_rect" "element" ## ..$ text :List of 11 ## .. .. $ family : chr "" ## .. .. $ face : chr "plain" ## .. .. $ colour : chr "black" ## .. .. $ size : num 11 ## .. .. $ hjust : num 0.5 ## .. .. $ vjust : num 0.5 ## .. .. $ angle : num 0 ## .. .. $ lineheight : num 0.9 ## .. .. $ margin : ’margin’ num [1:4] 0points 0points 0points 0points ## .. .. .. - attr(*, "unit")= int 8 ## .. .. $ debug : logi FALSE ## .. .. $ inherit.blank: logi TRUE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ title : NULL ## ..$ aspect.ratio : NULL ## ..$ axis.title :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : NULL ## .. .. $ size : num 14 ## .. .. $ hjust : NULL ## .. .. $ vjust : NULL ## .. .. $ angle : NULL ## .. .. $ lineheight : NULL ## .. .. $ margin : NULL ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi FALSE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ axis.title.x :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL 8
## .. .. $ colour : NULL ## .. .. $ size : NULL ## .. .. $ hjust : NULL ## .. .. $ vjust : num 1 ## .. .. $ angle : NULL ## .. .. $ lineheight : NULL ## .. .. $ margin : ’margin’ num [1:4] 2.75points 0points 0points 0points ## .. .. .. - attr(*, "unit")= int 8 ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi TRUE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ axis.title.x.top :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : NULL ## .. .. $ size : NULL ## .. .. $ hjust : NULL ## .. .. $ vjust : num 0 ## .. .. $ angle : NULL ## .. .. $ lineheight : NULL ## .. .. $ margin : ’margin’ num [1:4] 0points 0points 2.75points 0points ## .. .. .. - attr(*, "unit")= int 8 ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi TRUE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ axis.title.x.bottom : NULL ## ..$ axis.title.y :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : NULL ## .. .. $ size : NULL ## .. .. $ hjust : NULL ## .. .. $ vjust : num 1 ## .. .. $ angle : num 90 ## .. .. $ lineheight : NULL ## .. .. $ margin : ’margin’ num [1:4] 0points 2.75points 0points 0points ## .. .. .. - attr(*, "unit")= int 8 ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi TRUE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ axis.title.y.left : NULL ## ..$ axis.title.y.right :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : NULL ## .. .. $ size : NULL ## .. .. $ hjust : NULL ## .. .. $ vjust : num 1 ## .. .. $ angle : num -90 ## .. .. $ lineheight : NULL ## .. .. $ margin : ’margin’ num [1:4] 0points 0points 0points 2.75points ## .. .. .. - attr(*, "unit")= int 8 ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi TRUE 9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ axis.text :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : chr "grey30" ## .. .. $ size : num 12 ## .. .. $ hjust : NULL ## .. .. $ vjust : NULL ## .. .. $ angle : NULL ## .. .. $ lineheight : NULL ## .. .. $ margin : NULL ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi FALSE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ axis.text.x :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : NULL ## .. .. $ size : NULL ## .. .. $ hjust : NULL ## .. .. $ vjust : num 1 ## .. .. $ angle : NULL ## .. .. $ lineheight : NULL ## .. .. $ margin : ’margin’ num [1:4] 2.2points 0points 0points 0points ## .. .. .. - attr(*, "unit")= int 8 ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi TRUE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ axis.text.x.top :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : NULL ## .. .. $ size : NULL ## .. .. $ hjust : NULL ## .. .. $ vjust : num 0 ## .. .. $ angle : NULL ## .. .. $ lineheight : NULL ## .. .. $ margin : ’margin’ num [1:4] 0points 0points 2.2points 0points ## .. .. .. - attr(*, "unit")= int 8 ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi TRUE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ axis.text.x.bottom : NULL ## ..$ axis.text.y :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : NULL ## .. .. $ size : NULL ## .. .. $ hjust : num 1 ## .. .. $ vjust : NULL ## .. .. $ angle : NULL ## .. .. $ lineheight : NULL ## .. .. $ margin : ’margin’ num [1:4] 0points 2.2points 0points 0points ## .. .. .. - attr(*, "unit")= int 8 10
## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi TRUE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ axis.text.y.left : NULL ## ..$ axis.text.y.right :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : NULL ## .. .. $ size : NULL ## .. .. $ hjust : num 0 ## .. .. $ vjust : NULL ## .. .. $ angle : NULL ## .. .. $ lineheight : NULL ## .. .. $ margin : ’margin’ num [1:4] 0points 0points 0points 2.2points ## .. .. .. - attr(*, "unit")= int 8 ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi TRUE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ axis.text.theta : NULL ## ..$ axis.text.r :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : NULL ## .. .. $ size : NULL ## .. .. $ hjust : num 0.5 ## .. .. $ vjust : NULL ## .. .. $ angle : NULL ## .. .. $ lineheight : NULL ## .. .. $ margin : ’margin’ num [1:4] 0points 2.2points 0points 2.2points ## .. .. .. - attr(*, "unit")= int 8 ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi TRUE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ axis.ticks : list() ## .. .. - attr(*, "class")= chr [1:2] "element_blank" "element" ## ..$ axis.ticks.x : NULL ## ..$ axis.ticks.x.top : NULL ## ..$ axis.ticks.x.bottom : NULL ## ..$ axis.ticks.y : NULL ## ..$ axis.ticks.y.left : NULL ## ..$ axis.ticks.y.right : NULL ## ..$ axis.ticks.theta : NULL ## ..$ axis.ticks.r : NULL ## ..$ axis.minor.ticks.x.top : NULL ## ..$ axis.minor.ticks.x.bottom : NULL ## ..$ axis.minor.ticks.y.left : NULL ## ..$ axis.minor.ticks.y.right : NULL ## ..$ axis.minor.ticks.theta : NULL ## ..$ axis.minor.ticks.r : NULL ## ..$ axis.ticks.length : ’simpleUnit’ num 2.75points ## .. .. - attr(*, "unit")= int 8 ## ..$ axis.ticks.length.x : NULL ## ..$ axis.ticks.length.x.top : NULL ## ..$ axis.ticks.length.x.bottom : NULL 11
## ..$ axis.ticks.length.y : NULL ## ..$ axis.ticks.length.y.left : NULL ## ..$ axis.ticks.length.y.right : NULL ## ..$ axis.ticks.length.theta : NULL ## ..$ axis.ticks.length.r : NULL ## ..$ axis.minor.ticks.length : ’rel’ num 0.75 ## ..$ axis.minor.ticks.length.x : NULL ## ..$ axis.minor.ticks.length.x.top : NULL ## ..$ axis.minor.ticks.length.x.bottom: NULL ## ..$ axis.minor.ticks.length.y : NULL ## ..$ axis.minor.ticks.length.y.left : NULL ## ..$ axis.minor.ticks.length.y.right : NULL ## ..$ axis.minor.ticks.length.theta : NULL ## ..$ axis.minor.ticks.length.r : NULL ## ..$ axis.line : list() ## .. .. - attr(*, "class")= chr [1:2] "element_blank" "element" ## ..$ axis.line.x : NULL ## ..$ axis.line.x.top : NULL ## ..$ axis.line.x.bottom : NULL ## ..$ axis.line.y : NULL ## ..$ axis.line.y.left : NULL ## ..$ axis.line.y.right : NULL ## ..$ axis.line.theta : NULL ## ..$ axis.line.r : NULL ## ..$ legend.background : list() ## .. .. - attr(*, "class")= chr [1:2] "element_blank" "element" ## ..$ legend.margin : ’margin’ num [1:4] 5.5points 5.5points 5.5points 5.5points ## .. .. - attr(*, "unit")= int 8 ## ..$ legend.spacing : ’simpleUnit’ num 11points ## .. .. - attr(*, "unit")= int 8 ## ..$ legend.spacing.x : NULL ## ..$ legend.spacing.y : NULL ## ..$ legend.key : list() ## .. .. - attr(*, "class")= chr [1:2] "element_blank" "element" ## ..$ legend.key.size : ’simpleUnit’ num 1.2lines ## .. .. - attr(*, "unit")= int 3 ## ..$ legend.key.height : NULL ## ..$ legend.key.width : NULL ## ..$ legend.key.spacing : ’simpleUnit’ num 5.5points ## .. .. - attr(*, "unit")= int 8 ## ..$ legend.key.spacing.x : NULL ## ..$ legend.key.spacing.y : NULL ## ..$ legend.frame : NULL ## ..$ legend.ticks : NULL ## ..$ legend.ticks.length : ’rel’ num 0.2 ## ..$ legend.axis.line : NULL ## ..$ legend.text :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : NULL ## .. .. $ size : num 12 ## .. .. $ hjust : NULL ## .. .. $ vjust : NULL ## .. .. $ angle : NULL 12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
## .. .. $ lineheight : NULL ## .. .. $ margin : NULL ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi FALSE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ legend.text.position : NULL ## ..$ legend.title :List of 11 ## .. .. $ family : NULL ## .. .. $ face : NULL ## .. .. $ colour : NULL ## .. .. $ size : num 14 ## .. .. $ hjust : num 0 ## .. .. $ vjust : NULL ## .. .. $ angle : NULL ## .. .. $ lineheight : NULL ## .. .. $ margin : NULL ## .. .. $ debug : NULL ## .. .. $ inherit.blank: logi FALSE ## .. .. - attr(*, "class")= chr [1:2] "element_text" "element" ## ..$ legend.title.position : NULL ## ..$ legend.position : chr "right" ## ..$ legend.position.inside : NULL ## ..$ legend.direction : NULL ## ..$ legend.byrow : NULL ## ..$ legend.justification : chr "center" ## ..$ legend.justification.top : NULL ## ..$ legend.justification.bottom : NULL ## ..$ legend.justification.left : NULL ## ..$ legend.justification.right : NULL ## ..$ legend.justification.inside : NULL ## ..$ legend.location : NULL ## ..$ legend.box : NULL ## ..$ legend.box.just : NULL ## ..$ legend.box.margin : ’margin’ num [1:4] 0cm 0cm 0cm 0cm ## .. .. - attr(*, "unit")= int 1 ## ..$ legend.box.background : list() ## .. .. - attr(*, "class")= chr [1:2] "element_blank" "element" ## ..$ legend.box.spacing : ’simpleUnit’ num 11points ## .. .. - attr(*, "unit")= int 8 ## .. [list output truncated] ## ..- attr(*, "class")= chr [1:2] "theme" "gg" ## ..- attr(*, "complete")= logi TRUE ## ..- attr(*, "validate")= logi TRUE ## $ coordinates:Classes ’CoordCartesian’, ’Coord’, ’ggproto’, ’gg’ <ggproto object: Class CoordCartesi ## aspect: function ## backtransform_range: function ## clip: on ## default: TRUE ## distance: function ## expand: TRUE ## is_free: function ## is_linear: function ## labels: function ## limits: list 13
## modify_scales: function ## range: function ## render_axis_h: function ## render_axis_v: function ## render_bg: function ## render_fg: function ## setup_data: function ## setup_layout: function ## setup_panel_guides: function ## setup_panel_params: function ## setup_params: function ## train_panel_guides: function ## transform: function ## super: <ggproto object: Class CoordCartesian, Coord, gg> ## $ facet :Classes ’FacetNull’, ’Facet’, ’ggproto’, ’gg’ <ggproto object: Class FacetNull, Facet, ## compute_layout: function ## draw_back: function ## draw_front: function ## draw_labels: function ## draw_panels: function ## finish_data: function ## init_scales: function ## map_data: function ## params: list ## setup_data: function ## setup_params: function ## shrink: TRUE ## train_scales: function ## vars: function ## super: <ggproto object: Class FacetNull, Facet, gg> ## $ plot_env :<environment: R_GlobalEnv> ## $ layout :Classes ’Layout’, ’ggproto’, ’gg’ <ggproto object: Class Layout, gg> ## coord: NULL ## coord_params: list ## facet: NULL ## facet_params: list ## finish_data: function ## get_scales: function ## layout: NULL ## map_position: function ## panel_params: NULL ## panel_scales_x: NULL ## panel_scales_y: NULL ## render: function ## render_labels: function ## reset_scales: function ## resolve_label: function ## setup: function ## setup_panel_guides: function ## setup_panel_params: function ## train_position: function ## super: <ggproto object: Class Layout, gg> ## $ labels :List of 5 ## ..$ title : chr "Life expectancy and GDP per capita by continent" 14
## ..$ x : chr "GDP per capita" ## ..$ y : chr "Life expectancy" ## ..$ size : chr "Population" ## ..$ colour: chr "Continent" ## - attr(*, "class")= chr [1:2] "gg" "ggplot" #str(p): This code snippet utilizes the str() function to provide a detailed breakdown of the internal structure of a ggplot object named ‘p’. By executing str(p), it aims to reveal the composition of the plot, including its data layers, aesthetic mappings, geometric elements, scale transformations, and theme specifications, aiding in better understanding and customization of the plot. The output of the str(p) command reveals the hierarchical structure and components of the ggplot object ‘p’. It typically includes information about the data being plotted, aesthetic mappings, geometric layers, scale transformations, and theme settings. Understanding this output helps users comprehend how the plot is constructed and facilitates further customization or troubleshooting as needed. 5. Add geom_point() to the p object and show p: We will add a layer of geometric points to the ggplot object ‘p’, which visualizes individual data points on the plot. This step helps to show the actual data points alongside any smoothed or modeled trends. This sets the foundation for creating visualizations exploring the relationship between a country’s GDP per capita and life expectancy. # Create a ggplot object ' p ' with GDP per capita on the x-axis and life expectancy on the y-axis p <- p + geom_point () # Print the ggplot object print (p) 40 60 80 $1,000 $10,000 $100,000 GDP per capita Life expectancy Population 1e+06 5e+07 1e+09 Continent Africa Americas Asia Europe Oceania Life expectancy and GDP per capita by continent 15
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
#p <- p + geom_point(): This line modifies the existing ggplot object ‘p’ by adding a new layer of points to it. The geom_point() function is used to create scatterplot points based on the data provided in the ggplot object. #print(p): This line prints the ggplot object ‘p’. The ggplot object is stored in memory but may not be displayed automatically unless explicitly printed or plotted. Printing the ggplot object ‘p’ generates a visualization with GDP per capita on the x-axis and life expectancy on the y-axis, based on the specified dataset (‘gapminder’) and aesthetic mappings. The code attempts to add a geometric layer of points to an existing ggplot object ‘p’. However, it seems that ‘p’ is being referenced before it has been initialized, which would result in an error. Without prior initialization of ‘p’ with data and aesthetic mappings, adding geom_point() to ‘p’ will not work as intended. Therefore, the output of this code is likely to be an error message indicating that ‘p’ is not found or has not been defined. 6. Replace geom_point() with geom_smooth() and show p: This step involves replacing the individual data points in a plot with a smoothed line generated using the geom_smooth() function. Unlike dis- playing discrete points, the smoothed line represents a trend or pattern in the data, typically calculated using statistical methods such as linear regression, loess regression, or generalized additive models. This smoothed line helps in visualizing the overall trend in the data, making it easier to identify relationships or correlations between variables. It provides a more generalized view of the data distribution and can highlight underlying patterns that might not be immediately apparent from individual data points. # Add a smoothed line to the existing ggplot object ' p ' using geom_smooth() p <- p + geom_smooth () # Print the modified ggplot object print (p) ## Warning: Using ‘size‘ aesthetic for lines was deprecated in ggplot2 3.4.0. ## i Please use ‘linewidth‘ instead. ## This warning is displayed once every 8 hours. ## Call ‘lifecycle::last_lifecycle_warnings()‘ to see where this warning was ## generated. ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? 16
40 60 80 $1,000 $10,000 $100,000 GDP per capita Life expectancy Population 1e+06 5e+07 1e+09 Continent Africa Americas Asia Europe Oceania Life expectancy and GDP per capita by continent #p <- p + geom_smooth(): This line modifies the existing ggplot object ‘p’ by adding a new layer represented by geom_smooth(). This function calculates a smooth line based on the data present in ‘p’, typically using statistical methods such as LOESS or linear regression, to visualize the overall trend or pattern. #print(p): This line prints the modified ggplot object ‘p’, displaying the plot that now includes the smoothed line. By printing ‘p’, you can visualize the updated plot with the added smoothed line, allowing for better interpretation of the relationship between the variables depicted in the plot. The output is a plot displayed in the graphics device. The plot will show the original scatterplot, representing the relationship between GDP per capita and life expectancy, along with a smoothed line overlaid on the scatterplot. This smoothed line is calculated using statistical methods (such as LOESS or linear regression) to represent the general trend or pattern in the data. It helps to visually identify trends or correlations between the variables mapped to the x and y axes. 7. Return to geom_point() and add geom_smooth() to p: This step provides a comprehensive visual- ization, allowing us to observe both the raw data distribution and the general trend simultaneously. geom_point() displays the individual data points, while geom_smooth() overlays a smoothed line, aiding in identifying the overall trend or pattern in the data. By including both layers, we can gain insights into both the specifics and the generalities of the relationship between the variables represented on the plot. # Add individual data points and a smoothed line to the existing ggplot object ' p ' p <- p + geom_point () + geom_smooth () # Print the modified ggplot object print (p) 17
## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? 40 60 80 $1,000 $10,000 $100,000 GDP per capita Life expectancy Population 1e+06 5e+07 1e+09 Continent Africa Americas Asia Europe Oceania Life expectancy and GDP per capita by continent #p <- p + geom_point() + geom_smooth(): This line modifies the existing ggplot object ‘p’ by adding two new layers: geom_point() and geom_smooth(). #geom_point() adds individual data points to the plot, allowing us to visualize the raw data distribution. #geom_smooth() overlays a smoothed line onto the plot, which represents the general trend or pattern in the data. #print(p): This line prints the modified ggplot object ‘p’, displaying the updated plot that now includes both individual data points and the smoothed line. By printing ‘p’, we can visualize the combined effect of adding 18
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
these layers to the plot, providing insights into both the specifics and the generalities of the relationship between the variables represented on the plot. The output of the code will be a plot displaying both individual data points representing GDP per capita and life expectancy, as well as a smoothed line indicating the general trend in the data. This combined visualization provides a comprehensive understanding of the relationship between the variables, allowing for insights into both the specific data distribution and the overall trend. 8. Add the linear element to the geom_smooth() function and show p: In this step, we are enhancing the smoothed line on the plot by fitting a linear model to it using the ‘lm’ method in the geom_smooth() function. By applying linear regression, the smoothed line is adjusted to closely follow a linear trend in the data. This allows us to visualize the presence and strength of any linear relationships between the variables depicted on the plot. It is particularly useful for identifying linear trends in the data, providing insights into the direction and magnitude of the relationship between the variables. # Enhance the smoothed line by fitting a linear model using the ' lm ' method p <- p + geom_smooth ( method = "lm" ) # Print the modified ggplot object print (p) ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? 19
40 60 80 $1,000 $10,000 $100,000 GDP per capita Life expectancy Population 1e+06 5e+07 1e+09 Continent Africa Americas Asia Europe Oceania Life expectancy and GDP per capita by continent #p <- p + geom_smooth(method = “lm”): This line modifies the existing ggplot object ‘p’ by adding a new layer represented by geom_smooth(). The method = “lm” argument specifies that a linear model should be fitted to the smoothed line using the ‘lm’ method. This adjustment enhances the smoothed line by making it follow a linear trend if one exists in the data. #print(p): This line prints the modified ggplot object ‘p’, displaying the plot that now includes the enhanced smoothed line. By printing ‘p’, you can visualize the updated plot with the added linear element, allowing for better interpretation of any linear relationships between the variables represented on the plot. The output of the code will display a plot with a smoothed line that has been enhanced by fitting a linear model to it. This linear element visually represents any linear trend present in the data, providing insights into the direction and strength of the relationship between GDP per capita and life expectancy. By incorpo- rating the linear model into the plot, it becomes easier to identify and interpret linear patterns in the data, aiding in data analysis and inference. 9. Changing X-Axis Scale to Log10: In this step, we are transforming the x-axis scale of the plot to a logarithmic scale using the log10 transformation. This transformation is beneficial when dealing with data that covers a wide range of values, as it compresses the scale and makes it easier to visualize relationships across different orders of magnitude. Particularly useful for displaying variables such as GDP per capita, which often span several orders of magnitude, the log10 scale helps to emphasize relative differences and patterns more effectively. This adjustment enhances the plot’s interpretability, especially when examining relationships involving variables with large numerical ranges. # Transform the x-axis scale to log10 and format labels as dollar values p <- p + scale_x_log10 ( labels = scales :: dollar_format ()) ## Scale for x is already present. 20
## Adding another scale for x, which will replace the existing scale. # Print the modified ggplot object print (p) ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? 21
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
40 60 80 $1,000 $10,000 $100,000 GDP per capita Life expectancy Population 1e+06 5e+07 1e+09 Continent Africa Americas Asia Europe Oceania Life expectancy and GDP per capita by continent #p <- p + scale_x_log10(labels = scales::dollar_format()): This line modifies the existing ggplot object ‘p’ by transforming the x-axis scale to a logarithmic scale using scale_x_log10(). Additionally, it formats the axis labels as dollar values using scales::dollar_format(). This adjustment enhances the plot’s interpretability, especially when dealing with variables such as GDP per capita, which often span several orders of magnitude. #print(p): This line prints the modified ggplot object ‘p’, displaying the plot with the transformed x-axis scale and formatted labels. By printing ‘p’, you can visualize the updated plot with the logarithmic scale transformation and dollar value formatting applied to the x-axis, aiding in the interpretation of relationships between variables represented on the plot. The output of the code will be a plot where the x-axis scale has been transformed to a logarithmic scale, and the labels have been formatted as dollar values. This transformation compresses the scale and makes it easier to visualize relationships across different orders of magnitude, particularly useful for variables like GDP per capita which often span large numerical ranges. Additionally, formatting the labels as dollar values provides context and clarity to the plotted data, enhancing its interpretability. 10. Trying Scale Y-Axis Log10: In this step, we’re experimenting with transforming the y-axis scale to a logarithmic scale using the log10 transformation. This transformation is particularly useful when dealing with data that has highly skewed distributions or wide ranges of values. By compressing the scale and making it easier to visualize relationships across different orders of magnitude, the log10 scale aids in highlighting patterns or trends in the data that may not be apparent with a linear scale. This adjustment enhances the plot’s interpretability, especially when examining relationships involving variables with large numerical ranges on the y-axis. # Transform the y-axis scale to log10 p <- p + scale_y_log10 () 22
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
# Print the modified ggplot object print (p) ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? 30 50 70 $1,000 $10,000 $100,000 GDP per capita Life expectancy Population 1e+06 5e+07 1e+09 Continent Africa Americas Asia Europe Oceania Life expectancy and GDP per capita by continent 23
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
#p <- p + scale_y_log10(): This line modifies the existing ggplot object ‘p’ by transforming the y-axis scale to a logarithmic scale using scale_y_log10(). This transformation compresses the scale and makes it easier to visualize relationships across different orders of magnitude, particularly beneficial for variables with highly skewed distributions or wide ranges of values, such as life expectancy. #print(p): This line prints the modified ggplot object ‘p’, displaying the plot with the transformed y-axis scale. By printing ‘p’, you can visualize the updated plot with the logarithmic scale transformation applied to the y-axis, facilitating the interpretation of relationships between variables represented on the plot. The output of the code will be a plot where the y-axis scale has been transformed to a logarithmic scale using the log10 transformation. This transformation compresses the scale and makes it easier to visualize relationships across different orders of magnitude, particularly beneficial for variables with highly skewed distributions or wide ranges of values, such as life expectancy. 11. Changing Method to ‘gam’ from ‘lm’: In this step, we are changing the smoothing method used in the geom_smooth() function from ‘lm’ (linear model) to ‘gam’ (generalized additive model). Unlike linear models, generalized additive models offer a more flexible approach to capturing non-linear relationships in the data. This change allows for more nuanced modeling of complex relationships between variables, potentially uncovering non-linear patterns that may not be captured by linear models. O # Change the smoothing method to ' gam ' (generalized additive model) p <- p + geom_smooth ( method = "gam" ) # Print the modified ggplot object print (p) ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using formula = ’y ~ s(x, bs = "cs")’ 24
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? 30 50 70 $1,000 $10,000 $100,000 GDP per capita Life expectancy Population 1e+06 5e+07 1e+09 Continent Africa Americas Asia Europe Oceania Life expectancy and GDP per capita by continent #p <- p + geom_smooth(method = “gam”): This line modifies the existing ggplot object ‘p’ by adding a new layer represented by geom_smooth(). The method = “gam” argument specifies that a generalized additive model (GAM) should be used for smoothing. GAMs provide a more flexible approach to capturing non-linear relationships in the data compared to linear models. #print(p): This line prints the modified ggplot object ‘p’, displaying the plot with the updated smoothing method applied. By printing ‘p’, you can visualize the plot with the GAM smoothing method, which allows for more nuanced modeling of complex relationships between variables, potentially capturing non-linear patterns that may not be captured by linear models. The output of the code will be a plot where the smoothing method has been changed to a generalized additive model (GAM). GAMs offer a more flexible approach to capturing non-linear relationships in the data compared to linear models, potentially uncovering complex patterns that linear models may overlook. This change enhances the plot’s ability to represent the underlying structure of the data, facilitating more accurate interpretation and analysis of relationships between variables. 12. Replacing Scientific Notation with Dollar Signs: In this step, we are replacing the default scientific notation used for formatting axis labels on the x-axis with dollar signs. This adjustment is particularly useful when visualizing monetary values, such as GDP per capita, as it makes the axis labels more interpretable and user-friendly. By replacing scientific notation with dollar signs, we provide clearer context and improve the readability of the plot, enhancing the overall interpretability of the data represented on the x-axis. 25
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
# Replace scientific notation with dollar signs on the x-axis p <- p + scale_x_continuous ( labels = scales :: dollar_format ()) ## Scale for x is already present. ## Adding another scale for x, which will replace the existing scale. # Print the modified ggplot object print (p) ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using method = ’loess’ and formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using formula = ’y ~ x’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? ## ‘geom_smooth()‘ using formula = ’y ~ s(x, bs = "cs")’ ## Warning: The following aesthetics were dropped during statistical transformation: size. ## i This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## i Did you forget to specify a ‘group‘ aesthetic or to convert a numerical ## variable into a factor? 26
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
30 50 100 $0 $30,000 $60,000 $90,000 GDP per capita Life expectancy Population 1e+06 5e+07 1e+09 Continent Africa Americas Asia Europe Oceania Life expectancy and GDP per capita by continent #p <- p + scale_x_continuous(labels = scales::dollar_format()): This line modifies the existing ggplot object ‘p’ by applying a continuous x-axis scale using scale_x_continuous(). The labels = scales::dollar_format() argument specifies that the axis labels should be formatted as dollar values using the dollar_format() function from the scales package. This adjustment replaces the default scientific notation with dollar signs, enhancing the interpretability of monetary values represented on the x-axis. #print(p): This line prints the modified ggplot object ‘p’, displaying the plot with the updated formatting applied to the x-axis labels. By printing ‘p’, you can visualize the plot with the axis labels formatted as dollar values, providing clearer context and improving the readability of the plot, especially when dealing with monetary values like GDP per capita. The output of the code will be a plot where the x-axis labels have been formatted as dollar values, replacing the default scientific notation. This adjustment enhances the interpretability of monetary values, such as GDP per capita, making the axis labels more user-friendly and easier to understand. Overall, the modified plot provides clearer context for the represented data and improves the readability of the plot, particularly when visualizing financial metrics. 13. Identify Continent with Color: In this step, we are specifying the color of each point on the plot based on its associated continent. By mapping the ‘continent’ variable to the color aesthetic within the aes() function of geom_point(), we ensure that each point is uniquely colored according to its continent. This approach visually distinguishes data points from different continents, allowing for easier identification and comparison of data across geographical regions. It adds an additional layer of information to the plot, aiding in the interpretation of patterns and relationships within the dataset. # Create a ggplot object ' p ' with GDP per capita on the x-axis, life expectancy on the y-axis, and color p <- ggplot ( data = gapminder, mapping = aes ( x = gdpPercap, y = lifeExp, color = continent)) + 27
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
# Add points to the plot geom_point () + # Add a smoothed line with the linear model method geom_smooth ( method = "lm" ) + # Transform the x-axis scale to log10 scale_x_log10 () + # Add a title to the plot labs ( title = "Relationship between GDP per capita and life expectancy" ) # Print the plot print (p) ## ‘geom_smooth()‘ using formula = ’y ~ x’ 40 60 80 1e+03 1e+04 1e+05 gdpPercap lifeExp continent Africa Americas Asia Europe Oceania Relationship between GDP per capita and life expectancy #p <- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp, color = continent)) +: This line initializes a ggplot object ‘p’ with the ‘gapminder’ dataset, mapping GDP per capita to the x-axis, life expectancy to the y-axis, and coloring the points based on the continent variable. #geom_point() +: This line adds points to the plot, displaying each data point as a colored dot based on its continent. #geom_smooth(method = “lm”) +: This line adds a smoothed line to the plot using the linear model method (‘lm’), which fits a straight line to the data points to show the overall trend between GDP per capita and life expectancy. 28
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
#scale_x_log10() +: This line transforms the x-axis scale to log10, which compresses the scale and makes it easier to visualize relationships across different orders of magnitude, particularly useful for variables like GDP per capita which often span several orders of magnitude. #labs(title = “Relationship between GDP per capita and life expectancy”): This line adds a title to the plot, providing a brief description of the relationship being visualized between GDP per capita and life expectancy. #print(p): This line prints the plot ‘p’, displaying the visualization with the added points, smoothed line, transformed x-axis scale, and title. The resulting plot represents the relationship between GDP per capita and life expectancy, color-coded by continent, with a smoothed trend line showing the overall trend. The output of this code is a scatter plot displaying the relationship between GDP per capita (on a log- arithmic scale) and life expectancy. Each point on the plot represents a country from the ‘gapminder’ dataset, color-coded by continent. Additionally, a smoothed line is fitted to the data using the linear model method, providing insight into the overall trend between GDP per capita and life expectancy across different continents. 14. Add Labels to the Plot: In this step, we are enhancing the plot by adding labels using the geom_text() function. These labels provide additional context to the plot by displaying information such as country names or continent labels near the corresponding data points. By incorporating labels directly onto the plot, we improve its interpretability and provide viewers with specific information about individual data points, facilitating deeper insights into the dataset. # Create a ggplot object ' p ' with GDP per capita on the x-axis, life expectancy on the y-axis, and setti p <- ggplot ( data = gapminder, mapping = aes ( x = gdpPercap, y = lifeExp, color = "blue" )) + # Add points to the plot geom_point () + # Add a smoothed line with the generalized additive model method geom_smooth ( method = "gam" ) + # Transform the x-axis scale to log10 and label with dollar signs scale_x_log10 ( labels = scales :: dollar) # Print the plot print (p) ## ‘geom_smooth()‘ using formula = ’y ~ s(x, bs = "cs")’ 29
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
40 60 80 $1,000 $10,000 $100,000 gdpPercap lifeExp colour blue #p <- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp, color = “blue”)) +: This line initializes a ggplot object ‘p’ with the ‘gapminder’ dataset, mapping GDP per capita to the x-axis, life expectancy to the y-axis, and setting the color aesthetic to a constant value (“blue”). #geom_point() +: This line adds points to the plot, displaying each data point as a dot. #geom_smooth(method = “gam”) +: This line adds a smoothed line to the plot using the generalized additive model method (‘gam’), which provides a flexible approach to capturing non-linear relationships in the data. #scale_x_log10(labels = scales::dollar): This line transforms the x-axis scale to log10 using scale_x_log10() and labels it with dollar signs using labels = scales::dollar, making it easier to visualize relationships across different orders of magnitude, particularly useful for variables like GDP per capita. #print(p): This line prints the plot ‘p’, displaying the visualization with the added points, smoothed line, transformed x-axis scale, and color aesthetic. The resulting plot represents the relationship between GDP per capita and life expectancy, with a smoothed trend line showing the overall trend. The output of the code will be a plot displaying the relationship between GDP per capita and life expectancy. Points on the plot represent individual data points colored in blue, with a smoothed line overlaid to depict the general trend. The x-axis scale is transformed to log10 and labeled with dollar signs, providing clarity when visualizing the data across different scales of GDP per capita. 15. Change Smoothing Method to Loess: In this step, we are changing the smoothing method from linear regression (using the ‘lm’ method) to local regression (using the ‘loess’ method) in the geom_smooth() function. The ‘loess’ method fits a non-parametric curve to the data points, providing a smoother representation of the relationship between GDP per capita and life expectancy. This adjustment helps capture more nuanced trends and variations in the data, potentially offering a better fit for datasets with complex relationships or outliers. 30
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
# Create a ggplot object ' p ' with GDP per capita on the x-axis, life expectancy on the y-axis, and setti p <- ggplot ( data = gapminder, mapping = aes ( x = gdpPercap, y = lifeExp, color = "blue" )) + # Add points to the plot geom_point () + # Add a smoothed line with the loess method geom_smooth ( method = "loess" ) + # Transform the x-axis scale to log10 and label with dollar signs scale_x_log10 ( labels = scales :: dollar) + # Add a title to the plot labs ( title = "Relationship between GDP per capita and life expectancy" ) # Print the plot print (p) ## ‘geom_smooth()‘ using formula = ’y ~ x’ 40 60 80 $1,000 $10,000 $100,000 gdpPercap lifeExp colour blue Relationship between GDP per capita and life expectancy #p <- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp, color = “blue”)) +: This line initializes a ggplot object ‘p’ with the ‘gapminder’ dataset, mapping GDP per capita to the x-axis, life expectancy to the y-axis, and setting the color aesthetic to a constant value (“blue”). #geom_point() +: This line adds points to the plot, displaying each data point as a dot. 31
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
#geom_smooth(method = “loess”) +: This line adds a smoothed line to the plot using the ‘loess’ method, which fits a non-parametric regression curve to the data points, providing a smoothed representation of the relationship between GDP per capita and life expectancy. #scale_x_log10(labels = scales::dollar) +: This line transforms the x-axis scale to log10 using scale_x_log10() and labels it with dollar signs using labels = scales::dollar, making it easier to visu- alize relationships across different orders of magnitude. #labs(title = “Relationship between GDP per capita and life expectancy”): This line adds a title to the plot, providing a brief description of the relationship being visualized between GDP per capita and life expectancy. #print(p): This line prints the plot ‘p’, displaying the visualization with the added points, smoothed line, transformed x-axis scale, and color aesthetic. The output of this code is a scatter plot showing the relationship between GDP per capita and life expectancy. Each data point is represented as a colored dot, and a smoothed line is fitted to the data using the ‘loess’ method. The x-axis scale is transformed to a logarithmic scale and labeled with dollar signs, providing a clearer representation of the data across different magnitudes of GDP per capita. 16. Modify Appearance Using Fill: In this step, we aim to enhance the visual clarity and aesthetic appeal of our plot by using the fill aesthetic in the ggplot2 package. The fill aesthetic is used to modify the appearance of elements such as the interior of points, lines, and the smoother’s standard error ribbon. By incorporating the fill aesthetic, we can distinguish different groups more effectively and improve the interpretability of the plot. # Create the ggplot object with GDP per capita on the x-axis and life expectancy on the y-axis p <- ggplot ( data = gapminder, mapping = aes ( x = gdpPercap, y = lifeExp)) + # Add points with transparency set to 0.3 geom_point ( alpha = 0.3 ) + # Add a smooth line with linear model, blue color, confidence interval, and line size set to 1 geom_smooth ( color = "blue" , se = TRUE , size = 1 , method = "lm" ) + # Customize x-axis label and format with commas scale_x_continuous ( name = "GDP per capita" , labels = scales :: comma) + # Customize y-axis label scale_y_continuous ( name = "Life expectancy" ) + # Apply classic theme for a cleaner look theme_classic () # Print the plot print (p) ## ‘geom_smooth()‘ using formula = ’y ~ x’ 32
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
50 100 150 0 30,000 60,000 90,000 GDP per capita Life expectancy #p <- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp)): This line creates a ggplot object p using the gapminder dataset. It maps gdpPercap to the x-axis and lifeExp to the y-axis. #geom_point(alpha = 0.3): Adds points to the plot with 30% transparency. The transparency is controlled by the alpha parameter, making the points semi-transparent. #geom_smooth(color = “blue”, se = TRUE, size = 1, method = “lm”): Adds a smoothed line to the plot using a linear model (method = “lm”). The line is colored blue (color = “blue”), the confidence interval is displayed (se = TRUE), and the line size is set to 1 (size = 1). #scale_x_continuous(name = “GDP per capita”, labels = scales::comma): Customizes the x-axis label to “GDP per capita” and formats the axis labels to include commas for better readability using scales::comma. #scale_y_continuous(name = “Life expectancy”): Customizes the y-axis label to “Life expectancy”. #theme_classic(): Applies the classic theme to the plot, which provides a clean and minimalistic look by removing background grid lines and adding a plain white background. #print(p): Prints the ggplot object p, displaying the plot. The resulting plot shows the relationship between GDP per capita and life expectancy. Each point represents a country, with the points having 30% transparency. The blue line represents the linear trend (with a confidence interval) between GDP per capita and life expectancy. The x-axis values are formatted with commas for better readability, and the overall plot uses a clean, classic theme. 17. Limit the Figure Size in R Markdown: Limiting the figure size in R Markdown is important for ensuring that plots are clear, readable, and aesthetically pleasing within the document. By setting the fig.width and fig.height chunk options, we can control the dimensions of your plots, making them consistent and well-fitted to the page layout. This step helps maintain uniformity and improves the visual quality of the report, especially when displaying multiple plots together. 33
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
# Limit the figure size in R Markdown to 8 x 5 knitr :: opts_chunk $ set ( fig.width = 8 , fig.height = 5 ) #knitr::opts_chunk$set: This line accesses the opts_chunk function within the knitr package, which allows us to set chunk options for R code chunks. #fig.width = 8, fig.height = 5: This specifies the width and height of the figures generated by R code chunks. Here, fig.width = 8 sets the width of the figures to 8 inches, and fig.height = 5 sets the height of the figures to 5 inches. The output of setting the chunk options fig.width = 8 and fig.height = 5 ensures that all plots generated within R code chunks in the R Markdown document will have a fixed size of 8 inches in width and 5 inches in height. This ensures consistency in the dimensions of the plots throughout the document, making them fit well within the layout and enhancing readability. 18. Save one of the plots in its own file: This step involves using the ggsave() function in R to save a ggplot object as a standalone image file, such as PNG, JPG, or PDF. By specifying the file name and format, along with the dimensions of the plot, this process facilitates reproducibility, sharing, and integration of visualizations into various documents and presentations, ensuring that insights derived from data analysis can be effectively communicated and disseminated. # Create the initial ggplot object with specified mapping and aesthetics p <- ggplot ( data = gapminder, mapping = aes ( x = gdpPercap, y = lifeExp)) + geom_point ( color = "blue" ) # Specify color here if you want all points to be blue # Add geom_point() and geom_smooth() layers to the plot object p_out <- p + geom_point () + geom_smooth ( method = "loess" ) + # Change method to loess scale_x_log10 () # Adjust x-axis scale # Save the plot with specified dimensions as a JPG file ggsave ( "my_last_plot.jpg" , plot = p_out, width = 8 , height = 5 ) ## ‘geom_smooth()‘ using formula = ’y ~ x’ #ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp)): This line initializes a ggplot object named p, using the gapminder dataset. It specifies the mapping aesthetics where gdpPercap is mapped to the x-axis and lifeExp to the y-axis. #geom_point(color = “blue”): This line adds a layer of points to the plot. The color = “blue” argument sets the color of all points to blue. This overrides the color specified in the aes() function for points only. #geom_point(): This adds another layer of points to the plot, without specifying any additional aesthetic mappings, hence it will use the default colors and shapes. #geom_smooth(method = “loess”): This adds a smooth line to the plot using the loess method. #scale_x_log10(): This adjusts the scale of the x-axis to be logarithmic. #ggsave(“my_last_plot.jpg”, plot = p_out, width = 8, height = 5): This line saves the modified plot (p_out) as a JPG file named “my_last_plot.jpg” with a width of 8 inches and a height of 5 inches. The ggsave() function is used to accomplish this task. The output is a scatter plot visualizing the relationship between GDP per capita and life expectancy. Points on the plot represent individual countries, with the color of the points set to blue. Additionally, a smooth curve generated by the LOESS method is overlaid to capture the underlying trend in the data, with the x-axis scale transformed logarithmically. 34
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
19. Saving the Plot in Different Formats and Locations: In the step of saving the plot in different formats and locations, we aim to preserve the visualization generated from our data analysis for various pur- poses and platforms. By using the ggsave() function in R, we can export the plot in formats like PDF, PNG, or JPG, offering versatility in compatibility and quality. Furthermore, specifying different loca- tions for saving allows for organization and accessibility, ensuring that the visualizations can be easily retrieved and shared across different projects or with collaborators. This process not only enhances the reproducibility of our analysis but also facilitates effective communication of insights derived from the data. # Save the plot with specified dimensions as a PDF file ggsave ( "my_last_plot.pdf" , plot = p_out, width = 8 , height = 5 ) ## ‘geom_smooth()‘ using formula = ’y ~ x’ #ggsave(): It is a function provided by the ggplot2 package in R used to save a ggplot object as an external file. #“my_last_plot.pdf”: This is the name of the file to which the plot will be saved. The file extension .pdf specifies the format of the image file, in this case, a PDF file. PDF files are commonly used for sharing documents as they maintain high-quality images and are easily viewable across different devices. #plot = p_out: This argument specifies the ggplot object p_out that we want to save as a PDF. p_out is the modified ggplot object with additional layers or adjustments made to the initial plot p. #width = 8 and height = 5: These parameters specify the width and height of the saved image in inches, respectively. Here, the width is set to 8 inches and the height to 5 inches. Adjusting these values allows you to control the aspect ratio and size of the saved plot. The result is a PDF file named “my_last_plot.pdf” containing the plot represented by the ggplot object p_out. The plot will have dimensions of 8 inches in width and 5 inches in height. 20.Mapping different attributes from the Gapminder dataset: This step allows us to explore the relationships between various variables visually. By assigning different variables to aesthetic mappings such as x-axis, y- axis, color, shape, size, etc., we can observe how different attributes interact with each other and how they are distributed across the plot. This helps in gaining insights into the data and understanding any patterns or trends present. Showing the resulting plot provides a visual representation of these relationships, aiding in data exploration and analysis. # Create a ggplot object with different attributes from the Gapminder dataset p <- ggplot ( data = gapminder, mapping = aes ( x = year, y = gdpPercap, color = continent, shape = continen geom_point () + # Add points geom_smooth ( method = "lm" , color = "blue" ) + # Add a linear regression line labs ( title = "Relationship between Year and GDP per Capita" , subtitle = "Data from Gapminder dataset" , caption = "Source: Gapminder" ) # Add labels for the title, subtitle, and caption # Print the plot print (p) ## ‘geom_smooth()‘ using formula = ’y ~ x’ 35
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
0 30000 60000 90000 1950 1960 1970 1980 1990 2000 year gdpPercap continent Africa Americas Asia Europe Oceania Data from Gapminder dataset Relationship between Year and GDP per Capita Source: Gapminder #p <- ggplot(data = gapminder, mapping = aes(x = year, y = gdpPercap, color = continent, shape = continent)): This line initializes a ggplot object p with the Gapminder dataset, mapping the year variable to the x-axis, GDP per capita to the y-axis, and using continent for both color and shape aesthetics. #geom_point(): This adds points to the plot, representing each data point from the Gapminder dataset. #geom_smooth(method = “lm”, color = “blue”): This adds a linear regression line to the plot using the lm (linear model) method. The line is colored blue. #labs(title = “Relationship between Year and GDP per Capita”, subtitle = “Data from Gapminder dataset”, caption = “Source: Gapminder”): This line adds labels to the plot, including a title, subtitle, and caption. The title describes the relationship being visualized, the subtitle provides context about the data source, and the caption indicates the data source. #print(p): This prints the plot p to the output device, such as the console or a graphics device, allowing you to view the plot. The result of the provided code is a scatter plot with points representing the relationship between the year and GDP per capita from the Gapminder dataset. Each continent is differentiated by both color and shape. Additionally, a blue linear regression line is overlaid on the plot, indicating the overall trend between year and GDP per capita. The plot is further enhanced with a title describing the relationship, a subtitle providing context about the dataset, and a caption indicating the data source, making it informative and visually appealing. 36
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help