Lab11_R copy

docx

School

The City College of New York, CUNY *

*We aren’t endorsed by this school

Course

228

Subject

Computer Science

Date

Apr 3, 2024

Type

docx

Pages

Uploaded by LieutenantOxide7028

Lab 11: Introduction to the R statistical computing language Lab objectives: I. Learn the basics of R and RStudio a. Setting up RStudio b. Objects and functions II. Learn how to import and export files in R a. Directories and paths b. Reading and writing .csv files c. Basic plotting III. Learn how to manipulate and plot data using the tidyverse a. Downloading and loading packages b. Using the tidyverse to explore data c. Plotting with ggplot2 Background R is an open-source programming language designed for statistical analysis and plotting data. This language comes equipped with many tools for conducting exploratory data analyses, standard statistical tests, and data visualization. A major benefit to using R is the wide array of functionality and customization available to its users through shareable code written by R ’s user community. These freely available, specialized code collections are called packages. Packages can be freely downloaded from The Comprehensive R Archive Network ( CRAN ) package repository which currently features >18,000 packages with various functions. R is a powerful tool because it provides users with a programming environment for statistics and visualization that is customizable to different fields of research by using these specialized packages. Working directly in R can be challenging, especially when first learning the language. Using an integrated development environment ( IDE ) like RStudio helps with programming by providing interfaces that allow programmers to edit, interpret, and access code through a single graphical user interface ( GUI ) . In today’s lab, you will learn the basics of how to use the R programming language in RStudio by working with a Pokémon dataset. Requirements: You will need to have R and RStudio installed on your computer. R can be downloaded here and RStudio can be downloaded here . Select the correct installer for your operating system. Download the Lab_11 folder here . Sources: https://jcoliver.github.io/learn-r/002-intro-stats.html#solution-to-challenge-2 https://kirstenmorehouse.wordpress.com/354-2/topic-1-crash-course-in-r/ https://r4ds.had.co.nz/wrangle-intro.html https://bookdown.org/ndphillips/YaRrr/the-four-rstudio-windows.html I. The basics of R and RStudio 1

a. Setting up: Opening RStudio and the RStudio interface Begin the lab by first opening RStudio on your device. Navigate to the top left of the screen and go to File > New Project . A new window will appear. In this new window select the second option Existing Directory to associate a project with an existing working directory. Click the Browse button and navigate to the location on your computer where the Lab_11 folder has been extracted. Click on Lab_11 , click open, and then Create Project. Navigate to File and click New File > R script . You should now see four panels displayed in the RStudio application. Your screen should look similar to the image below. Panel 1 (top left) - Source: This panel acts like a basic text editor, and it is where you can write, annotate, and save commands. Text that includes commands can be called code or a script . You can also run commands from this panel by highlighting the relevant script and clicking the Run button on the top right corner of this panel. You can have multiple scripts open at the same time, and each script will have its own tab. Scripts in this panel can be saved as text files by clicking the Save floppy disk icon . Writing, running, and saving the code in this panel provides a shareable record of the commands used to perform an analysis, which facilitates replicability and trouble-shooting. Panel 2 (top right) – Environment/History: This panel includes four tabs, but only the Environment and History tabs will be used in these laboratory exercises. The Environment tab contains information about the objects that are in your working space. The R Environment can also be saved like a text document. This can be done by clicking the Save floppy disk icon on the toolbar of the Environment tab. When all objects in your Environment are saved to an .Rdata file, they can later be loaded as a single unit without having to import all of the data again. The History tab lists all the commands that have been executed, and this can also be saved if needed. 2

Panel 3 (bottom left) - Console: There are three tabs in this panel, but only the Console tab will be used in these laboratory exercises. This is where your R code is executed after being run from the Source panel. This is also the location in the RStudio interface where you will see outputs and any errors or warnings from your commands. If you were to run R without an IDE like RStudio, you would only see a console screen. You can run commands directly through the console by typing a command after the “>” character and pressing enter. The “>” character is referred to as a prompt . When the “>” character is visible it means that R is “waiting” for a command. If a command is currently running, the “>” character will disappear, and a stop sign (red octagon) will become visible in the top right of this panel. You can terminate a command by clicking this stop sign. Panel 4 (bottom right)- Files, plots, packages, and help: This panel includes six tabs, but you likely will only use the Files , Plots , Packages , and Help tabs. The Files tab acts like a file explorer and it shows what files are currently within your working directory. The Plots tab displays any graphics that are created using commands. The Packages tab contains all the available packages currently installed in R on your device. Packages can be loaded or unloaded by clicking the checkbox next to the package name. You can install packages from CRAN in the Packages tab by clicking the Install button on the top left portion of the panel. The Help tab includes detailed information about packages and their functions. b. Objects and functions Data structures, data types, and creating an object: Programming in R involves at least one object and one function . The most conventional way to think about objects is to imagine them as a shortcut to store data so they can easily be used later. Objects can be created in R by using the arrow-like assignment operator “<-” (a less-than sign “<” and a dash “-”) which assigns a name on the left of the operator to an object name on the right of the operator. Operators are characters that perform specific tasks with a piece of code or an object. There are many operators included in R that are used in object assignment, arithmetic calculations, and coding logic. For example: > pi <- 3.1415927 …creates an object called “pi” that stands for the numerical value of pi to 7 decimal places. The direction of the arrow can be swapped as long as it points to the object name. The command… > 3.1415927 -> pi …is equivalent. Object names cannot begin with a number and cannot be the name of an existing function. Using the same object name twice will delete the old object without warning and create a new object with the same name. Create object names that are concise and descriptive. Navigate to the source panel (Panel 1) in RStudio . Create an object named “one” with a value of 1 by first writing the word one on line 1, followed by the assignment operator <-, and the value 1 . Highlight this line by clicking and dragging your cursor and then click the Run button on the top right corner of the Source panel. 3

Your preview ends here