Introductory Statistics: R Basics and Vector Operations

Introductory Statistics - Discussion #06 ILRST2100/STSCI2100 Spring 2024

Discussion 1. Introduction Points to Cover Intro To R, RStudio, Posit Cloud, and Gradescope Review Basic Arithmetic # Addition 7 + 2 [1] 9 # Subtraction 7 - 2 [1] 5 # Division 7 / 2 [1] 3.5 # Multiplication 7 * 2 [1] 14 # Negation - 7 [1] -7 # Exponents 7 ^ 2 [1] 49 # SquareRoots sqrt ( 7 ) [1] 2.645751 Quotients and Remainders # Remainders 7 %% 2 [1] 1 # Quotients 7 %/% 2 [1] 3

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Trig Functions sin ( 7 ) [1] 0.6569866 cos ( 7 ) [1] 0.7539023 tan ( 7 ) [1] 0.871448 Logarithm and Exponential Function # Natural Log log ( 7 ) [1] 1.94591 # Exponential exp ( 7 ) [1] 1096.633

Discussion 2. Numerical Vectors Review Vector Assignments c ( 1 , 4 , 3.2 ) #Combine elements into a vector [1] 1.0 4.0 3.2 A1 <- c ( 5 , 7 , 3 , - 1 ) #Assign a vector to variable A1 # Variable names can only contain letters, numbers, underscores, and periods, are case-sensitive, and must start with a letter or a period followed by a letter. A1 [1] 5 7 3 -1 Sequential Vectors 4 : 9 #increments of 1 [1] 4 5 6 7 8 9 seq ( from = 4 , to = 6.9 , by = 0.5 ) # increments of .5 starting at 4. Ends at 6.5. [1] 4.0 4.5 5.0 5.5 6.0 6.5 seq ( from = 4 , to = 6 , length = 9 ) # 9 equally spaced elements starting at 4 and ending at 6.9 [1] 4.00 4.25 4.50 4.75 5.00 5.25 5.50 5.75 6.00 Replicating Vectors rep ( c ( - 4.9 , - 7 , - 4.5 ), times = 256 ) # replicate the vector c(-4.9,-7.0,-4.5) 256 times Combine Vectors x <- 7 : 4 #sequence vector y <- rep ( 9 , times = 3 ) #replicated vector c (x, y, 10 , x) #combining x,y, and 10 [1] 7 6 5 4 9 9 9 10 7 6 5 4 Filtering Elements x <- 11 : 20 x [1] 11 12 13 14 15 16 17 18 19 20 x[ c ( 9 , 1 )] #filter the 9th and then the 1st element [1] 19 11 x[ - c ( 9 , 1 )] #Filter everything except the 9th and 1st element [1] 12 13 14 15 16 17 18 20 Vector Length x <- c ( 12 , 122 , 122 , 14 , 15 , 15 , 15 , 15 , 12 , 122 ) length (x) #number of elements in x [1] 10

Unique elements x <- c ( 12 , 122 , 122 , 14 , 15 , 15 , 15 , 15 , 12 , 122 ) unique (x) #unique elements in x [1] 12 122 14 15 Table elements x <- c ( 12 , 122 , 122 , 14 , 15 , 15 , 15 , 15 , 12 , 122 ) table (x) #frequencies of/count elements in x x 12 14 15 122 2 1 4 3 Vectorized Operations & Recycling x <- c ( 2 , 3 ) y <- c ( 1 , 2 , 3 , 4 ) y ^ x # c(1^2 , 2^3 , 3^2 , 4^3) [1] 1 8 9 64 log (y) #c(log(1), log(2), log(3), log(4)) [1] 0.0000000 0.6931472 1.0986123 1.3862944

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Discussion 3. Some Logic Review Sampling a Vector # Vector to select from a <- c ( 1 , 2 , 3 , 4 ) # Without replacement & Equal Probabilities sample (a, size = 2 , replace = FALSE ) [1] 3 1 # With Replacement \& Equal Probabilities sample (a, size = 12 , replace = TRUE ) [1] 3 1 1 1 2 1 1 4 2 1 2 1 # With Replacement \& UNEQUAL Probabilities sample (a, size = 12 , replace = TRUE , prob = c ( 0.05 , 0.05 , 0.1 , 0.7 )) [1] 4 4 4 4 4 4 4 4 4 4 3 3 Help Menu Type ? before the function’s/command’s name With respect to the labs, the HELP function is not a window to all functions in R. You are restricted to using only those functions that are listed in the review sections of each lab. Logical Vectors # Don't name a variable TRUE or FALSE A <- c ( TRUE , TRUE , FALSE , FALSE ) #must be in caps B <- c (T, F, T, F) #must be in caps Logical Operators - Vectorized # Using the Vector A and B from above ! A #NOT, NEGATION - NOT TRUE is FALSE, NOT FALSE is TRUE [1] FALSE FALSE TRUE TRUE A & B #Vector-AND - Both TRUE for TRUE, otherwise FALSE [1] TRUE FALSE FALSE FALSE A | B #Vector-OR - At least one TRUE for TRUE, otherwise FALSE [1] TRUE TRUE TRUE FALSE # && Scalar-AND -> Single Element Vectors Only: Both TRUE for TRUE, otherwise FALSE || Scalar-OR -> Single Element Vectors Only: At least one TRUE for TRUE, otherwise FALSE

Comparisons # Element by Element comparisons x <- c ( 1 , 2 , 3 , 4 , 5 ) y <- c ( 5 , 4 , 3 , 2 , 1 ) x == y # equality [1] FALSE FALSE TRUE FALSE FALSE x != y # inequality [1] TRUE TRUE FALSE TRUE TRUE x <= y [1] TRUE TRUE TRUE FALSE FALSE x < y [1] TRUE TRUE FALSE FALSE FALSE x >= y [1] FALSE FALSE TRUE TRUE TRUE x > y [1] FALSE FALSE FALSE TRUE TRUE Any & All x <- c ( TRUE , FALSE ) y <- c ( TRUE , TRUE ) z <- c ( FALSE , FALSE ) # Are any TRUE? any (x) [1] TRUE any (y) [1] TRUE # Are all TRUE? all (x) [1] FALSE all (y) [1] TRUE

Filtering with Logic x <- c ( 1 , 2 , 3 , 4 ) bool <- c ( FALSE , TRUE , FALSE , TRUE ) x[bool] [1] 2 4 x[x > 3 ] [1] 4 x[x >= 3 ] [1] 3 4 ifelse x <- c ( 1 , 2 , 3 , 4 ) y <- c ( 2 , 2 , 2 , 2 ) # if the corresponding values are equal, produce 0, otherwise produce 1 ifelse (x == y, 0 , 1 ) [1] 1 0 1 1 if…else… x <- c ( 1 , 2 , 3 , 4 ) # if a condition is true, complete these steps. Else Complete alternate steps if first value is 1, output the second, else output the third y <- sample (x, size = 3 , replace = FALSE ) y [1] 1 3 4 if (y[ 1 ] == 1 ) { y[ 2 ] } else if (y[ 1 ] == 2 ) { y[ 3 ] } else { y[ 1 ] } [1] 3 factor( ) x <- c ( "a" , "b" , "b" , "d" ) # Tells R that information within should be treated as being collected from specific categories. levels argument identifies the categories that were possible. fact.x <- factor (x, levels = c ( "a" , "b" , "c" , "d" )) x [1] "a" "b" "b" "d" fact.x [1] a b b d Levels: a b c d table (x) x a b d 1 2 1 table (fact.x) fact.x a b c d 1 2 0 1

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Discussion 4. For Loops & Strings Review Special ``Numbers” NA #Not Available. Missing Value [1] NA NaN #Not a Number - Undefined [1] NaN NULL #Just a placeholder. NULL Inf #Bigger than all big [1] Inf Character Vectors # I will also refer to them as StringVectors c ( "String" , "Elements" , "Are" , "Wrapped" , "In" , "Quotes" ) [1] "String" "Elements" "Are" "Wrapped" "In" "Quotes" # If one character element is in a vector, the remaining elements will be coerced(turned) into character elements. var <- c ( "var" ) # var is a variable that contains a character/string element 'var' Paste & Print x <- "Hello" y <- "World!" z <- paste (x, y, sep = "<|>" ) #One string created 'Hello<|>World! print (z) [1] "Hello<|>World!" as.numeric c ( 9 , "9" , "nine" ) # a character vector [1] "9" "9" "nine" as.numeric ( c ( 9 , "9" , "nine" )) #converts elements to numeric values, if possible Warning: NAs introduced by coercion [1] 9 9 NA for Loop givenVector <- c ( "orange" , "red" , "gold" ) for (eachElement in givenVector) { # perform these steps }

Discussion 5. While loops & Strings Review while Loop thisValue = TRUE while (thisValue == TRUE ) { performTheseSteps <- FALSE thisValue <- performTheseSteps # create thisValue so that it eventually becomes FALSE or insert a break statement that will kick in at a certain point. Otherwise, the loop continues forever. }

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Discussion 6. Functions & Histograms Review Functions functionName <- function ( arguments = defaults ){ steps to be executed return ( vector to be returned ) } Probability Distributions ` ? ` (distributions #A list of distributions built into R ) # Binomial Distribution example dbinom #probability mass function pbinom #cumulative distribution function qbinom #quantile function rbinom #random number generator # Normal Distribution example dnorm #probability density function pnorm #cumulative distribution function qnorm #quantile function rnorm #random number generator # Uniforn Distribution example dunif #probability density function punif #cumulative distribution function qunif #quantile function runif #random number generator # t Distribution example dt #probability density function pt #cumulative distribution function qt #quantile function rt #random number generator # Chi-Square Distribution example dchisq #probability density function pchisq #cumulative distribution function qchisq #quantile function rchisq #random number generator

Matrices x <- 1 : 20 xmat <- matrix (x, nrow = 4 , ncol = 5 , byrow = TRUE ) # TRUE fills across rows first, FALSE fills down columns first xmat [,1] [,2] [,3] [,4] [,5] [1,] 1 2 3 4 5 [2,] 6 7 8 9 10 [3,] 11 12 13 14 15 [4,] 16 17 18 19 20 Filter Matrices # Index Vectore is row first, column second Single Element xmat[ 2 , 3 ] [1] 8 # Single Row xmat[ 2 , ] [1] 6 7 8 9 10 # Single Column xmat[, 3 ] [1] 3 8 13 18 # SubMatrix xmat[ 3 : 4 , 2 : 5 ] [,1] [,2] [,3] [,4] [1,] 12 13 14 15 [2,] 17 18 19 20

Matrix Operations y <- 21 : 40 ymat <- matrix (y, nrow = 4 , ncol = 5 , byrow = TRUE ) ymat [,1] [,2] [,3] [,4] [,5] [1,] 21 22 23 24 25 [2,] 26 27 28 29 30 [3,] 31 32 33 34 35 [4,] 36 37 38 39 40 # Matrix Addition xmat + ymat [,1] [,2] [,3] [,4] [,5] [1,] 22 24 26 28 30 [2,] 32 34 36 38 40 [3,] 42 44 46 48 50 [4,] 52 54 56 58 60 # transpose t (ymat) [,1] [,2] [,3] [,4] [1,] 21 26 31 36 [2,] 22 27 32 37 [3,] 23 28 33 38 [4,] 24 29 34 39 [5,] 25 30 35 40 # Matrix Multiplication xmat %*% t (ymat) [,1] [,2] [,3] [,4] [1,] 355 430 505 580 [2,] 930 1130 1330 1530 [3,] 1505 1830 2155 2480 [4,] 2080 2530 2980 3430

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Histogram # 1000 Random Values used to make a histogram. In real life these would be data values x <- runif ( 1000 ) hist (x) hist (x, main = "TITLE" , xlab = "XXX" , ylab = "yyy" , xlim = c ( 0.2 , 0.8 ), breaks = 20 ) Multiple Plots par ( mfrow = c ( 1 , 3 )) hist (x) hist (x) hist (x)

Examples 1. Create a function called add.m.up . add.m.up will add together the elements of a vector. add.m.up should take a single argument x with a default value of 0. 2. Create a function called simulate.rolling.a.die.matrix . simulate.rolling.a.die.matrix simulates rolling a fair many sided die , many times. The sides on the die will be equally likely. This function will: a. have three arguments: sides.on.die , number.of.batches , and rolls.per.batch , and . • sides.on.die represents the number of sides on the simulated die. • The simulated die will be rolled repeatedly in batches. • Each batch will consist of rolls.per.batch number of outcomes. • There will be number.of.batches batches. • Example: Six batches of five rolls a piece. b. produce a matrix with number.of.batches x rolls.per.batch values. • Each row will represent a batch of rolls. • Each row will hold the results of rolling the simulated die rolls.per.batch number of times. • There will be number.of.batches rows. One row for each batch. • Since each row represents a batch, it needs rolls.per.batch elements to record all the rolls. • Therefore, the matrix will have rolls.per.batch columns. 3. Create a function called row.totals . row.totals is a function that takes each row of a matrix, sums the elements, and returns a vector with these sums. The function takes 2 arguments: mat and number.of.rows . mat is the matrix whose rows need to be added together. number.of.rows is the number of rows in mat .

Problems The first three problems are modifications of the examples. 4. Create a function called average.it . It should take one argument x . x will be a numeric vector. The function should compute the average of the values in the x . The function should return the average. – Your function should be able to account for the fact that not all vectors will have the same length. – You should take advantage of the function add.m.up created in the examples. Use add.m.up in your code, do not recreate it within your code. – Set x to default to the zero vector 5. Create a function called sim.uniform.samples . It should take two arguments: number.of.batches and selected.per.batch . The function should return a matrix with number.of.batches rows and selected.per.batch columns. – Each row should be filled with selected.per.batch values randomly generated from a U ni f or m distribution with min imum value zero and max imum value ten. – Set the default values for number.of.batches and selected.per.batch to 1. 6. Create a function called row.averages . The function takes each row of a matrix, average the elements, and returns a vector with these averages. The function takes 2 arguments: mat , and number.of.rows . – mat is the matrix whose rows need to be averaged. – number.of.rows is the number of rows in mat . – You should use the function average.it created in the previous problems Use ****average.it**** in your code, do not recreate it within your code.Make use of the function you created called .

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

7. Make 4 histograms. All 4 should be plotted together in a 2 x 2 grid. For comparison and consistency, use the same bins for each. Inside each use of the hist ogram function, add these arguments br e ak s = se q ( f r om = 0 ,t o = 10 ,b y = .25 ) and yl im = c ( 0 , 1000 ) . Use the functions sim.uniform.samples and row.averages to create the values used in these graphs. a. The first histogram should be made from 10,000 randomly generated values from a uniform distribution with minimum zero and maximum ten. b. The second histogram should be made from 10,000 randomly generated values. Each one of these 10,000 values should be the average of two values that were randomly selected/generated values from a uniform distribution with minimum zero and maximum ten. c. The third histogram should be made from 10,000 randomly generated values. Each one of these 10,000 values should be the average of five values that were randomly generated values from a uniform distribution with minimum zero and maximum ten. d. The fourth histogram should be made from 10,000 randomly generated values. Each one of these 10,000 values should be the average of ten values that were randomly generated values from a uniform distribution with minimum zero and maximum ten.

2100_HW_06

Related Documents