STAT 405 Homework_1

txt

School

University of Pennsylvania *

*We aren’t endorsed by this school

Course

405

Subject

Statistics

Date

Feb 20, 2024

Type

txt

Pages

7

Uploaded by BaronNarwhal3261

Report
#### STAT 4050/7050 Homework 1, Spring 2023 #### Name: SOLUTIONS #### Instructions: #### 1. If a question refers to a function not seen in class, #### use the help facility in R to learn about it. #### 2. Insert your answers and the R code you used to generate them, beneath each question. #### Even though for some questions you could find the answer "by inspection", for this homework you need #### to write code to get the answers, unless it is explicitly stated that no code is required. #### 3. When you are asked to print an answer, it is NOT ENOUGH to simply write a print statement, #### for example print(2*3) as an answer, you also need to #### cut and paste into this document the value(s) that is actually printed (i.e.,6). On the other hand, pasting #### long lines of code from the console when you are not asked to do so is unnecessary. Please ask if you have questions on this. #### 4. When submitting to Canvas, please change the file extension to .txt, otherwise it cannot be submitted. ## ------------------------------------------------------------------------ #### Q1 (35 points) #a 5pts. Create the following 6x6 matrix in R and call it "mat.6" #### (4 1 7 6 9 7 #### 1 6 0 8 7 2 #### 5 7 9 3 3 2 #### 7 8 5 1 5 6 #### 5 9 9 7 5 0 #### 3 8 3 1 3 2) #### Label the the rows as Alpha, Bravo, Charlie, Delta, Echo, and Foxtrot and the columns as A, B, C, D, E, and F. Print the result. my.matrix <- matrix( data = c(4,1,5,7,5,3,1,6,7,8,9,8,7,0,9,5,9,3,6,8,3,1,7,1,9,7,3,5,5,3,7,2,2,6,0,2), nrow = 6, ncol = 6) print(my.matrix) #assigning names colnames(my.matrix) <- c("A","B","C","D","E","F") rownames(my.matrix) <- c("Alpha","Bravo","Charlie","Delta","Echo","Foxtrot") print(my.matrix) # > print(my.matrix) # A B C D E F # Alpha 4 1 7 6 9 7 # Bravo 1 6 0 8 7 2 # Charlie 5 7 9 3 3 2 # Delta 7 8 5 1 5 6 # Echo 5 9 9 7 5 0 # Foxtrot 3 8 3 1 3 2 #b 5pts. Extract and print the sub-matrix corresponding to the second column, # and the third and fourth rows. my.matrix[c(3,4),c(2)] # > my.matrix[c(3,4),c(2)]
# Charlie Delta # 7 8 #c 5pts. Find the transpose of the matrix entered in part a. Print the solution # (The transpose swaps the rows and columns) t(my.matrix) # > t(my.matrix) # Alpha Bravo Charlie Delta Echo Foxtrot # A 4 1 5 7 5 3 # B 1 6 7 8 9 8 # C 7 0 9 5 9 3 # D 6 8 3 1 7 1 # E 9 7 3 5 5 3 # F 7 2 2 6 0 2 #d 5pts. Find the inverse of the matrix enter in part a. Print the solution. answer <- solve(my.matrix) print(answer) # > answer <- solve(my.matrix) # > print(answer) # Alpha Bravo Charlie Delta Echo Foxtrot # A -0.06958862 -0.12495194 -0.4192618 0.370434448 0.37543253 -0.3235294 # B -0.06382161 0.09534794 0.1845444 -0.058054594 -0.14186851 0.1176471 # C 0.07343329 -0.06151480 0.1551326 -0.156093810 -0.05363322 0.1176471 # D -0.18262207 0.35717032 0.5461361 0.068819685 -0.28546713 -0.4705882 # E 0.30795848 -0.43598616 -0.9567474 -0.081314879 0.57612457 0.5588235 # F -0.12110727 0.37370242 0.8200692 -0.001730104 -0.63667820 -0.2647059 #e 5pts. Multiply the original matrix (on the left) by its transpose (on the right) # (use, matrix multiplication, not element-wise) and store the result. mymat <- my.matrix%*%t(my.matrix) print(mymat) # > mymat <- my.matrix%*%t(my.matrix) # > print(mymat) # Alpha Bravo Charlie Delta Echo Foxtrot # Alpha 232 135 149 164 179 88 # Bravo 135 154 96 110 150 84 # Charlie 149 96 177 166 205 114 # Delta 164 110 166 200 184 128 # Echo 179 150 205 184 261 136 # Foxtrot 88 84 114 128 136 96 #f 5pts. Find and print the inverse of the matrix created in part e. solve(mymat) # > solve(mymat) # Alpha Bravo Charlie Delta Echo Foxtrot # Alpha 0.15716444 -0.24665816 -0.46490149 -0.07093544 0.28565138 0.3137368 # Bravo -0.24665816 0.48579636 0.97909463 0.01716616 -0.64820957 -0.4662347 # Charlie -0.46490149 0.97909463 2.12004699 -0.07627401 -1.42113560 -0.8331298 # Delta -0.07093544 0.01716616 -0.07627401 0.17630855 0.09028934 -0.2224088 # Echo 0.28565138 -0.64820957 -1.42113560 0.09028934 0.98272291 0.4803582 # Foxtrot 0.31373680 -0.46623471 -0.83312979 -0.22240880 0.48035823 0.7361592 #g 5pts. Find and print the sum of the elements of the leading (top left to bottom right) # diagonal of the inverse matrix. (The command "diag" is very useful here). mymat2 <- solve(mymat)
diagonal <- diag(mymat2) diagonalsum <- sum(diagonal) diagonalsum # > diagonalsum # [1] 4.658198 #### Q2 (30 points) #a 5pts. What is the key difference between the list extraction operators [ and $? #The [ operator returns a new list or vector that contains the specified elements, while the $ operator returns a single element of the list specified by name #b Reset the random number seed to 2023. set.seed(2023) # Paste the following code into R nest.list <- list(Omega = c(sample(x=100,size=100,replace=TRUE)), Theta = matrix(data=rnorm(36),ncol=6), Delta = list(GAMMA = matrix(data = sample(x=10,size=64,replace=TRUE),ncol=8), EPSILON = matrix(data = rnorm(64),ncol=8))) nest.list # > #b Reset the random number seed to 2023. # > set.seed(2023) # > nest.list <- list(Omega = c(sample(x=100,size=100,replace=TRUE)), # + Theta = matrix(data=rnorm(36),ncol=6), # + Delta = list(GAMMA = matrix(data = sample(x=10,size=64,replace=TRUE),ncol=8), # + EPSILON = matrix(data = rnorm(64),ncol=8))) # > nest.list # $Omega # [1] 80 47 72 26 44 98 65 29 49 81 5 72 63 98 3 79 45 4 96 5 24 9 # [23] 57 98 38 85 41 30 39 95 49 44 48 30 33 70 68 70 70 82 34 99 40 52 # [45] 24 71 32 46 33 44 100 17 32 37 42 83 96 92 90 79 21 76 12 31 28 43 # [67] 73 86 27 60 62 80 84 32 64 67 45 95 62 84 20 81 54 50 10 53 29 43 # [89] 73 70 74 32 38 81 96 100 100 73 57 18 # # $Theta # [,1] [,2] [,3] [,4] [,5] [,6] # [1,] 0.91980757 -0.6899573 -0.9538035 0.30310787 1.4209584 -1.1151520 # [2,] 1.18297818 -0.5350626 0.1315696 0.62199075 -0.7959284 1.4217720 # [3,] -0.06661768 -1.3957565 -0.1048885 -0.07572521 -0.3782428 -0.3747833 # [4,] 0.69121516 -0.5824434 -1.2991418 0.68143284 0.0720771 -1.2167619 # [5,] -1.28738212 -0.7449056 -1.8107273 -0.07796042 1.0514441 -1.6879300 # [6,] 0.89816897 -1.5087501 0.3461719 -0.26435962 1.1745123 -0.8430539 # # $Delta # $Delta$GAMMA # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] # [1,] 4 9 8 9 4 8 3 7 # [2,] 5 10 9 6 4 10 4 5 # [3,] 2 10 7 3 9 5 6 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
# [4,] 5 3 7 5 4 7 4 6 # [5,] 8 5 2 9 3 1 2 4 # [6,] 3 1 2 1 8 1 4 10 # [7,] 5 4 2 3 1 10 2 1 # [8,] 8 5 10 1 5 3 5 4 # # $Delta$EPSILON # [,1] [,2] [,3] [,4] [,5] [,6] [,7] # [1,] 0.88918466 -2.1602654 2.0216039 0.04022616 0.7010316 0.6452307 - 0.45751085 # [2,] 1.11031334 -1.2160521 -0.1143415 0.77666776 -2.4329114 -0.7723815 - 0.21085099 # [3,] -0.01906351 -1.8143246 1.1478031 1.10162700 -2.5201181 -0.5632796 - 0.03789005 # [4,] 0.24877479 -0.6173035 0.3101706 1.36710625 -0.1462613 1.2353488 - 0.83658016 # [5,] 2.21455431 -0.1723222 -0.7520078 1.00884406 0.1736895 0.7654678 - 0.13154933 # [6,] -0.23567090 -1.2736024 -0.4642222 -0.24691557 2.2936313 -1.2080468 0.56069987 # [7,] 1.17311486 0.4453596 0.2489604 0.67429567 0.7743053 -0.8932955 - 1.39466333 # [8,] 0.37387865 -0.3418995 -1.2292926 -1.42378106 -1.0738868 -0.6847716 - 0.36389287 # [,8] # [1,] -0.807941750 # [2,] -0.563753696 # [3,] 0.434264299 # [4,] 0.077642164 # [5,] -0.894238568 # [6,] 1.219824020 # [7,] -0.757064268 # [8,] 0.006967099 # Hint: This creates a nested list. That is, the elements of nest.list are Omega, Theta and Delta # where Omega is a vector, Theta is a matrix, and Delta is a list itself with 2 elements #c 5pts. Write code to extract and print the value of the (2,3) element in the Theta matrix nest.list$Theta[2,3] # > nest.list$Theta[2,3] # [1] 0.1315696 #d 5pts. Write code to extract and find the sum of all of the elements in the inverse of the # matrix product (%*%) of the two matrices GAMMA and EPSILON. product <- nest.list$Delta$GAMMA%*%nest.list$Delta$EPSILON product2 <- solve(product) sum <- sum(product2) sum # > sum # [1] 0.9962438 #e 10pts. Write one line of code to check whether the answer obtained in d is greater than the reciprocal of the sum of the # elements of the vector (Omega). This line should return a TRUE or FALSE sum > (1/sum(nest.list$Omega))
# > sum > (1/sum(nest.list$Omega)) # [1] TRUE #f 5pts. Rename the element currently named "Theta", as "Beta", and paste both the command you used # and the output (i.e. print the result) from the names() command to show that you did it successfully. # Hint: ?names names(nest.list) names(nest.list)[names(nest.list) == "Theta"] <- "Beta" names(nest.list) # > names(nest.list) # [1] "Omega" "Theta" "Delta" # > names(nest.list)[names(nest.list) == "Theta"] <- "Beta" # > names(nest.list) # [1] "Omega" "Beta" "Delta" #### Q3 (15 points) #a 5pts. Set a random number seed to 2023. Then create a 12000 x 600 standard normal matrix (mean = 0, sd = 1). # (No need to print the matrix) set.seed(2023) x <- matrix(rnorm(7200000),ncol=600, nrow=12000) #b 5pts. Using the apply function, store in a vector the mean of each row of the matrix. (No need to print the vector) vector1 <- apply(x,1,mean) #c 5pts. Find and print the standard deviation (R command "sd") of these 12000 row means. # What you have just done is empirically estimate the standard error of the sample mean, # when the sample mean is based on a sample of size n = 600. sd(vector1) # > sd(vector1) # [1] 0.0404957 #### Q4 (20 points) # A teacher wants to know whether students prefer online or in-person learning. Students answered on a 1 through 5 numeric # scale, where: #### 1: Strongly prefer in-person #### 2: Slightly prefer in-person #### 3: No preference #### 4: Slightly prefer online #### 5: Strongly prefer online # The teacher wants to recode this numeric variable into a two-level ordered factor, with levels: “In-person preferred” if they # had answered with a 1, 2 or 3, and “Online preferred” if they had answered with a 4 or 5. # Here’s the raw data: raw.scores <- c(4, 2, 3, 5, 1, 2, 1, 3, 2, 3, 3, 1, 5, 1, 1, 1, 1, 4, 5, 1,
1, 2, 2, 2, 1, 3, 4, 1, 4, 5) #a 5pts. Write code to create and store a logical vector that takes on the value TRUE if the raw score was either a 4 or a 5 and # FALSE otherwise. online <- ((raw.scores == 4) | (raw.scores == 5)) print(online) # [1] TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE # [16] FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE TRUE TRUE #b 5pts. Using the logical vector from part a, replace the values 1, 2 and 3 in raw_scores with the value 1 and the # values 4 and 5 with the value 5. You can do this in multiple steps if you want. Call this new numeric vector “mod_scores”. At this # point mod_scores should only contain the values 1 and 5. mod_scores <- raw.scores mod_scores[!online_pref] <- 1 mod_scores[online_pref] <- 5 mod_scores # > mod_scores # [1] 5 1 1 5 1 1 1 1 1 1 1 1 5 1 1 1 1 5 5 1 1 1 1 1 1 1 5 1 5 5 #c 5pts. Create and store from “mod_scores” an ordered factor variable from the modified data with the labels “In-person preferred”, # and “Online preferred”. “In-person preferred” should be the lower of the two levels in the ordered factor. mod_factor <- as.factor(mod_scores) levels(mod_factor) <- c("In-person preferred", "Online preferred") mod_factor <- ordered(mod_factor, levels = c("In-person preferred", "Online preferred")) mod_factor # > mod_factor # [1] Online preferred In-person preferred In-person preferred Online preferred # [5] In-person preferred In-person preferred In-person preferred In-person preferred # [9] In-person preferred In-person preferred In-person preferred In-person preferred # [13] Online preferred In-person preferred In-person preferred In-person preferred # [17] In-person preferred Online preferred Online preferred In-person preferred # [21] In-person preferred In-person preferred In-person preferred In-person preferred # [25] In-person preferred In-person preferred Online preferred In-person preferred # [29] Online preferred Online preferred # Levels: In-person preferred < Online preferred #d 5pts. Write code to identify and print which of the two new levels occurred most frequently and how often it occurred. # Hint: you may find it useful to first summarize the new data with the “table” command table(mod_factor) # In-person preferred Online preferred # 22 8 sort(table(mod_factor), decreasing = TRUE)[1]
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
# In-person preferred # 22