531 HW 1 solution

pdf

School

Cleveland State University *

*We aren’t endorsed by this school

Course

531

Subject

Statistics

Date

Feb 20, 2024

Type

pdf

Pages

6

Uploaded by AdmiralTurkey4027

Report
STA 431/531 Homework 1 Due: Thursday 9/7 Name: Directions –You may work with others on the homework, but you must write and turn in your own copy. This does not mean that you can simply copy someone else ' s work!! Come and see me if you have any questions. Show your work and carry your answer to 3 decimal points. Write your answer on separate sheets of paper and clearly label your problems. Submit your R code along with R output. 1. Given the following data set: X Y 44.0 13.2 41.8 17.7 42.6 16.4 33.2 9.7 56.2 13.8 53.7 18.8 34.0 19.6 34.3 11.3 39.0 21.8 58.8 15.7 47.3 11.1 30.0 12.6 47.8 12.0 52.6 14.1 29.1 10.7 Use R to calculate the following things WITHOUT using R ' s built-in functions to calculate them directly. Again, you may not use the R functions, such as “mean”, “sd”, “cor” or “lm” which would calculate these directly (except to check your answers). You may use other R functions such as “sum” or “length” in your program. (a) Calculate the (sample) mean of Y. > x<-c (44.0 ,41.8 ,42.6 ,33.2 ,56.2 ,53.7 ,34.0 ,34.3 ,39.0 ,58.8 ,47.3 ,30.0 ,47.8 ,52.6 ,29.1 > y<-c (13.2 ,17.7 ,16.4 ,9.7 ,13.8 ,18.8 ,19.6 ,11.3 ,21.8 ,15.7 ,11.1 ,12.6 ,12.0 ,14.1 ,10.7) > mean.y<-sum(y)/length(y) > mean.y Page 1 of 6
[1] 14.56667 (b) Calculate the (sample) standard deviation of Y. > var.y<-sum ((y-mean(y))^2) /( length(y) -1) > sd.y<-sqrt(var.y) > sd.y [1] 3.630165 (c) Calculate the correlation coefficient between X and Y. > mean.x<-sum(x)/length(x) > var.x<-sum ((y-mean.x)^2) /( length(y) -1) > sd.x<-sqrt(var.x) > rho <-sum ((x-mean.x)*(y-mean.y))/(sd.x*sd.y) > rho [1] 0.8996201 Page 2 of 6
2. Look at the file called tennis.txt (consisting of data for a number of professional tennis players) in the blackboard. After saving this file into an appropriate directory, write an R program that will do the following: (a) Read in the data and create an R data frame named tennis.dfr that has the fol- lowing names for its columns: frst.name, last.name, major.match.wins, ma- jor.match.losses, overall.match.wins, overall.match.losses, major.titles, over- all.titles . (Note that the data file has several explanatory lines before the real data begin that should be skipped when reading in the data lines.) > tennis.dfr <-read.table (" tennis.txt",header =F, skip =7) > names(tennis.dfr) <-c(" first.name", "last.name", "major.match.wins", "major.match.losses", "overall.match.wins", "overall.match.losses "," major.titles", "overall.titles ") (b) Create and add two more columns called major.winning.pct and overall.winning.pct (showing winning percentage in the “major” and “overall” categories, respectively) to this data frame. Note that “winning percentage” is defined as (match wins)/(match wins + match losses). > tennis.dfr $ major.winning.pct <-tennis.dfr $ major.match.wins /( tennis. dfr $ major.match.wins+tennis.dfr $ major.match.losses) > tennis. dfr $ overall.winning.pct <-tennis. dfr $ overall.match.wins /( tennis.dfr $ overall.match.wins+tennis. dfr $ overall.match.losses) (c) Sort the data frame by major titles, from most to least. Have your program print the sorted data frame. > ord <-order(tennis.dfr $ major.titles , decreasing =T) > tennis.dfr [ord ,] (d) Perform a nested sort, sorting the data frame first by major titles (from most to least), and then by major winning percentage (from most to least) within major-title levels. Have your program print this sorted data frame. > ord <-order( tennis.dfr $ major.titles , tennis.dfr $ major.winning.pct , decreasing =T) > tennis.dfr[ord ,] (e) Have R extract the subset of the data frame consisting of players with at least 6 major titles. Call this new data frame: greatest.dfr . Have your program print this new data frame. Page 3 of 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
> index <-which(tennis.dfr $ major.titles >=6) > greatest.dfr <- tennis.dfr[index ,] (f) In the most efficient way possible, have R calculate the sample means for each of the numeric variables in the tennis.dfr data set. (Hint: Extract the appropriate subset of the data frame first.) > tennis.dfr2 <-tennis.dfr [,c(" major.match.wins"," major.match.losses "," overall.match.wins"," overall.match.losses", "major.titles "," overall.titles "," major.winning.pct"," overall.winning.pct ")] > apply(tennis.dfr2 , 2, mean ) major.match.wins major.match.losses overall.match.wins overall .match.losses major.titles overall.titles 159.9666667 41.7000000 700.9000000 225.8333333 6.3666667 50.7000000 major.winning.pct overall.winning.pct 0.7791792 0.7511338 3. Use the write.table() function to write the data set tennis.dfr to an external file. Make sure the external file includes the column names. Also, make sure the players ' names are NOT surrounded by quotes in the external file. write.table (tennis.dfr , "tennis2.txt",quote =FALSE ,col.names =TRUE ,row. names =F) Page 4 of 6
4. The Fibonacci numbers are the sequence of numbers defined by the linear recurrence equa- tion F n = F n 1 + F n 2 , where F 1 = F 2 = 1 and by convention F 0 = 0. For example, the first 8 Fibonacci numbers are 1, 1, 2, 3, 5, 8, 13, 21. (a) For a given n , compute the n th Fibonnaci number using a for loop. > Fibonacci1 <-function(n){ + r<-rep(NA ,n) + for(i in 1:n){ + if(i==1||i==2){ + r[i]<-1 + } + else{ + r[i]<-r[i -1]+r[i-2] + } + } + return(r[n]) + } > Fibonacci1 (15) [1] 610 (b) For a given n , compute the n th Fibonnaci number using a while loop. > Fibonacci2 <-function(n){ + r<-rep(NA ,n) + counter <-1 + while(counter <=n){ + if(counter ==1|| counter ==2){r[counter ]<-1} + else{r[counter]<-r[counter -1]+r[counter -2]} + counter <-counter +1 + } + return(r[n]) + } > Fibonacci2 (15) [1] 610 You can also calculate Fibonacci numbers recursively. > Fibonacci3 <-function(n){ + r<-NULL + if(n==1||n==2){r<-1} + else{r<-Fibonacci3(n-1)+Fibonacci3(n-2)} + r + } > Fibonacci3 (15) Page 5 of 6
[1] 610 (c) Print the 15 th Fibonacci number obtained from each of the code written above. Hint: You can create a function taking n as argument. Alternatively, write the code for n=15. Page 6 of 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help