Lab A quiz notes

docx

School

Curtin University *

*We aren’t endorsed by this school

Course

307529

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

4

Uploaded by SargentCobra3727

Report
**4.1 Simple Commands – Checkpoint Exercises: Use the console to calculate the following: 1. (! " × 8 − 1)!/" [1] 2.657958 2. log!(4096) [1] 12 (using log2() function) 3. 𝑒!$%&'(! ") [1] 7.389056 (using exp(2+cos(pi/2)) ) 4. (2.3! + 5.4! − 2 × 2.3 × 4.5 × cos(* +)),/! [1] 3.914804 (using sqrt(2.3^2+5.4^2- 2*2.3*4.5*cos(pi/8)) ) **A basic concept in programming is called a variable. A variable allows you to store a value (e.g. 4) or an object (e.g. a function description) in R . A variable provides us with named storage that our programs can manipulate. A valid variable (one that R will accept) name consists of letters, numbers and the dot or underline characters. The variable name starts with a letter or the dot not followed by a number. Variable Name Validity Reason var_name2. valid Has letters, numbers, dot and underscore var_name% invalid Has the character '%'. Only dot(.) and underscore allowed. 2var_name invalid Starts with a number .var_name valid Can start with a dot(.) but the dot(.)should not be followed by a number. var.name valid Starts with a letter and the dot(.) is not be followed by a number. .2var_name invalid The starting dot is followed by a number making it invalid. _var_name invalid Starts with _ which is not valid Variable names are case- sensitive, so Name , name , and NAME can all be different variables (although R allows this, for clarity in your code it is better if you don’t use all three). The assignment character is <-, and it is constructed by using two separate keystrokes (< and -), or in RStudio , just type Alt- (the Alt key and the minus sign – together). If you want to write two or more R commands on the same line of the console, separate them by a semicolon (;). Typing the name of the variable and then pressing Enter will tell you the contents of the variable. For example: > A <- 15; B <- 3 > C <- A/B > C > print(C) [1] 5 ** than one element, one way is use the c() function which means to combine the elements into a vector. > apple <- c('red','green',"yellow") > print(apple) [1] "red" "green" "yellow" > x <- c(0,pi/4,pi/2,3*pi/4,pi,5*pi/4,3*p i/2,7*pi/4,2*pi) > x [1] 0.0000000 0.7853982 1.5707963 2.3561945 3.1415927 [6] 3.9269908 4.7123890 5.4977871 6.2831853 We saw above that functions can take scalar arguments, but they can also take vector arguments, and the function will operate on each element of the vector, as follows: > y <-sin(x) > y [1] 0.000000e+00 7.071068e- 01 1.000000e+00 7.071068e- 01 [5] 1.224606e-16 -7.071068e- 01 -1.000000e+00 -7.071068e- 01 [9] -2.449213e-16 Another way of creating a vector is to use the operator : ( a colon), which creates a sequence of values: > z <- 1:10 > z [1] 1 2 3 4 5 6 7 8 9 10 We can also select a subset of the elements of a vector by using the indexing operator []. So, if we wanted to select the first three elements of x above, we could do this in the following two ways: > x[1:3] [1] 0.0000000 0.7853982 1.5707963 > x[c(1,2,3)] [1] 0.0000000 0.7853982 1.5707963 The second statement can be used to select any subset of elements in any order. ** > BMI <- data.frame( gender = c("Male", "Male","Female"), height = c(152, 171.5, 165), weight = c(81,93, 78), Age = c(42,38,26) ) > print(BMI) ** > plot(x,y) plot(x,y,type="b") We can also add a title, some colour and informative labels: plot(x,y, type="b", xlab = "time(hours)", ylab = "scaled temperature diffe rence", main="Temperature anomaly", col="blue")
Next, we will do a simple plot using the dummy data set "mtcars", available in the R environment, to create a basic scatterplot. Let's use the columns "wt" and "mpg" in mtcars (wt being weight, and mpg being miles per gallon). Ensure you understand what this line of code is actually doing – if not, ask your tutor. > input <- mtcars[c('wt','mpg')] > plot(x = input$wt,y = input$mpg) Construct a plot of wt against mpg for cars with weight between 2.5 to 5 and mileage between 15 and 30 and label it appropriately. The title of the plot should be: Weight vs Mileage”. > plot(x = input$wt,y = input$mpg, xlab="Weight", ylab="Mileage", main="We ight vs Mileage", xlim=c(2.5,5), ylim=c(15,30)) http://www1.appstate.edu/ ~arnholta/PASWR/CD/data/ Bodyfat > site<-"http://www1.appstate.ed u/~arnholta/PASWR/CD/data/Bo dyfat" > FAT<-read.table(file=site, header=TRUE, sep="\t") > head(FAT) age fat sex 1 23 9.5 M 2 23 27.9 F 3 27 7.8 M 4 27 17.8 M 5 39 31.4 F 6 41 25.9 F > write.table(FAT,file="FAT.txt") write.table(FAT,file="FAT.txt", sep="\t") > mean(taxis) > taxis<- c(34400,45500,36700,32000,4 8400,32800,38100,30100) > mean(taxis) [1] 37250 > min(taxis) [1] 30100 > max(taxis) [1] 48400 > median(taxis) Lab 2 [1] 35550 > var(taxis) [1] 42860000 > sd(taxis) [1] 6546.755 > summary(taxis) Min. 1st Qu. Median Mean 3rd Qu. Max. 30100 32600 35550 37250 39950 48400 > fivenum(taxis) [1] 30100 32400 35550 41800 48400 Enter the following data into R and call it mydata: 54, 59, 35, 41, 46, 25, 47, 60, 54, 46, 49, 46, 41, 34, 22. Now type > stem(mydata) > mydata<-c(54, 59, 35, 41, 46, 25, 47, 60, 54, 46, 49, 46, 41, 34, 22) > stem(mydata) The decimal point is 1 digit(s) to the right of the | 2 | 25 3 | 45 4 | 1166679 5 | 449 6 | 0 > boxplot(Cars93$Min.Price) boxplot(Cars93$Min.Price, col="orange", horizontal =TRUE) > boxplot(Cars93$MPG.city~Cars 93$Type) > concrete<- read.csv(file.choose()) > dim(concrete) >fivenum(concrete$Concrete_c ompressive_strength) boxplot(concrete$Concrete_co mpressive_strength, main="Compressive Strength of Concrete", ylab="MPa",col="yellow") 1. Simulate tossing a coin 1000 times. Are the results what you would expect? > rbinom(1,1000,0.5) [1] 503 2. Suppose that n1 items are to be inspected from one production line and n2 items are to be inspected from another production line. Let p1 = the probability of a defective from line 1 and p2 = the probability of a defective from line 2. Let X be a binomial random variable with parameters n1 and p1. Let Y be a binomial random variable with parameters n2 and p2. A variable of interest is W , which is the total number of defective items observed in both production lines. Let W = X + Y . Use simulation to see how the distribution of W will behave. Useful information could be obtained by looking at the histogram of W i’s generated and also considering the sample mean and the sample variance. In your simulation use the following random variables X and Y : X is binomial with n1=7, p1=0.2; and Y is binomial with n2=8, p2=0.6. X <- rbinom(1000,7,0.2) Y <- rbinom(1000,8,0.6) mean(X) var(X) mean(Y) var(Y) mean(X+Y) var(X+Y) hist(X,freq = FALSE) hist(Y,freq = FALSE) hist(X+Y,freq = FALSE) The output is as follows: > mean(X) [1] 1.425 > var(X) [1] 1.129505 > mean(Y) [1] 4.862 > var(Y) [1] 1.952909 > mean(X+Y) [1] 6.287 > var(X+Y) [1] 3.111743 If X is a Normally distributed random variable with 𝜇=20 and 𝜎=5 . Calculate the following using R : (i) 𝑃 ( 𝑋 <15) (ii) 𝑃 (14< 𝑋 <23) (iii) Find k such that 𝑃(𝑋<𝑘)=0.9345 .
(i) > pnorm(15,20,5) [1] 0.1586553 (ii) > pnorm(23,20,5)- pnorm(14,20,5) [1] 0.6106772 (ii) > qnorm(0.9345,20,5) [1] 27.55085 Sampling Distribution of the Mean > samples_size_2<- replicate(100,mean(sample(0:9 ,2, replace=TRUE))) > samples_size_3<- replicate(100,mean(sample(0:9 ,3, replace=TRUE))) > samples_size_4<- replicate(100,mean(sample(0:9 ,4, replace=TRUE))) > samples_size_5<- replicate(100,mean(sample(0:9 ,5, replace=TRUE))) > samples_size_10<- replicate(100,mean(sample(0:9 ,10, replace=TRUE))) > hist(samples_size_2,freq=F,mai n = "100 means of samples of size 2")
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
> hist(samples_size_2,freq=F,mai n = "100 means of samples of size 2", xlab="mean") > hist(samples_size_3,freq=F,mai n = "100 means of samples of size 3", xlab="mean") > hist(samples_size_4,freq=F,mai n = "100 means of samples of size 4", xlab="mean") > hist(samples_size_5,freq=F,mai n = "100 means of samples of size 5", xlab="mean") > hist(samples_size_10,freq=F,m ain = "100 means of samples of size 10", xlab="mean") **set.seed(1000) weight<-rnorm(30, 24, 1.2) hist(weight) fivenum(weight) IQR(weight) boxplot(weight, main="Weight Data", ylab="Weight") **mean(weight) 23.8181 var(weight) 1.372586 **qnorm(1-.035) 1.811911 93% C.I. is given by 23.8181 ± 1.8119 1.2 30 =( 23.421 , 24 **Find the smallest positive integer k such that P ( X ≤k ) 0.5 . qbinom(.5,50,1/6) 8 a) What are the mean and standard deviation of X? E(X) = np = 50(1/6) = 8.33. SD(X) ¿ 50 ( 1 6 )( 5 6 ) = 2.635 Simulate this experiment by randomly generating the corresponding data a 1000 times. Compute the mean and standard deviation of the generated data and compare with your answers to (c). x <- rbinom(1000,50,1/6) mean(x) sd(x) The values of mean and standard deviation are close to the theoretical values