ST 314 Data Analysis 2

docx

School

Oregon State University, Corvallis *

*We aren’t endorsed by this school

Course

314

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

6

Uploaded by lucekimb

Report
Intellectual Property of Kelsi Espinoza © ST 314 Data Analysis 2 Question 1. Identify the distribution For each random variable: State the distribution that will best model random variable. Choose from the common distributions: o Uniform, o Exponential o Normal distribution Briefly explain your reasoning. Each distribution should be selected one time. State the parameter(s) and their value(s) that describe the distribution. Give the probability density function. Be sure to include the sample space. a. Random Variable 1. In a board game, individuals must attempt to guess a phrase based on clues from their teammate. If they successfully guess the phrase before a buzzer sounds, their teammate may give clues for another phrase. Each correctly guessed phrase, before the buzzer sounds, gives them a point in the game. The buzzer is set to a random time increment anywhere between 35 and 90 seconds. Consider time until the buzzer sounds a random variable where any time between 35 and 90 seconds has an equal likelihood of occurring. b. Random Variable 2. An industrial process yields a large number of steel cylinders. The length of the cylinder is a random variable with an average of 4.25 centimeters and a standard deviation of 0.012 centimeters. The distribution of cylinder lengths is symmetrical, where lengths are more likely to be close to the mean rather than further away from the mean. 1
Intellectual Property of Kelsi Espinoza © (iii) P(X) = 1/0.012√2π * e^-(x-4.25)^2/2*(0.012)^2; -∞ < X < ∞ c. Random Variable 3. A statistics student has a part-time job as a coffee shop barista. They realize the time between customer orders is a random variable. During an eight-hour shift, they measure time between consecutive customer orders and find that the time between customer orders is, on average, 42 seconds. They also discover that times are more likely to be close to 0, and less likely as they get further away from 0 (when values get higher). Question 2. Normal Distributions Some companies "grade on a bell curve" to compare the performance of their managers and professional workers. This forces the use of some low performance ratings so that not all workers are listed as "above average." Kia Motor Company's "performance management process" for this year assigned 11% A grades, 78% B grades, and 11% C grades to the company's managers. Suppose Kia's performance scores really are Normally distributed. This year, managers with scores less than 150 received C grades and those with scores of at least 377 received A grades. a. What is the z-score associated with the 11 th percentile from the standard normal distribution? The 11th percentile corresponds to a cumulative probability of 0.11. The z-score (z) associated with this percentile is approximately -1.224 b. What is the mean and standard deviation of the performance scores? Show work. 2
Intellectual Property of Kelsi Espinoza © c. Suppose the company adds grades D and F, so now there are 5 categories to grade performance. If they want to give As only to those in the top 7%, what performance score must a manager exceed to get an A? Question 3. Simulation of Gamma Random Variables Background: When we use the probability density function to find probabilities for a random variable, we are using the density function as a model. This is a smooth curve, based on the shape of observed outcomes for the random variable. The observed distribution will be rough and may not follow the model exactly. The probability density curve, or function, is still just a model for what is actually happening with the random variable. In other words, there can be some discrepancies between the actual proportion of values above x and the proportion of area under the curve above the same value x. Our expectation is as the number of observations increase, literally or theoretically, the observed distribution will align more with the density curve. Over the long run, the differences are negligible, the model is sufficient and more convenient to find desired information. Simulation: Use R to simulate 1000 observations from a gamma distribution. To begin, alpha = 3 and beta = 5. Highlight and run the parameters and observation values. Run the simulation code to plot the observations and fit the probability density function over the observations. You don’t need to change anything. You may run the section all at once by highlighting all of the section and running it by clicking the run button at the top of the script window. a. Given the values are from a gamma distribution with alpha= 3 and beta = 5, 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Intellectual Property of Kelsi Espinoza © i. What is the expression for the probability density function? ii. What is the average and standard deviation of random variable? Show work. iii. What is the probability X is less than 4? Show work. b. Run the simulation and paste your plot. Comment on the general shape of the distribution. How well does the density curve fit the observations? 4
Intellectual Property of Kelsi Espinoza © c. What is the exact proportion of values below 4? How does the actual proportion compare to the probability from the density curve in Q3(a-iii)? d. Increase the number of observation to 10000, rerun the simulation. Paste your plot. How does increasing the number of observations affect the fit of the density curve? e. What is the exact proportion of values below 4? How does increasing the number of observations affect the accuracy of the model? Make a comparison between this proportion and 3-a-iii and 2c. 5
Intellectual Property of Kelsi Espinoza © f. Rerun the simulation with alpha = 1, beta = 5, and observations = 10000. Paste your plot. g. Comment on the general shape of the distribution. We noticed a shift towards the far right in the graph. h. This model is a special case of the gamma distribution, what is it specifically? What is the expression for the probability density function? i. Optional: Change the parameter values and take note of the effect of increasing or decreasing parameter values. As alpha rises, the distribution approaches a shape that closely resembles a normal distribution. The peak diminishes as alpha increases. 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help