Statistical Solutions for Geometric Distribution Homework 3

Stats 120B Homework 3: Solutions 1. Let X 1 , . . . , X n be a random sample from Geometric distribution with pmf P ( X = x ) = (1 - p ) x p. The mean of this distribution is (1 - p ) /p . (a) Find the estimator for p using method of moments. Match (1 - p ) /p with ¯ X n , we get ˆ p = 1 ¯ X n + 1 . (b) Write down the likelihood function of X 1 , . . . , X n . Find the MLE of p . The likelihood function is L ( p ) = n Y i =1 (1 - p ) x i p = (1 - p ) ∑ i x i p n . Taking log and the derivative, we get log L ( p ) = log(1 - p ) X i x i + n log p. ⇒ d dp log L ( p ) = ∑ i x i p - 1 + n p = 0 . ⇒ ˆ p = 1 ¯ X n + 1 . (c) Consider a Beta prior on p , i.e., p ∼ Beta( α, β ). Find the posterior distribu- tion of p . Take the product of prior and the likelihood, we get ξ ( β | x ) ∝ p α - 1 (1 - p ) β - 1 × (1 - p ) ∑ i x i p n = p n + α - 1 (1 - p ) ∑ i x i + β - 1 . So the posterior distribution is Beta( n + α, ∑ i X i + β ). (d) What is the Bayes estimator of p under squared error loss? Denote it by ˆ p B . The Bayes estimator under squared error loss is the posterior mean of Beta( n + α, ∑ i X i + β ), which is ˆ p B = n + α n + α + ∑ i X i + β 1

(e) What happens to ˆ p B if both α and β goes to 0? Let α, β → 0, then ˆ p B becomes MLE/MOM. This result can be viewed as prior contributing no weight, which makes the data likelihood solely deter- mines the posterior distribution. 2. The mgf of χ 2 ( m ) is (1 - 2 t ) - m/ 2 for any t < 1 / 2. (a) In the textbook, Theorem 8.2.2 states that if the random variables X 1 , . . . , X k are independent with X i ∼ χ 2 ( m i ) for i = 1 , ..., k , then their sum ∑ k i =1 X i ∼ χ 2 ( ∑ k i =1 m i ). Prove this theorem using mgf. Since independence, the mgf of ∑ X i is k Y i =1 M X i ( t ) = k Y i =1 (1 - 2 t ) - m i / 2 = (1 - 2 t ) - ∑ k i =1 m i / 2 , which is the mgf of χ 2 ( ∑ k i =1 ( m i ). (b) Theorem 8.2.3 states that if Z ∼ N (0 , 1), then Y = Z 2 ∼ χ 2 (1). Prove this theorem using mgf. (Hint: Make the integral look like a known pdf.) M Y ( t ) = E ( e Y t ) = E ( e Z 2 t ) = Z ∞ -∞ e z 2 t 1 √ 2 π e - z 2 / 2 dz = Z ∞ -∞ 1 √ 2 π e - ( √ 1 - 2 tz ) 2 / 2 dz let y = √ 1 - 2 tz = Z ∞ -∞ 1 √ 2 π e - y 2 / 2 1 √ 1 - 2 t dy pdf of N(0,1) for y = (1 - 2 t ) - 1 / 2 (c) If Y ∼ χ 2 (1), does X = Y 1 / 2 follow the normal distribution? Explain. No, the support of X is [0 , ∞ ). (d) Prove the following result using CLT: Let X ∼ χ 2 ( n ), then X - n √ 2 n d → N (0 , 1) as n → ∞ . We can write X = Y 1 + · · · + Y n with Y 1 , . . . , Y n i.i.d ∼ χ 2 (1). By CLT: ¯ Y n - 1 p 2 /n d → N (0 , 1) . 2

Multiply n on both numerator and denominator, the result follows since X = n ¯ Y n . 3. Define the following random variables: X 1 , X 2 , X 3 , X 4 iid ∼ N (0 , 1) W ∼ N (5 , 9) V 1 , V 2 iid ∼ χ 2 (4) Assume X 1 , . . . , X 4 , W, V 1 , V 2 are all independent of each other. Also define ¯ X = 1 4 ( X 1 + X 2 + X 3 + X 4 ) . Each of the following random variables, Y , has either a normal distribution, χ 2 distribution, or t -distribution. For each random variable, state the name of the distribution and the value(s) of its parameter(s). Note: You will not be deriving these distributions (e.g., using the cdf method for transformations or finding mgfs). Instead, study carefully the properties of the three distributions mentioned above, and justify your answer by rearranging each random variable to fit these properties. (a) Y = X 1 + X 2 By Theorem 5.6.7, Y ∼ N (0 , 2) (b) Y = X 2 1 + X 2 2 By Corollary 8.2.1, Y ∼ χ 2 (2). (c) Y = W - 5 3 2 Since ( W - 5) / √ 9 ∼ N (0 , 1), by Theorem 8.2.3, Y ∼ χ 2 (1). (d) Y = 4 ¯ X 2 By Theorem 8.3.1, ¯ X ∼ N (0 , 1 / 4), so ( ¯ X - 0) / (1 / √ 4) ∼ N (0 , 1). Thus, by Theorem 8.2.3, Y = ¯ X - 0 1 / √ 4 2 = (2 ¯ X ) 2 ∼ χ 2 (1) . 3

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

(e) Y = 1 2 ( X 1 + X 2 ) 2 + 1 2 ( X 3 + X 4 ) 2 By Theorem 5.6.7, X 1 + X 2 ∼ N (0 , 2) and X 3 + X 4 ∼ N (0 , 2). Thus, ( X 1 + X 2 - 0) / √ 2 ∼ N (0 , 1) and ( X 3 + X 4 - 0) / √ 2 ∼ N (0 , 1). By Corollary 8.2.1, Y = X 1 + X 2 √ 2 2 + X 3 + X 4 √ 2 2 ∼ χ 2 (2) . (f) Y = 2 X 1 √ V 1 By Definition 8.4.1, and the fact that X 1 independent with V 1 , Y = X 1 p V 1 / 4 ∼ t (4) . (g) Y = ¯ X √ 32 √ V 1 + V 2 As shown in (d), 2 ¯ X ∼ N (0 , 1). Also, by Theorem 8.2.2, V 1 + V 2 ∼ χ 2 (8). Also, ¯ X is independent with V 1 + V 2 . Thus, by Definition 8.4.1, Y = 2 ¯ X p ( V 1 + V 2 ) / 8 ∼ t (8) . (h) Y = V 1 V 2 Note that V 1 and V 2 are independent, and Y = V 1 / 4 V 2 / 4 ∼ F (4 , 4). (i) Y = X 2 1 X 2 2 Since X 2 1 ∼ χ 2 (1) and X 2 2 ∼ χ 2 (1), since X 2 1 is independent with X 2 2 , so Y = X 2 1 / 1 X 2 2 / 1 ∼ F (1 , 1). (j) Y = W - 5 3 X 1 Since ( W - 5) / 3 ∼ N (0 , 1) and X 1 ∼ N (0 , 1), and they’re independent, so Y follows the standard Cauchy distribution, which is also the t (1) distribu- tion. (k) Y = 32 ¯ X 2 V 1 + V 2 We know T = ¯ X √ 32 √ V 1 + V 2 ∼ t (8) . 4

Since Y = T 2 , Y ∼ F (1 , 8). 4. Suppose that a random sample of eight observations is taken from the normal dis- tribution with unknown mean μ and unknown variance σ 2 , and that the observed values are 3.1, 3.5, 2.6, 3.4, 3.8, 3.0, 2.9, and 2.2. Find the shortest confidence interval for μ with each of the following three confidence coefficients: (a) 0.90, (b) 0.95, and (c) 0.99. R code and output: > dat = c(3.1,3.5,2.6,3.4,3.8,3.0,2.9,2.2) > m = mean(dat) > m [1] 3.0625 > s = sd(dat) > s [1] 0.5125218 > n = 8 > > # (a) 90% CI for mu: > q = qt(.95,7) > q [1] 1.894579 > m + c(-1,1)*q*s/sqrt(n) [1] 2.719195 3.405805 > > # (b) 95% CI for mu: > q = qt(.975,7) > q [1] 2.364624 > m + c(-1,1)*q*s/sqrt(n) [1] 2.634021 3.490979 > > # (c) 99% CI for mu: > q = qt(.995,7) > q [1] 3.499483 > m + c(-1,1)*q*s/sqrt(n) [1] 2.42838 3.69662 (a) 3 . 0625 ± (1 . 8946) 0 . 5125 √ 8 = (2 . 719 , 3 . 406) (b) 3 . 0625 ± (2 . 3646) 0 . 5125 √ 8 = (2 . 634 , 3 . 491) 5

(c) 3 . 0625 ± (3 . 4995) 0 . 5125 √ 8 = (2 . 428 , 3 . 697) 5. Let X 1 , . . . , X n be a random sample from a Bernoulli ( p ) distribution. Define the sample proportion ˆ p = X 1 + · · · + X n n . (a) According to the Central Limit Theorem, what is the approximate probability distribution of ˆ p for large n ? ˆ p · ∼ N p, p (1 - p ) n (b) Give an expression for the standardized sample proportion, and argue why it is a pivotal quantity. The standardized sample proportion is ˆ p - p r p (1 - p ) n . This is a pivotal quantity since it has an approximate standard normal distri- bution, which does not depend on any unknown parameters. (c) Use your result in part (b) to derive an approximate large-sample (1 - α ) × 100% confidence interval for the population proportion p . Hint: We cannot know the true variance of ˆ p from sample data, so you will need to replace p by ˆ p in the expression for the variance of ˆ p . Note that this will not change the approximate distribution of the pivotal quantity. Let z β be the β -quantile of a standard normal distribution. Then for large n , 1 - α ≈ P z α/ 2 < ˆ p - p r p (1 - p ) n < z 1 - α/ 2 = P ˆ p + z α/ 2 r p (1 - p ) n < p < ˆ p + z 1 - α/ 2 r p (1 - p ) n Since p is unknown, we need to estimate the standard deviation of ˆ p , which results in an approximate (1 - α ) × 100% confidence interval ˆ p ± z 1 - α/ 2 r ˆ p (1 - ˆ p ) n . 6

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

(d) A Jan. 31, 2014, Gallup article reported that “Russians See Gold in Sochi Olympic Games Yet Many Concerned About Corruption” (http://www.gallup.com/poll/167138/russians-gold-sochi-olympic-games.aspx). The article reports on results of a poll of a random sample of 2,000 adults in Russia. In the sample, 66% responded that they thought the olympic games would increase corruption. i. Using your formula from part (c), construct an approximate 95% confi- dence interval for the true proportion of all adult Russians that believe the olympic games will increase corruption. R code and output: > .66+c(-1,1)*qnorm(.975)*sqrt(.66*(1-.66)/2000) [1] 0.6392392 0.6807608 An approximate 95% confidence interval for p is . 66 ± 1 . 96 r . 66(1 - . 66) 2000 = . 66 ± . 01059 = ( . 6392 , . 6808) . ii. Give an interpretation of your interval in context, i.e., “We are 95% con- fident that .” We are 95% confident that the true proportion of adults in Russia that believe the olympic games will increase corruption (at the time of the sur- vey) is between .6392 and .6808. iii. Explain what the phrase “95% confident” means in your interpretation. By “95% confident”, we mean that if we were to repeat this study many times, in 95% of random samples of 2,000 adults in Russia, the confidence interval produced by the sample would capture the true proportion of adult Russians that believe the olympic games will increase corruption. 7

hw3_sol

Related Documents