docx

School

University of California, Santa Barbara *

*We aren’t endorsed by this school

Course

108C

Subject

Statistics

Date

Jun 11, 2024

Type

docx

Pages

11

Uploaded by CorporalSquirrelMaster1081

Report
Name: ______________________________ Student’s perm #: ________________________ Student’s signature: ________________________ Grade =______________ /80 pts. MCDB 108C Spring 2024 MIDTERM EXAM #2 DO NOT OPEN THE EXAM UNTIL YOU ARE INSTRUCTED TO DO SO This exam should have 11 pages. Two blank pages are collated to the exam for you to draft your answers. Please put your name on every page You are allowed to use two pages of your own hand-written notes . While you may not use a computer or smart phone during the exam, you’re welcome to use a basic scientific calculator. In your answers, you should show how you arrived at your final conclusion(s) . Each point of this exam represents 1 point of the final grade of the class (total: 500 points). 1/11
Name: ______________________________ Part A: The following set of questions focuses on a problem introduced during the sections of Week 4. A “mother” cell contains 5 green fluorescence proteins (GFP). When the mother cell divides, the GFP molecules segregate at random and with equal probability between the two daughter cells, and β . Note: In your computation of probabilities, you can leave numerical fractions without calculating their exact values. Question A1 [8 points]: What is the probability that all GFP molecules end in daughter cell ? Briefly justify the probabilistic model you’re using, including its assumptions . 2/11
Name: ______________________________ Assumptions: the molecules segregate between cells and β as a Bernoulli trial. This model is justified because (1) each molecule ends in either cell or β (2 possible outcomes) and (2) the probability of each outcome is fixed and equal to 0.5. Finally, the application of the Bernoulli trial implies that each molecule segregates independently of each other. The outcome of a collection of ensemble of Bernoulli trials can be described by the Binomial distribution. If k denotes the number of GFP molecules in cell , the probability that all 5 molecules are found in cell is: P(k=5;n=5;p=0.5)=(1/2)^5=1/32=0.0312 Rubric: +4 pts for listing the 3 assumptions of a Bernoulli trial and the use of a Binomial distribution. In principle the students should explicitly mention the Bernoulli trial in their justification, but this might not be necessary if they justify the use of the Binomial distribution. +4 pts for computing the probability. Note that this value can be obtained without using the Binomial distribution. Question A2 [10 points]: What is the probability of observing 3 or 4 GFP molecules in cell ? Let k be the stochastic variable representing the number of GFP molecules in cell . The probability of observing 3 or 4 GFP molecules in cell is computed by applying the addition rule of probabilities corresponding to mutually exclusive events. Thus the probability is: P(k=3 or k=4;n=5;p=0.5) = P(k=3;n=5;p=0.5) + P(k=4;n=5;p=0.5) = 5!/(3! * 2!) (1/2)^5 + 5!/(4!) (1/2)^5 = 10 (1/2)^5 + 5 (1/2)^2 = 0.4687 Rubric: +2 for associating the number of GFP molecules in cell alpha with a random variable +4 pts for applying and justifying the use of the additive rule for mutually exclusive events +4 pts for using the correct formula of the Binomial distribution for k=3 and k=4 Question A3 [8 points]: Consider the segregation of many GFP molecules between cells and β . This time, the segregation is asymmetrical. The probability that a GFP molecule gets transmitted to cell β is very low: 10 -3 . The total number of GFP molecules is large: 2,000. What is 3/11
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Name: ______________________________ the probability of observing at least 1 GFP molecule in the daughter cell β? Please briefly justify the probabilistic model you are using to answer this question. As discussed above the segregation of each GFP molecule can be viewed as a Bernoulli trial. In this problem, a successful outcome is the transmission of a given molecule to cell β. Since the number of Bernoulli trials is very large (2,000) and the probability that observing a successful outcome is very small (0.001), the Binomial distribution can be approximated by the Poisson distribution. Let k be a stochastic variable describing the final number of molecules in cell β. The expected (or average) number of GFP molecules in cell β is given by μ = n* p = 2. P(k≥1; μ=2) = 1-P(k=0; μ=2) = 1 – (2^0/0!) *e^(-2) = 1 – 1/e^2 = 0.8647 Rubric: +3 pts for justification of the application of the Poisson dist +2 pts for using the complementary probability (1-P) +3 pts for making proper use of the formula of the Poisson dist -3 pts if a student failed to use the (1-P) to calculate the final probability Part B: Imagine a small pond from which mosquitoes larvae hatch. During the middle of the night, 100 mosquitoes are released at the same time from the pond. Humans sleep in a house located near the pond. To estimate the likelihood that humans will get bitten by the mosquitoes, you are asked to model the dispersion of the mosquitoes as 1D random walk. Fact about the mosquitoes’ dispersion: Assume that all 100 mosquitoes are released from the same site x = 0 m at the same time. Positions to the right side of the pond are positive. During intervals of 1 hour , the mosquitoes move with equal probability either to the left or to the right by steps of 10 meters. Question B1 [8 points]: What is the fraction of mosquitoes found at position x = 0 m after 4 hours (equivalent to 4 time steps). Hint: Use the “law of motion.” 4/11
Name: ______________________________ Let k be the number of steps that a mosquito travels toward the right (positive) side and n be the total number of steps. Here n=4. The probability of moving to the right side is p=0.5. The law of motion of the mosquito progression is given by: x(k) = distance to the left side – distance to the right side x(k) = k * 10 m – (n-k) * 10 m = (2k-n) * 10 m As x(k)=0 m for k=2, the probability of finding a mosquito at position x = 0 m is P(k=2;n=4;p=0.5) = 4! / (2! 2!) (1/2)^4 = 6 (1/2)^4 = 6/16 = 0.375 The fraction of mosquitos found at position x(k) = 0 m will be 100 * P(k=2;n=4;p=0.5) = 37.5%. Note that this result can be seen as the expected (average) number of success (mosquito being at x = 0 m) for 100 trials with a probability of success of 0.375 for the individual trials. Rubric: +3 pts for writing the proper law motion and deriving the link between k=2 and x=0 m +3 pts for computing the probability of observing k=2 with Binomial distribution +2 pts for computing the fraction of mosquitos at x=0 m. There is no need to justify the formula 100 * P(k=2;n=4;p=0.5) as long as it is correct. Since the question asks for the “fraction” of mosquitos answering 37.5 mosquitos wouldn’t be fully correct. I would remove 1 pt for this type of answer. Question B2 [8 points]: Imagine that the mosquitoes are attracted to CO2 emitted by humans sleeping in the house. The detection of CO2 increases the probability that a mosquito moves to the right side to ¾ instead of ½. What is the average position of the mosquitoes after 4 hours? Please briefly justify your answer. Hint: The “law of motion” of the mosquitoes should be linear. 5/11
Name: ______________________________ In this case, the same law of motion can be used: x(k) = (2k-n) * 10 m. The final position of a mosquito is a stochastic variable. The average position of x(k) after 4 hours is given by <x(k)>= (2 <k> - n) * 10 m. The average value of <k> is equal to n * p = 4 * ¾ = 3. Therefore the average final position of the mosquitos is: (2 * 3 – 4) * 10 m = 20 m. Rubric: +3 pts for properly applying the average to the x(k) (showing the formula with the sum is worth 2 pts) +3 pts for properly calculating the value of <k> +2 pts for the correct final value of <x> (including units) 6/11
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Name: ______________________________ Question B3 [10 points]: At a given position in the house, the probability of being bitten by a mosquito is low when 0-10 mosquitoes are found at this position. The probability of being bitten is moderate for 11-20 mosquitoes, and high for >20 mosquitoes at a given position. Given the attraction of the mosquitoes to CO2, what is the probability that an individual sleeping at position x = +40 m will be bitten by a mosquito after 4 hours? After n = 4 steps (or hours), x(k)=40 m for k=4. The probability of finding a mosquito at x = 40 m is P(k=4;n=4;p=0.5) = (3/4)^4 = 0.3164. The average number of mosquitos found at position x = 40 m is 31.6 mosquitos. Therefore the probability of being bitten by a mosquito at this position is high . Rubric: +3 pts for establishing a proper link between x=40 m and k=4 +2 pts for referring to the use of P(4;4;3/4) +2 pts for properly computing the numerical value of P(4;4;3/4) +2 pts for deriving the right conclusion (or a conclusion consistent with the probability computed in the rest of the answer) 7/11
Name: ______________________________ Part C: In coding assignment #4, you studied the movement of a Brownian particle in 2D. The main routine is copied below after minor modifications to simulate the behavior of a single particle. Question C1 [8 points]: Imagine that you rewrite the for loop of line 7 as: for i = 1:(number_steps-1) How should the rest of the script be updated so that it runs as its original version? Please refer to specific line numbers. The proposed change amounts to transforming the indexing within the for loop. In the new loop, the index of the first position computed along the x and y dimensions should be given by i+1 since the loops start with i=1. Therefore the new position at i+1 should be defined as a function of the position at i. The last index of the loop cannot be larger than (number_steps- 1) so that the last position computed corresponds to index number_steps. Four edits should be made for the updated loop to run as the original: Line 9: pos_x(i) = pos_x(i-1) + 1; should become pos_x(i+1) = pos_x(i) + 1; Line 11: pos_x(i) = pos_x(i-1) - 1; should become pos_x(i+1) = pos_x(i) – 1; Line 15: pos_y(i) = pos_y(i-1) + 1; should become pos_y(i+1) = pos_y(i) + 1; Line 17: pos_y(i) = pos_y(i-1) - 1; should become pos_y(i+1) = pos_y(i) – 1; Rubric: +2 pts for each of the 4 correct changes Providing an explanation of the change in indexing is worth 5 pts on its own in case the edits are incorrect. 8/11
Name: ______________________________ Question C2 [10 points]: You run the script 5 times. Each random walk consists of 500 steps starting from position (x=0, y=0) . The 5 trajectories are plotted in the figure below. You notice that all particles ended at positions where y is equal to 500. Among the following 4 options, what is the set of probability parameters that might have produced the trajectories of the figure? Please briefly justify your choice. Hint: Explain what allows you to exclude options. OPTION A: prob_right = 0.5; prob_up = 0.5; OPTION B: prob_right = 0.75; prob_up = 0.75; OPTION C: prob_right = 0.5; prob_up = 1.0; OPTION D: prob_right = 1.0; prob_up = 1.0; The key giveaway is that the final positions of the particles all have y=500 following after 500 steps. This outcome is only possible if each particle moves in a deterministic way upward with prob_up = 1.0. The only options compatible with a deterministic upward motion are (C) and (D). Hence (A) and (B) can be ruled out. We observe that the particles can move to the left and to the right. This excludes option (D) since this option should produce a deterministic movement upward and to the right, which isn’t the case of the trajectories shown in the Figure. Therefore the only option compatible with the figure is (C). Rubric: + 4 pts for concluding that prob_up must be equal to 1 given that y=500 for all particles. + 4 pts for concluding that prob_right cannot be equal to 1 or even 0.75 given the existence of no apparent bias of the trajectories toward the right side. + 2 pts for correct final conclusion 9/11
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Name: ______________________________ (extra space for C2) Part D [10 points]: In Luria-Delbruck experiment, briefly explain how the jackpot effect differentiates the predictions of the model proposed by Lamarck and Darwin. In your answer, please define what the “jackpot effect” is. Key elements to be included in the answer: 1) In the Lamarckian model, the number of mutants is determined by the a random variable that follows a Poisson distribution (rare event) calculated based on the population of non-resistant cells undergoing the antibiotic treatment at the last generation. 2) In the Darwinian model, the jackpot effect is associated with the exponential amplification of mutants born spontaneously in generations preceding the application of the antibiotic. The spontaneous emergence of a mutant follows a Poisson distribution for rare event, which becomes more likely as the number of non-resistant cells grows exponentially across generations 3) The jackpot effect can result in very large numbers of mutants that are virtually impossible in the Lamarckian model. Thus the jackpot effect differentiates the statistical outcomes of the Darwinian and the Lamarckian models. Edited answer from the forum of Week 5: In the Darwinian model, the jackpot effect statistically refers to the relatively rare occurrence of mutants at generations preceding the antibiotic challenge. These early-generation mutants are amplified exponentially by continued rounds of division until the last generation with the antibiotic treatment. The exponential amplification gives rise to the "long tailed” nature of the distribution of mutants in which the existence and frequency of large numbers of resistant colonies per plate is incompatible with what would be predicted by a Poisson distribution that considers only the final round of replications at the time of the antibiotic treatment (Lamarckian model). While plates with a large number of resistant colonies might not be frequent, they "stand out" in the distribution and stretches the "tail" so that it is longer than the Lamarckian model would predict. Rubric: +4 pts for definition of jackpot effect, its link with the Poisson distribution at every generation +3 pts for explanation of the use of the Poisson distribution 10/11
Name: ______________________________ in the Lamarckian model +3 pts for overall conclusion 11/11