STAT244NF_HW6_Fcn_CompMod_RN

Rmd

School

Mount Holyoke College *

*We aren’t endorsed by this school

Course

244

Subject

Statistics

Date

Apr 3, 2024

Type

Rmd

Pages

8

Uploaded by SargentArmadillo1825

Report
--- title: 'Homework 6: Functions for Compartmental Models and Reproductive Numbers' subtitle: "STAT 244NF: Infectious Disease Modeling" author: "DAISY DOAN" date: "`r format(Sys.time(), '%d %B, %Y')`" output: pdf_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) library(ggplot2) library(dplyr) ``` ## Instructions: 1. Please replace "YOUR NAME HERE" under author above with your name before you knit your final document for submission. 2. All of your homework needs to be completed in this document, whether it requires R or just typed responses. As we incorporate more statistical computing into the course, it will be important that you are comfortable with R Markdown, so start now. Remember that R Markdown gives us a convenient framework for reproducible statistical reports because it contains a complete record of our analyses and conclusions. 3. You may knit to PDF, HTML, or Word (click on the drop-down menu for Knit to choose the file output type). 4. Before submitting your work, please make sure the knitted file is well organized, legible, and looks the way you expect! 5. Please include the names of any classmates with whom you worked on this homework, as well as any external resources that you might have used. 6. This homework assignment is **due on Monday, March 25, 2024 and should be submitted to Gradescope**. - *Collaborators:* - *External resources:* ## Reproductive Numbers **RN 1. Write a function to calculate the reproductive number for an SI model, where the growth rate and infectious period are variables in the function and the function returns the value of the reproductive number. Call the function RN_SI. To test your function, verify that for $\Lambda=0.5$ cases/day and $D=2$ days, the resulting reproductive number is 2. You must show this to get full credit.** ```{r} RN_SI <- function(Lambda, D){ 1+Lambda*D } RN_SI(0.5, 2) ``` **RN 2. Write a function to calculate the reproductive number for an SEIR model, where the growth rate and serial interval are variables in the function, and the function returns the value of the reproductive number. Call the function RN_SEIR. To test your function, verify that for a growth rate of 1.2 cases/day and a serial interval of 2.2 days, the resulting reproductive number is 3.64. You must show this to get full credit.**
```{r} RN_SEIR <- function(Lambda, T_s){ 1+Lambda*T_s } RN_SEIR(1.2,2.2) ``` **RN 3. The herd immunity threshold, which is defined as the proportion of the population that need to be immune for the effective reproductive number to be 1 (endemic state - not growing or shrinking), is calculated as $1-\frac{1}{R_0}$. What is the the herd immunity threshold for the infection described in (b)?** ```{r} HI <- function(Lambda, T_s){ 1-(1/(1+Lambda*T_s)) } HI(1.2, 2.2) ``` **RN 4. Suppose for the infection described in (b), 75% of the population are immune through vaccination. Will the reproductive number be greater than 1 (meaning it will continue to spread in the population), exactly 1 (meaning it will remain steady in the population, neither growing nor shrinking), or less than 1 (meaning we have interrupted the chain of transmission and the infection will subside in the population)?** **Response:** ```{r} R0 <- function(HI){ 1/(1-HI) } R0(.75) ``` - $R_0>1$ (meaning it will continue to spread in the population) since herd immunity (HI) is $75\%, 0<HI<1, 1>1-HI>0, \frac{1}{1-HI}>1$ and not whole the population is susceptible. - When the population is not fully suceptible (ussually), the basic $R_0$ is a less appropriate choice for describing the infection period. Thus, other reproductive number formulations have been developed, including $$R_{eff} = R_0*s$$ where s is the fraction of the host population that is susceptible (25% in this case) $ $R_{eff}=R_0*25\%$$ ## Examining the relationship between risk and rate. When we introduced one of the potential formulations for $\lambda_t$, it was claimed in the notes that $$\underbrace{c_e\times \frac{I_{t-1}}{N}}_{\text{rate}}\approx \underbrace{1-e^{\ overbrace{-c_e\times \frac{I_{t-1}}{N}}}}^{\text{rate}}_{\text{risk}}.$$ If there is a claim made that makes you feel skeptical, you can always investigate it yourself. Is it true? If so, is it true for all relevant values, or is it true for a range. Let's briefly explore the relationship between a rate and risk by running the code below. ```{r, fig.align='center'} rate_versus_risk <- function(rate){ risk <- 1-exp(-rate)
return(data.frame(rate=rate, risk=risk)) } rate_risk_df <- rate_versus_risk(rate=seq(0,1,by=0.1)) ggplot(data=rate_risk_df, aes(x=rate, y=risk)) + geom_line(linewidth=1.1, color="cornflowerblue") + geom_vline(xintercept=0.1, linetype="dashed") + geom_hline(yintercept=0.1, linetype="dashed") + geom_vline(xintercept=0.2, linetype="dashed") + geom_hline(yintercept=0.2, linetype="dashed") + theme_bw() ``` **RR 1. Based on your examination of the plot, for what range of $c_e\frac{I_{t-1}} {N}$ do you feel comfortable with the claim that ** $$c_e\times \frac{I_{t-1}}{N}\approx 1-e^{-c_e\times \frac{I_{t-1}}{N}}?$$ **Response:** In the range of [0, 0.1], it's comfortable to make this claim since based on the plot, we can see that in this range, the line is almost a linear line with the slope of 0.5. It means that the rate is approximate to the risk and we more ussually use risk than rate in calculation since it is simpler to compute and stabler over time (in the range we consider). Otherwise, outside this range, the slope decreased and the rate get far from the risk exponentially (not obviously visualized but non-linear.) In practice, modelers tend to work with risks rather than rates (Vynnycky and White, Chapter 2) because they are easier to calculate, and tend to be approximately equal over the range that is relevant in infectious disease modeling. In the following modeling exercise for mumps, we are going to exclusively use risks for our parameters, so please pay attention to the details. ## Mumps Mumps is a viral illness characterized by fever, headache, muscle pain, lethargy, and loss of appetite, as well as swelling of facial glands. While children in the United States are vaccinated for mumps as part of the standard course of childhood vaccinations, there have still been large outbreaks of mumps on college campuses including Harvard University (2016), The Ohio State University (2014), and the University of Iowa (2006, 2016). (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8790351/). According to the CDC, the average incubation period for mumps is 16-18 days, although it can range from 12 to 25 days at the extremes. Individuals become infectious 2-3 days before facial swelling begins, and remain infectious for about 5 days after swelling begins. Since the incubation period is the time from exposure to symptom onset, and we are told that individuals are infectious 2-3 days before swelling begins (before symptom onset), this suggests a pre-infectious period of 13-16 days, with a larger range of 9 to 23 days at the extremes. The presence of this pre-infectious period suggests that an appropriate model for mumps is an SEIR model, which we can capture with the following. We are also assuming this is a fulling immunizing infection. The basic reproductive number ranges from 4 to 7. For this modeling exercise, we will use the following model. SEIR model differencing equations:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
$$S_t=S_{t-1}-\lambda_tS_{t-1}$$ $$E_t=E_{t-1}+\lambda_tS_{t-1}-\pi E_{t-1}$$ $$I_t=I_{t-1}+\pi E_{t-1}-\rho I_{t-1}$$ $$R_t=R_{t-1}+\rho I_{t-1}$$ where $\lambda_t=1-e^{-c_e\frac{I_{t-1}}{N}}$, $\pi=1-e^{-\frac{1}{D'}}$, and $\ rho=1-e^{-\frac{1}{D}}$, where $D'$ and $D$ are the pre-infectious and infectious periods, respectively. **SEIR 1. Write a function to run a simulation of a mumps outbreak. Call your function `SEIR_simulation`. In order to receive full credit, your function must do the following:** - use the basic reproductive number (let's call it RN), N, S0, E0, I0, pD (this is $D'$), D, and TotalTime as variables. Note, if you prefer other names, you can use any informative variable names that are valid in R, but they have to be variables in the model - they cannot be fixed quantities, i.e. numbers. Also, note that N is the same at all time points, so you can get an initial value for the R compartment if you know N, S0, E0, and I0; - return a data frame with three columns, time, compartment, and count. ```{r} lambda_t_fcn <- function(RN, D, I_i, N){ c_e <- RN/D return(1-exp(-c_e*I_i/N)) } pi_t_fcn <- function(pD){ return(1-exp(-1/pD)) } rho_t_fcn <- function(D){ return(1-exp(-1/D)) } SEIR_simulation <- function(N, S0, E0, I0, RN, pD, D, Time){ SEIR_df <- data.frame(time=0:Time, S=rep(NA, Time+1), E=rep(NA,Time+1), I=rep(NA, Time+1), R=rep(NA, Time+1), E_SE = rep(NA,Time+1), I_EI = rep(NA, Time+1), R_IR = rep(NA, Time+1), lambda_t=rep(NA, Time+1)) SEIR_df$S[1] <- S0 SEIR_df$E[1] <- E0 SEIR_df$I[1] <- I0 SEIR_df$R[1] <- N-S0-E0-I0 for (t in 2:(Time+1)){ SEIR_df$lambda_t[t] <- lambda_t_fcn(RN=RN, D=D, I_i=SEIR_df$I[t-1], N=N) SEIR_df$E_SE[t] <- rbinom(n=1, SEIR_df$S[t-1],SEIR_df$lambda_t[t]) SEIR_df$I_EI[t] <- rbinom(n=1,SEIR_df$E[t-1],pi_t_fcn(pD)) SEIR_df$R_IR[t] <- rbinom(n=1, SEIR_df$I[t-1],rho_t_fcn(D)) SEIR_df$S[t] <- SEIR_df$S[t-1]- SEIR_df$E_SE[t] SEIR_df$E[t] <- SEIR_df$E[t-1]+SEIR_df$E_SE[t]-SEIR_df$I_EI[t]
SEIR_df$I[t] <- SEIR_df$I[t-1]+SEIR_df$I_EI[t]-SEIR_df$R_IR[t] SEIR_df$R[t] <- SEIR_df$R[t-1]+SEIR_df$R_IR[t] } return(data.frame(time=rep(0:Time, 4), compartment=rep(c("S","E","I", "R"), each=(Time+1)), count=c(SEIR_df$S, SEIR_df$E, SEIR_df$I, SEIR_df$R))) } ``` **SEIR 2. Run your function with the following arguments. Then, plot each result using ggplot. Note, for each combination of arguments, you will need to save your function as something so that you will be able to use that function (a data frame) for plotting.** (a) N=1000, S0=999, E0=0, I0=1, RN=4, pD=13 days, D=7 days, TotalTime=365 days ```{r fig.align='center'} SEIR_sim1 <- SEIR_simulation(1000, 999, 0, 1, 4, 13, 7, 365) ggplot(data = SEIR_sim1, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R")) ``` (b) N=1000, S0=999, E0=0, I0=1, RN=5.5, pD=13 days, D=7 days, TotalTime=365 days ```{r fig.align='center'} SEIR_sim2 <- SEIR_simulation(1000, 999, 0, 1, 5.5, 13, 7, 365) ggplot(data = SEIR_sim2, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R")) ``` (c) N=1000, S0=999, E0=0, I0=1, RN=7, pD=13 days, D=7 days, TotalTime=365 days ```{r fig.align='center'} SEIR_sim3 <- SEIR_simulation(1000, 999, 0, 1, 7, 13, 7, 365) ggplot(data = SEIR_sim3, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R")) ``` (d) N=1000, S0=999, E0=0, I0=1, RN=4, pD=16 days, D=7 days, TotalTime=365 days ```{r fig.align='center'} SEIR_sim4 <- SEIR_simulation(1000, 999, 0, 1, 4, 16, 7, 365) ggplot(data = SEIR_sim4, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R")) ``` (e) N=1000, S0=999, E0=0, I0=1, RN=5.5, pD=16 days, D=7 days, TotalTime=365 days ```{r fig.align='center'} SEIR_sim5 <- SEIR_simulation(1000, 999, 0, 1, 5.5, 16, 7, 365) ggplot(data = SEIR_sim5, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R"))
``` (f) N=1000, S0=999, E0=0, I0=1, RN=7, pD=16 days, D=7 days, TotalTime=365 days ```{r fig.align='center'} SEIR_sim6 <- SEIR_simulation(1000, 999, 0, 1, 7, 16, 7, 365) ggplot(data = SEIR_sim6, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R")) ``` (g) N=1000, S0=999, E0=0, I0=1, RN=4, pD=13 days, D=8 days, TotalTime=365 days ```{r fig.align='center'} SEIR_sim7 <- SEIR_simulation(1000, 999, 0, 1, 4, 13, 8, 365) ggplot(data = SEIR_sim7, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R")) ``` (h) N=1000, S0=999, E0=0, I0=1, RN=5.5, pD=13 days, D=8 days, TotalTime=365 days ```{r fig.align='center'} SEIR_sim8 <- SEIR_simulation(1000, 999, 0, 1, 5.5, 13, 8, 365) ggplot(data = SEIR_sim8, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R")) ``` (i) N=1000, S0=999, E0=0, I0=1, RN=7, pD=13 days, D=8 days, TotalTime=365 days ```{r fig.align='center'} SEIR_sim9 <- SEIR_simulation(1000, 999, 0, 1, 7, 13, 8, 365) ggplot(data = SEIR_sim9, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R")) ``` (j) N=1000, S0=999, E0=0, I0=1, RN=4, pD=16 days, D=8 days, TotalTime=365 days ```{r fig.align='center'} SEIR_sim10 <- SEIR_simulation(1000, 999, 0, 1, 4, 16, 8, 365) ggplot(data = SEIR_sim10, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R")) ``` (k) N=1000, S0=999, E0=0, I0=1, RN=5.5, pD=16 days, D=8 days, TotalTime=365 days ```{r fig.align='center'} SEIR_sim11 <- SEIR_simulation(1000, 999, 0, 1, 5.5, 16, 8, 365) ggplot(data = SEIR_sim11, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R")) ``` (l) N=1000, S0=999, E0=0, I0=1, RN=7, pD=16 days, D=8 days, TotalTime=365 days ```{r fig.align='center'}
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SEIR_sim12 <- SEIR_simulation(1000, 999, 0, 1, 7, 16, 8, 365) ggplot(data= SEIR_sim12, aes(x=time, y=count)) + geom_line(aes(color=compartment), size=1.1) + scale_colour_discrete(limits = c("S","E","I","R")) ``` **SEIR 3. Consider your simulations in SEIR 2. If you want to understand the impact of changing a variable in a simulation, how can you do this in a way that allows you to accurately assess that impact? Specifically:** (a) Fin a group of simulations that allow you to assess the impact of changing the basic reproductive number on the simulation results? What set are you would you choose, and why? Note, there are multiple correct answers. The general idea here is when we would like to assess the impact of a variable on the simulation results, we just have to choose a group of simulations that have the same value for every other variable and they are only different in the value of the variable we would like to assess it impact. We have the list: - (a) N=1000, S0=999, E0=0, I0=1, RN=4, pD=13 days, D=7 days, TotalTime=365 days - (b) N=1000, S0=999, E0=0, I0=1, RN=5.5, pD=13 days, D=7 days, TotalTime=365 days - (c) N=1000, S0=999, E0=0, I0=1, RN=7, pD=13 days, D=7 days, TotalTime=365 days - (d) N=1000, S0=999, E0=0, I0=1, RN=4, pD=16 days, D=7 days, TotalTime=365 days - (e) N=1000, S0=999, E0=0, I0=1, RN=5.5, pD=16 days, D=7 days, TotalTime=365 days - (f) N=1000, S0=999, E0=0, I0=1, RN=7, pD=16 days, D=7 days, TotalTime=365 days - (g) N=1000, S0=999, E0=0, I0=1, RN=4, pD=13 days, D=8 days, TotalTime=365 days - (h) N=1000, S0=999, E0=0, I0=1, RN=5.5, pD=13 days, D=8 days, TotalTime=365 days - (i) N=1000, S0=999, E0=0, I0=1, RN=7, pD=13 days, D=8 days, TotalTime=365 days - (j) N=1000, S0=999, E0=0, I0=1, RN=4, pD=16 days, D=8 days, TotalTime=365 days - (k) N=1000, S0=999, E0=0, I0=1, RN=5.5, pD=16 days, D=8 days, TotalTime=365 days - (l) N=1000, S0=999, E0=0, I0=1, RN=7, pD=16 days, D=8 days, TotalTime=365 days - Group 1: (a), (b), (c) - Group 2: (d), (e), (f) - Group 3: (g), (h), (i) - Group 4: (i), (k), (l) - Since every simulations in a group is only different in RN, all other variables are the same. Therefore, if we divide 12 simulations into 4 groups of 3 like that, we can assess the impact of changing the basic reproductive number on the simulation results. (b) Find two simulations that allow you to assess the impact of changing the pre- infectious period from 13 to 16 days. Which two simulations are you examining and why? What is the impact on the resulting epidemic curves? You can comment on the slopes of the curves, the width of the E curve, etc. - Group 1: (a), (d) - Group 2: (b), (e) - Group 3: (c), (f) - Group 4: (g), (j) - Group 5: (h), (k) - Group 6: (i), (l) - Since every simulations in a group is only different in pD, all other variables are the same. Therefore, if we divide 12 simulations into 6 groups of 2 like that, we can assess the impact of changing the pre-infectious period from 13 to 16 days on the simulation results. - As the pre-infectious period increases from 13 to 16 days, the slope od the
curves is higher (the curve is steeper) and the with of the E curve is larger (E curve is wider.)