This tutorial documents how to use EpiModel to solve new deterministic compartmental models (DCMs). New model types incorporate model compartments, parameters, and structures different from the built-in SI/SIR/SIS model types in EpiModel. This extension tutorial assumes a solid familiarity with both R programming and epidemic model parameterization the other tutorials. If you are not familiar with DCMs or running this model class in EpiModel, consult the Basic DCMs with EpiModel tutorial.
First load the EpiModel package:
library(EpiModel)
To start let’s examine the mathematical syntax for our basic SI model in a closed population (no births or deaths). We first run this model in EpiModel to see the normal syntax that avoids any specific reference to the model function specifying the mathematical structure. As detailed in the basic tutorial, EpiModel automatically chooses the correct model function as a result of the input parameters, initial conditions, and control settings.
param <- param.dcm(inf.prob = 0.5, act.rate = 0.25)
init <- init.dcm(s.num = 500, i.num = 1)
control <- control.dcm(type = "SI", nsteps = 100)
mod <- dcm(param, init, control)
plot(mod)
Within EpiModel, DCMs are solved using the deSolve package,
using this framework to define a model as a function that includes
calculations for dynamic rates and the system of differential equations
for each compartment in the model. To see what one of these looks like,
we can use the control.dcm
argument
print.mod = TRUE
. When the dcm
function is run
with this, only the model structure is printed to the console.
control <- control.dcm(type = "SI", nsteps = 100, print.mod = TRUE)
mod <- dcm(param, init, control)
function(t, t0, parms) {
with(as.list(c(t0, parms)), {
# Derivations
num <- s.num + i.num
# Parameters
lambda <- inf.prob * act.rate * i.num / num
if (!is.null(parms$inter.eff) && t >= inter.start) {
lambda <- lambda * (1 - inter.eff)
}
# Flows
si.flow <- lambda * s.num
# ODEs
dS <- -si.flow
dI <- si.flow
# Output
list(c(dS, dI,
si.flow),
num = num)
})
}
<bytecode: 0x10e9ba208>
<environment: namespace:EpiModel>
The standard specification for model functions is with the top two
lines to wrap the input and output into a list structure. The
mathematical calculation of main derivatives, dS
and
dI
, are an arithmetic function of the force of infection,
lambda
and the size of the susceptible population,
s.num
. The base model functions include a modifier on the
lambda to simulate an intervention if those parameters are specified in
the model.
An alternative method for accessing the built-in model functions is to consult the internal help page for those functions.
?dcm.mods
There are currently a total of 12 model types: three disease types (SI, SIR, and SIS), two group number specifications (one versus two groups), and two demographic settings (open versus closed populations). An equivalent method to access the same model function as above is by printing the internal model function.
print(mod_SI_1g_cl)
function(t, t0, parms) {
with(as.list(c(t0, parms)), {
# Derivations
num <- s.num + i.num
# Parameters
lambda <- inf.prob * act.rate * i.num / num
if (!is.null(parms$inter.eff) && t >= inter.start) {
lambda <- lambda * (1 - inter.eff)
}
# Flows
si.flow <- lambda * s.num
# ODEs
dS <- -si.flow
dI <- si.flow
# Output
list(c(dS, dI,
si.flow),
num = num)
})
}
<bytecode: 0x10e9ba208>
<environment: namespace:EpiModel>
Each mathematical model solved using EpiModel must have its own
R
function. The core function structure should be the the
same for each model:
lambda
(the force of infection), which
changes as a function of infected population size. Here, we include the
calculation for num
, the total population size. This is not
dynamic in this particular model because of the closed population, but
it allows for a shorthand formula for lambda
. The values
for s.num
and i.num
, the number susceptible
and infected, are passed into this function via the initial conditions
we set in init.dcm
. Dynamic calculations also include
composite statistics that you would like to output with the model (e.g.,
prevalence <- i.num/num
).dS
below) but must be consistent with the output list. The derivatives for
the model not only include the compartment sizes for the disease states,
but also any flows in the model (si.flow
in this model);
that is because the flow sizes are calculated using the same method of
numerical integration as the compartment sizes.c(dS, dI, si.flow)
. Always list the derivative
objects first, in this combined vector format, and in the same order
that they are entered in the initial conditions list. After writing
these equations, one can also output dynamic statistics from the model.
These should be named in the manner below to ensure that the output
object has appropriate column names.To define the parameters, initial state sizes, and control settings
for the model specifications use the the param.dcm
,
init.dcm
, and control.dcm
helper
functions.
param.dcm
and the variables called within your
new model function..flow
: this tells
EpiModel to calculate the flow sizes at each time step as a lagged
difference rather than a cumulative size. If the flows are not named to
end with .flow
, errors will be the result.type
parameter, and instead uses the
new.mod
parameter to specify the model function.Details of these steps will be further described in the two examples below.
EpiModel includes a built-in SIR model, but here we show how to model an SEIR disease like Ebola. The E compartment in this disease is an exposed state in which the person is not infectious to others. Following some basic parameters for Ebola in the popular science to date, we model this disease using parameters for \(R_0\), the average durations spent in the exposed and infected phases, and the case fatality rate. This model will use a simplifying assumption that the only deaths in the population are due to Ebola.
Persons infected then either die from infection or recovery at an equal rate, defined by the average time spent in the infected state. Other simplifying assumptions here are that dead persons do not transmit disease (many infections have occurred this way) and that the rate of contacts remains the same over the infected phase (it probably declines, but there is not good data on this).
We use four parameters in the model: R0
is the initial
reproductive number, e.dur
is the duration of the exposed
state, i.dur
is the duration of the infectious state, and
cfr
is the case fatality rate expressed as a proportion of
those who will die among those infected. Given estimates on \(R_0\) and the duration of infection, we
infer a simplistic lambda
by mathematical
rearrangement.
The four differential equations are defined as a function of state sizes and parameters. For the infected state, in-flows are from the exposed stage and out-flows are either to the recovered state (1 - the CFR will recover) or death (CFR will die).
SEIR <- function(t, t0, parms) {
with(as.list(c(t0, parms)), {
# Population size
num <- s.num + e.num + i.num + r.num
# Effective contact rate and FOI from a rearrangement of Beta * c * D
ce <- R0 / i.dur
lambda <- ce * i.num/num
dS <- -lambda*s.num
dE <- lambda*s.num - (1/e.dur)*e.num
dI <- (1/e.dur)*e.num - (1 - cfr)*(1/i.dur)*i.num - cfr*(1/i.dur)*i.num
dR <- (1 - cfr)*(1/i.dur)*i.num
# Compartments and flows are part of the derivative vector
# Other calculations to be output are outside the vector, but within the containing list
list(c(dS, dE, dI, dR,
se.flow = lambda * s.num,
ei.flow = (1/e.dur) * e.num,
ir.flow = (1 - cfr)*(1/i.dur) * i.num,
d.flow = cfr*(1/i.dur)*i.num),
num = num,
i.prev = i.num / num,
ei.prev = (e.num + i.num)/num)
})
}
In the model output, we include the four derivatives, but also several other calculated time-series to analyze later. These are all included as named elements of the list, with compartments and flows specified as derivatives in the named vector and other summary statistics included in the output list.
For model parameters, we specify the following values. Our
sensitivity analysis will vary the CFR, given the potential for
variability of this across health systems. The starting population is
roughly 1 million persons in which 10 have been exposed at the outset.
Note that we specify the initial value for all the flows named in the
model, with a starting value of 0 (i.e., there are no transitions at the
outset of the model simulation). We simulate this model over 500 days,
specifying that our model function is SEIR
as defined
above.
param <- param.dcm(R0 = 1.9, e.dur = 10, i.dur = 14, cfr = c(0.5, 0.7, 0.9))
init <- init.dcm(s.num = 1e6, e.num = 10, i.num = 0, r.num = 0,
se.flow = 0, ei.flow = 0, ir.flow = 0, d.flow = 0)
control <- control.dcm(nsteps = 500, dt = 1, new.mod = SEIR)
mod <- dcm(param, init, control)
The model shows that three models were run, where the CFR was varied. All the output in the model matches our model function specs.
mod
EpiModel Simulation
=======================
Model class: dcm
Simulation Summary
-----------------------
No. runs: 3
No. time steps: 500
Model Parameters
-----------------------
R0 = 1.9
e.dur = 10
i.dur = 14
cfr = 0.5 0.7 0.9
Model Output
-----------------------
Variables: s.num e.num i.num r.num se.flow ei.flow ir.flow
d.flow num i.prev ei.prev
Let’s first ignore the sensitivity analysis and examine the model with the middle CFR of 70%. Here is what the prevalence and incidence are from the model. In the initial epidemic, the prevalence curve is approximated by an exponential growth model. The peak incidence occurs roughly one year into the epidemic.
par(mfrow = c(1, 2))
plot(mod, y = "i.num", run = 2, main = "Prevalence")
plot(mod, y = "se.flow", run = 2, main = "Incidence")