9. Discrete time Kermack-McKendrick Model#
Why does the SIR assume an exponential distribution for the infectious period?#
The majority of work with compartmental models assumes that the amound of time an individual spends in a disease state is exponentially distributed. For example, for the SIR model
after an individual becomes infectious (moves from S to I) they spend on average \(1/\gamma\) time units in this infectious state. However, we may wonder what the density of infectious periods (time spend in the I state) is. The SIR model implicitly assumes that the density of infectious periods is exponential. To see this, for a small increment of time \(\tau\), we assume an individual leaves the infectious state with probability \(\gamma \tau\). This means that an individual remains infectious for this small period of time \(\tau\) with probability \(1-\gamma \tau\). Then the probability that an individual is still infectious in the interval \(t + \tau\) is the probability that they remain infectious in the interval \([0,\tau],[\tau,2\tau],[2\tau,3\tau],\cdots,[t-\tau ,t]\) equals \(p(\text{still infectious}) = (1-\gamma \tau)^{t/\tau}\) and so the probability that they are no longer infectious in the interval \([t,t+\tau]\) is
For continuous time, when \(\tau\) moves towards infinity, from calculus we know that
we also know that in continuous time, as we shrink this interval \(\tau\), that probabilities turn into probability densities. In other words
where \(f\) is the probability density that an individual is no longer infectious in an interval from \(t\) to \(t+\tau\). Then as \(\tau\) shrinks we find that
The density above \(f(t)\) is an exponential distribution with paramter \(\gamma\).
The derivative of the above is
This is the key. We assume in the SIR model that infectious individuals leave the infectious state like
This can only mean that the infectious period is assumed to have an exponential density.
The discrete-time Kermack McKendrick model#
If an infectious diseases did have an exponential period then it would mean that the probability you remain infectious decreases tremendously with time and that there is no ‘typical’ time in which an individual is infecitous. For most infectious diseases, an exponential distributed infectious period is unrealistic because there is a typical amount of time an individual remains infectious.
Our goal will be to study the discrete-time Kermack McKendrick model as a means of incorporating any type of infectious period. We will follow closely the paper: The discrete-time Kermack–McKendrick model: A versatile and computationally attractive framework for modeling epidemics.
For the typical SIR, we assume that the number of susceptibles is depleted like
We can think of \(\beta \frac{I(t)}{N}\) as a function of the number of infectors in the system. This function is often called the force of infection and, for the SIR model, is defined as the function \(\Lambda(t)\) of \(t\)
We can substitute in the above function to obtain
However the function that we chose, \(\Lambda(t) = \beta \frac{I(t)}{N}\) , is just one of an infinite number of force of infection functions that we can use to describe the loss of susceptibles to infections. For now, lets continue with the general function \(\Lambda(t)\). Suppose we wish to solve the above differential equation over a small time interval from \(t\) to \(t+\tau\). The fundemental theorem of calculus tell us to integrate
Lets simplify the above equation by defining the cumulative force of infection
Then our final equation above simplifies to
Note: that the number of susceptibles can only stay the same or decrease because the expression \(\exp \left[- \hat{\Lambda} (t)\right]\) is at most the value one. That is, for this formulation, we do not consider individuals moving back to the susceptible disease state.
Though we can use any function \(\hat{\Lambda} (t)\), a common function supposes that previously infected individuals can infect others in the system. This is the defining characteristic of an infectious diseases—that the risk of disease is related to the number of infections in the system.
The original 1927 paper that specifies the evolution of an infectious disease chose as a function
where we say that \(l\) is the lag or a number that describes the number of time units in the past (relative to \(t\)). This proposed function has two components. The function \(\mathcal{A}(l)\) needs to be set by the modeler as is called the contribution to the force of infection. More on this later. The term \(\Lambda(t-l) S(t-l)\) is the change in the number of susceptibles \(l\) time units ago. For the Kermack-McKendrick model, all the susceptibles move to the infected state. That is, the change in the number of susceptibles is the number of incident infections at time \(t-l\).
Define incident infections at time \(t\). Then
The contribution to the force of infection should be interpreted as the probability that an infected individual makes effective contact, contact resulting in infection, when they come into contact with a susceptible.
An equation for the basic reproduction number#
Suppose that we introduce a single infected individual at time zero into a population where the remaining \(N-1 \approx N\) individuals are susceptible. Then at time zero this incident, infected individual is expected to contact all \(N\) susceptibles and infect on average \(N \mathcal{A(0)}\) of them. Lets continue to follow the path of this incident infector. At time one, we would expect that they infect \(N \mathcal{A(1)}\) susceptibles (assuming the population is approximately still \(N\)). At time two, three, and four we would expect this infector to produce \(N \mathcal{A(2)}\),\(N \mathcal{A(3)}\),\(N \mathcal{A(4)}\) infections.
That is, the expected number of infections generated from a single infected individual in an entirely susceptible population is
Discretization to make things a bit easier#
We typically do not, and cannot, observe outbreaks on a continuous timescale. Public health resources are limited and so we typically observe incident, reported infections at the day, week, month, or even only the annual time scale. Because of this, it makes more sense to look at a discretized version of the above. That is, to replace integrals with summations.
Our discretized KM model is
where we introduce the parameter \(C\) as a limit to how far back we “look” when assuming infectors in the past are generating new infections. That is, we assume an individual who was infected \(C\) time units ago is no longer producing new infections.
The generation interval#
Let \(\sum_{l=0}^{\infty} \mathcal{A}(l)\) be the total number of average infections (the basic reproduction number) for a single infector. Then \(g(l) = A(l) / \sum_{l=0}^{\infty} \mathcal{A}(l)\) is the probability that an individual was infected \(l\) time units ago. The time between an infector contacting a susceptible and when that susceptible becomes infectious is called the generation interval.
To see this, suppose
or we expect a single infector to produce a total of 2.08 infections. Lets normalize the above \(\mathcal{A}\). Then we find
and say that there is a (for example) 0.48 probability that 2 time units ago a susceptible individual made effective contact with an infector and is now infectious.
The above argument suggests a relationship between the basic reproduction number, contribution to force of infection, and the generation interval.
Coding up the discrete time KM model#
def g(t):
if t==0:
return 0.24
elif t==1:
return 0.12
elif t==2:
return 0.48
elif t==3:
return 0.16
else:
return 0
def A(R0,g,t):
return R0*g(t)
start,end = 0,100
S0 = 1000
N = 1001
R0 = 1.50/N
S = [S0]
i = [0,0,0,1]
for t in np.arange(start+3,end):
#--compute lambda
L = 0
for l in np.arange(0,4):
L = L+A(R0,g,l)*i[t-l]
St = S[-1]*(np.exp(-L))
i.append( S[-1]-St )
S.append( St )
i_g1 = i
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[1], line 26
23 S = [S0]
24 i = [0,0,0,1]
---> 26 for t in np.arange(start+3,end):
27 #--compute lambda
28 L = 0
29 for l in np.arange(0,4):
NameError: name 'np' is not defined
plt.plot(i_g1)
[<matplotlib.lines.Line2D at 0x140aeb0d0>]

def g(t):
if t==0:
return 0.48
elif t==1:
return 0.24
elif t==2:
return 0.12
elif t==3:
return 0.16
else:
return 0
def A(R0,g,t):
return R0*g(t)
start,end = 0,100
S0 = 1000
N = 1001
R0 = 1.50/N
S = [S0]
i = [0,0,0,1]
for t in np.arange(start+3,end):
#--compute lambda
L = 0
for l in np.arange(0,4):
L = L+A(R0,g,l)*i[t-l]
St = S[-1]*(np.exp(-L))
i.append( S[-1]-St )
S.append( St )
i_g2 = i
plt.plot(i_g1,color="blue")
plt.plot(i_g2,color="red")
plt.show()

Homework#
A manuscript in 2009 estimated the generation interval for influenza to be on average 3.6 days with a standard deviation of 1.6 days. Lets assume that the generation interval (a probability mass function) takes the form of a Poisson distribution.
Define a function g that takes as input a lag \(t\) and parameter \(p\) and returns the the generation interval value at that lag computed as Pois(l;\(p\)) where Pois(l;\(p\)) is the Poisson distribution evaluated at l with parameter value \(p\).
A typical reproduction number for influenza is 1.5. Use the generation interval above with parameter \(3.6\) and plot the number of incident infection over time from 0 to 100 time units.
Change the generation interval parameter from \(3.6\) to \(4.6\) and overlay the number of incident infections for (2) as well as for this new parameter value.
Discuss the implications of estimating the generation interval precisely.
For the Reed-Frost model, we found that the basic reproduction number was \(Np\). Please use the relationship between the reproduction number and contribution to force of infection to identify the contribution to the force of infection for the Reed-frost model.
Lets assume a simple model for how infections propogate, and using this model, understand the relationship between how fast an infecitous disease grows and the reproduction number.
Lets assume that when an individual is infected that they become infectious after \(T_{1}\) time units and stay infectius until \(T_{2}\) time units. For example, we might assume that a susceptible individual who comes into contact with an infector becomes infected 3 days later and stays infected until 8 days after that initial contact (the infectious period is 8-3= 5 days).
Further lets assume that the probability an infector passes the pathogen to a susceptible is \(p\) and that each individual comes into contact with \(c\) individuals. Then at the beginning of an outbreak (everyone they contact is susceptible) the basic reproduction number is
Then we can assume that the number of new infectors \(i(t)\), early in the spread of disease, follows
This can be interpreted as follows: At time \(t\) we will look back at the number of infectors who were infected: 1 day, knowing that they will produce \(pc\) infections each for a total of \(pc \cdot i(t-1)\) infections; 2 days, knowing that each of the \(i(t-2)\) infecors will produce \(pc\) infections for a total of \(pc \cdot i(t-2)\) and so on.
In the beginning of an outbreak, its safe to assume that the growth in the number of infections follows
for some parameter \(r\)—called the growth rate (or Malthusian parameter).
Lets substitute this assumed exponential growth rate into the above number of incident infection over time.
The above equation is dense, but can tell us a few interesting things. To be clear, the above equation says that if we assume an exponential growth in infections then the growth rate \(r\) is that value such that \(\int_{\tau=T_{1}}^{\tau=T_{2}} pc \times e^{-r\tau} d\tau\) is equal to one.
Lets consider the right hand side a function \(f(r)\) of the grow rate \(r\)—how fast infections are spreading.
where we defined the function
It may be easier to multiply (and divide) by \((T_{2}-T_{1})\) to specify this function in terms of the reproduction number.
To estimate r, the growth rate, we know that given the parameters R_{0}, T_{2}, T_{1} the growth rate r is that value such that f(r)=1 We also know that f(0) = \(R_{0}\).
Question 3a: Please show that \(f(0) = R_{0}\)
Question 3b: Please plot \(f\) values for growth rates from -1 to 1 for the following parameters:
(i) \(R_{0}=0.45; T_{1}=1; T_{2}=5\)
(ii) \(R_{0}=1; T_{1}=1; T_{2}=5\)
(iii) \(R_{0}=2; T_{1}=1; T_{2}=5\)
(iii) \(R_{0}=3; T_{1}=1; T_{2}=5\)
Set the xlimits of your plot to (-1,1) and the ylimits (0,5).
Add a horizontal line at the value 1 and a vertical line at 0.
For the above, what is the relationship between \(R_{0}\) and \(r\)?
Hint: Use the quad
function from scipy
to help you compute that integral.
Question 3c: Please plot \(f\) values for growth rates from -1 to 1 for the following parameters:
(i) \(R_{0}=2; T_{1}=1; T_{2}=5\)
(ii) \(R_{0}=2; T_{1}=1; T_{2}=10\)
(iii) \(R_{0}=2; T_{1}=1; T_{2}=50\)
Set the xlimits of your plot to (-1,1) and the ylimits (0,5).
Add a horizontal line at the value 1 and a vertical line at 0.
For the above, what is the relationship between \(R_{0}\) and the infectious period ?
Hint: Use the quad
function from scipy
to help you compute that integral.