| Title: | Understand Individual-Level Variation in Infectious Disease Transmission |
|---|---|
| Description: | Estimate and understand individual-level variation in transmission. Implements density and cumulative compound Poisson discrete distribution functions (Kremer et al. (2021) <doi:10.1038/s41598-021-93578-x>), as well as functions to calculate infectious disease outbreak statistics given epidemiological parameters on individual-level transmission; including the probability of an outbreak becoming an epidemic/extinct (Kucharski et al. (2020) <doi:10.1016/S1473-3099(20)30144-4>), or the cluster size statistics, e.g. what proportion of cases cause X\% of transmission (Lloyd-Smith et al. (2005) <doi:10.1038/nature04153>). |
| Authors: | Joshua W. Lambert [aut, cre, cph] (ORCID: <https://orcid.org/0000-0001-5218-3046>), Adam Kucharski [aut, cph] (ORCID: <https://orcid.org/0000-0001-8814-9421>), Dillon C. Adam [aut] (ORCID: <https://orcid.org/0000-0002-7485-9905>), Sebastian Funk [ctb, cph] (ORCID: <https://orcid.org/0000-0002-2842-3406>, .chain_sim uses code from bpmodels::chain_sim), Pratik Gupte [rev] (ORCID: <https://orcid.org/0000-0001-5294-7819>), Hugo Gruson [rev] (ORCID: <https://orcid.org/0000-0002-4094-1476>), James M. Azam [rev, ctb] (ORCID: <https://orcid.org/0000-0001-5782-7330>), Chris Hartgerink [rev] (ORCID: <https://orcid.org/0000-0003-1050-6809>), London School of Hygiene and Tropical Medicine, LSHTM [cph] (ROR: <https://ror.org/00a0jsq62>) |
| Maintainer: | Joshua W. Lambert <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.4.0.9000 |
| Built: | 2026-05-13 09:25:59 UTC |
| Source: | https://github.com/epiverse-trace/superspreading |
) for a (heterogeneous)
networkThe calculation of the reproduction number adjusting for heterogeneity in number of contacts.
calc_network_R( mean_num_contact, sd_num_contact, infect_duration, prob_transmission, age_range )calc_network_R( mean_num_contact, sd_num_contact, infect_duration, prob_transmission, age_range )
mean_num_contact |
A |
sd_num_contact |
A |
infect_duration |
A |
prob_transmission |
A |
age_range |
A |
A named numeric vector of length 2, the unadjusted (R)
and network adjusted (R_net) estimates of .
# example using NATSAL data calc_network_R( mean_num_contact = 14.1, sd_num_contact = 69.6, infect_duration = 1, prob_transmission = 1, age_range = c(16, 74) )# example using NATSAL data calc_network_R( mean_num_contact = 14.1, sd_num_contact = 69.6, infect_duration = 1, prob_transmission = 1, age_range = c(16, 74) )
FINITE_INF is a large finite number used to approximate Inf.
NSIM is the number of simulations run when generating random samples or
branching process simulation replicates.
FINITE_INF NSIMFINITE_INF NSIM
An object of class numeric of length 1.
An object of class numeric of length 1.
Density of the Poisson-lognormal compound distribution
dpoislnorm(x, meanlog, sdlog)dpoislnorm(x, meanlog, sdlog)
x |
A |
meanlog |
A |
sdlog |
A |
The function is vectorised so a vector of quantiles can be input and the output will have an equal length.
A numeric vector of the density of the Poisson-lognormal
distribution.
dpoislnorm(x = 10, meanlog = 1, sdlog = 2) dpoislnorm(x = 1:10, meanlog = 1, sdlog = 2)dpoislnorm(x = 10, meanlog = 1, sdlog = 2) dpoislnorm(x = 1:10, meanlog = 1, sdlog = 2)
Density of the Poisson-Weibull compound distribution
dpoisweibull(x, shape, scale)dpoisweibull(x, shape, scale)
x |
A |
shape |
A |
scale |
A |
The function is vectorised so a vector of quantiles can be input and the output will have an equal length.
A numeric vector of the density of the Poisson-Weibull
distribution.
dpoisweibull(x = 10, shape = 1, scale = 2) dpoisweibull(x = 1:10, shape = 1, scale = 2)dpoisweibull(x = 10, shape = 1, scale = 2) dpoisweibull(x = 1:10, shape = 1, scale = 2)
This is a helper function for creating a model comparison <data.frame>
primarily for use in the superspreading vignettes. It is designed
specifically for handling fitdistrplus::fitdist() output and not a
generalised function. See bbmle::ICtab() for a more general use function
to create information criteria tables.
ic_tbl(..., sort_by = c("AIC", "BIC", "none"))ic_tbl(..., sort_by = c("AIC", "BIC", "none"))
... |
dots One or more model fit results from
|
sort_by |
A |
A <data.frame>.
if (requireNamespace("fitdistrplus", quietly = TRUE)) { cases <- rnbinom(n = 100, mu = 5, size = 0.7) pois_fit <- fitdistrplus::fitdist(data = cases, distr = "pois") geom_fit <- fitdistrplus::fitdist(data = cases, distr = "geom") nbinom_fit <- fitdistrplus::fitdist(data = cases, distr = "nbinom") ic_tbl(pois_fit, geom_fit, nbinom_fit) }if (requireNamespace("fitdistrplus", quietly = TRUE)) { cases <- rnbinom(n = 100, mu = 5, size = 0.7) pois_fit <- fitdistrplus::fitdist(data = cases, distr = "pois") geom_fit <- fitdistrplus::fitdist(data = cases, distr = "geom") nbinom_fit <- fitdistrplus::fitdist(data = cases, distr = "nbinom") ic_tbl(pois_fit, geom_fit, nbinom_fit) }
Cumulative distribution function of the Poisson-lognormal compound distribution
ppoislnorm(q, meanlog, sdlog)ppoislnorm(q, meanlog, sdlog)
q |
A |
meanlog |
A |
sdlog |
A |
The function is vectorised so a vector of quantiles can be input and the output will have an equal length.
A numeric vector of the distribution function.
ppoislnorm(q = 10, meanlog = 1, sdlog = 2) ppoislnorm(q = 1:10, meanlog = 1, sdlog = 2)ppoislnorm(q = 10, meanlog = 1, sdlog = 2) ppoislnorm(q = 1:10, meanlog = 1, sdlog = 2)
Cumulative distribution function of the Poisson-Weibull compound distribution
ppoisweibull(q, shape, scale)ppoisweibull(q, shape, scale)
q |
A |
shape |
A |
scale |
A |
The function is vectorised so a vector of quantiles can be input and the output will have an equal length.
A numeric vector of the distribution function.
ppoisweibull(q = 10, shape = 1, scale = 2) ppoisweibull(q = 1:10, shape = 1, scale = 2)ppoisweibull(q = 10, shape = 1, scale = 2) ppoisweibull(q = 1:10, shape = 1, scale = 2)
Outbreak containment is defined as outbreak extinction when
simulate = FALSE. When simulate = FALSE, probability_contain() is
equivalent to calling probability_extinct().
When simulate = TRUE, outbreak containment is defined by the
case_threshold (default = 100) and outbreak_time arguments.
Firstly, case_threshold sets the size of the transmission chain below
which the outbreak is considered contained. Secondly, outbreak_time sets
the time duration from the start of the outbreak within which the outbreak
is contained if there is no more onwards transmission beyond this time.
When setting an outbreak_time, a generation_time is also required.
case_threshold and outbreak_time can be jointly set.
Overall, when simulate = TRUE, containment is defined as the size and
time duration of a transmission chain not reaching the case_threshold
and outbreak_time, respectively.
probability_contain( R, k, num_init_infect, ind_control = 0, pop_control = 0, simulate = FALSE, ..., case_threshold = 100, outbreak_time = Inf, generation_time = NULL, offspring_dist )probability_contain( R, k, num_init_infect, ind_control = 0, pop_control = 0, simulate = FALSE, ..., case_threshold = 100, outbreak_time = Inf, generation_time = NULL, offspring_dist )
R |
A |
k |
A |
num_init_infect |
An |
ind_control |
A |
pop_control |
A |
simulate |
A |
... |
< |
case_threshold |
A number for the threshold of the number of cases below
which the epidemic is considered contained. |
outbreak_time |
A number for the time since the start of
the outbreak to determine if outbreaks are contained within a given period
of time. |
generation_time |
A |
offspring_dist |
An |
When using simulate = TRUE, the default arguments to simulate the
transmission chains with .chain_sim() are 105 replicates,
a negative binomial (nbinom) offspring distribution, parameterised with
R (and pop_control if > 0) and k.
When setting the outbreak_time argument, the generation_time argument is
also required. The generation_time argument requires a random number
generator function. For example, if we assume the generation time is
lognormally distributed with meanlog = 1 and sdlog = 1.5, then we can
define the function to pass to generation_time as:
function(x) rlnorm(x, meanlog = 1, sdlog = 1.5)
A number for the probability of containment.
Lloyd-Smith, J. O., Schreiber, S. J., Kopp, P. E., & Getz, W. M. (2005) Superspreading and the effect of individual variation on disease emergence. Nature, 438(7066), 355-359. doi:10.1038/nature04153
# population-level control measures probability_contain(R = 1.5, k = 0.5, num_init_infect = 1, pop_control = 0.1) # individual-level control measures probability_contain(R = 1.5, k = 0.5, num_init_infect = 1, ind_control = 0.1) # both levels of control measures probability_contain( R = 1.5, k = 0.5, num_init_infect = 1, ind_control = 0.1, pop_control = 0.1 ) # multi initial infections with population-level control measures probability_contain(R = 1.5, k = 0.5, num_init_infect = 5, pop_control = 0.1) # probability of containment within a certain amount of time # this requires parameterising a generation time gt <- function(n) { rlnorm(n, meanlog = 1, sdlog = 1.5) } probability_contain( R = 1.2, k = 0.5, num_init_infect = 1, simulate = TRUE, case_threshold = 50, outbreak_time = 20, generation_time = gt )# population-level control measures probability_contain(R = 1.5, k = 0.5, num_init_infect = 1, pop_control = 0.1) # individual-level control measures probability_contain(R = 1.5, k = 0.5, num_init_infect = 1, ind_control = 0.1) # both levels of control measures probability_contain( R = 1.5, k = 0.5, num_init_infect = 1, ind_control = 0.1, pop_control = 0.1 ) # multi initial infections with population-level control measures probability_contain(R = 1.5, k = 0.5, num_init_infect = 5, pop_control = 0.1) # probability of containment within a certain amount of time # this requires parameterising a generation time gt <- function(n) { rlnorm(n, meanlog = 1, sdlog = 1.5) } probability_contain( R = 1.2, k = 0.5, num_init_infect = 1, simulate = TRUE, case_threshold = 50, outbreak_time = 20, generation_time = gt )
) based on R and initial casesThe method for the evolution of pathogen emergence described in
Antia et al. (2003) (doi:10.1038/nature02104). The model is a multi-type
branching process model with an initial (wild-type) reproduction number,
usually below 1, and a evolved reproduction number that is
greater than 1, and thus can cause a sustained human-to-human epidemic.
The reproduction number for a pathogen changes at the mutation_rate.
probability_emergence( R_wild, R_mutant, mutation_rate, num_init_infect, tol = 1e-10, max_iter = 1000 )probability_emergence( R_wild, R_mutant, mutation_rate, num_init_infect, tol = 1e-10, max_iter = 1000 )
R_wild |
A |
R_mutant |
A |
mutation_rate |
A |
num_init_infect |
An |
tol |
A |
max_iter |
A |
Following Antia et al. (2003), we assume that the mutation rate for all variants is the same.
A value with the probability of a disease emerging and causing an outbreak.
Antia, R., Regoes, R., Koella, J. & Bergstrom, C. T. (2003) The role of evolution in the emergence of infectious diseases. Nature 426, 658–661. doi:10.1038/nature02104
probability_epidemic(), probability_extinct()
probability_emergence( R_wild = 0.5, R_mutant = 1.5, mutation_rate = 0.5, num_init_infect = 1 )probability_emergence( R_wild = 0.5, R_mutant = 1.5, mutation_rate = 0.5, num_init_infect = 1 )
Calculates the probability a branching process will cause an epidemic (i.e. probability will fail to go extinct) based on R, k and initial cases.
probability_epidemic( R, k, num_init_infect, ind_control = 0, pop_control = 0, ..., offspring_dist )probability_epidemic( R, k, num_init_infect, ind_control = 0, pop_control = 0, ..., offspring_dist )
R |
A |
k |
A |
num_init_infect |
An |
ind_control |
A |
pop_control |
A |
... |
< |
offspring_dist |
An |
A value with the probability of a large epidemic.
Lloyd-Smith, J. O., Schreiber, S. J., Kopp, P. E., & Getz, W. M. (2005) Superspreading and the effect of individual variation on disease emergence. Nature, 438(7066), 355-359. doi:10.1038/nature04153
Kucharski, A. J., Russell, T. W., Diamond, C., Liu, Y., Edmunds, J., Funk, S. & Eggo, R. M. (2020) Early dynamics of transmission and control of COVID-19: a mathematical modelling study. The Lancet Infectious Diseases, 20(5), 553-558. doi:10.1016/S1473-3099(20)30144-4
probability_epidemic(R = 1.5, k = 0.1, num_init_infect = 10)probability_epidemic(R = 1.5, k = 0.1, num_init_infect = 10)
Calculates the probability a branching process will not causes an epidemic
and will go extinct. This is the complement of the probability of a disease
causing an epidemic (probability_epidemic()).
probability_extinct( R, k, num_init_infect, ind_control = 0, pop_control = 0, ..., offspring_dist )probability_extinct( R, k, num_init_infect, ind_control = 0, pop_control = 0, ..., offspring_dist )
R |
A |
k |
A |
num_init_infect |
An |
ind_control |
A |
pop_control |
A |
... |
< |
offspring_dist |
An |
A value with the probability of going extinct.
Lloyd-Smith, J. O., Schreiber, S. J., Kopp, P. E., & Getz, W. M. (2005) Superspreading and the effect of individual variation on disease emergence. Nature, 438(7066), 355-359. doi:10.1038/nature04153
probability_extinct(R = 1.5, k = 0.1, num_init_infect = 10)probability_extinct(R = 1.5, k = 0.1, num_init_infect = 10)
Calculates the proportion of new cases that originated with a transmission event of a given size. It can be useful to inform backwards contact tracing efforts, i.e. how many cases are associated with large clusters. Here we define a cluster to as a transmission of a primary case to at least one secondary case.
proportion_cluster_size( R, k, cluster_size, ..., offspring_dist, format_prop = TRUE )proportion_cluster_size( R, k, cluster_size, ..., offspring_dist, format_prop = TRUE )
R |
A |
k |
A |
cluster_size |
A |
... |
dots not used, extra arguments supplied will cause a warning. |
offspring_dist |
An |
format_prop |
A |
This function calculates the proportion of secondary cases that are caused by transmission events of a certain size. It does not calculate the proportion of transmission events that cause a cluster of secondary cases of a certain size. In other words it is the number of cases above a threshold divided by the total number of cases, not the number of transmission events above a certain threshold divided by the number of transmission events.
A <data.frame> with the value for the proportion of new cases
that are part of a transmission event above a threshold for a given value
of R and k.
R <- 2 k <- 0.1 cluster_size <- 10 proportion_cluster_size(R = R, k = k, cluster_size = cluster_size) # example with a vector of k k <- c(0.1, 0.2, 0.3, 0.4, 0.5) proportion_cluster_size(R = R, k = k, cluster_size = cluster_size) # example with a vector of cluster sizes cluster_size <- c(5, 10, 25) proportion_cluster_size(R = R, k = k, cluster_size = cluster_size)R <- 2 k <- 0.1 cluster_size <- 10 proportion_cluster_size(R = R, k = k, cluster_size = cluster_size) # example with a vector of k k <- c(0.1, 0.2, 0.3, 0.4, 0.5) proportion_cluster_size(R = R, k = k, cluster_size = cluster_size) # example with a vector of cluster sizes cluster_size <- c(5, 10, 25) proportion_cluster_size(R = R, k = k, cluster_size = cluster_size)
Calculates the proportion of cases that cause a certain percentage of transmission.
It is commonly estimated what proportion of cases cause 80% of transmission
(i.e. secondary cases).
This can be calculated using proportion_transmission() at varying values of
and for different values of percentage transmission.
There are two methods for calculating the proportion of transmission,
(default) and , see method argument or details for
more information.
proportion_transmission( R, k, prop_transmission, method = c("p_80", "t_20"), simulate = FALSE, ..., offspring_dist, format_prop = TRUE )proportion_transmission( R, k, prop_transmission, method = c("p_80", "t_20"), simulate = FALSE, ..., offspring_dist, format_prop = TRUE )
R |
A |
k |
A |
prop_transmission |
A |
method |
A |
simulate |
A |
... |
dots not used, extra arguments supplied will cause a warning. |
offspring_dist |
An |
format_prop |
A |
Calculates the expected proportion of transmission from a given
proportion of infectious cases. There are two methods to calculate this with
distinct formulations, and these can be specified
by the method argument.
method = p_80 calculates relative transmission heterogeneity
from the offspring distribution of secondary cases, , where the upper
proportion of the distribution comprise of total number of cases
given R0 and k, where is typically defined as 0.8 or 80%. e.g. 80%
of all transmissions are generated by the upper 20% of cases, or
p_80 = 0.2, per the 80/20 rule. In this formulation, changes in R can
have a significant effect on the estimate of even when
is constant. Importantly, this formulation does not allow for true
homogeneity when k = Inf i.e. .
method = t_20 calculates a similar ratio, instead in terms of
the theoretical individual reproductive number and infectiousness given
and . The individual reproductive number, , is
described in Lloyd-Smith et al. (2005), "as a random variable representing
the expected number of secondary cases caused by a particular infected
individual. Values for are drawn from a continuous gamma
probability distribution with population mean and dispersion
parameter , which encodes all variation in infectious histories of
individuals, including properties of the host and pathogen and environmental
circumstances." The value of corresponds to the shape parameters of
the gamma distribution which encodes the variation in the gamma-Poisson
mixture a.k.a. the negative binomial.
For method = t_20, we define the upper proportion of infectiousness,
which is typically 0.2 i.e. the upper 20% most infectious
cases, again per the 80/20 rule. e.g. the most infectious 20% of cases,
are expected to produce 80% of all infections, or t_20 = 0.8. Unlike
method = p_80, changes in R have no effect on the estimate
of t_80 when k is constant, but R is still required for the underlying
calculation. This formulation does allow for true homogeneity when
k = Inf i.e. t_20 = 0.2, or t_80 = 0.8.
Multiple values of R and k can be supplied and a <data.frame> of
every combination of these will be returned.
The numerical calculation for method = p_80 uses random number generation
to simulate secondary contacts so the answers may minimally vary between
calls. The number of simulation replicates is fixed to 105.
A <data.frame> with the value for the proportion of cases for a
given value of R and k.
The analytical calculation is from:
Endo, A., Abbott, S., Kucharski, A. J., & Funk, S. (2020) Estimating the overdispersion in COVID-19 transmission using outbreak sizes outside China. Wellcome Open Research, 5. doi:10.12688/wellcomeopenres.15842.3
The method follows the formula defined in section 2.2.5 of the
supplementary material for:
Lloyd-Smith, J. O., Schreiber, S. J., Kopp, P. E., & Getz, W. M. (2005) Superspreading and the effect of individual variation on disease emergence. Nature. Nov;438(7066):355–9. doi:10.1038/nature04153
The original code for the method is from ongoing work
originating from https://github.com/dcadam/kt and:
Adam, D., Gostic, K., Tsang, T., Wu, P., Lim, W. W., Yeung, A., Wong, J., Lau, E., Du, Z., Chen, D., Ho, L.-M., Martín-Sánchez, M., Cauchemez, S., Cobey, S., Leung, G., & Cowling, B. (2022) Time-varying transmission heterogeneity of SARS and COVID-19 in Hong Kong. doi:10.21203/rs.3.rs-1407962/v1
# example of single values of R and k prop_transmission <- 0.8 # 80% of transmission R <- 2 k <- 0.5 proportion_transmission( R = R, k = k, prop_transmission = prop_transmission ) # example with multiple values of k k <- c(0.1, 0.2, 0.3, 0.4, 0.5, 1) proportion_transmission( R = R, k = k, prop_transmission = prop_transmission ) # example with vectors of R and k R <- c(1, 2, 3) proportion_transmission( R = R, k = k, prop_transmission = prop_transmission )# example of single values of R and k prop_transmission <- 0.8 # 80% of transmission R <- 2 k <- 0.5 proportion_transmission( R = R, k = k, prop_transmission = prop_transmission ) # example with multiple values of k k <- c(0.1, 0.2, 0.3, 0.4, 0.5, 1) proportion_transmission( R = R, k = k, prop_transmission = prop_transmission ) # example with vectors of R and k R <- c(1, 2, 3) proportion_transmission( R = R, k = k, prop_transmission = prop_transmission )