Epidemic final size calculations
are sensitive to input data such as the R0 of the infection. Such
values can often be uncertain in the early stages of an outbreak. This
uncertainty can be included in final size calculations by running
final_size()
for values drawn from a distribution, and
summarising the outcomes.
New to finalsize? It may help to read the “Get started”, “Modelling heterogeneous contacts”, or “Modelling heterogeneous susceptibility” vignettes first!
The infection parameter (R0) is uncertain. We want to know how much variation this could cause in the estimated final size of the epidemic.
This example uses social contact data from the POLYMOD project (Mossong et al.
2008) to estimate the final size of an epidemic in the U.K.
These data are provided with the socialmixr
package.
These data are handled just as in the “Get started” vignette. This example also considers an infection with an R0 of 1.5.
# get UK polymod data from socialmixr
polymod <- socialmixr::polymod
contact_data <- socialmixr::contact_matrix(
polymod,
countries = "United Kingdom",
age.limits = c(0, 5, 18, 40, 65),
symmetric = TRUE
)
# get the contact matrix and demography data
contact_matrix <- t(contact_data$matrix)
demography_vector <- contact_data$demography$population
# scale the contact matrix so the largest eigenvalue is 1.0
contact_matrix <- contact_matrix / max(Re(eigen(contact_matrix)$values))
# divide each row of the contact matrix by the corresponding demography
contact_matrix <- contact_matrix / demography_vector
n_demo_grps <- length(demography_vector)
For simplicity, this example considers a scenario in which susceptibility to infection does not vary.
final_size
over R0 samplesThe basic reproduction number R0 of an infection might
be uncertain in the early stages of an epidemic. This uncertainty can be
modelled by running final_size()
multiple times for the
same contact, demography, and susceptibility data, while sampling R0 values from a
distribution.
This example assumes that the R0 estimate, and the uncertainty around that estimate, is provided as the mean and standard deviation of a normal distribution.
This example considers a normal distribution N(μ = 1.5, σ = 0.1),
for an R0 of 1.5.
We can draw 1,000 R0 samples from this
distribution and run final_size()
on the contact data and
demography data for each sample.
This is quick, as finalsize
is an Rcpp package with a
C++ backend.
# run final size on each sample with the same data
final_size_data <- Map(
r0_samples, seq_along(r0_samples),
f = function(r0, i) {
# the i-th final size estimate
fs <- final_size(
r0 = r0,
contact_matrix = contact_matrix,
demography_vector = demography_vector,
susceptibility = susc_uniform,
p_susceptibility = p_susc_uniform
)
fs$replicate <- i
fs$r0_estimate <- r0
fs
}
)
# combine data
final_size_data <- Reduce(x = final_size_data, f = rbind)
# order age groups
final_size_data$demo_grp <- factor(
final_size_data$demo_grp,
levels = contact_data$demography$age.group
)
# examine some replicates in the data
head(final_size_data, 15)
#> demo_grp susc_grp susceptibility group_size p_infected n_infected
#> 1 [0,5) susc_grp_1 1 3453670 0.3429556 1184455
#> 2 [5,18) susc_grp_1 1 9761554 0.6064155 5919557
#> 3 [18,40) susc_grp_1 1 18110368 0.4524943 8194838
#> 4 [40,65) susc_grp_1 1 19288101 0.3839559 7405780
#> 5 65+ susc_grp_1 1 9673058 0.2443132 2363256
#> 6 [0,5) susc_grp_1 1 3453670 0.4764501 1645501
#> 7 [5,18) susc_grp_1 1 9761554 0.7483620 7305176
#> 8 [18,40) susc_grp_1 1 18110368 0.6050303 10957322
#> 9 [40,65) susc_grp_1 1 19288101 0.5290010 10203424
#> 10 65+ susc_grp_1 1 9673058 0.3561888 3445435
#> 11 [0,5) susc_grp_1 1 3453670 0.4267289 1473781
#> 12 [5,18) susc_grp_1 1 9761554 0.6999138 6832247
#> 13 [18,40) susc_grp_1 1 18110368 0.5500188 9961043
#> 14 [40,65) susc_grp_1 1 19288101 0.4755078 9171643
#> 15 65+ susc_grp_1 1 9673058 0.3132654 3030235
#> replicate r0_estimate
#> 1 1 1.387710
#> 2 1 1.387710
#> 3 1 1.387710
#> 4 1 1.387710
#> 5 1 1.387710
#> 6 2 1.606263
#> 7 2 1.606263
#> 8 2 1.606263
#> 9 2 1.606263
#> 10 2 1.606263
#> 11 3 1.517773
#> 12 3 1.517773
#> 13 3 1.517773
#> 14 3 1.517773
#> 15 3 1.517773
library(tibble)
library(dplyr)
library(tidyr)
library(purrr)
library(forcats)
final_size_data <-
# create a dataframe with values from a vector
tibble(r0 = r0_samples) %>%
rownames_to_column() %>%
# map the function final_size() to all the r0 values
# with the same set of arguments
# with {purrr}
mutate(
temp = map(
.x = r0,
.f = final_size,
contact_matrix = contact_matrix,
demography_vector = demography_vector,
susceptibility = susc_uniform,
p_susceptibility = p_susc_uniform
)
) %>%
# unnest all the dataframe outputs in temp
unnest(temp) %>%
# relevel the factor variable
mutate(
demo_grp = fct_relevel(
demo_grp,
contact_data %>%
pluck("demography") %>%
pluck("age.group")
)
)
head(final_size_data, 15)
#> # A tibble: 15 × 8
#> rowname r0 demo_grp susc_grp susceptibility group_size p_infected
#> <chr> <dbl> <fct> <chr> <dbl> <dbl> <dbl>
#> 1 1 1.39 [0,5) susc_grp_1 1 3453670 0.343
#> 2 1 1.39 [5,18) susc_grp_1 1 9761554 0.606
#> 3 1 1.39 [18,40) susc_grp_1 1 18110368 0.452
#> 4 1 1.39 [40,65) susc_grp_1 1 19288101 0.384
#> 5 1 1.39 65+ susc_grp_1 1 9673058 0.244
#> 6 2 1.61 [0,5) susc_grp_1 1 3453670 0.476
#> 7 2 1.61 [5,18) susc_grp_1 1 9761554 0.748
#> 8 2 1.61 [18,40) susc_grp_1 1 18110368 0.605
#> 9 2 1.61 [40,65) susc_grp_1 1 19288101 0.529
#> 10 2 1.61 65+ susc_grp_1 1 9673058 0.356
#> 11 3 1.52 [0,5) susc_grp_1 1 3453670 0.427
#> 12 3 1.52 [5,18) susc_grp_1 1 9761554 0.700
#> 13 3 1.52 [18,40) susc_grp_1 1 18110368 0.550
#> 14 3 1.52 [40,65) susc_grp_1 1 19288101 0.476
#> 15 3 1.52 65+ susc_grp_1 1 9673058 0.313
#> # ℹ 1 more variable: n_infected <dbl>
ggplot(final_size_data) +
stat_summary(
aes(
demo_grp, p_infected
),
fun = mean,
fun.min = function(x) {
quantile(x, 0.05)
},
fun.max = function(x) {
quantile(x, 0.95)
}
) +
scale_y_continuous(
labels = scales::percent,
limits = c(0.25, 1)
) +
theme_classic() +
theme(
legend.position = "top",
legend.key.height = unit(2, "mm"),
legend.title = ggtext::element_markdown(
vjust = 1
)
) +
coord_cartesian(
expand = TRUE
) +
labs(
x = "Age group",
y = "% Infected"
)