If you are unfamiliar with the {simulist} package or the
sim_linelist()
function Get Started
vignette is a great place to start.
This vignette will describe how to simulate a line list with age-stratified (or age-dependent) hospitalisation and death risks.
There are three risks in {simulist}, specifically in the
sim_linelist()
and sim_outbreak()
functions,
that relate to hospitalisations and deaths:
hosp_risk
)hosp_death_risk
)non_hosp_death_risk
)The default for sim_linelist()
is a 0.2
(or
20%) hospitalisation risk for individuals infected, 0.5
(or
50%) death risk for hospitalised individuals, and 0.05
(or
5%) death risk for infected individuals outside of hospitals. These
default risks are applied to all age groups in the populations.
However, it may be unrealistic to assume that the probability an infected person is admitted to hospital or dies is independent of their age. For many diseases, young children or elderly individuals are at higher risk of being hospitalised and having adverse outcomes.
The sim_linelist()
function can accommodate
age-stratified risks by accepting a <data.frame>
instead of a single risk for the entire population.
Here is an example that uses the default hospitalisation and death risks (inside and outside of hospital). First we load the requested delay distributions using the {epiparameter} package.
contact_distribution <- epiparameter(
disease = "COVID-19",
epi_name = "contact distribution",
prob_distribution = create_prob_distribution(
prob_distribution = "pois",
prob_distribution_params = c(mean = 2)
)
)
#> Citation cannot be created as author, year, journal or title is missing
infectious_period <- epiparameter(
disease = "COVID-19",
epi_name = "infectious period",
prob_distribution = create_prob_distribution(
prob_distribution = "gamma",
prob_distribution_params = c(shape = 1, scale = 1)
)
)
#> Citation cannot be created as author, year, journal or title is missing
# get onset to hospital admission from {epiparameter} database
onset_to_hosp <- epiparameter_db(
disease = "COVID-19",
epi_name = "onset to hospitalisation",
single_epiparameter = TRUE
)
#> Using Linton N, Kobayashi T, Yang Y, Hayashi K, Akhmetzhanov A, Jung S, Yuan
#> B, Kinoshita R, Nishiura H (2020). "Incubation Period and Other
#> Epidemiological Characteristics of 2019 Novel Coronavirus Infections
#> with Right Truncation: A Statistical Analysis of Publicly Available
#> Case Data." _Journal of Clinical Medicine_. doi:10.3390/jcm9020538
#> <https://doi.org/10.3390/jcm9020538>..
#> To retrieve the citation use the 'get_citation' function
# get onset to death from {epiparameter} database
onset_to_death <- epiparameter_db(
disease = "COVID-19",
epi_name = "onset to death",
single_epiparameter = TRUE
)
#> Using Linton N, Kobayashi T, Yang Y, Hayashi K, Akhmetzhanov A, Jung S, Yuan
#> B, Kinoshita R, Nishiura H (2020). "Incubation Period and Other
#> Epidemiological Characteristics of 2019 Novel Coronavirus Infections
#> with Right Truncation: A Statistical Analysis of Publicly Available
#> Case Data." _Journal of Clinical Medicine_. doi:10.3390/jcm9020538
#> <https://doi.org/10.3390/jcm9020538>..
#> To retrieve the citation use the 'get_citation' function
We set the seed to ensure we have the same output each time the vignette is rendered. When using {simulist}, setting the seed is not required unless you need to simulate the same line list multiple times.
Simulate a line list with population-wide default risks:
0.2
0.5
0.05
linelist <- sim_linelist(
contact_distribution = contact_distribution,
infectious_period = infectious_period,
prob_infection = 0.5,
onset_to_hosp = onset_to_hosp,
onset_to_death = onset_to_death
)
# first 6 rows of linelist
head(linelist)
#> id case_name case_type sex age date_onset date_admission outcome
#> 1 1 Wajdi al-Demian probable m 35 2023-01-01 <NA> recovered
#> 2 2 Raaid el-Diab confirmed m 43 2023-01-01 2023-01-07 recovered
#> 3 3 Nickolas Nault suspected m 1 2023-01-01 <NA> recovered
#> 4 5 Hee Kennedy confirmed m 78 2023-01-01 2023-01-03 died
#> 5 6 Hope Arshad suspected f 22 2023-01-01 2023-01-28 died
#> 6 8 Shanta Holiday probable f 28 2023-01-01 <NA> recovered
#> date_outcome date_first_contact date_last_contact ct_value
#> 1 <NA> <NA> <NA> NA
#> 2 <NA> 2022-12-30 2023-01-05 23.2
#> 3 <NA> 2022-12-30 2023-01-02 NA
#> 4 2023-01-21 2022-12-29 2023-01-02 25.2
#> 5 2023-01-10 2023-01-01 2023-01-03 NA
#> 6 <NA> 2023-01-03 2023-01-04 NA
We can run another simulation and change the hospitalisation and death risks, inside and outside the hospital, still applied to the entire population. In this scenario the probability of being hospitalised if infected is higher, but the mortality risk for both hospitalised and non-hospitalised groups is lower.
linelist <- sim_linelist(
contact_distribution = contact_distribution,
infectious_period = infectious_period,
prob_infection = 0.5,
onset_to_hosp = onset_to_hosp,
onset_to_death = onset_to_death,
hosp_risk = 0.4,
hosp_death_risk = 0.2,
non_hosp_death_risk = 0.01
)
head(linelist)
#> id case_name case_type sex age date_onset date_admission outcome
#> 1 1 Robert Wanzek suspected m 80 2023-01-01 2023-01-08 recovered
#> 2 2 Kacy Kim probable f 85 2023-01-01 <NA> recovered
#> 3 4 Jade Goldsberry probable f 76 2023-01-01 <NA> recovered
#> 4 8 Brittany Brooks confirmed f 12 2023-01-01 2023-01-01 died
#> 5 11 Fat'hiyaa al-Zafar suspected f 50 2023-01-01 <NA> recovered
#> 6 14 Desirae Carranza probable f 54 2023-01-01 2023-01-03 recovered
#> date_outcome date_first_contact date_last_contact ct_value
#> 1 <NA> <NA> <NA> NA
#> 2 <NA> 2023-01-04 2023-01-06 NA
#> 3 <NA> 2022-12-30 2023-01-03 NA
#> 4 2023-01-18 2023-01-05 2023-01-05 26.4
#> 5 <NA> 2023-01-03 2023-01-05 NA
#> 6 <NA> 2022-12-31 2023-01-02 NA
To define age-stratified risks, we must create a table
(<data.frame>
) which contains the lower limits of the
age groups and their respective risks.
In this example the hospitalisation risk will be:
0.2
(or 20%) for those aged 80 years or older0.1
(or 10%) for those younger than 5 years0.05
(or 5%) for the remaining age groupThe oldest age group stops at the upper age range given in the
population_age
argument. The default upper age range in 90,
so in this example the oldest age bracket will be 80-90 (inclusive). The
minimum age of each age group is inclusive, and the maximum age of each
age group is exclusive, except the oldest age group which is inclusive
of the minimum and maximum age. In this example the first age group is
the first element of each vector, so the minimum age is 1, maximum age
is four (as the next age group starts at five), and the hospitalisation
risk for that group is 0.1. Each age group forms a row in the table.
age_dep_hosp_risk <- data.frame(
age_limit = c(1, 5, 80),
risk = c(0.1, 0.05, 0.2)
)
age_dep_hosp_risk
#> age_limit risk
#> 1 1 0.10
#> 2 5 0.05
#> 3 80 0.20
linelist <- sim_linelist(
contact_distribution = contact_distribution,
infectious_period = infectious_period,
prob_infection = 0.5,
onset_to_hosp = onset_to_hosp,
onset_to_death = onset_to_death,
hosp_risk = age_dep_hosp_risk
)
head(linelist)
#> id case_name case_type sex age date_onset date_admission outcome
#> 1 1 Ignacio Albo probable m 88 2023-01-01 <NA> recovered
#> 2 2 Desire Aguayo confirmed f 90 2023-01-01 <NA> recovered
#> 3 4 Miska al-Ramadan confirmed f 30 2023-01-01 <NA> recovered
#> 4 5 Frankie Garcia confirmed f 37 2023-01-01 <NA> recovered
#> 5 6 Tahma Mitchell probable m 88 2023-01-01 <NA> recovered
#> 6 7 Tan Antonio probable m 56 2023-01-02 <NA> recovered
#> date_outcome date_first_contact date_last_contact ct_value
#> 1 <NA> <NA> <NA> NA
#> 2 <NA> 2023-01-02 2023-01-04 23.5
#> 3 <NA> 2023-01-05 2023-01-07 22.9
#> 4 <NA> 2023-01-05 2023-01-06 24.9
#> 5 <NA> 2022-12-29 2023-01-03 NA
#> 6 <NA> 2023-01-01 2023-01-05 NA
The minimum age of the youngest age group must match the age range
specified in the population_age
argument of
sim_linelist()
, and the largest age limit in the risk
<data.frame>
must not be older than the upper age
range. If these conditions are not met the function will error.
If the age-stratified risk table does not match the default
(c(1, 90)
), the population_age
argument will
need to be set to match.
For example, the default age range of the population is 1 to 90
(inclusive). In our example above, the lowest age group started at 1 and
the oldest age group stopped at 90. This matches the default
population_age = c(1, 90)
. However, see here that if the
lower age limit exceeds the age range the function will not run.
age_dep_hosp_risk <- data.frame(
age_limit = c(1, 5, 95),
risk = c(0.1, 0.05, 0.2)
)
age_dep_hosp_risk
#> age_limit risk
#> 1 1 0.10
#> 2 5 0.05
#> 3 95 0.20
linelist <- sim_linelist(
contact_distribution = contact_distribution,
infectious_period = infectious_period,
prob_infection = 0.5,
onset_to_hosp = onset_to_hosp,
onset_to_death = onset_to_death,
hosp_risk = age_dep_hosp_risk
)
#> Error in .check_risk_df(hosp_risk, age_range = age_range): Lower bound of oldest age group must be lower than highest age range
In order to make this code run with the age-stratified
hospitalisation risk given, the population_age
can be
adjusted. Here the oldest age bracket is now 95 to 100
([95, 100]
).
age_dep_hosp_risk <- data.frame(
age_limit = c(1, 5, 95),
risk = c(0.1, 0.05, 0.2)
)
linelist <- sim_linelist(
contact_distribution = contact_distribution,
infectious_period = infectious_period,
prob_infection = 0.5,
onset_to_hosp = onset_to_hosp,
onset_to_death = onset_to_death,
hosp_risk = age_dep_hosp_risk,
population_age = c(1, 100)
)
head(linelist)
#> id case_name case_type sex age date_onset date_admission outcome
#> 1 1 Devin Soloman probable m 80 2023-01-01 <NA> recovered
#> 2 2 Hazm el-Vaziri confirmed m 28 2023-01-01 <NA> recovered
#> 3 3 Grace Vue probable f 83 2023-01-01 <NA> recovered
#> 4 4 Shajee'a al-Sadek suspected f 16 2023-01-01 <NA> recovered
#> 5 5 Azzaam el-Yusuf confirmed m 36 2023-01-01 <NA> recovered
#> 6 7 Cordero Manheimer confirmed m 5 2023-01-01 <NA> died
#> date_outcome date_first_contact date_last_contact ct_value
#> 1 <NA> <NA> <NA> NA
#> 2 <NA> 2022-12-29 2023-01-04 27.2
#> 3 <NA> 2023-01-03 2023-01-05 NA
#> 4 <NA> 2022-12-31 2023-01-03 NA
#> 5 <NA> 2022-12-30 2023-01-03 29.8
#> 6 2023-01-19 2023-01-03 2023-01-08 24.2
Exactly the same method of age-stratified risks applies to death
risks. First create the <data.frame>
with the age
groups and their respective, in this case, death risks, and then supply
this to either the hosp_death_risk
or
non_hosp_death_risk
arguments, to define the death risks in
and outside the hospital, respectively, or both.
Here are a couple of examples:
age_dep_hosp_death_risk <- data.frame(
age_limit = c(1, 5, 80),
risk = c(0.3, 0.1, 0.6)
)
age_dep_hosp_death_risk
#> age_limit risk
#> 1 1 0.3
#> 2 5 0.1
#> 3 80 0.6
linelist <- sim_linelist(
contact_distribution = contact_distribution,
infectious_period = infectious_period,
prob_infection = 0.5,
onset_to_hosp = onset_to_hosp,
onset_to_death = onset_to_death,
hosp_death_risk = age_dep_hosp_death_risk
)
age_dep_non_hosp_death_risk <- data.frame(
age_limit = c(1, 5, 80),
risk = c(0.1, 0.05, 0.1)
)
age_dep_non_hosp_death_risk
#> age_limit risk
#> 1 1 0.10
#> 2 5 0.05
#> 3 80 0.10
linelist <- sim_linelist(
contact_distribution = contact_distribution,
infectious_period = infectious_period,
prob_infection = 0.5,
onset_to_hosp = onset_to_hosp,
onset_to_death = onset_to_death,
non_hosp_death_risk = age_dep_non_hosp_death_risk
)
Up until now these age-stratified tables have only been supplied to one risk. However, these can be supplied in the same simulation. In this case the hospitalisation risk, and death risks inside and outside of hospital, are all specified.
age_dep_hosp_risk <- data.frame(
age_limit = c(1, 5, 80),
risk = c(0.1, 0.05, 0.2)
)
age_dep_hosp_death_risk <- data.frame(
age_limit = c(1, 5, 80),
risk = c(0.3, 0.1, 0.6)
)
age_dep_non_hosp_death_risk <- data.frame(
age_limit = c(1, 5, 80),
risk = c(0.1, 0.05, 0.1)
)
linelist <- sim_linelist(
contact_distribution = contact_distribution,
infectious_period = infectious_period,
prob_infection = 0.5,
onset_to_hosp = onset_to_hosp,
onset_to_death = onset_to_death,
hosp_risk = age_dep_hosp_risk,
hosp_death_risk = age_dep_hosp_death_risk,
non_hosp_death_risk = age_dep_non_hosp_death_risk
)
head(linelist)
#> id case_name case_type sex age date_onset date_admission outcome
#> 1 1 Afeef al-Saber suspected m 54 2023-01-01 <NA> recovered
#> 2 3 Duncan Kleinman suspected m 39 2023-01-02 2023-01-03 recovered
#> 3 4 Montica Gallegos probable f 37 2023-01-01 <NA> recovered
#> 4 5 Son Ahuna suspected m 25 2023-01-03 <NA> recovered
#> 5 6 Alyssa Gonzalez confirmed f 10 2023-01-02 <NA> recovered
#> 6 7 Ashley Palomo confirmed f 13 2023-01-02 <NA> recovered
#> date_outcome date_first_contact date_last_contact ct_value
#> 1 <NA> <NA> <NA> NA
#> 2 <NA> 2022-12-27 2023-01-01 NA
#> 3 <NA> 2022-12-31 2023-01-06 NA
#> 4 <NA> 2023-01-02 2023-01-06 NA
#> 5 <NA> 2023-01-02 2023-01-06 25.8
#> 6 <NA> 2023-01-02 2023-01-03 27.3