Title: | Bayesian Estimation of the Force of Infection from Serological Data |
---|---|
Description: | Estimating the force of infection from time varying, age varying, or constant serocatalytic models from population based seroprevalence studies using a Bayesian framework, including data simulation functions enabling the generation of serological surveys based on this models. This tool also provides a flexible prior specification syntax for the force of infection and the seroreversion rate, as well as methods to assess model convergence and comparison criteria along with useful visualisation functions. |
Authors: | Zulma M. Cucunubá [aut, cre] |
Maintainer: | Zulma M. Cucunubá <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.3 |
Built: | 2025-03-28 19:24:40 UTC |
Source: | https://github.com/epiverse-trace/serofoi |
A DESCRIPTION OF THE PACKAGE
Maintainer: Zulma M. Cucunubá [email protected] (ORCID)
Authors:
Nicolás T. Domínguez [email protected] (ORCID)
Ben Lambert
Pierre Nouvellet
Other contributors:
Geraldine Gómez (ORCID) [contributor]
Jaime A. Pavlich-Mariscal (ORCID) [contributor]
Hugo Gruson (ORCID) [contributor]
David Santiago Quevedo (ORCID) [contributor]
Miguel Gámez [contributor]
Sumali Bajaj [contributor]
Everlyn Kamau [contributor]
Richard Creswell [contributor]
International Development Research Center (IDRC) [funder]
Pontificia Universidad Javeriana [copyright holder]
Stan Development Team (NA). RStan: the R interface to Stan. R package version 2.26.22. https://mc-stan.org
#' @keywords internal
Useful links:
Report bugs at https://github.com/epiverse-trace/serofoi/issues
Adds age group marker to serosurvey
add_age_group_to_serosurvey(serosurvey)
add_age_group_to_serosurvey(serosurvey)
serosurvey |
|
serosurvey with additional column specifying age group marker
defined as the mean floor between age_min
and age_max
Builds stan data for sampling depending on the selected model
build_stan_data( serosurvey, model_type = "constant", foi_prior = sf_uniform(), foi_index = NULL, is_log_foi = FALSE, foi_sigma_rw = sf_none(), is_seroreversion = FALSE, seroreversion_prior = sf_none() )
build_stan_data( serosurvey, model_type = "constant", foi_prior = sf_uniform(), foi_index = NULL, is_log_foi = FALSE, foi_sigma_rw = sf_none(), is_seroreversion = FALSE, seroreversion_prior = sf_none() )
serosurvey |
|
model_type |
Type of the model. Either "constant", "age" or "time" |
foi_prior |
Force-of-infection distribution specified by means of the helper functions. Currently available options are:
|
foi_index |
Integer vector specifying the age-groups for which Force-of-Infection values will be estimated. It can be specified by means of get_foi_index |
is_log_foi |
Boolean to set logarithmic scale in the FoI |
foi_sigma_rw |
Prior distribution for the standard deviation of the Force-of-Infection. Currently available options are: |
is_seroreversion |
Boolean specifying whether to include seroreversion rate estimation in the model |
seroreversion_prior |
seroreversion distribution specified by means of the helper functions. Currently available options are:
|
List with necessary data for sampling the specified model
Datasets that measure the seroprevalence of IgG antibodies against Trypanosoma cruzi infection in rural areas of Colombia corresponding to a serosurvey conducted in 2012 for a rural indigenous community known to have long-term endemic transmission, where some control interventions have taken place over the years.
data(chagas2012)
data(chagas2012)
chagas2012
A <data.frame>
with 4 rows and 5 columns:
Year in which the serosurvey was conducted
Number of collected samples per age group
Number of positive samples per age group
Age group minimal age
Age group maximal age
data(chagas2012)
data(chagas2012)
Datasets that measure the seroprevalence of IgG antibodies against the Chikungunya virus conducted in Bahia, Brazil in October-December 2015 by Dias et al. (2018). The survey was conducted immediately after a large Chikungunya epidemic in the area.
data(chik2015)
data(chik2015)
chik2015
A <data.frame>
with 4 rows and 5 columns:
Year in which the serosurvey was conducted
Number of collected samples per age group
Number of positive samples per age group
Age group minimal age
Age group maximal age
data(chik2015)
data(chik2015)
Extracts central estimates from stan_fit object for specified parameter
extract_central_estimates( seromodel, serosurvey, alpha = 0.05, par_name = "foi_vector" )
extract_central_estimates( seromodel, serosurvey, alpha = 0.05, par_name = "foi_vector" )
seromodel |
stan_fit object obtained from sampling a model with fit_seromodel |
serosurvey |
|
alpha |
1 - alpha indicates the credibility level to be used |
par_name |
String specifying the parameter to be extracted
from |
A dataframe with the following columns
median
Median of the samples computed as the 0.5 quantile
lower
Lower quantile alpha
upper
Upper quantile 1 - alpha
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) central_estimates <- extract_central_estimates( seromodel, veev2012, par_name = "foi" )
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) central_estimates <- extract_central_estimates( seromodel, veev2012, par_name = "foi" )
Runs specified stan model for the Force-of-Infection (FoI)
fit_seromodel( serosurvey, model_type = "constant", is_log_foi = FALSE, foi_prior = sf_normal(), foi_sigma_rw = sf_none(), foi_index = NULL, foi_init = NULL, is_seroreversion = FALSE, seroreversion_prior = sf_normal(), ... )
fit_seromodel( serosurvey, model_type = "constant", is_log_foi = FALSE, foi_prior = sf_normal(), foi_sigma_rw = sf_none(), foi_index = NULL, foi_init = NULL, is_seroreversion = FALSE, seroreversion_prior = sf_normal(), ... )
serosurvey |
|
model_type |
Type of the model. Either "constant", "age" or "time" |
is_log_foi |
Boolean to set logarithmic scale in the FoI |
foi_prior |
Force-of-infection distribution specified by means of the helper functions. Currently available options are:
|
foi_sigma_rw |
Prior distribution for the standard deviation of the Force-of-Infection. Currently available options are: |
foi_index |
Integer vector specifying the age-groups for which Force-of-Infection values will be estimated. It can be specified by means of get_foi_index |
foi_init |
Initialization function for sampling. If null, default is chosen depending on the foi-scale of the model |
is_seroreversion |
Boolean specifying whether to include seroreversion rate estimation in the model |
seroreversion_prior |
seroreversion distribution specified by means of the helper functions. Currently available options are:
|
... |
Additional parameters for rstan |
stan_fit object with Force-of-Infection and seroreversion (when applicable) samples
data(chagas2012) seromodel <- fit_seromodel( serosurvey = chagas2012, model_type = "time", foi_index = data.frame( year = 1935:2011, foi_index = c(rep(1, 46), rep(2, 31)) ), iter = 100 )
data(chagas2012) seromodel <- fit_seromodel( serosurvey = chagas2012, model_type = "time", foi_index = data.frame( year = 1935:2011, foi_index = c(rep(1, 46), rep(2, 31)) ), iter = 100 )
Generates a list of integers indexing together the time/age intervals
for which FoI values will be estimated in fit_seromodel.
The max value in foi_index
corresponds to the number of FoI values to
be estimated when sampling.
The serofoi approach to fitting serological data currently supposes that FoI
is piecewise-constant across either groups of years or ages, and this
function creates a Data Frame that communicates this grouping to the
Stan model
get_foi_index(serosurvey, group_size, model_type)
get_foi_index(serosurvey, group_size, model_type)
serosurvey |
|
group_size |
Age groups size |
model_type |
Type of the model. Either "age" or "time" |
A Data Frame which describes the grouping of years or ages (dependent on model) into pieces within which the FoI is assumed constant when performing model fitting. A single FoI value will be estimated for ages/years assigned with the same index
data(chagas2012) foi_index <- get_foi_index(chagas2012, group_size = 25, model_type = "time")
data(chagas2012) foi_index <- get_foi_index(chagas2012, group_size = 25, model_type = "time")
Plots Force-of-Infection central estimates
plot_foi_estimates( seromodel, serosurvey, alpha = 0.05, foi_df = NULL, foi_max = NULL, size_text = 11, plot_constant = FALSE, x_axis = NA )
plot_foi_estimates( seromodel, serosurvey, alpha = 0.05, foi_df = NULL, foi_max = NULL, size_text = 11, plot_constant = FALSE, x_axis = NA )
seromodel |
stan_fit object obtained from sampling a model with fit_seromodel |
serosurvey |
|
alpha |
1 - alpha indicates the credibility level to be used |
foi_df |
Dataframe with columns
|
foi_max |
Max FoI value for plotting |
size_text |
Size of text for plotting ( |
plot_constant |
boolean specifying whether to plot single
Force-of-Infection estimate and its corresponding rhat value instead
of showing this information in the summary.
Only relevant when |
x_axis |
either |
ggplot object with estimated FoI
data(chagas2012) seromodel <- fit_seromodel( serosurvey = chagas2012, model_type = "time", foi_index = data.frame( year = 1935:2011, foi_index = c(rep(1, 46), rep(2, 31)) ), iter = 100, chains = 2 ) plot_foi_estimates(seromodel, chagas2012)
data(chagas2012) seromodel <- fit_seromodel( serosurvey = chagas2012, model_type = "time", foi_index = data.frame( year = 1935:2011, foi_index = c(rep(1, 46), rep(2, 31)) ), iter = 100, chains = 2 ) plot_foi_estimates(seromodel, chagas2012)
Plot r-hats convergence criteria for the specified model
plot_rhats( seromodel, serosurvey, size_text = 11, plot_constant = FALSE, x_axis = NA )
plot_rhats( seromodel, serosurvey, size_text = 11, plot_constant = FALSE, x_axis = NA )
seromodel |
stan_fit object obtained from sampling a model with fit_seromodel |
serosurvey |
|
size_text |
Size of text for plotting ( |
plot_constant |
boolean specifying whether to plot single
Force-of-Infection estimate and its corresponding rhat value instead
of showing this information in the summary.
Only relevant when |
x_axis |
either |
ggplot object showing the r-hats of the model to be compared with the convergence criteria (horizontal dashed line)
data(chagas2012) seromodel <- fit_seromodel( serosurvey = chagas2012, model_type = "time", foi_index = data.frame( year = 1935:2011, foi_index = c(rep(1, 46), rep(2, 31)) ), iter = 100, chains = 2 ) plot_rhats(seromodel, chagas2012)
data(chagas2012) seromodel <- fit_seromodel( serosurvey = chagas2012, model_type = "time", foi_index = data.frame( year = 1935:2011, foi_index = c(rep(1, 46), rep(2, 31)) ), iter = 100, chains = 2 ) plot_rhats(seromodel, chagas2012)
Visualise results of the provided model
plot_seromodel( seromodel, serosurvey, alpha = 0.05, bin_serosurvey = FALSE, bin_step = 5, foi_df = NULL, foi_max = NULL, loo_estimate_digits = 1, central_estimate_digits = 2, seroreversion_digits = 2, rhat_digits = 2, size_text = 11, plot_constant = FALSE, x_axis = NA )
plot_seromodel( seromodel, serosurvey, alpha = 0.05, bin_serosurvey = FALSE, bin_step = 5, foi_df = NULL, foi_max = NULL, loo_estimate_digits = 1, central_estimate_digits = 2, seroreversion_digits = 2, rhat_digits = 2, size_text = 11, plot_constant = FALSE, x_axis = NA )
seromodel |
stan_fit object obtained from sampling a model with fit_seromodel |
serosurvey |
|
alpha |
1 - alpha indicates the credibility level to be used |
bin_serosurvey |
If |
bin_step |
Integer specifying the age groups bin size to be used when
|
foi_df |
Dataframe with columns
|
foi_max |
Max FoI value for plotting |
loo_estimate_digits |
Number of loo estimate digits |
central_estimate_digits |
Number of central estimate digits |
seroreversion_digits |
Number of seroreversion rate digits |
rhat_digits |
Number of rhat estimate digits |
size_text |
Size of text for plotting ( |
plot_constant |
boolean specifying whether to plot single
Force-of-Infection estimate and its corresponding rhat value instead
of showing this information in the summary.
Only relevant when |
x_axis |
either |
seromodel summary plot
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) plot_seromodel(seromodel, veev2012)
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) plot_seromodel(seromodel, veev2012)
Plot seroprevalence estimates on top of the serosurvey
plot_seroprev_estimates( seromodel, serosurvey, alpha = 0.05, size_text = 11, bin_serosurvey = FALSE, bin_step = 5 )
plot_seroprev_estimates( seromodel, serosurvey, alpha = 0.05, size_text = 11, bin_serosurvey = FALSE, bin_step = 5 )
seromodel |
stan_fit object obtained from sampling a model with fit_seromodel |
serosurvey |
|
alpha |
1 - alpha indicates the credibility level to be used |
size_text |
Size of text for plotting ( |
bin_serosurvey |
If |
bin_step |
Integer specifying the age groups bin size to be used when
|
ggplot object with seroprevalence estimates and serosurveys plots
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) plot_seroprev_estimates(seromodel, veev2012)
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) plot_seroprev_estimates(seromodel, veev2012)
Plots seroprevalence from the given serosurvey
plot_serosurvey( serosurvey, size_text = 11, bin_serosurvey = FALSE, bin_step = 5 )
plot_serosurvey( serosurvey, size_text = 11, bin_serosurvey = FALSE, bin_step = 5 )
serosurvey |
|
size_text |
Size of text for plotting ( |
bin_serosurvey |
If |
bin_step |
Integer specifying the age groups bin size to be used when
|
ggplot object with seroprevalence plot
# Chikungunya example serosurvey data(chik2015) plot_serosurvey(chik2015) # VEEV example serosurvey data(veev2012) plot_serosurvey(veev2012)
# Chikungunya example serosurvey data(chik2015) plot_serosurvey(chik2015) # VEEV example serosurvey data(veev2012) plot_serosurvey(veev2012)
Plots model summary
plot_summary( seromodel, serosurvey, loo_estimate_digits = 1, central_estimate_digits = 2, rhat_digits = 2, size_text = 11, plot_constant = FALSE )
plot_summary( seromodel, serosurvey, loo_estimate_digits = 1, central_estimate_digits = 2, rhat_digits = 2, size_text = 11, plot_constant = FALSE )
seromodel |
stan_fit object obtained from sampling a model with fit_seromodel |
serosurvey |
|
loo_estimate_digits |
Number of loo estimate digits |
central_estimate_digits |
Number of central estimate digits |
rhat_digits |
Number of rhat estimate digits |
size_text |
Size of text for plotting ( |
plot_constant |
boolean specifying whether to plot single
Force-of-Infection estimate and its corresponding rhat value instead
of showing this information in the summary.
Only relevant when |
ggplot object with a summary of the specified model
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) plot_summary(seromodel, veev2012)
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) plot_summary(seromodel, veev2012)
Adds seroprevalence values with corresponding binomial confidence interval
prepare_serosurvey_for_plot(serosurvey, alpha = 0.05)
prepare_serosurvey_for_plot(serosurvey, alpha = 0.05)
serosurvey |
|
alpha |
1 - alpha indicates the confidence level to be used |
serosurvey with additional columns:
Seroprevalence computed as the proportion of positive
cases n_seropositive
in the number of samples
n_sample
for each age group
Lower limit of the binomial confidence interval
of seroprev
Upper limit of the binomial confidence interval
of seroprev
This function calculates the probabilities of seropositivity by age based on an age-varying FoI model. It takes into account the FoI and the rate of seroreversion.
prob_seroprev_age_by_age(foi, seroreversion_rate)
prob_seroprev_age_by_age(foi, seroreversion_rate)
foi |
A dataframe containing the FoI values for different ages. It should have two columns: 'age' and 'foi'. |
seroreversion_rate |
A non-negative numeric value representing the rate of seroreversion. |
A dataframe with columns 'age' and 'seropositivity'.
This function calculates the probabilities of seropositivity by age based on an age-and-time-varying FoI model. It takes into account the FoI and the rate of seroreversion.
prob_seroprev_age_time_by_age(foi, seroreversion_rate)
prob_seroprev_age_time_by_age(foi, seroreversion_rate)
foi |
A dataframe containing the FoI values for different ages. It should have three columns: 'year', 'age' and 'foi'. |
seroreversion_rate |
A non-negative numeric value representing the rate of seroreversion. |
A dataframe with columns 'age' and 'seropositivity'.
This function generates seropositivity probabilities based on either a time-varying Force-of-Infection (FoI) model, an age-varying FoI model, or an age-and-time-varying FoI model. In all cases, it is possible to optionally include seroreversion.
prob_seroprev_by_age(model, foi, seroreversion_rate = 0)
prob_seroprev_by_age(model, foi, seroreversion_rate = 0)
model |
A string specifying the model type which can be either '"age"', '"time"', '"age-time"'. |
foi |
A dataframe containing the FoI values. For time-varying models the columns should be:
For age-varying models the columns should be:.
For age-and-time-varying models the columns should be:
|
seroreversion_rate |
A non-negative value determining the rate of seroreversion (per year). Default is 0. |
A dataframe with columns 'age' and 'seropositivity'.
prob_seroprev_by_age( model = "age", foi = data.frame( age = 1:80, foi = rep(0.01, 80) ) )
prob_seroprev_by_age( model = "age", foi = data.frame( age = 1:80, foi = rep(0.01, 80) ) )
This function calculates the probabilities of seropositivity by age based on an abstract model of the serocatalytic system.
prob_seroprev_gen_by_age( construct_A_fun, calculate_seroprev_fun, initial_conditions, max_age, ... )
prob_seroprev_gen_by_age( construct_A_fun, calculate_seroprev_fun, initial_conditions, max_age, ... )
construct_A_fun |
A function that constructs a matrix that defines the multiplier term in the linear ODE system. |
calculate_seroprev_fun |
A function which takes the state vector and returns the seropositive fraction. |
initial_conditions |
The initial state vector proportions for each birth cohort. |
max_age |
The maximum age to simulate seropositivity for. |
... |
Additional parameters for |
A dataframe with columns 'age' and 'seropositivity'.
# define age- and time-specific multipliers foi_df_time <- data.frame( year = seq(1946, 2025, 1), foi = c(rep(0, 40), rep(1, 40)) ) foi_df_age <- data.frame( age = 1:80, foi = 2 * dlnorm(1:80, meanlog = 3.5, sdlog = 0.5) ) u <- foi_df_age$foi v <- foi_df_time$foi # function to construct A matrix for one piece construct_A <- function(t, tau, u, v) { u_bar <- u[t - tau] v_bar <- v[t] A <- diag(-1, ncol = 12, nrow = 12) A[row(A) == (col(A) + 1)] <- 1 A[1, 1] <- -u_bar * v_bar A[2, 1] <- u_bar * v_bar A[12, 12] <- 0 A } # determines the sum of seropositive compartments of those still alive calculate_seropositivity_fn <- function(Y) { sum(Y[2:11]) / (1 - Y[12]) } # initial conditions in 12D state vector initial_conditions <- rep(0, 12) initial_conditions[1] <- 1 # calculate probability seropositive_hiv <- prob_seroprev_gen_by_age( construct_A, calculate_seropositivity_fn, initial_conditions, max_age = 80, u, v )
# define age- and time-specific multipliers foi_df_time <- data.frame( year = seq(1946, 2025, 1), foi = c(rep(0, 40), rep(1, 40)) ) foi_df_age <- data.frame( age = 1:80, foi = 2 * dlnorm(1:80, meanlog = 3.5, sdlog = 0.5) ) u <- foi_df_age$foi v <- foi_df_time$foi # function to construct A matrix for one piece construct_A <- function(t, tau, u, v) { u_bar <- u[t - tau] v_bar <- v[t] A <- diag(-1, ncol = 12, nrow = 12) A[row(A) == (col(A) + 1)] <- 1 A[1, 1] <- -u_bar * v_bar A[2, 1] <- u_bar * v_bar A[12, 12] <- 0 A } # determines the sum of seropositive compartments of those still alive calculate_seropositivity_fn <- function(Y) { sum(Y[2:11]) / (1 - Y[12]) } # initial conditions in 12D state vector initial_conditions <- rep(0, 12) initial_conditions[1] <- 1 # calculate probability seropositive_hiv <- prob_seroprev_gen_by_age( construct_A, calculate_seropositivity_fn, initial_conditions, max_age = 80, u, v )
This function calculates the probabilities of seropositivity by age based on a time-varying FoI model. It takes into account the FoI and the rate of seroreversion.
prob_seroprev_time_by_age(foi, seroreversion_rate)
prob_seroprev_time_by_age(foi, seroreversion_rate)
foi |
A dataframe containing the FoI values for different years. It should have two columns: 'year' and 'foi'. |
seroreversion_rate |
A non-negative numeric value representing the rate of seroreversion. |
A dataframe with columns 'age' and 'seropositivity'.
Computes the probability of being seropositive when Forces-of-Infection (FoIs) vary by age
probability_exact_age_varying(ages, fois, seroreversion_rate = 0)
probability_exact_age_varying(ages, fois, seroreversion_rate = 0)
ages |
Integer indicating the ages of the exposed cohorts |
fois |
Numeric atomic vector corresponding to the age-varying Force-of-Infection to simulate from |
seroreversion_rate |
Non-negative seroreversion rate. Default is 0. |
vector of probabilities of being seropositive for age-varying FoI including seroreversion (ordered from youngest to oldest individuals)
Computes the probability of being seropositive when Forces-of-Infection (FoIs) vary by time
probability_exact_time_varying(years, fois, seroreversion_rate = 0)
probability_exact_time_varying(years, fois, seroreversion_rate = 0)
years |
Integer indicating the years covering the birth ages of the sample |
fois |
Numeric atomic vector corresponding to the age-varying FoI to simulate from |
seroreversion_rate |
Non-negative seroreversion rate. Default is 0. |
vector of probabilities of being seropositive for age-varying FoI including seroreversion (ordered from youngest to oldest individuals)
Sets initialization function for sampling
set_foi_init(foi_init, is_log_foi, foi_index)
set_foi_init(foi_init, is_log_foi, foi_index)
foi_init |
Initialization function for sampling. If null, default is chosen depending on the foi-scale of the model |
is_log_foi |
Boolean to set logarithmic scale in the FoI |
foi_index |
Integer vector specifying the age-groups for which Force-of-Infection values will be estimated. It can be specified by means of get_foi_index |
Function specifying initialization vector for the Force-of-Infection
data(chagas2012) foi_index <- get_foi_index(chagas2012, group_size = 5, model_type = "age") foi_init <- set_foi_init( foi_init = NULL, is_log_foi = FALSE, foi_index = foi_index )
data(chagas2012) foi_index <- get_foi_index(chagas2012, group_size = 5, model_type = "age") foi_init <- set_foi_init( foi_init = NULL, is_log_foi = FALSE, foi_index = foi_index )
Set stan data defaults for sampling
set_stan_data_defaults(stan_data, is_log_foi = FALSE, is_seroreversion = FALSE)
set_stan_data_defaults(stan_data, is_log_foi = FALSE, is_seroreversion = FALSE)
stan_data |
List to be passed to rstan |
is_log_foi |
Boolean to set logarithmic scale in the FoI |
is_seroreversion |
Boolean specifying whether to include seroreversion rate estimation in the model |
List with default values of stan data for sampling
Sets Cauchy distribution parameters for sampling
sf_cauchy(location = 0, scale = 1)
sf_cauchy(location = 0, scale = 1)
location |
Location of the Cauchy distribution |
scale |
Scale of the Cauchy distribution |
List with specified statistics and name of the distribution
my_prior <- sf_cauchy()
my_prior <- sf_cauchy()
Sets empty prior distribution
sf_none()
sf_none()
List with the name of the empty distribution
Sets normal distribution parameters for sampling
sf_normal(mean = 0, sd = 1)
sf_normal(mean = 0, sd = 1)
mean |
Mean of the normal distribution |
sd |
Standard deviation of the normal distribution |
List with specified statistics and name of the model
my_prior <- sf_normal()
my_prior <- sf_normal()
Sets uniform distribution parameters for sampling
sf_uniform(min = 0, max = 10)
sf_uniform(min = 0, max = 10)
min |
Minimum value of the random variable of the uniform distribution |
max |
Maximum value of the random variable of the uniform distribution |
List with specified statistics and name of the model
my_prior <- sf_uniform()
my_prior <- sf_uniform()
This function generates binned serosurvey data based on either a time-varying FoI model, an age-varying FoI model, or an age-and-time-varying FoI model. In all cases, it is possible to optionally include seroreversion. This function allows construction of serosurveys with binned age groups, and it generates uncertainty in the distribution of a sample size within an age bin through multinomial sampling.
simulate_serosurvey(model, foi, survey_features, seroreversion_rate = 0)
simulate_serosurvey(model, foi, survey_features, seroreversion_rate = 0)
model |
A string specifying the model type which can be either '"age"', '"time"', '"age-time"'. |
foi |
A dataframe containing the FoI values. For time-varying models the columns should be:
For age-varying models the columns should be:.
For age-and-time-varying models the columns should be:
|
survey_features |
A dataframe containing information about the binned age groups and sample sizes for each. It should contain columns:
The resulting age intervals are closed to the left |
seroreversion_rate |
A non-negative value determining the rate of seroreversion (per year). Default is 0. |
A dataframe with simulated serosurvey data, including age group information, overall sample sizes, the number of seropositive individuals, and other survey features.
# time-varying model foi_df <- data.frame( year = seq(1990, 2009, 1), foi = rnorm(20, 0.1, 0.01) ) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey( model = "time", foi = foi_df, survey_features = survey_features) # age-varying model foi_df <- data.frame( age = seq(1, 20, 1), foi = rnorm(20, 0.1, 0.01) ) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey( model = "age", foi = foi_df, survey_features = survey_features) # age-and-time varying model foi_df <- expand.grid( year = seq(1990, 2009, 1), age = seq(1, 20, 1) ) foi_df$foi <- rnorm(20 * 20, 0.1, 0.01) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey( model = "age-time", foi = foi_df, survey_features = survey_features)
# time-varying model foi_df <- data.frame( year = seq(1990, 2009, 1), foi = rnorm(20, 0.1, 0.01) ) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey( model = "time", foi = foi_df, survey_features = survey_features) # age-varying model foi_df <- data.frame( age = seq(1, 20, 1), foi = rnorm(20, 0.1, 0.01) ) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey( model = "age", foi = foi_df, survey_features = survey_features) # age-and-time varying model foi_df <- expand.grid( year = seq(1990, 2009, 1), age = seq(1, 20, 1) ) foi_df$foi <- rnorm(20 * 20, 0.1, 0.01) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey( model = "age-time", foi = foi_df, survey_features = survey_features)
This function generates binned serosurvey data based on an age-varying FoI model, optionally including seroreversion. This function allows construction of serosurveys with binned age groups, and it generates uncertainty in the distribution of a sample size within an age bin through multinomial sampling.
simulate_serosurvey_age(foi, survey_features, seroreversion_rate = 0)
simulate_serosurvey_age(foi, survey_features, seroreversion_rate = 0)
foi |
A dataframe containing the FoI values. For time-varying models the columns should be:
For age-varying models the columns should be:.
For age-and-time-varying models the columns should be:
|
survey_features |
A dataframe containing information about the binned age groups and sample sizes for each. It should contain columns:
The resulting age intervals are closed to the left |
seroreversion_rate |
A non-negative value determining the rate of seroreversion (per year). Default is 0. |
A dataframe with simulated serosurvey data, including age group information, overall sample sizes, the number of seropositive individuals, and other survey features.
# specify FOIs for each year foi_df <- data.frame( age = seq(1, 20, 1), foi = rnorm(20, 0.1, 0.01) ) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey_age( foi_df, survey_features)
# specify FOIs for each year foi_df <- data.frame( age = seq(1, 20, 1), foi = rnorm(20, 0.1, 0.01) ) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey_age( foi_df, survey_features)
This function generates binned serosurvey data based on an age-and-time-varying FoI model, optionally including seroreversion. This function allows construction of serosurveys with binned age groups, and it generates uncertainty in the distribution of a sample size within an age bin through multinomial sampling.
simulate_serosurvey_age_time(foi, survey_features, seroreversion_rate = 0)
simulate_serosurvey_age_time(foi, survey_features, seroreversion_rate = 0)
foi |
A dataframe containing the FoI values. For time-varying models the columns should be:
For age-varying models the columns should be:.
For age-and-time-varying models the columns should be:
|
survey_features |
A dataframe containing information about the binned age groups and sample sizes for each. It should contain columns:
The resulting age intervals are closed to the left |
seroreversion_rate |
A non-negative value determining the rate of seroreversion (per year). Default is 0. |
A dataframe with simulated serosurvey data, including age group information, overall sample sizes, the number of seropositive individuals, and other survey features.
# specify FOIs for each year foi_df <- expand.grid( year = seq(1990, 2009, 1), age = seq(1, 20, 1) ) foi_df$foi <- rnorm(20 * 20, 0.1, 0.01) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey_age_time( foi_df, survey_features)
# specify FOIs for each year foi_df <- expand.grid( year = seq(1990, 2009, 1), age = seq(1, 20, 1) ) foi_df$foi <- rnorm(20 * 20, 0.1, 0.01) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey_age_time( foi_df, survey_features)
This simulation method assumes only that the model system can be written as a piecewise-linear ordinary differential equation system.
simulate_serosurvey_general( construct_A_fun, calculate_seroprev_fun, initial_conditions, survey_features, ... )
simulate_serosurvey_general( construct_A_fun, calculate_seroprev_fun, initial_conditions, survey_features, ... )
construct_A_fun |
A function that constructs a matrix that defines the multiplier term in the linear ODE system. |
calculate_seroprev_fun |
A function which takes the state vector and returns the seropositive fraction. |
initial_conditions |
The initial state vector proportions for each birth cohort. |
survey_features |
A dataframe containing information about the binned age groups and sample sizes for each. It should contain columns:
The resulting age intervals are closed to the left |
... |
Additional parameters for |
A dataframe with simulated serosurvey data, including age group information, overall sample sizes, the number of seropositive individuals, and other survey features.
foi_df_time <- data.frame( year = seq(1946, 2025, 1), foi = c(rep(0, 40), rep(1, 40)) ) foi_df_age <- data.frame( age = 1:80, foi = 2 * dlnorm(1:80, meanlog = 3.5, sdlog = 0.5) ) # generate age and time dependent FoI from multipliers foi_age_time <- expand.grid( year = foi_df_time$year, age = foi_df_age$age ) |> dplyr::left_join(foi_df_age, by = "age") |> dplyr::rename(foi_age = foi) |> dplyr::left_join(foi_df_time, by = "year") |> dplyr::rename(foi_time = foi) |> dplyr::mutate(foi = foi_age * foi_time) |> dplyr::select(-c("foi_age", "foi_time")) # create survey features for simulating max_age <- 80 n_sample <- 50 survey_features <- data.frame( age_min = seq(1, max_age, 5), age_max = seq(5, max_age, 5)) |> dplyr::mutate(n_sample = rep(n_sample, length(age_min)) ) # simulate survey from age and time FoI serosurvey <- simulate_serosurvey( model = "age-time", foi = foi_age_time, survey_features = survey_features )
foi_df_time <- data.frame( year = seq(1946, 2025, 1), foi = c(rep(0, 40), rep(1, 40)) ) foi_df_age <- data.frame( age = 1:80, foi = 2 * dlnorm(1:80, meanlog = 3.5, sdlog = 0.5) ) # generate age and time dependent FoI from multipliers foi_age_time <- expand.grid( year = foi_df_time$year, age = foi_df_age$age ) |> dplyr::left_join(foi_df_age, by = "age") |> dplyr::rename(foi_age = foi) |> dplyr::left_join(foi_df_time, by = "year") |> dplyr::rename(foi_time = foi) |> dplyr::mutate(foi = foi_age * foi_time) |> dplyr::select(-c("foi_age", "foi_time")) # create survey features for simulating max_age <- 80 n_sample <- 50 survey_features <- data.frame( age_min = seq(1, max_age, 5), age_max = seq(5, max_age, 5)) |> dplyr::mutate(n_sample = rep(n_sample, length(age_min)) ) # simulate survey from age and time FoI serosurvey <- simulate_serosurvey( model = "age-time", foi = foi_age_time, survey_features = survey_features )
This function generates binned serosurvey data based on a time-varying FoI model, optionally including seroreversion. This function allows construction of serosurveys with binned age groups, and it generates uncertainty in the distribution of a sample size within an age bin through multinomial sampling.
simulate_serosurvey_time(foi, survey_features, seroreversion_rate = 0)
simulate_serosurvey_time(foi, survey_features, seroreversion_rate = 0)
foi |
A dataframe containing the FoI values. For time-varying models the columns should be:
For age-varying models the columns should be:.
For age-and-time-varying models the columns should be:
|
survey_features |
A dataframe containing information about the binned age groups and sample sizes for each. It should contain columns:
The resulting age intervals are closed to the left |
seroreversion_rate |
A non-negative value determining the rate of seroreversion (per year). Default is 0. |
A dataframe with simulated serosurvey data, including age group information, overall sample sizes, the number of seropositive individuals, and other survey features.
# specify FOIs for each year foi_df <- data.frame( year = seq(1990, 2009, 1), foi = rnorm(20, 0.1, 0.01) ) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey_time( foi_df, survey_features)
# specify FOIs for each year foi_df <- data.frame( year = seq(1990, 2009, 1), foi = rnorm(20, 0.1, 0.01) ) survey_features <- data.frame( age_min = c(1, 3, 15), age_max = c(2, 14, 20), n_sample = c(1000, 2000, 1500)) serosurvey <- simulate_serosurvey_time( foi_df, survey_features)
Summarise central estimate
summarise_central_estimate( seromodel, serosurvey, alpha, par_name = "seroreversion_rate", central_estimate_digits = 2 )
summarise_central_estimate( seromodel, serosurvey, alpha, par_name = "seroreversion_rate", central_estimate_digits = 2 )
seromodel |
stan_fit object obtained from sampling a model with fit_seromodel |
serosurvey |
|
alpha |
1 - alpha indicates the credibility level to be used |
par_name |
String specifying the parameter to be extracted
from |
central_estimate_digits |
Number of central estimate digits |
Text summarising specified central estimate
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) summarise_central_estimate( seromodel, veev2012, alpha = 0.05, par_name = "foi" )
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) summarise_central_estimate( seromodel, veev2012, alpha = 0.05, par_name = "foi" )
Extract specified loo estimate
summarise_loo_estimate( seromodel, par_loo_estimate = "elpd_loo", loo_estimate_digits = 2 )
summarise_loo_estimate( seromodel, par_loo_estimate = "elpd_loo", loo_estimate_digits = 2 )
seromodel |
stan_fit object obtained from sampling a model with fit_seromodel |
par_loo_estimate |
Name of the loo estimate to be extracted. Available options are:
For additional information refer to loo. |
loo_estimate_digits |
Number of loo estimate digits |
Text summarising specified loo estimate
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) summarise_loo_estimate(seromodel)
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) summarise_loo_estimate(seromodel)
Summarise specified model
summarise_seromodel( seromodel, serosurvey, alpha = 0.05, par_loo_estimate = "elpd_loo", loo_estimate_digits = 1, central_estimate_digits = 2, rhat_digits = 2 )
summarise_seromodel( seromodel, serosurvey, alpha = 0.05, par_loo_estimate = "elpd_loo", loo_estimate_digits = 1, central_estimate_digits = 2, rhat_digits = 2 )
seromodel |
stan_fit object obtained from sampling a model with fit_seromodel |
serosurvey |
|
alpha |
1 - alpha indicates the credibility level to be used |
par_loo_estimate |
Name of the loo estimate to be extracted. Available options are:
For additional information refer to loo. |
loo_estimate_digits |
Number of loo estimate digits |
central_estimate_digits |
Number of central estimate digits |
rhat_digits |
Number of rhat estimate digits |
A list summarising the specified model
model_name
Name of the model
elpd
elpd and its standard deviation
foi
Estimated foi with credible interval (for 'constant' model)
foi_rhat
foi rhat value (for 'constant' model)
seroreversion_rate
Estimated seroreversion rate
seroreversion_rate_rhat
Seroreversion rate rhat value
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) summarise_seromodel(seromodel, veev2012)
data(veev2012) seromodel <- fit_seromodel(veev2012, iter = 100) summarise_seromodel(seromodel, veev2012)
Datasets that measure the seroprevalence of IgG antibodies against VEEV in a rural village in Panamá in 2012 [Carrera2020].
data(veev2012)
data(veev2012)
veev2012
A <data.frame>
with 4 rows and 5 columns:
Year in which the serosurvey was conducted
Number of collected samples per age group
Number of positive samples per age group
Age group minimal age
Age group maximal age
data(veev2012)
data(veev2012)