The prevalence package

The prevalence package provides tools for prevalence assessment studies. Currently, two functions are available to estimate true prevalence from apparent prevalence in a Bayesian framework:

The Bayesian method for estimating true prevalence from individual samples is also available as an online R/Shiny application


Download and installation

Version 0.1.0 of the prevalence package is available on CRAN. Source code of the development version is on GitHub.

To install the prevalence package in R, the following steps have to be followed:
  1. download and install the latest version of R via cran.r-project.org
  2. download and install JAGS via mcmc-jags.sourceforge.net
  3. download and install package coda: install.packages("coda")
  4. download and install package rjags: install.packages("rjags")
  5. download and install package prevalence: install.packages("prevalence")
Finally, to load prevalence in your R session, simply run library(prevalence)

Individual samples: function truePrev()

Implementation

truePrev(x, n, SE = 1, SP = 1, prior = c(1, 1), conf.level = 0.95,
         nchains = 2, burnin = 5000, update = 10000,
         verbose = FALSE, plot = FALSE)


Different distributions are available to specify test sensitivity SE and specificity SP. Distribution parameters must be specified in a named list() as follows:

Fixed: list(dist = "fixed", par)
This is the default distribution used when a single numeric value is specified for SE or SP
Uniform: list(dist = "uniform", min, max)
Beta: list(dist = "beta", alpha, beta)
Beta-PERT: list(dist = "pert", method, a, m, b, k)
method must be "Classic" or "Vose";
a denotes the pessimistic (minimum) estimate, m the most likely estimate, and b the optimistic (maximum) estimate;
k denotes the scale parameter.

Type ?betaPERT in the R console for more information.
Beta-Expert: list(dist = "beta-expert", mode, mean, lower, upper, p)
mode denotes the most likely estimate, mean the mean estimate;
lower denotes the lower bound, upper the upper bound;
p denotes the confidence level of the expert.
Only mode or mean should be specified; lower and upper can be specified together or alone.

Type ?betaExpert in the R console for more information.

Examples

## Taenia solium cysticercosis in Nepal
## 142 positives out of 742 pigs sampled

## Model SE and SP based on literature data
## Sensitivity ranges uniformly between 60% and 100%
## Specificity ranges uniformly between 75% and 100%
SE <- list(dist = "uniform", min = 0.60, max = 1.00)
SP <- list(dist = "uniform", min = 0.75, max = 1.00)
truePrev(x = 142, n = 742, SE = SE, SP = SP)

## Model SE and SP based on expert opinions
## Sensitivity lies in between 60% and 100%; most likely value is 90%
## Specificity is with 95% confidence larger than 75%; most likely value is 90%
SE <- list(dist = "pert", a = 0.60, m = 0.90, b = 1.00)
SP <- list(dist = "beta-expert", mode = 0.90, lower = 0.75, p = 0.95)
truePrev(x = 142, n = 742, SE = SE, SP = SP)

## Model SE and SP as fixed values (each 90%)
truePrev(x = 142, n = 742, SE = 0.90, SP = 0.90)


More information

For more information on this function, type ?truePrev in the R console.

An introduction to Frequentist and Bayesian methods for assessing true prevalence from individual samples has been published as a Hints & Kinks paper in International Journal of Public Health:

Misclassification errors in prevalence estimation: Bayesian handling with care.
Speybroeck N, Devleesschauwer B, Joseph L, Berkvens D


Pooled samples: function truePrevPools()

Implementation

truePrevPools(x, n, SE = 1, SP = 1, prior = c(1, 1), conf.level = 0.95,
              nchains = 2, burnin = 5000, update = 10000,
              verbose = FALSE, plot = FALSE)


Different distributions are available to specify test sensitivity SE and specificity SP. See above for more details.

Note that SE and SP must correspond to the test characteristics for testing individual samples; truePrevPools() will calculate SEpool and SPpool, the sensitivity and specificitiy for testing pooled samples, based on Boelaert et al., 2000.

Example

## Sandflies in Aurabani, Nepal, 2007
pool_results <- c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0)
pool_sizes <- c(2, 1, 6, 10, 1, 7, 1, 4, 1, 3)

## Sensitivity ranges uniformly between 60% and 95%
## Specificity is considered to be 100%
SE <- list(dist = "uniform", min = 0.60, max = 0.95)
truePrevPools(x = pool_results, n = pool_sizes, SE = SE, SP = 1)


More information

For more information on this function, type ?truePrevPools in the R console.

An overview of Frequentist and Bayesian methods for estimating population prevalence based on pooled samples has been published in Medical and Veterinary Entomology:

Estimating the prevalence of infections in vector populations using pools of samples
Speybroeck N, Williams CJ, Lafia KB, Devleesschauwer B, Berkvens D


This paper provides the following R code: