Lab Repositories

Reproductive Number Estimation

We have recently created a new R package (WhiteLabRt) that improves performance on the methods proposed in Zhou et al and Li et al. This is loaded on to CRAN and uses C and Stan to improve computation time.

These functions perform the following tasks:

mlTransEpi: Using Sequencing Data to infer transmission dynamics

We created an R package that is being developed here. We describe key features of this package here.

Estimation of pairwise transmission probabilities

Using a machine learning approach, we make use of SNP distances and other meta data (e.g. individual level attibutes, spatial distance between potential infectors and infectees, differences in diagnosis dates, etc.) to estimate pairwise transmission probabilities. The method does not require a strict SNP cut-off and allows for quantification of uncertainty. The method was published in the International Journal of Epidemiology.

Estimation of odds ratios

There is interest in understanding what factors might be most or least associated with transmission of an infectious disease. We describe a method for estimating these odds ratios in a paper published in Epidemiology.

Statistical inference with data from respondent driven samples

Data collected from respondent driven samples, a method to sample social networks to identify hard to reach populations, is highly correlated and typical statistical methods do not apply. We have developed a bivariate test of association for this data that was published in the Journal of the Royal Statistical Society with code for implementing this method available here.

Tuberculosis methods

Below are some of our methods developed to assist in understanding TB epidemiology.

TB-STATIS: Measures of tuberculosis disease severity

Tuberculosis is the leading cause of death due to infectious disease globally and antibiotic treatment regimens can take months to years to successfully treat the disease. There is growing interest in treatment shortening regimens for individuals with less severe disease. With that in mind, we developed a rigorous statistical approach to estimate TB disease severity using available data at diagnosis and event-based modeling (popular in the study of cognitive decline). Our software is available here and the paper is under review.

imputeTBculture: methods to handle missing data in serially collected culture samples

In clinical trials and cohort studies of TB, it is common to collect culture samples at multiple time points during follow-up. Missing data is common in these studies and there was no well-established approach to handle this missing data. We have published a paper in BMC Medical Research Methods describing best practices for this problem and created code for practitioners to use.

Substance Use Disorder Research

Building on our initial effort to undestand the prevalence of Opioid Use Disorder using capture recapture methods in a highly cited paper in the American Journal of Public Health, we have worked to improve methods for prevalence estimation in this hard to reach population. We use multiple systems data methods to estimate prevalence in an attempt to understand the size of the population not captured by any data sources.

Spatially granular prevalence estimates

We have developed an approach inspired by capture recapture methods to perform small area estimation of prevalence. This fully bayesian approach can estimate prevalence in the presence of sparse data by leveraging spatial correlation between areas. Code for this method is found here and the paper describing this is under review.