Skip to content

nimbleEcology package: ecological statistics driven by NIMBLE

perrydv edited this page Mar 17, 2017 · 10 revisions

Background

A great deal of statistical analysis in ecology uses hierarchical models written in BUGS code, from which MCMC is run in WinBUGS or JAGS. These include models for capture-recapture, species occupancy of sites, spatial abundance accounting for imperfect detection and specific sampling methods such as distance sampling, and spatial capture-recapture from camera trap arrays. Most of these models have latent states, often discrete latent states.

The R package nimble provides a newer option for running MCMC and other algorithms on models written in BUGS code. NIMBLE provides a general hierarchical modeling programming system within R, including a compiler that generates and compiles model and algorithm-specific C++. NIMBLE's MCMC system allows writing new samplers and customization of sampler configuration for specific models. One can also program methods for model selection, model validation, or other goals. Finally, NIMBLE extends the BUGS language in ways that allow more flexible model construction, including definition of multiple alternative models from the same code. Thus NIMBLE provides an appealing engine for analyses of the kinds of hierarchical models used widely in ecology.

A paper by Turek et al. (available on arXiv here ) illustrates the flexibility of NIMBLE for capture-recapture models by embedding hidden Markov models (HMMs) explicitly in BUGS code. This uses NIMBLE's extension to the BUGS language of allowing new distributions and functions to be written in R (as nimbleFunctions), used in BUGS, and automatically compiled via C++. This paper shows improved performance compared to JAGS ranging from minor to multiple orders-of-magnitude.

This project would be to build a new R package providing high-level user interfaces to many kinds of ecological models and implement the analyses internally using NIMBLE.

** Related work

There are numerous related packages, but in various ways they lack the flexibility and generality we envision. For capture-recapture, related packages include Mark (with R interface RMark), R package marked, Capture, and E-SURGE. For occupancy and abundance modeling, related packages include PRESENCE and R package unmarked. For spatial capture-recapture, related packages include oSCR and R package secr.

The limitations of these packages are illustrated by the fact that textbooks nevertheless introduce ecologists to writing these models in BUGS code for its flexibility (e.g. King et al. 2009, Royle and Dorazio 2008, Kery and Schaub 2011, [Royle and Kery 2015] (https://www.mbr-pwrc.usgs.gov/pubanalysis/keryroylebook/)). This motivates our goal of providing more efficient and more extensive tools for automating analysis of these models based on BUGS code. NIMBLE is more flexible for writing models and algorithms and generally faster for MCMC than other packages.

** Details of your coding project

As currently envisioned, the coding project would involve roughly the following steps for each kind of model:

  1. Writing a function to process high-level model formulae customized to each kind of model (similar in principle to lm or glm, etc.). Such a function would have to process the code of a formula and expand variables into format(s) such as a model matrix for use in models created by NIMBLE from BUGS code. It would manage the construction of a NIMBLE model object. It would return the model object or, optionally, pass it directly on to an analysis function.

  2. Writing one or more quite general BUGS code files to be used by step 1. The BUGS code must accommodate the full range of models available from model formulae in the high-level specifications of step 1. NIMBLE's extensions to the BUGS language makes this easier by allowing conditional statements that are evaluated each time the BUGS code is processed to make a new model.

  3. Writing functions to drive general analyses, such as MCMC, for models created by the functions of step 1. This would use NIMBLE's MCMC configuration and customization system to assign good sampler choices to particular kinds of models.

  4. Writing appropriate classes and utility functions to manage models and analysis results.

  5. Writing documentation, unit tests, and a vignette on the whole package.

** Expected impact

The new package will make it easier for researchers to accomplish MCMC and related methods for many ecological statistical models more efficiently that in other packages (both in terms of human time and computer time).

** Mentors

Perry de Valpine and Daniel Turek

** Tests

Since we have identified a student for this project, we do not invite completion of tests by others.

Clone this wiki locally