-
Notifications
You must be signed in to change notification settings - Fork 31
nimbleEcology package: ecological statistics driven by NIMBLE
A great deal of statistical analysis in ecology uses hierarchical models written in BUGS code, from which MCMC is run in WinBUGS or JAGS. These include models for capture-recapture, species occupancy of sites, spatial abundance accounting for imperfect detection and specific sampling methods such as distance sampling, and spatial capture-recapture from camera trap arrays. Most of these models have latent states, often discrete latent states.
The R package [nimble](https://cran.r-project.org/web/packages/nimble/index.html) provides a newer option for running MCMC and other algorithms on models written in BUGS code. [NIMBLE](http://r-nimble.org) provides a general hierarchical modeling programming system within R, including a compiler that generates and compiles model and algorithm-specific C++. NIMBLE’s MCMC system allows writing new samplers and customization of sampler configuration for specific models. One can also program methods for model selection, model validation, or other goals. Finally, NIMBLE extends the BUGS language in ways that allow more flexible model construction, including definition of multiple alternative models from the same code. Thus NIMBLE provides an appealing engine for analyses of the kinds of hierarchical models used widely in ecology.
A paper by [Turek et al.](https://link.springer.com/article/10.1007/s10651-016-0353-z) (available on arXiv [here](https://arxiv.org/abs/1601.02698) ) illustrates the flexibility of NIMBLE for capture-recapture models by embedding hidden Markov models (HMMs) explicitly in BUGS code. This uses NIMBLE’s extension to the BUGS language of allowing new distributions and functions to be written in R (as nimbleFunctions), used in BUGS, and automatically compiled via C++. This paper shows improved performance compared to JAGS ranging from minor to multiple orders-of-magnitude.
This project would be to build a new R package providing high-level user interfaces to many kinds of ecological models and implement the analyses internally using NIMBLE.
There are numerous related packages, but in various ways they lack the flexibility and generality we envision. For capture-recapture, related packages include [Mark](http://warnercnr.colostate.edu/~gwhite/mark/mark.htm) (with R interface [RMark](https://cran.r-project.org/web/packages/RMark/index.html)), R package [marked](https://cran.r-project.org/web/packages/marked/index.html), [Capture](https://www.mbr-pwrc.usgs.gov/software/capture.html), and [E-SURGE](http://www.cefe.cnrs.fr/fr/recherche/bc/bbp/1045-desc/264-logiciels). For occupancy and abundance modeling, related packages include [PRESENCE](https://www.mbr-pwrc.usgs.gov/software/presence.shtml) and R package [unmarked](https://cran.r-project.org/web/packages/unmarked/index.html). For spatial capture-recapture, related packages include [oSCR](https://sites.google.com/site/spatialcapturerecapture/oscr-package) and R package [secr](https://cran.r-project.org/web/packages/secr/index.html).
The limitations of these packages are illustrated by the fact that textbooks nevertheless introduce ecologists to writing these models in BUGS code for its flexibility (e.g. [King et al. 2009](https://www.crcpress.com/Bayesian-Analysis-for-Population-Ecology/King-Morgan-Gimenez-Brooks/p/book/9781439811870), [Royle and Dorazio 2008](http://www.sciencedirect.com/science/book/9780123740977), [Kery and Schaub 2011](http://www.sciencedirect.com/science/book/9780123870209), [Royle and Kery 2015] (https://www.mbr-pwrc.usgs.gov/pubanalysis/keryroylebook/)). This motivates our goal of providing more efficient and more extensive tools for automating analysis of these models based on BUGS code. NIMBLE is more flexible for writing models and algorithms and generally faster for MCMC than other packages.
As currently envisioned, the coding project would involve roughly the following steps for each kind of model:
- Writing a function to process high-level model formulae customized to each kind of model (similar in principle to lm or glm, etc.). Such a function would have to process the code of a formula and expand variables into format(s) such as a model matrix for use in models created by NIMBLE from BUGS code. It would manage the construction of a NIMBLE model object. It would return the model object or, optionally, pass it directly on to an analysis function.
- Writing one or more quite general BUGS code files to be used by step 1. The BUGS code must accommodate the full range of models available from model formulae in the high-level specifications of step 1. NIMBLE’s extensions to the BUGS language makes this easier by allowing conditional statements that are evaluated each time the BUGS code is processed to make a new model.
- Writing functions to drive general analyses, such as MCMC, for models created by the functions of step 1. This would use NIMBLE’s MCMC configuration and customization system to assign good sampler choices to particular kinds of models.
- Writing appropriate classes and utility functions to manage models and analysis results.
- Writing documentation, unit tests, and a vignette on the whole package.
The new package will make it easier for researchers to accomplish MCMC and related methods for many ecological statistical models more efficiently that in other packages (both in terms of human time and computer time).
Perry de Valpine and Daniel Turek
Since we have identified a student for this project, we do not invite completion of tests by others.