Projects

Orange tree growth

The orange tree growth data, originally taken from Draper & Smith (1981) and reproduced in Draper & Smith (1998), p. 559, was used by Pinheiro & Bates (2000, Ch. 8.2) to illustrate how a logistic growth curve model with random effects can be implemented with the S-Plus function nlme. The data contain measurements of trunk circumferences (mm) made at seven occasions for each of five orange trees. The data is available within R in "datasets::Orange".

The errors within trees are assumed to be normally distributed and independent; the data can be straightforwardly analyzed either by standard nonlinear regression (assuming each tree follows an independent growth curve) or by nonlinear mixed-effects models (allowing the growth parameters to be random variables from an underlying population distribution).

Project report:: OrangeTree.pdf; sim-bias.pdf; sim-ci.pdf

[Source code]

Mineralization of terbuthylazine

Terbuthylazine is a herbicide used in agriculture. It is a so-called s-triazin like atrazine, which has been banned in Denmark after suspicion of causing cancer.
Terbuthylazine can be bound to the soil, but free terbuthylazine can be washed into the drinking water. Some bacteria can mineralize it. This data is part of a larger experiment to determine the ability of certain bacteria to mineralize terbuthylazine, and to estimate the mineralization rate.

This is a fairly straightforward nonlinear least-squares problem, with normally distributed residuals and no random effects or latent variables. The deterministic part of the model is the solution to a set of coupled ordinary differential equations (ODEs) for the concentrations in different compartments. Because the ODEs are linear, the deterministic solution can be found directly in terms of a matrix exponential, for which functions exist in all three of ADMB, BUGS, and R. From there it is simply a matter of defining a normal likelihood, or equivalently a least-squares expression, and minimizing it. The main differences appear in the speed and robustness of the matrix exponential formulations in different software tools.

author: Anders Nielsen

Project report:: min.pdf

[Source code]

Fitting N-mixture models with random observer effects

Most protocols for estimating abundance while taking detectability into account require that individuals can be individually identified, a condition which often requires capturing and marking of animals. This is costly and therefore the N-mixture, or binomial mixture, model of Royle (2004) is an appealing alternative: this model yields estimates of abundance from spatially and temporally replicated counts of unmarked animals alone. Typical applications of the Nmix model require the assumption of some effects as random, for instance, to account for intrinsic differences in the ability of field ornithologists to detect and identify birds.

This example shows worked BUGS and ADMB solutions to estimating an N-mixture model and demonstrates their use on simulated (pseudo-)data. (1) The BUGS and ADMB estimates of the fixed effects and variances are very similar. (2) Differences are more pronounced in the estimates of the random effects. The posterior means from WinBUGS were more accurate than the ADMB estimates (RMSE = 0.22 vs 0.43). (3) The BUGS estimates were more precise than ADMB estimates of the random effects. On average, the posterior standard deviations obtained by BUGS were 47% smaller than the standard deviations estimated by ADMB. ADMB is much faster (4 vs. 12 minutes per estimate).

authors: Richard Chandler, Marc Kery, and Hans Skaug

Project report:: nmix.pdf

[Source code]

Owl nestling negotiation

The data for this example, taken from Zuur et al. (2009) and ultimately from Roulin and Bersier (2007), quantify the number of vocalizations (sibling negotiations) by owl chicks in different nests as a function of food treatment (deprived or satiated), the sex of the parent, and arrival time of the parent at the nest.

This problem is basically a zero-inflated generalized linear mixed model, where numbers of negotiations are the response variable, food treatment/arrival time/parental sex are the fixed-effect predictors, and sites are a random effect. The presence of zero-inflation puts the problem beyond standard GLMM implementations. In R, the MCMCglmm package allows for zero-inflation, or one can implement an expectation-maximization function. The problem is relatively straightforward in JAGS, or in ADMB, and one can also use the glmmADMB package in R.

authors: Ben Bolker, Mollie Brooks, Beth Gardner, Cleridy Lennert, Mihoko Minami

Project report:: owls.pdf

[Source code]

Skate mortality: Bayesian state-space model

The goal of the model was to obtain decadal mortality estimates of three different size classes of winter skates (Leucoraja ocellata) on the eastern Scotian Shelf. The time series are largely non-informative for several of the model parameters (catchability, recruitment rate, and stage transition probability), so informative Bayesian priors are used.

The model described here is a Bayesian state-space model implemented in both JAGS and AD Model Builder. The model description and alternative model formulations are fully described in Swain et al. (2009)

authors: Trevor Davies and Steve Martell

Project report:: skate.pdf

[Source code]

Tadpole mortality as a function of size

The data are originally from Vonesh and Bolker (2005), describing the numbers of reed frog (Hyperolius spinigularis) tadpoles killed by predators as a function of size in a small-scale field trial. Our main interest is in a quantitative description of the "window of vulnerability", defined as the unimodal pattern of proportion killed as a function of size. In various contexts, we can use this description either to describe and test differences among treatments (e.g., does the window of vulnerability differ by predator size, or with tadpoles exposed to different predator cues?) or to project the effects of growth and mortality rates through a life stage. See the reference above and McCoy et al. (2011) for more details and examples.

This basic example is essentially a maximum likelihood estimation problem with a binomial response variable. The data set is small, there are no random effects or latent variables, and the problem is low-dimensional, with only a single predictor and a single response variable and only three parameters in the statistical model used.

author: Ben Bolker

Project report:: tadpole.pdf

[Source code]

Theta-logistic population growth model

The example is a theta-logistic nonlinear state-space population model. The population size is modelled as a nonlinear function of its previous size, with a discrete-time theta-logistic process model: N(t+1)=theta-logistic(N(t)) plus a normally distributed process error, and the observation error is also normally distributed. This example uses simulated data from the same model to test it. More details are available in Pedersen et al. (2011).

AD Model Builder is fastest, but requires the most code.
JAGS is slower, but not too bad for this relatively simple problem (and produces much wider credible/confidence intervals).
A hidden Markov model can be implemented in R, but takes some effort and is quite slow.

author: Casper W. Berg

Project report:: theta.pdf

[Source code]

Weeds: Modeling weed density over time

The goal of this problem is to model weed density from 12 years of data in the form of an S-shaped curve. The data are simply 12 densities at equispaced index times. The suggested model was a three-parameter logistic function, though an extension to estimate the variance around the model is also of interest.

The problem is relatively difficult, especially in its original presentation, as it is badly scaled and there are nearly flat areas of the sum of squares or likelihood surface. Moreover, the Hessian at the solution is effectively singular, so methods based on Newton's iteration do rather badly, while crude approaches such as Nelder-Mead may do better if they can be scaled appropriately.

author: John C. Nash and Anders Nielsen and Ben Bolker

Project report:: weeds.pdf

[Source code]

Wildflowers

These data are from E. Crone and colleagues' long-term study of stages, flowering, and seed pod production of Astragulus scaphoides. The model looks at individual flowering as a function of the previous year's stage and seed production.

This is a binomial generalized linear mixed model for flowering probability with three random effects: intercept and effect of size across individuals and intercept variation across years.

authors: Elizabeth Crone, Mollie Brooks, and Perry de Valpine

Project report:: lme4_confidence_intervals.pdf; wildflower.pdf; wildflower_summary_figures.pdf

[Source code]

Document Actions

Print this

Sections

Personal tools

Projects

Orange tree growth

Mineralization of terbuthylazine

Fitting N-mixture models with random observer effects

Owl nestling negotiation

Skate mortality: Bayesian state-space model

Tadpole mortality as a function of size

Theta-logistic population growth model

Weeds: Modeling weed density over time

Wildflowers

Document Actions