Our JAGS preferences

HOME At the start of the recent "Bayes with JAGS" workshop, we wrote down some ideas to try to make the code we give to participants more consistent: a sort of style guide for JAGS. Once you stop writing WinBUGS-compatible code, lot of options open up.

See also the later post here.

A key point is that JAGS can use vectors and matrices (WinBUGS can't) which makes your code run faster.

= We want to use the nice features of JAGS, we don't care if our models won't run with WinBUGS or OpenBUGS.

- JAGS works with vectors and matrices. So if Beta is a vector of coefficients and X is a matrix of covariates (with a column of ones to take care of the intercept), you can produce a whole vector of mu's with:
mu <- X %*% Beta  #:)
The link functions still need to be done one by one in a loop
logit(phi[i]) <- mu[i]  #:|
so it may be easier to just use vector multiplication
logit(phi[i]) <- sum(Beta * X[i, ])

Either way is easier than
logit(phi[i]) <- Beta[1] * X[i, 1]  + Beta[2] * X[i,2] + Beta[3] * X[i,3] + etc  #:(
and about twice as fast. You don't even need to know how many covariates will be in the model when you write the model code.

JAGS runs faster with matrices or vectors! The models I tried gave time savings up to 50%.

Click here to download a ZIP file with example code.

- JAGS allows you to do arithmetic inside the parentheses of functions and distributions. A common case in occupancy models is
y[i] ~ dbin(p * z[i], n[i])  #:)
which would throw an error in WinBUGS or OpenBUGS. For those you'd have to do something like
pz[i] <- p * z[i]  #:(
y[i] ~ dbin(pz[i], n[i].

- The same applies to square brackets for indexing. Suppose you have a 0/1 parameter for sex coming from sex[i] ~ dbern(sexRatio) and you want different values of p for males and females, you can use p[sex[i] + 1] in JAGS. In Win/OpenBUGS you'd have to create a 1/2 parameter first:
sex12[i] <- sex[i] + 1  #:(
then use p[sex12[i]].

- JAGS has a length function for vectors, which you can use for looping:
for(i in 1:length(y)) {
  y[i] ~ blabla
}
There is also a dim function for arrays, but that has to be placed in a separate data block before the model code (see the JAGS User Manual 2.6.3 p.11; the code on p.33 doesn't work). So it will generally be easier to specify dimensions in the data list.

= We prefer the extension .jags for JAGS model files instead of .txt.

- Then no need to include 'model' in the file name, and if you sort files by type in File Manager, all the model files will be grouped together.

- We can add .jags to the file extensions in Notepad++ and get R-style syntax highlighting.

= We prefer to put the likelihood before the priors in the model.

This is generally the way we build up models, and here we agree with John Kruschke. Though we like to build the biological model first with what Link & Barker (2010) call "the data we wish we had", and then do the observation model with the actual data.

= We prefer dbeta(1,1) for uniform priors for probabilities, instead of dunif(0,1).

It's then clear that we are dealing with a probability and it's easily adapted for informative priors. And it runs slightly faster in JAGS.

= We prefer to put priors on biologically meaningful things.

- For example: p, not logit(p); SD, not variance or precision.

- We like Marc Kéry's prior for the intercept of a logistic model with standardized predictors:
p0 ~ dbeta(1, 1)
beta0 <- logit(p0)

- We don't like commonly-used priors which are biological nonsense, such as Gamma(0.001, 0.001). Sensible Gamma priors are ok. (And in the last case we are in agreement with Andy Gelman and the Stan gang.)

= We prefer meaningful names for things instead of letters.

For example: nTraps instead of J; nOcc instead of K; AC (activity centre) instead of s; bForest instead of beta2. (Though if there are lots of predictors, a vector of coefficients - beta[1], beta[2], etc - is convenient.)

= We don't use 'superpopulation' to refer to the augmented population when using data augmentation.

'Superpopulation' has a specific meaning in Crosbie-Manly-Schwarz-Arneson (CMSA) open population models and using the same term invites confusion. Just wait until you want to run a CMSA model with an augmented superpopulation!

= We prefer omega for the augmentation parameter and w for the existence flag.

For occupancy modelling, we use psi for probability of occupancy and z for the 'occupied' flag (z[i] = 1 if site i is occupied, 0 otherwise). Although data augmentation is analogous to occupancy, we prefer to use different notation, because when we do multispecies occupancy models we have both.

= We like Bill Link's prior for omega.

Back in 2013, I did thousands of simulations of SECR analyses with small populations; I encountered a few cases where augmentation was not sufficient no matter how big I made M, and population estimates got more and more ridiculous as I increased M. Then came Bill Link's (2013) paper explaining the problem: there's no guarantee that the usual discrete uniform prior will converge.

So I prefer his omega ~ dbeta(0.001, 1) prior, which says that prior probability goes down as N goes up, to the usual omega ~ dbeta(1, 1), which says all values - way up into the billions - are equally probable.

= We prefer ^ notation or sqrt instead of pow.

It's what we see every day in R!

Not part of the JAGS model definition, but related to it:

= We supply initial values only when necessary.

If no initial values are supplied for a parameter, JAGS draws random values from the corresponding prior. If the prior is something like sigma ~ dunif(0, 20), there's no point in including sigma=runif(1, 0, 20) in the init list, it's just unnecessary clutter.

Though if you want to avoid the chains starting near zero or near 20, you could have sigma=runif(1, 2, 15) in there.

Sometimes random values don't work (we get a "node inconsistent with parents" error), and we need to provide appropriate values.

Added 31 Dec 2019:

= We refer to MCMC output as "draws", not "samples".

Strictly speaking, the whole output is one sample - or perhaps each chain is one sample - from the posterior; you do not have 10,000 samples! It also causes needless confusion with the actual set of observed data.

And "effective sample size" becomes "effective chain length". You only caught 29 squirrels, but the effective sample size is not 9,984!

Updated 31 Dec 2019 by Mike Meredith