In a previous post we looked at GOF tests for occupancy models.
Here we'll look at similar tests for spatially-explicit
capture-recapture data, and go on to other posterior predictive
checks. This post uses the results of analysis of data
provided by Rajan Amin of the Zoological Society of London (ZSL).
The study deployed 44 detectors for 130 days and detected 22
animals.
The 'p0(sex) sigma(sex)' model
Initial modelling showed a clear difference in p0 and sigma
for males and females, but the fit was not good even after
allowing for this.
We used the approach of
Royle et al (2014, section 8.4.2),
calculating 3 fit statistics, all using Freeman-Tukey
discrepancies:
- T1 : individual x detector frequencies, aggregating the
binary data by animals and detectors
- T2 : individual encounter frequencies, aggregating for
each animal
- T3 : detector frequencies, aggregating for each detector
The JAGS code for this model is shown below. The raw data are
binary, 1/0 for each occasion, and we aggregate these so that Y
is an animals x detectors matrix with the number of detections,
augmented with extra rows of zeros to give M rows
total; nOcc is a vector with the number of
occasions each detector was active; sex is a vector
of length M with the sex of detected animals and NA
for undetected animals. The habitat mask and related constants
are derived from makeJAGSmask . More explanations
and comments follow the code.
model {
omega ~ dbeta(1,1) # existence parameter
pi ~ dbeta(1,1) # sex ratio
for(t in 1:2){ # separate priors for male and female
p0[t] ~ dbeta(1,1)
sigma[t] ~ dunif(0, 30) # in pixel units
sigma2[t] <- 1/(2*sigma[t]^2)
}
for(i in 1:M){
w[i] ~ dbern(omega) # w=1 if the individual is available for detection
sex[i] ~ dbern(pi)
AC[i, 1] ~ dunif(1, upperLimit[1])
AC[i, 2] ~ dunif(1, upperLimit[2])
pOK[i] <- habMat[trunc(AC[i, 1]), trunc(AC[i, 2])] # habitat check
OK[i] ~ dbern(pOK[i]) # OK[i] = 1, the ones trick
for(j in 1:nTraps){
d2[i,j] <- (AC[i,1] - trapMat[j,1])^2 + (AC[i,2] - trapMat[j,2])^2
p[i,j]<- p0[sex[i]+1] * exp(-sigma2[sex[i]+1]*d2[i,j])
Y[i,j] ~ dbin(p[i,j] * w[i], nOcc[j])
# components for fit diagnostics
Ysim[i,j] ~ dbin(p[i,j] * w[i], nOcc[j]) # simulated
Yexp[i,j] <- p[i,j] * w[i] * nOcc[j] # expected
# components for T1
err1obs[i,j] <- (sqrt(Y[i,j]) - sqrt(Yexp[i,j]))^2
err1sim[i,j] <- (sqrt(Ysim[i,j]) - sqrt(Yexp[i,j]))^2
}
# components for T2
err2obs[i] <- (sqrt(sum(Y[i,])) - sqrt(sum(Yexp[i,])))^2
err2sim[i] <- (sqrt(sum(Ysim[i,])) - sqrt(sum(Yexp[i,])))^2
}
# components for T3
for(j in 1:nTraps){
err3obs[j] <- (sqrt(sum(Y[,j])) - sqrt(sum(Yexp[,j])))^2
err3sim[j] <- (sqrt(sum(Ysim[,j])) - sqrt(sum(Yexp[,j])))^2
}
# Fit diagnostics totals
T1obs <- sum(err1obs)
T1sim <- sum(err1sim)
T2obs <- sum(err2obs)
T2sim <- sum(err2sim)
T3obs <- sum(err3obs)
T3sim <- sum(err3sim)
# Derived parameters
N[1] <- sum(w * (1-sex)) # number of females...
N[2] <- sum(w * sex) # ... males
D <- N /area * 1e8 # convert m2 to 100 km2
p0pc <- p0 * 100 # p0 as a percentage
sigmakm <- sigma * pixelWidth # sigma in km
}
The model is a standard SCR model with a half-normal
detection function and binomial likelihood for the aggregated detection
data. To this we have added
code to estimate Freeman-Tukey discrepancies between the
expected number of detections and (a) the observed detections (err.obs )
and (b) a simulated number of detections based on the current
values of the parameters at each step of the MCMC chain (err.sim ). We ran the model with jagsUI::jags and assigned
the output to out .
GOF p-value plots
We used jagsUI:pp.check
to plot the discrepancies and produce the P values:
All three GOF plots are bad, but it's not clear from these
what we should do to improve the model fit. So next we tried
comparing the simulated data, Ysim in the model
code, to the observed data with a number of biologically
relevant criteria.
Observed values and posterior predictive values
We extract the MCMC chains for Ysim with the
following code:
Ysim <- out$sims.list$Ysim
dim(Ysim)
[1] 1000 80 44
Here I had done some fairly brutal thinning to reduce the
number of draws saved to 1000 to keep object sizes small. Each
step of the MCMC process generates simulated detection data with
80 rows for the individual animals (including the all-zero
detection histories added) and 44 columns for the traps. This
corresponds to the Yobs matrix, with 22 rows for animals
detected and 44 columns for traps.
We'll begin by plotting the observed and posterior
predictions for the total number of detections and the number of
animals detected:
# 1. total detections
( obs <- sum(Yobs) )
147
sims <- apply(Ysim, 1, sum)
hist(sims, xlim=range(sims, obs), main="Total detections")
abline(v=obs, col='red', lwd=3)
legend('topright', legend=paste("p =", round(mean(sims > obs),2)), bty='n')
# 2. animals detected
( obs <- dim(Yobs)[1] )
22
getNcaps <- function(x) sum(rowSums(x) > 0)
sims <- apply(Ysim, 1, getNcaps)
hist(sims, xlim=range(sims, obs), main="Animals detected")
abline(v=obs, col='red', lwd=3)
legend('topright', legend=paste("p =", round(mean(sims > obs),2)), bty='n')
Nothing notable here. Next we'll look at some statistics
related to detector performance. The code to plot the histograms
is identical to the above and is not shown.
# 3. total detectors visited
getTrapsVisited <- function(x) {
visitsPerTrap <- colSums(x)
sum(visitsPerTrap > 0) }
( obs <- getTrapsVisited(Yobs) )
31
sims <- apply(Ysim, 1, getTrapsVisited)
... [plotting code omitted]
# 4. max visits per detector
getMaxVisitsperTrap <- function(x) {
visitsPerTrap <- colSums(x)
max(visitsPerTrap) }
( obs <- getMaxVisitsperTrap(Yobs) )
20
sims <- apply(Ysim, 1, getMaxVisitsperTrap)
... [plotting code omitted]
# 5. max detections for 1 animal at 1 detector
( obs <- max(Yobs) )
13
sims <- apply(Ysim, 1, max)
... [plotting code omitted]
Now here are some good clues: lots of detectors are not
picking up animals, and those that do detect more than the model
predicts. This points to heterogeneity among traps. One last
plot:
# 6. max detections for 1 animal overall
getMaxCaps <- function(x) {
capsPerAnimal <- rowSums(x)
max(capsPerAnimal)}
( obs <- getMaxCaps(Yobs) )
23
sims <- apply(Ysim, 1, getMaxCaps)
... [plotting code omitted]
Again this is consistent with trap heterogeneity.
The 'p0(sex+het) sigma(sex)' model
We'll try a model with a random effect for probability of
detection, p0 . We modify the code for P0 to have a
different value for each detector, with males and females being
above or below the average. The priors for p0 and
sigma now look like this:
for(t in 1:2){ # separate priors for male and female
sigma[t] ~ dunif(0, 30) # in pixel units
sigma2[t] <- 1/(2*sigma[t]^2)
}
for(j in 1:nTraps) {
lp0.trap[j] ~ dnorm(mu, tau)
logit(p0[j,1]) <- lp0.trap[j] + lp0.sex
logit(p0[j,2]) <- lp0.trap[j] - lp0.sex # average sex effect is zero.
}
mean.p0 ~ dbeta(1,1)
mu <- logit(mean.p0)
sig.p0 ~ dunif(0,5)
tau <- 1/sig.p0^2
lp0.sex ~ dnorm(0, 0.1)
And the line for p[i, j] becomes:
p[i,j] <- p0[j, sex[i]+1] * exp(-sigma2[sex[i]+1]*d2[i,j])
GOF p-value plots
The GOF P-values are now fine, so we would conclude that the
model is a good fit to the data. But let's look at the other
statistics we plotted.
Observed values and posterior predictive values
These plots show no evidence of lack of fit. Our diagnosis of
detector heterogeneity seems to have been correct and no other
problems are evident. |