Essentials of R and JAGS coding style

HOME I wanted to refer to Google's original style guide for R, but that is now just a few disagreements with the Tidyverse Style Guide. The latter seems to be overly prescriptive or only relevant if you are creating a package, so here's my own "just the essentials" style guide for JAGS code as well as R.

Code is meant to be read by humans. "Any fool can write code that a computer can understand. Good programmers write code that humans can understand." Martin Fowler. And you should be able to run the code line by line, even if it is inside a loop or a function definition.

Object names

My only real rule is no dots in function names, as that risks confusion with R's S3 methods.

I tend to avoid dots everywhere and prefer camelCase for all names, though snake_case is fine.

Names should be self-explanatory: you should not need to add a comment to explain what the object is or what the function does. (Ok, getting good, short names is difficult, so maybe a comment can help too.) Nor should you need to scroll up to find where the object was created to remind yourself what it is.

Good: nSites <- nrow(covariates)
Bad: K <- nrow(covariates)  # Number of sites

Be careful with generic names like temp, data, out, result, index: they are ok if you only want to use them in the next couple of lines, but don't expect them to have the same value 200 lines down the page - and don't assume your code will be run in sequence.

Avoid using the names of common R functions for your objects: mean <- mean(x) works but can cause confusion later.

Spacing

Code without spaces will work, but is unreadable! Put spaces around <- and most operators (+, -, *) except those with highest priority (:, ^, $). Put spaces after commas. Don't put spaces inside parentheses.

Good: distance <- sqrt(x^2 + y^2)

Bad: distance<-sqrt ( x ^ 2+y ^ 2 )

Exception to the last point: I often put parentheses around a whole line so that the result is displayed in the Console. Then I do put in extra spaces:

( distance <- sqrt(x^2 + y^2) )

With indexing, don't put a space between the object name and the opening "[": foo[3] not foo [3]. And never have a line break in there! Avoid spaces between the name of a function and the opening parenthesis: mean(x) not mean (x).

Curly braces {}

This section looks a bit finicky, but it really really helps to make code readable, especially with nested loops (common in JAGS):

  • Opening brace at the end of a line
  • Closing brace at the beginning of a line, aligned with the initial if or for
  • Code inside indented by 2 spaces (including comments)
for(i in 1:10) {
  xbar <- mean(rnorm(20, 0, 5)
  # keep positive and negative values separate
  if(xbar > 0) {
    positive[i] <- xbar
  } else {
    negative[i] <- xbar
  }
}

Indenting makes it easy to look down the page and see where each code block (or sub-block) begins and ends.

If your block has just one line, you can dispense with the braces, but do use a new line and indent:

for(i in 1:10)
  print(i^2)

else must go on the same line as the closing brace for the if block - and don't skip the braces if you are using else. (If you try to run a line beginning with else, you will see Error: unexpected 'else' in "else".)

 

Long lines

Break them up. Horizontal scrolling is a pain and word-wrap doesn't understand code. The recommendation is <= 80 characters per line. Indent continuation lines by at least 2 spaces (I use 4), more if it allows pretty alignment, as in the examples below.

In a function call, put line breaks between arguments. In calls to jagsUI::jags, I like to put the substantive stuff on the first line, the MCMC quantities on the second line, and stuff such as modules or samplers on a third line:

out1 <- jags(jagsData, inits, wanted, "wt_siteCovs.jags", DIC=FALSE, 
             n.chains=3, n.iter=11000, n.burn=1000, parallel=TRUE)

In the example above, I have not put spaces around "="; that keeps function calls compact.

If you need to split a long line of algebra, put the breaks after operators, so that R or JAGS know that the line is incomplete:

logit(psi[i]) <- b0 + bFor * fst[i] + bElev * ele[i] + 
                      bElev2 * ele2[i]

 

At the top of the script

You probably want the script to be usable by colleagues or at least by future you. So a comment with a brief description of what it does is good. You may need to add author and copyright information, and that wretched don't-sue-me-if-it-all-goes-wrong thing.

Some folks like a comment with the date of the last update. I tend to modify files then forget to update the update date; better just look at the date stamp in File Manager.

The script should work when run in a fresh, clean R instance. You don't need rm(); setwd() may work for you now, but not otherwise. See here for discussion.

Load all packages and data at the start of the script, so that you don't have to break off to install packages or hunt for a data file in the middle of working through the analysis.

 

General  layout

Use blank lines, headings, and commented lines of ----- or ==== to break up the code into logical chunks. But don't go overboard, one blank line is enough and headings don't really need to be in boxes. On my laptop screen I can comfortably display 25 lines of code, and I get lost if too many of those lines are blank.

I also get lost if the script is too long. (The code file for AHM1 runs to >12,000 lines!) Once you get beyond a few hundred lines it's time to look for ways of breaking it up into multiple files. If files should be run in sequence, start file names with numbers, eg, "01_data_prep.R", "02_data_explo.R", ..., "11_JAGSanalysis_augmentation.R"

Posted 24 December 2019, last updated 10 April 2020 by Mike Meredith