Bayz Manual

Reading bayz output in R

A set of R functions is supplied with bayz that allows to read the samples from bayz out-file. To load the set of R function supplied with bayz, use once in every R session the R command:

source("http://www.bayz.biz/bayzRfunc.R")

Samples are read for one parameter-vector at a time, and the name of the parameter-vector is needed to retrieve parameters. Within R the list of parameters (and their number of levels, the length of each vector) can be retrieved using:

bayz.param("jobname")
This should also match all names that appear on the left-hand-sides of the model lines in the bayz script, or all names that appear in the .mod or .lab output files.

The R function to read output is:

bayz.read("jobname","parameter-name")
or for linux and unix system a faster version is:
bayz.read.fast("jobname","parameter-name")
The "jobname" should be the name of the job file that was run, without ".bayz" extension. The parameter name is the name as explained above. For some examples see below.

The bayz.read function returns a matrix with all the MCMC samples for all levels of the parameter-vector, with MCMC cycles on the rows, and the parameter-levels on the columns. When many MCMC samples are saved, and when the parameter vector has many levels (e.g., vector of breeding values, SNP effects, etc), this can be a quite large matrix.

Parameter summaries with R

Posterior mean and Posterior SD

The posterior means and posterior SDs can be computed as follows. Note that this should produce the same as pbayz, and pbayz will be faster. The example assumes that a model was run with a bayz script called "fatbv.bayz" that estimated breeding values (polygenic effects) in a pedigree model, and the vector of breeding values was called "polyg.anim.fatpct":

source("http://www.bayz.biz/bayzRfunc.R") # only needed once
bv_samp <- bayz.read.fast("fatbv","polyg.anim.fatpct") # all MCMC samples
bv_pm <- apply(bv_samp,2,mean) # posterior means
bv_psd <- apply(bv_samp,2,sd) # posterior SDs

Median and quantiles

Can be obtained using the R quantile() function. See R help.

Mode

As far as I know there is not a standard mode-function in R. One way to make an estimate of the mode is to use the density() function and to determine the value where it is maximal, something like:

  d <- density(x)
  d$x[which.max(d$y)]

Highest Posterior Density (HPD) intervals

An HPD interval is (slightly) different from taking quantiles, i.e. a 95% HPD is not the same is taking the 2.5 and 97.5% quantiles. The HPD needs to find the smallest interval that contains a determined part of the density (e.g., 95%), this can result in taking different quantiles in the left and the right tail when the distribution is skewed. The R coda package (see the page on Using R coda) contains a function HPDinterval to compute the HPD:

  library(coda)
  HPDinterval(as.mcmc(x),prob=0.95) 
However, the coda HPDinterval function produces wrong intervals when a parameter is near a boundary, for instance a variance that is close to zero. The bayzRfunc.R file contains an hpdpos function, with the same interface as HPDinterval, and that correctly computes HPD intervals for parameters that need to be positive (e.g., variances). It can also be used for proportions or heritabilities that are close to zero.