Bayz Manual
Basics for reading bayz output in R and convergence checks using R coda

Reading bayz output in R

A set of R functions is supplied with bayz that allows to read the samples from bayz out-file.

To load the set of R function supplied with bayz use the R command:

source("http://www.bayz.biz/bayzRfunc.R")

Output samples can be retrieved for one parameter at a time. The R function to read output is:

bayz.read("jobname","parameter-name")

or for linux and unix system a faster version is:

bayz.read.fast("jobname","parameter-name")

The "jobname" should be the name of the job file that was run, without ".bayz" extension. The parameter name to be given should be a name as on the left hand sides of the model lines. Hence, if uncertain about the parameter names, open the bayz job file that was run, and copy the desired name from the left part of the model lines. The R functions also needs the lab-file (labels) for some functions.

It is also possible to get a list of the parameters in a job file with the R comnand:

bayz.param("jobname")

When paramerters are retrieved, the labels are not automatically attached. If you want to have those, retrieve them using:

mypar_labels <- bayz.labels("jobname","parameter-name")

You can for instance attach these labels as the column names of for the parameter samples read with bayz.read.

Using R Coda

It is recommended to study well convergence and Monte Carlo (MC) error of the Bayesian MCMC based procedures. The MC error is often reported as 'standard error' (SE), not to be confused with the posterior standard deviation (SD). Bayz does not include features to check convergence, because this is quite a special topic with many personal 'flavours'. And there are many software tools available for convergence diagnosis, for instance in the R Coda package. Bayz output read in R can be converted to 'mcmc' objects for the Coda package using the supplied function 'bayz2mcmc'.

The simplest two convergence checks that can be done using R Coda are computation of the MC error, which is available using summary (reported as 'time series SE'), and computation of effective sample size, which is available using effectiveSize. The effective size computation is based on 'batching'. The two approaches are usually producing quite similar results, i.e. SE is about SD/sqrt(effective size). A small R session follows:

>source("http://www.bayz.biz/bayzRfunc.R")
   >library(coda)
   >varE <- bayz.read.fast("bivarmodel","varE")
   >summary(bayz2mcmc(varE))
                         Mean       SD  Naive SE Time-series SE
     var.bodyw          12.1322  0.59893 0.0189399      0.0206664
   var.feedint       239.3156 14.69610 0.4647314      0.5797327
   cov.feedint.bodyw  38.3988  2.62449 0.0829935      0.0900629
   cor.bodyw.feedint   0.7124  0.02287 0.0007231      0.0006093
   >
   effectiveSize(bayz2mcmc(varE))
           var.bodyw       var.feedint cov.feedint.bodyw cor.bodyw.feedint 
            839.8992          642.6114          849.1739         1408.4700 

In this case, the Time Series SE on the the correlation estimate (cor.bodyw.feedint) is 0.0006 which means it would be very OK to report the correlation estimation with 2 decimals, and somewhat dubious to report the correlation with 3 decimals (the SE is only just below 0.001).