**In this post we examine the statistical software available for conducting Bayesian network meta-analyses (NMA). The first two sections provide some background information and a review of the available software – we hope this section will be of interest to all of our readers. The final section provides an introduction to conducting NMA in Stan – Stan is a relatively new program for conducting Bayesian analyses – this section will be of interest to readers who conduct their own NMAs.**

**NMA and WinBUGS**

Historically, Bayesian network meta-analyses have most often been conducted using WinBUGS. Lu and Ades proposed the first Bayesian network meta-analysis models in 2004 – their paper was accompanied by WinBUGS code [1]. Since then, numerous network meta-analysis publications have provided WinBUGS code for a whole range of applications including survival outcomes [2], modelling outcomes by time [3] and synthesising aggregated data with individual patient data [4]. The NICE Decision Support Unit (DSU) Technical Support Document (TSD) 2 also provides WinBUGS code for eight standard network meta-analysis models.

BUGS was first released in 1993 and was followed by WinBUGS in 1997. For many years, WinBUGS was the only program available that would allow users to fit Bayesian models easily. Since its inception, WinBUGS has facilitated the growth of Bayesian statistics by making Bayesian methods more accessible [5]. However, anyone who has used WinBUGS regularly will know that this program comes with some frustrations. It expects data in a custom format, it produces uninformative error messages (see Figure 1), it can be slow to converge and it doesn’t produce great graphics. Some of these frustrations can be addressed by using WinBUGS in conjunction with other software, such as R. The R2WinBUGS package allows users to call WinBUGS from within R – hence data formatting and plotting can be handled within R. However, other WinBUGS frustrations are not so easy to avoid.

**Figure 1: The frustrations of WinBUGS – unintuitive error messages!**

Whilst experienced users of WinBUGS may know that a ‘trap 66’ error is often associated with a poor choice of prior distribution for a variance parameter – ‘trap 66’ is not helpful to most. |

** **

**Alternatives to WinBUGS**

Since Lu and Ades key paper in 2004, many alternatives to WinBUGS have been released. Quantics conducted a review to identify and evaluate open source alternatives to WinBUGS for Bayesian network meta-analysis. Our goal was to identify software that could be used to fit a wide range of network meta-analysis models efficiently. The results of our review are summarised in the table below.

**Table 1: Alternatives to WinBUGS**

Out of the three potential alternatives highlighted above we felt that Stan was the best alternative because:

- It uses the latest available MCMC methodology: Hamiltonian Monte Carlo (HMC) and no-U-turn samplers,
- It has a flexible modelling language, including if/then statements and vector operations,
- It provides useful error messages,
- It is being actively developed and updated,
- It is clearly documented with a 500+ page manual, and
- It integrates with R for producing publication quality graphics.

For those readers interested in exploring Stan further and understanding how it can be applied to a network-meta analysis, the following section will guide you through the application of Stan to a WinBUGS example from the NICE DSU TSD 2. For those who are not already familiar with the NICE DSU TSD 2, it might be helpful to review examples 5 and 7 before reading this section.

** ****Get all our latest news delivered straight to your inbox.**

**Stan for Network Meta-Analysis**

In order to explore the use of Stan for network meta-analysis, we re-programmed the WinBUGS models from the NICE DSU TSD 2 in Stan. Here we focus on one example, providing the equivalent Stan code for program number 7 (b) in the NICE DSU TSD 2 (pg 95), which consists of a fixed effect model for a normally distributed continuous outcome.

Stan can be operated from the command line or called from within a number of other packages including R, Python, MATLAB, Julia, Stata and Mathematica. In this example, we have used R to call Stan. Instructions for setting up R and Stan are available from the Stan website. All analysis was conducted using R version 3.4.1 and rstan version 2.16.2.

First we take a look at how to set-up the data. We have used the version of the Parkinson’s dataset from the NICE DSU TSD 2 that is based on the treatment differences within study. For each study, there is one control treatment and either one or two experimental treatments; the data consists of the differences in the reduction in mean off-time for each experimental treatment relative to the control treatment. In the NICE DSU TSD 2 example, the dataset is structured with one row per study, however, we have restructured the data so that there is one row per experimental treatment (see Table 2).

**Table 2: Parkinson’s dataset. **

s | t | b | y | se | v |

1 | 3 | 1 | -0.31 | 0.668 | 0 |

2 | 2 | 1 | -1.70 | 0.383 | 0 |

3 | 4 | 3 | -0.35 | 0.442 | 0 |

4 | 4 | 3 | 0.55 | 0.555 | 0 |

5 | 5 | 4 | -0.30 | 0.274 | 0 |

6 | 5 | 4 | -0.30 | 0.320 | 0 |

7 | 2 | 1 | -2.30 | 0.718 | 0.255 |

7 | 4 | 1 | -0.90 | 0.695 | 0.255 |

In this dataset:

**‘s’**is the study ID,**‘t’**is the experimental treatment ID,**‘b’**is the control treatment ID,**‘y’**is the difference in the reduction for treatment ‘t’ relative to treatment ‘b’ in study ‘s’,**‘se’**is the standard error of ‘y’, and**‘v’**is the variance of the reduction for the control treatment (note that this is only required for multi-arm studies).

The columns se and v are presented to 3 decimal places. A CSV version of this dataset is available here.

The dataset includes one three-arm trial: study 7 (s = 7) compares control treatment 1 (b=1) to experimental treatments 2 and 4 (t = 2, 4). For multi-arm trials, it is necessary to account for the correlation in the treatment differences. Since both treatment 2 and treatment 4 have been compared to treatment 1, these treatment differences are correlated. It can be shown that the covariance of the treatment differences is equal to the variance of the reduction for the control arm (see NICE DSU TSD 2, pg 37). There are many ways to approach this covariance in Stan; the neatest solution we have found is to model all of the treatment differences (across all of the studies) using a single multivariate normal distribution. Most of the covariances are zero, since treatment differences from different studies will not be correlated, but the covariances for treatment differences within the same study are set to variance of the reduction for the control arm. This approach is demonstrated in the R code and Stan code provided below.

**Stan model **

Next, we take a look at the Stan code for the model (see Appendix, Figure A1). The first thing that you may notice is that, compared to WinBUGS code, the Stan code is much more structured. Stan models are set out in program blocks, each of which has a specific purpose.

- The first block in our code is the
**data**block. This defines the data that is read into the Stan model. We need to tell Stan the type of each variable. For example, the data block specifies that ‘no’ (the number of observations) is an integer, that ‘t’ (the experimental treatment ID variable) is an integer variable of length ‘no’ and that ‘varcov’ (the variance-covariance matrix for the treatment differences) is a matrix of size ‘no’ by ‘no’. If you’d like to learn more about defining data in Stan, then the Stan language manual is very comprehensive. - The second block in our code is the
**parameters**block. This defines the parameters that are sampled in the Markov chain Monte Carlo (MCMC) chains. Following the NICE DSU TSD 2 example, we want to evaluate the difference between each treatment and treatment 1 (‘d’), and the reduction due to treatment 1 (‘A’). In the NICE DSU TSD 2 WinBUGS code, ‘d’ is a vector of length ‘nt’, where ‘nt’ is the number of treatments. The*n*^{th}element of ‘d’ represents the difference between treatment*n*and treatment 1. Hence the 1^{st}element of ‘d’ represents the difference between treatment 1 and itself, and is therefore set to 0. Whilst WinBUGS allows ‘d’ to include a mix of random and non-random values, Stan parameters can only include random values. Hence, in the Stan code, we define a variable called ‘d_param’ which is equivalent to the last ‘nt’-1 elements of ‘d’. - The third block in our code is the
**transformed parameters**block. As you might expect from its name, this block defines transformations of the parameters. We have defined two transformed parameters: ‘d’ and ‘delta’. As described above, the first element of ‘d’ is set to 0; the remaining elements are derived from ‘d_param’. As per the NICE DSU TSD 2 WinBUGS code, ‘delta’ represents the trial specific treatment differences. - The fourth block in our code is the
**model**block. The model block defines the likelihoods. The observed treatment differences (‘y’) have a multivariate normal distribution with mean vector ‘delta’ and covariance matrix ‘varcov’. The reduction due to treatment 1 (‘A’) has a normal distribution with mean ‘meanA’ and standard deviation ‘sdA’. Note that Stan parameterises the variance of the multivariate and univariate normal distributions differently from WinBUGS – using the covariance matrix and standard deviation, rather than the inverse covariance matrix and inverse variance, respectively. The model block can also include prior distributions. In the WinBUGS code, the last ‘nt’-1 elements of ‘d’ are given vague normal priors. However, in Stan, it is not necessary to specify vague priors. If no priors are explicitly specified, Stan assumes a flat prior, over the range of the parameter. Hence for ‘d_param’ Stan will automatically assume that each element has a uniform prior on (-∞, ∞). - The final block in our code is the
**generated quantities**block. The generated quantities block defines functions of the parameters and the data. Following the NICE DSU TSD 2 WinBUGS code, we have calculated the total residual deviance (‘totresdev’), the reduction due to each treatment (‘T’), the differences between each pair of treatments (‘mean_diff’) and the probability that each treatment is best (‘best’).

**R code **

Now that we have seen how the Stan code is set up, we also take a quick look at how to run the Stan code from within R (see Appendix, Figure A2).

Stan works in two stages, compiling and sampling. The model is translated into C++ (programming language) code and compiled, followed by sampling from the posterior distribution. To remain consistent with the WinBUGS example, this model has been run for 3 chains of 100,000 iterations each, after a warmup (or burn-in as referred to in WinBugs) of 50,000. The function returns a ‘stanfit’ object called res.7b.

From the ‘stanfit’ object, the fit statistics specified in the model are printed in R and additional model fit parameters are calculated. These results are summarised in Table 3. The results are nearly the same as those produced by WinBUGS (see Example 7 in Table A9 of the NICE DSU TSD 2), with all differences no greater than 0.01. These analyses are based on random sampling and therefore small differences are expected.

Refer to the NICE DSU TSD 2 for further interpretation of these results.

**Table 3: Summary statistics for the fixed effects model for treatment effects relative to placebo (d _{12} to d_{15}), absolute effects of placebo and each treatment (T_{1 }to T_{5}) and model fit statistics for the Parkinson’s dataset. **

Mean | SD | Median | Credible Interval | |

d_{12} | -1.81 | 0.33 | -0.811 | (-2.47, -1.16) |

d_{13} | -0.48 | 0.49 | -0.48 | (-1.44, 0.47) |

d_{14} | -0.53 | 0.48 | -0.52 | (-1.47, 0.42) |

d_{15} | -0.83 | 0.52 | -0.82 | (-1.85, 0.20) |

T_{1} | -0.73 | 0.22 | -0.73 | (-1.16, -0.30) |

T_{2} | -2.54 | 0.40 | -2.54 | (-3.32, -1.76) |

T_{3} | -1.21 | 0.53 | -1.21 | (-2.26, -0.17) |

T_{4} | -1.26 | 0.53 | -1.25 | (-2.29, -0.22) |

T_{5} | -1.56 | 0.56 | -1.55 | (-2.67, -0.45) |

_{Dres} | 6.3 | |||

pD | 3.8 | |||

DIC | 10.1 |

Dres = Posterior mean of the residual deviance; pD = Leverage; DIC = Deviance Information Criteria

**Conclusions**

In summary, we recommend Stan as an alternative to WinBUGS for network meta-analyses. Its advantages include that it is based on the latest available MCMC methodology, it has a flexible modelling language and it provides useful error messages. Whilst it’s modelling language is somewhat different from that of WinBUGS (and OpenBUGS and JAGS), we believe that it is worth the time to learn it, especially for anyone fitting complex network meta-analysis models. For those willing to make the jump, Stan comes with excellent documentation and an active online community to provide assistance.

**Further information **

We hope you’ve found this post informative. If there are any particular topics you’d like to see covered in the future then please let us know. For further information, please see our poster on Alternatives to WinBUGS for Network Meta-Analysis

**Acknowledgments**

This blog is based on a project conducted by Quantics, in collaboration with the University of Guelph, Canada. Quantics would like to thank Matthew Stephenson at the University of Guelph for his contribution to this project.

**References**

[1] Lu G, Ades AE. Combination of direct and indirect evidence in mixed treatment comparisons. Stat Med 2004; 23: 3105–3124. Available from: http://onlinelibrary.wiley.com/doi/10.1002/sim.1875/full

[2] Woods BS, Hawkins N, Scott DA. Network meta-analysis on the log-hazard scale, combining count and hazard ratio statistics accounting for multi-arm trials: a tutorial. BMC medical research methodology. 2010; 10: 54. Available from: https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-10-54

[3] Dakin HA, Welton NJ, Ades AE, Collins S, Orme M, Kelly S. Mixed treatment comparison of repeated measurements of a continuous endpoint: an example using topical treatments for primary open‐angle glaucoma and ocular hypertension. Statistics in medicine. 2011; 30: 2511-35. Available from: http://onlinelibrary.wiley.com/doi/10.1002/sim.4284/full

[4] Saramago P, Sutton AJ, Cooper NJ, Manca A. Mixed treatment comparisons using aggregate and individual participant level data. Statistics in medicine. 2012; 31: 3516-36. Available from: http://onlinelibrary.wiley.com/doi/10.1002/sim.5442/full

[5] Lunn D, Spiegelhalter D, Thomas A, Best N. The BUGS project: evolution, critique and future directions. Statistics in medicine. 2009; 28: 3049-67. Available from: http://onlinelibrary.wiley.com/doi/10.1002/sim.3680/abstract