ML-NMR: the Future of Population Adjustment?

Matt Chapman-Rounds

This is the third blog in our trilogy on techniques for population adjustment in network meta-analysis (NMA). Here we will focus on an approach called Multi-Level Network Meta-Regression (ML-NMR), which improves on several aspects of the methods we described in our first two blogs – Matching Adjusted Indirect Comparison (MAIC), and Simulated Treatment Comparison (STC).

NMAs are an important part of HTA submissions. HTA submissions often involve demonstrating the relative efficacy or safety of one treatment compared to one or more competitor treatments. Standard NMAs, which use only aggregate data, are forced to rely on the assumption that the covariate characteristics of the various trial populations are the same, in all the ways that would affect the relative benefits of the treatments for the outcome of interest.

This assumption is very strong, and may be very wrong. However, standard NMAs have the advantage of providing a general probabilistic framework which allows us to integrate data from as many trials and treatments as we can connect to a network.

We often have access to individual patient data (IPD) for one or more trials. If this is the case, we can use these IPD to try to adjust for differences in the covariates between populations.

Key Takeaways

Multi-Level Network Meta-Regression (ML-NMR) combines individual patient data (IPD) and aggregate trial data in a single probabilistic framework, improving population adjustment over standard NMA, MAIC and STC.
ML-NMR uses a network meta-regression structure: a regression model for treatment–covariate relationships is learned from IPD and embedded inside the Bayesian NMA model, allowing covariate adjustment across the whole evidence network.
Aggregate-only trials are incorporated by modelling “imaginary” individuals whose covariates are drawn from reported summaries (e.g. means, standard deviations), so more of the available evidence contributes to treatment comparisons.
The method relies on assumptions (e.g. about covariate distributions, linear relationships, shared variance structures) and is suited to anchored comparisons; these assumptions must be checked and interpreted carefully.

MAIC and STC do this in a limited fashion; they estimate relative treatment effects between pairs of trials, where we have IPD for one of the trials.

ML-NMR instead embeds a probabilistic approach to population adjustment into the general probabilistic model used for standard aggregate NMAs. It is consistent with those approaches, whilst allowing them to now directly include IPD from trials which have it, and covariate information from all trials. This makes it a substantial improvement over other population adjustment approaches, and indeed over the standard NMA approaches, whenever IPD are available.

We recommend reading some of our earlier blogs for context. In them we have described standard NMAs, population adjustment (and why we prefer it), MAIC, and STC.

Blog overview and caveats

ML-NMR is a maths-heavy modelling approach, which relies on Bayesian statistics and inference in figures. Out of consideration for my readers’ time, I’ll be skipping most of the maths. This means that this blog is not a technical document – those who wish to understand how to implement ML-NMR should read the paper, and try the multinma R package. For a deeper dive, David Phillippo’s PhD thesis is a good place to start.

Instead, this blog will try to describe the main insights which lie at the centre of the approach. These correspond quite naturally to the two halves of the ML-NMR acronym. The method is ‘multi-level’ because it combines both aggregate data and individual data into a single probabilistic model. The method is a ‘network meta-regression’ because (much like STC) it learns a linear model using available IPD. Unlike STC, this linear model is embedded inside the probabilistic model.

Setting

Let’s imagine a simple scenario where we have two trials; a trial comparing two treatments A and B (trial AB) and a trial comparing two treatments A and C (trial AC), where we have IPD for trial AB, but only aggregate data for the AC trial. We can imagine that A is placebo, B is our company’s treatment (hence why we have IPD for the AB trial – we ran it ourselves), and C is a competitor treatment (hence we only have published aggregate data). The figure below shows the relationship between the treatments (nodes) and the trials (edges).

We wish to compare treatment B to treatment C in terms of some outcome. Comparisons between treatments always depend on the population in which the comparison is made. In a head-to-head trial, we randomise patients between arms to balance the populations, and so we can assume that the trial population is homogeneous. For an indirect comparison we don’t have this guarantee; the AC population could be substantially different from the AB population.

For clarity in this blog, we’re only going to focus on the relationship between age and the biomarker we are using to measure effect. The figure below shows (in green) the histogram of ages from the participants in the AB trial, for which we have IPD. The vertical dotted line (in brown) is the reported mean of the ages of participants in the AC trial. We also know the standard deviation of the ages of participants in the AC trial.

It seems clear that there are substantial differences between the trial populations in terms of age; the AC trial has been conducted on an older population, with less variance in age. Mean age in the AB trial is 35.8, with a standard deviation of 10.0. Minimum age is 18.

Naturally, a one-dimensional model is not particularly powerful; ideally we would fit a multivariate linear model to as many covariates as we have available (and this is also what NICE and other regulatory agencies would require). However, we can plot what is happening in one dimension much more easily than in higher dimensions.

Receive every Quantics blog as soon as it’s released

Subscribe to the Quantics Blog

ML-NMA

As we stated in the introduction, our goal is to adjust for differences in trial populations so that our comparison of the outcomes of interest is not biased by an imbalance in patient characteristics that are treatment effect modifiers (see our blog on the difference between prognostic variables and treatment effects). The advantage of IPD is that we can use them to directly learn the relationship between each characteristic of the population and the outcome of interest. The simplest way to do this is to fit a linear model; just as we do in STC.

The figure below shows an example of this approach, where we have fit a straight line to the population of trial AB to model the relationship between age and the outcome of interest (biomarker change from baseline).

This regression model has two parameters – fitting the model involves finding the most likely values of these parameters, given the data. They are the slope, which governs the relationship between the covariate and the outcome (here it takes the value 0.04), and the bias or intercept, which shifts the model so it is at a meaningful location (here with the value -2.18).

If we had IPD for both trials, we would estimate these parameters using all the data in both trials. For data from either trial, the slope parameter would be shared. This reflects our assumption that if a relationship between age and outcome exists, it will be the same irrespective of trial.

The bias parameter allows the model to account for variation between trials which affects the outcome, but is not due to age – in practice this bias parameter would have multiple components accounting for things like trial specific effects and treatment specific effects.

Standard NMA, when we do not have any IPD, just estimates the bias parameter. The network structure is formed by trials having multiple treatment arms, and treatments appearing in multiple trials. If we have IPD, adding in the ability to model covariates gives our analyses a whole extra informative dimension.

ML-NMR takes the following approach: if we have IPD we can estimate the slope parameter corresponding to a particular covariate (here, age) from these IPD. We can then model the non-IPD trials as though their aggregate outcomes were being generated by individual data. As we know the aggregate outcomes, we can then infer things about these imaginary individuals.

To model the likely contribution from an individual, we use the regression model we have learned from the IPD elsewhere (i.e. the figure above). To capture the aggregate outcome that this would lead to in the modelled trial, we use the AC trial statistics we have.

Imagine we only have the mean age from the AC trial. We could simply plug the mean into the regression model, and get a point estimate for the outcome. The figure below shows what this would look like.

If you read our STC blog, this figure should look familiar. This a model much like STC, albeit a probabilistic version of STC able to integrate the data from multiple trials simultaneously. However, only relying on a point estimate like this is problematic. As a single point would not account for any variance in the trial (which is certainly there) we would run the risk of overconfidence in the conclusions of our analysis.

However, more information means we can do more. We actually have both mean and standard deviation for how age was distributed in the AC trial. We can now build a distribution which is representative of how age might look in the AC trial. With only mean and standard deviation available, our most conservative estimate is to use a Normal distribution.

On the figure below I have plotted the distribution (in brown) of 1000 imaginary participant ages, drawn from the normal distribution parameterised with the mean and variance of the AC trial. To make its relationship to the model of age-biomarker change learned from the AB trial clear, I have plotted it on the age axis of the regression figure; the black curve is the normal distribution from which I have drawn the samples.

Finally, we can now estimate a distribution of biomarker change in the AC trial. You can think of this as effectively doing the mean estimate, but for each of the 1000 imaginary participants, and then plotting the resultant histogram on the biomarker change axis. Once we have this sample from the distribution over biomarker change, we can compute anything that we need for the other parts of the probabilistic model; mean, median, standard deviation, and so on.

This allows us to improve the standard NMA model by including the details of covariate distributions in the populations of the different trials for which we have aggregate information. A general principle is that if you can add more data without biasing your analysis, your results will be more precise.

By using ML-NMA we can include all the subject data which has until now been languishing unused in aggregate published studies. In addition, we can integrate the IPD for trials where it is available directly into the probabilistic network model. The potential improvement to our ability to perform network analyses should not be understated.

Furthermore, if we know the covariate distribution of a particular target population, we can estimate outcomes directly on that target population, using the above approach. This gives us flexibility when performing subsequent analyses (such as economic analyses) where there are certain populations of interest.

Caveats

Whilst ML-NMR has allowed us to drop one particularly strong assumption – that the population covariates are identically distributed between different trials – we have had to make several new assumptions.

Firstly, covariates may not be Gaussian distributed. We make this assumption because it is the most conservative estimate we have; it may be quite wrong. However, ML-NMR is appealing because if we do have more information than mean and standard deviation, we can integrate it into the model – the approach is extremely flexible.

Secondly, I have not talked at all about how to make the probabilistic model consistent between the individual level and the aggregate level. This requires some careful choosing of likelihood functions.

Thirdly, in the multivariate setting, we use the IPD to model the between-covariate variance, and assume this is the same for all trials. As ever, this is an assumption we have to make due to lack of data: it may well be wrong.

All of the cautions which come with using linear models still apply; see the caveats section of the STC blog for more details.

ML-NMR is not suitable for unanchored comparisons as MAIC and STC may be. This isn’t the largest downside – unanchored comparisons are a tool of last resort anyway – but is something to be aware of.

Lastly, this blog is not a technical document! Hopefully it has given a reasonable no-maths summary of what ML-NMR does, and how it improves the current state-of-the-art in model adjustment.

Conclusion

ML-NMR embeds a probabilistic approach to population adjustment into the general probabilistic model used for standard aggregate NMAs. It is consistent with those approaches, whilst allowing them to now directly include IPD from trials which have it, and covariate information from all trials. This makes it a substantial improvement over other population adjustment approaches, and indeed over the standard NMA approaches, whenever IPD are available.

Useful Links

ML-NMR paper: Phillippo, David M et al. “Multilevel network meta-regression for population-adjusted treatment comparisons.” Journal of the Royal Statistical Society. Series A, (Statistics in Society) vol. 183,3 (2020): 1189-1210. doi:10.1111/rssa.12579

R package multinma

David Phillippo’s PhD thesis Calibration of Treatment Effects in Network Meta-Analysis using Individual Patient Data

NICE guidelines to population adjustment: TSD 18

About the Author

Matt Chapman-Rounds

Matt is writing up a PhD at the University of Edinburgh. At Quantics, he works as a statistician with a focus on NMA. He has experience with machine learning, data science and experiment design, with additional interests in Model Explainability, inference in graphs, and bayesian models of cognition.

View all posts

ML-NMR: the Future of Population Adjustment?