Post Covid 19 - Handling Missing Data in Clinical Trials

Daniel James

COVID-19 disruption of clinical trials is likely to result in missing data, disrupted timelines, and perhaps patient population changes. We thought we would dedicate a blog post to the actions that can be considered when a study is facing the issue of missing data.

The COVID-19 situation is clearly still developing and regulators are still in the process of clarifying their thinking, however the FDA has issued Guidance [1] and the EMA Committee for Human Medicinal Products has also just issued a “points to consider” document for public consultation [2].

The main statistical analysis considerations include:

Outcomes and results may be safeguarded if appropriate action is taken quickly,
Preserve the database and make clear recordings of how/why data is missing and reasons for discontinued subjects,
Revisions may be needed to the SAP to address changes required to reporting of results,
Consider wider impacts of COVID-19 on data gathering and patient outcomes, for example social or economic pressures on subjects.

Key Takeaways

Timely actions such as preserving the database, recording clear reasons for missing data or discontinuation, revising the statistical analysis plan, and continuing to collect data where possible are essential to maintain trial integrity and meet ethical and regulatory expectations.
The nature of missingness (missing completely at random, at random, or not at random) determines which analysis methods are appropriate, from complete-case analysis to single imputation, model-based methods, or multiple imputation.
A predefined plan for handling missing data should be included in the protocol and SAP to avoid bias, and sensitivity analyses (e.g. pattern mixture models) are recommended to assess robustness when a substantial proportion of data is missing.

Missing data is common

Missing data is a common problem in all clinical trials, but is normally expected to be a small proportion of the total data set. Often the missing data can just be treated as missing, but if there is a significant proportion of data involved, as may occur during and post COVID-19, protocol amendments may be required to introduce more sophisticated statistical ways of managing the problem.

This is particularly important if the value of the primary outcome for a patient is missing. This may result in a reduction in the power of the study, and therefore a need to revisit the sample size.

Minimise the impact of missing data

Some steps can be taken to try to minimise the impact:

Some data is always better than none, so when and where possible continue to collect outcome data, even if it is out of sync or at the wrong time points. It is also an ethical mandate, as pointed out in the EMA paper [2], to proceed with a trial where possible.
Ensure that you have a mechanism for identifying missing data and record the reasons for any subjects discontinuing the trial.
If the trial involves patient reported outcomes, or diaries, consider contacting patients and encouraging them to continue recording even if clinic visits have had to be stopped or delayed.
If a patient discontinues a trial, efforts should be made to obtain the participant’s consent for the use of data on treatments and outcomes. This preserves the ability to analyse endpoints for all participants who underwent randomization and thus to make intention-to-treat inferences grounded in randomization.

Receive every Quantics blog as soon as it’s released

Subscribe to the Quantics Blog

Take stock of the accumulated data – Data Monitoring Committee (DMC) and interim analysis

To assess the impact of missing data and changes in patient populations, a blinded review of data by an independent DMC should be considered. The review will be able to advise on changes to trial design, sample size, formal missing data management and the potential for an interim efficacy or futility analysis, if the trial progress was reasonably advanced when the lockdown hit.

An interim analysis that was not originally planned has significant impact on study power and MUST be carefully planned with statistical support to maintain scientific validity. The proposed analysis must be fully documented and, if unblinding is considered, a separate statistical team may be required to maintain blinding for the main clinical teams and CRO staff.

Plan for management of missing data

Assuming the trial is to continue, then now is the time to define in advance the plan for dealing with the increase in missing data. Formal planning is necessary in order to avoid introducing bias into the comparison of treatments which could happen if the methods are chosen after seeing the data.

There is no single correct analysis when data are missing: all methods require assumptions which are usually unverifiable. The best approach in this post COVID-19 situation (where the missing data is related to external circumstances beyond the control of the trial) will depend on what is missing.

There are three fundamentally different scenarios for missing data:

Missing at random (MAR): when a recorded characteristic about the participant can account (or partially account) for differences in the data for observed and missing cases.

As an example, suppose that the trial requires diary data on exercise. In the current situation, it may be that older patients, considered more at risk of COVID-19, will be less willing to take exercise outside than younger patients. The reluctance is just related to being older and more concerned about COVID-19, and is not related to their actual level of fitness.

Missing completely at random (MCAR): when all participants are equally likely to have any given variable missing. This means that the complete cases are representative of all the original cases as randomized, so can be used for inferences about the treatments.
Missing not at random (MNAR): when the data is missing because of a factor related to the primary analysis. For example, MNAR would occur if older patients exercised less because they had become unfit due to COVID-19. In this case, any analysis of the study endpoints has the potential to be biased by missing data.

Possible analysis approaches include the following:

Complete-case analysis: participants with missing data are simply excluded from the analysis. This would be the basis of the interim analysis suggested above. This approach is ONLY valid if the data are missing completely at random, so the complete cases are representative of all the original cases as randomized.

Single imputation methods: a single value is filled in for each missing data point by means of, for example:
1. Last observation carried forward (LOCF).
2. Baseline observation carried forward (BOCF).
These methods may be valid if data are missing at random. Clinical input (blinded to the data) is generally needed to inform and support the choice of imputation method.

Methods based on statistical models, including:
1. Repeated measures methods, in which observations are assumed to have a normal distribution with a specified form of mean and covariance matrix.
2. Bayesian methods, in which inferences are based on a statistical model that includes an assumed prior distribution for the measurements.
3. Multiple imputation, in which multiple sets of plausible values for missing data are created from their model-based predictive distribution and estimates and standard errors are obtained with the use of multiple-imputation combining rules.

The choice of strategy should be described and justified in the statistical section of the protocol and the assumptions underlying any mathematical models employed should be clearly explained. Analysis methods that are based on plausible scientific assumptions should be used.

Sensitivity analysis

Whichever analysis method is chosen, a sensitivity analysis should be conducted for the method of handling missing values, especially if the number of missing values is substantial. One approach is to use pattern mixture models, which examine subgroups of participants with different patterns of drop-out.

Other related blogs

An Introduction to Survival Analysis for Clinical Trials

References:

FDA Guidance on Conduct of Clinical Trials of Medical Products during COVID-19 Pandemic – https://www.fda.gov/media/136238/download
Committee for Medicinal Products for Human Use. Points to consider on implications of Coronavirus disease (COVID-19) on methodological aspects of ongoing clinical trials. European Medicines Agency, 25 March 2020 EMA/158330/2020 – https://www.ema.europa.eu/en/implications-coronavirus-disease-covid-19-methodological-aspects-ongoing-clinical-trials

About the Author

Daniel James

Daniel joined Quantics in 2015. He has a Masters in Applied Statistics and Datamining from the University of St Andrews in Scotland. Since joining Quantics, Daniel has been part of our HTA team. He has used R and WinBUGS to conduct network meta-analyses for urology, ophthalmology and respiratory indications. He has also been involved in the reporting of these analyses.

View all posts

Post Covid 19 – Handling Missing Data in Clinical Trials

Key Takeaways

Missing data is common

Minimise the impact of missing data

Take stock of the accumulated data – Data Monitoring Committee (DMC) and interim analysis

Plan for management of missing data

Possible analysis approaches include the following:

Sensitivity analysis

Other related blogs

References:

About the Author

What is a Simulated Treatment Comparison, and can it help my HTA submission?

Reference Bridging for Bioassays: Techniques and Best Practice

Fitting Algorithms: The BEBPA Sessions

Choosing a statistical model: Continuous response data

ELISA Analysis: The Bursa-Yellowlees Method

Improving confidence limits for concentration-response models with quantal data

Quantics Biostatistics

Contact Us

Key Takeaways

Missing data is common

Minimise the impact of missing data

Take stock of the accumulated data – Data Monitoring Committee (DMC) and interim analysis

Plan for management of missing data

Possible analysis approaches include the following:

Sensitivity analysis

Other related blogs

References:

About the Author

Read Next

Quantics Biostatistics

Contact Us