Blog
Sep 17

Challenges of Clinical Trials for Vaccines: Sample Sizes

|

To perform any experiment, it is vital to ensure that enough data is collected. Too little data can not only mean imprecise results, but also makes distinguishing between reality and statistical fluke difficult. This concept of statistical power – and, inherent within, sample size – is a key component of trial design: consideration must be given to making sure that the statistical power of a trial is sufficient well before it can begin. Trials used for vaccine testing are no different. Indeed, vaccine trials can often be faced with the challenge of requiring inflated sample sizes compared to similar trials for other drug products. Here, we’re going to examine why this arises from the nature of vaccine targets, and how the requirements of each stage of the clinical trial process are reflected in typical sample sizes.

The Importance of Sample Size

Sample size calculations are a vital stage of any clinical trial. So vital, in fact, that we’ve covered them in detail in a previous blog. Nevertheless, we will go over the key concepts of why we care about sample size here.

Key Takeaways

  • Vaccine trials often require larger sample sizes than other drug trials due to several unique challenges: the need to observe natural infection rates in healthy participants, the requirement to demonstrate meaningful efficacy above a non-zero threshold, and heightened safety expectations given their use in healthy populations.

  • Statistical power is central to clinical trial design, with sample size being a controllable factor among those that influence it. Vaccine trials must ensure sufficient power to detect efficacy while also being large enough to observe rare adverse events.

  • Sample size increases across clinical phases, starting from small phase 1 studies (~100 participants) focused on safety and immunogenicity, to phase 3 trials that may require tens of thousands for robust efficacy and safety data, and post-marketing surveillance (phase 4) that monitors millions for extremely rare side effects.

The purpose of a clinical trial is, at the most fundamental level, to draw a conclusion about the world based on the evidence presented by the trial. Either the evidence collected shows that the vaccine is effective, or the trial does not provide such evidence. We can reduce this to the form of a hypothesis test. We define a null hypothesis, , and assume it’s true. Evidence from the trial is then assessed to determine whether it is sufficient to reject the null, and instead accept the alternative hypothesis, . A simplified hypothesis test might be:

H_0: The vaccine is no better than placebo at preventing infection

H_a: The vaccine is better than placebo at preventing infection

Now, it is possible for a trial to provide evidence which leads to us drawing an incorrect conclusion by random chance. These are known as errors, and they come in two types:

Type I Error: Reject the null based on provided evidence, even though the null is true. The probability of a Type I error is represented by \alpha.

Type II Error: Fail to reject the null based on provided evidence, even though the null is false. The probability of a Type II error is represented by \beta.

error table

From the table above, we can see that the probability of the trial providing sufficient evidence to reject the null correctly is 1-\beta. This is an important quantity – known as the statistical power – as we usually set up hypothesis tests so that the alternative hypothesis is the outcome we’re interested in. In our example,  is that the vaccine is effective compared to placebo, which is the result we want as a vaccine developer. As a result, we want to ensure that the probability of our trial coming to this conclusion correctly is well understood.

Several factors influence the statistical power of a trial, including the effect size (the difference in efficacy between the vaccine and placebo), the variability of the effect, and the sample size of the trial. Of these, only the sample size is under our control. As a result, the sample size of a trial is carefully calculated under assumptions made about the other factors to ensure that the statistical power is sufficiently high while minimising the cost of the trial.

Sample Sizes for Vaccine Trials

Vaccines for many diseases are tested using clinical trials in the human population. This is often very powerful as it is a direct test of the vaccine’s safety and efficacy on the stage on which it is expected to perform. There are, however, some challenges associated with such clinical trials which require vaccine trials to have larger sample sizes than comparable trials for other products.

One such consideration is the occurrence rate of the disease in the population. In most clinical trials, the subjects already have a disease, and the trial is testing a method of curing or managing that condition. Every subject you enrol is guaranteed to provide a datapoint in your study. In a population study for a vaccine, this is not the case. The subjects are healthy when they are vaccinated, and there is no guarantee of how many will be exposed to the disease in question going about their daily lives.

As such, when considering the sample size for such a trial, one must take into account the prevalence of the disease in the population in which the vaccine is being tested. If the number of infections is too low, the trial will not have sufficient statistical power to adequately determine the efficacy of the vaccine against placebo.

This is often exacerbated by the requirement for a vaccine to surpass a non-zero efficacy margin. We expect vaccines to be a substantial improvement over placebo – a vaccine which only showed a 10% efficacy, say, would only be useful against the most severe diseases. So, we might set the null hypothesis to be the vaccine is no more than than 30% effective compared to placebo. This increases the required sample size to achieve a desired statistical power.

For diseases with low prevalence, an alternative approach is the challenge trial, in which subjects are deliberately exposed to the disease after vaccination. This guarantee of exposure means the infection rate is no longer a factor in the statistical power of the trial, and can allow a smaller sample size. Human challenge trials, however, pose a greater risk to participants due to the disease exposure, and are usually inappropriate for diseases for which severe outcomes are likely.

A further factor that leads to vaccine trials requiring larger sample sizes than for other products is a greater emphasis on safety. Vaccines are one of the few medical interventions which are typically performed on healthy individuals – they are intended to prevent disease rather than treat or cure. This means that there are special ethical considerations which must be made when testing a vaccine. In particular, safety plays an even more prominent role in the assessment of vaccine performance than it does for other medicinal products. There is no guarantee that patients given a vaccine would ever have been exposed to the disease they are now protected against, so even rare side effects can shift the cost-benefit analysis of a particular vaccine.

That means it is important that late-stage clinical trials for vaccines are sufficiently large that rare side effects can be detected. The FDA recommend a minimum of 3000 participants be included in a pre-licensure safety database. This number ensures that, if no serious side effects are observed among those 3000 subjects, one can infer an approximate occurrence rate of at most 1 in 1000 with 95% confidence.

Sample Sizes through Trial Phases

While exact sample size choices are made with the specific requirements of a certain vaccine in mind – key outcomes, safety profile, response variability, etc – there are general trends in the typical sample sizes used at each stage in the vaccine testing process. These phases have different goals, approaches, and regulatory imperatives, meaning the number of subjects enrolled in each will vary noticeably. Let’s examine these sample size trends for each phase and discuss the reasoning behind these choices.

Phase 1

A phase 1 trial is usually the first time a vaccine candidate has been used in human subjects. The goal of a phase 1 trial is to assess the vaccine for short-term side effects – injection site reactions, for example – as well as the ability of the vaccine to generate an immune response: its immunogenicity.

The sample size for a phase 1 trial is usually small – often no more than 100 subjects. The objectives of the trial are a key reason for its size. A phase 1 trial is the first dip of a toe into the water of proving the vaccine is safe and effective in humans: we are looking to show that there are no dramatic side effects which went undetected in preclinical trials and that the vaccine at least generates an immune response. We are not worried at this stage about demonstrating the efficacy of the vaccine against the disease in question.

As such, there is typically no formal hypothesis testing for efficacy involved in a phase 1 trial. This means we are not concerned with the statistical power of the study, meaning we can be content with a smaller sample size. From an ethical perspective, this also has the benefit of minimising the number of volunteer subjects who are exposed to a vaccine candidate which is to date untested in humans. The downside, however, is that any statistical analysis which is required for assessing the goals of the study may be hampered by the small sample size, and results should often be viewed with caution. Nevertheless, this is seen as a reasonable trade-off given that these outcomes will be confirmed by further trials should the candidate be deemed suitable to pass phase 1.

Phase 2

Phase 2 trials are an extended version of phase 1, with the goal once again to understand the safety of the vaccine before more extensive efficacy trials and demonstrate its immunogenicity. In this case, however, the trial population is expanded to better match the expected target population of the vaccine candidate. This may include participants from higher risk groups than participated in phase 1.

The resulting sample size is typically in the low hundreds – often between 100-300 subjects. This is driven by the need to demonstrate statistically significant immunogenicity between study arms, which requires the study to have sufficient statistical power. This often stops short of powering the study such that efficacy can be definitively proven, but efficacy information can sometimes be gathered if the target disease is prevalent enough. Indeed, phase 2 studies can sometimes be used as a “proof of concept” efficacy trial to establish feasibility before more extensive phase 3 trials. More commonly, the opposite approach is taken, and phase 2 is combined with phase 3 trials or skipped altogether. This occurs most frequently when the immune response to the vaccine was very strong in phase 1 and/or the need for the vaccine is especially urgent, as was the case for the COVID-19 vaccines.

Phase 3

Phase 3 is often known as “pivotal” for a reason: it is where the efficacy of the vaccine candidate is finally proven once and for all. Key performance endpoints are evaluated in the treatment group vs the control group and the results used in formal statistical testing to determine the efficacy of the vaccine. This requires a large sample size such that the study is sufficiently powered.

As we’ve discussed previously, there are many factors which affect the sample size required for a trial to achieve a high enough statistical power. These include the prevalence of the virus in the population and the target efficacy. This leads to a wide range of sample sizes, from the low thousands to several tens of thousands of participants. One method is to take an event-driven approach: the trial period or enrolment continues until a set number of key events – such as infection, hospitalisation, or deaths – occur among the trial population. This can ensure the statistical power of the trial is sufficient, but does lead to uncertainty about the eventual cost and duration of the trial.

Phase 3 is also the first opportunity to assess the vaccine candidate for any rare adverse events. The sample sizes in phase 1 and 2 will be far too small to expect to observe many of these side effects. Generally, the sample sizes required to power efficacy conclusions is such that the trial population will be large enough to provide a good test for rare side effects.

In cases where there is reason to be extra careful, however, the sample size may need to be further expanded. For example, rotavirus vaccines have been known to be associated with an increased risk of a type of bowel obstruction known as intussusception. When a novel rotavirus vaccine was developed, therefore, consideration was given to the required sample size to establish the risk of intussusception associated with the new vaccine. This turned out to be more than 60,000!

Phase 4

Once the efficacy of a vaccine has been demonstrated in a phase 3 trial, it can then be licensed for general use. In most cases, however, this is not the end of the story. There may be extremely rare side effects of the vaccine which would require infeasibly large phase 3 populations to be detected, but which may nevertheless prove serious or even deadly.

As a result, there are typically post-marketing surveillance requirements for vaccines which are often considered a fourth phase of the testing process. These studies don’t require formal sample size justification per se, but one could consider their study populations to be in the millions or even billions for the widest-used vaccines. Any adverse events which arise after the clinical use of the vaccine are reported to safety surveillance bodies, such as VAERS in the US. This allows extremely rare vaccine side effects to be detected and adjusted for where required.

AstraZeneca’s COVID-19 vaccine proved to be an example of this process in action. In 2021, the EMA announced that there was evidence of unusual blood clotting associated with the vaccine reported from post-marketing surveillance of the vaccine in real-world use. At the time of that announcement, 222 events had been detected out of approximately 34 million people vaccinated in Europe, giving an incidence rate on the order of 1 in 150,000. This is well below the threshold detectable in even a large phase 3 trial. As a result of this evidence, some regulatory authorities altered their deployment of the vaccine. In the UK, for instance, regulators advised that alternatives to the AstraZeneca vaccine be provided to those over the age of 30 where possible, as it was deemed the risk of blood clotting events was highest amongst the younger population.

Planning is Key

Sample size sits at the heart of vaccine development. From the modest numbers of early safety studies to the vast global efforts of pivotal phase 3 trials and beyond, every stage of the process is shaped by the need to collect enough data to draw confident, reliable conclusions. The specific demands of vaccine testing, including their prophylactic use in healthy populations, the reliance on natural exposure to pathogens, and the need to rule out even rare adverse events, mean that their trials often reach far larger scales than those seen for many other medical products. While this creates substantial scientific, logistical, and ethical challenges, it also ensures that the vaccines which reach licensure have been tested with a rigour proportionate to their importance in public health.

Follow Quantics on Social Media:

LinkedInFacebookTwitter

About the Authors

  • Holly is a key member of our stats team, leading our bioequivalence work and providing her expertise to the clinical and bioassay groups. Before her time at Quantics, she completed an MMath in Pure Mathematics at the University of St Andrews, and completed a masters and a DPhil in statisitical genetics.

    View all posts
  • Jason joined the marketing team at Quantics in 2022. He holds master's degrees in Theoretical Physics and Science Communication, and has several years of experience in online science communication and blogging.

    View all posts

About The Author

Holly is a key member of our stats team, leading our bioequivalence work and providing her expertise to the clinical and bioassay groups. Before her time at Quantics, she completed an MMath in Pure Mathematics at the University of St Andrews, and completed a masters and a DPhil in statisitical genetics.