The earliest stages of the development of any vaccine involve laying out the roadmap for its design and testing. A vast network of interlinked paths from a new idea to a needle in a patient’s arm stretch ahead into the future, each with unique regulatory and ethical implications. Should the vaccine target a certain strain of a disease, or all of them? Should the vaccine be given to children, or adults? Should we test the vaccine in human volunteers, or animal models? To find answers to questions such as these, we require a deep understanding of the disease we wish to vaccinate against and how it behaves when left to its own devices. That’s the role of a natural history study: a study which examines the pathology, immunology, and epidemiology of a target disease if it is allowed to progress naturally. Here, we’re going to examine the natural history study and how it feeds into deciding the future path of the vaccine.
Why are Natural History Studies necessary?
The idea behind a natural history study is to give strong foundational knowledge about a disease before looking to undergo vaccine development and testing. This is of particular importance for rare or emerging diseases as there may be little prior research into the natural progression of the disease.
Key Takeaways
- Natural history studies provide critical foundational knowledge of a disease’s pathology, immunology, and epidemiology. This understanding informs vaccine design, development, and ethical trial strategies, particularly for rare or emerging diseases.
- Decisions between retrospective versus prospective and cross-sectional versus longitudinal designs affect study outcomes and costs. Prospective and longitudinal designs offer more comprehensive insights but are more resource-intensive.
- Natural history data guide the selection of trial endpoints and ethical considerations for trial formats. Disease prevalence and severity influence whether trials use challenge models, population studies, or animal testing, aiming to balance scientific rigour with participant safety.
This foundational knowledge is crucial to setting the eventual goals of the vaccine. For example, if a disease is highly transmissible but only poses serious consequences in a minority of cases, the priority might be for the vaccine to prevent severe illness rather than preventing the development of the disease entirely. COVID-19 is a good example of a disease for which the prevention of severe disease takes priority. By contrast, if the disease almost always poses serious consequences for the individual but is unlikely to be transmitted, then it is crucial to prevent the disease taking hold in the first place. The rabies vaccine follows this archetype: rabies is essentially 100% fatal for those who become symptomatic, but human-to-human transmission is all but unheard of.
And, an understanding of how the disease behaves in the wild is crucial to informing and justifying a development plan for the vaccine. Alongside information about the severity of the disease, factors such as its incidence rate are important for deciding how the efficacy of a vaccine is to be best assessed.
What does a Natural History Study examine?
So, we have established that the role of a natural history study is to provide a strong knowledge base to underpin further development and testing of a vaccine. There are three key components to this foundation which are typically examined in a natural history study: pathology, immunology, and epidemiology. Let’s investigate each in turn.
Pathology
Pathology is the study of a disease, its causes, and its effects. In the case of natural history studies, we are often specifically interested in pathogenesis, or how a microbe – or pathogen –causes the disease with which it is associated. In very poorly understood diseases, such as in the early stages of the COVID-19 pandemic, this might even involve identification of the pathogen in the first instance. More likely, however, we are interested in how the pathogen behaves in the body. Understanding the behaviour and, where applicable, the life cycle of a pathogen and how it causes disease can reveal key weaknesses that can be targeted in the design of a vaccine. It is important to understand if the symptoms associated with the disease are a result of direct cell damage, or a consequence of the body’s immune response.
Determining how the pathogen moves through the body is also key. The rabies virus, for instance, travels through the body’s nervous system before eventually attacking the brain, after which the classic symptoms appear. This process can take several weeks or even months, which means that the rabies vaccine is one of the few which can be administered after infection and still be effective.
Immunology
Immunology is the study of how the human body – and specifically the immune system – responds to attack by a pathogen. Performing a natural history study can help researchers understand what parts of the highly-complex immune system are triggered by the pathogen as well as how these affect outcomes from the diseases. Particularly important among these are correlates of protection: measurable aspects of a subjects physiology that are indicative of the level of protection a subject currently has against a pathogen.
A natural history study of the immunology associated with a disease can help researchers find ways of effectively attacking a pathogen. For instance, early studies of recovered COVID-19 patients showed they had elevated levels of antibodies which attacked the SARS-CoV-2 virus’ spike protein, indicating that neutralising that protein might be key to preventing infection with the virus. This guided researchers towards a design which targeted the spike protein, resulting in the highly successful COVID vaccines we see today.
Conversely, a natural history study can also reveal the holes in the immune system which are exploited by a pathogen and which can be plugged by a vaccine. Thanks to natural history studies of the HIV virus, we know that it is highly effective at evading antibody protection through mutation. However, these studies also uncovered that the virus is far more susceptible to attacks by certain types of T cells, which has guided research towards an effective vaccine for HIV towards boosting cellular immunity.
Epidemiology
Epidemiology is the study of the spread of the disease. This includes key factors such if and how the diseases typically spreads from person to person along with the rate at which this spread naturally occurs. The spread of the disease is usually quantified with its R0: the number of people a single infected individual is expected to pass the diseases onto. The R0 is key to the deployment strategy of the vaccine. Tuberculosis, for example, has a R0 of below one in developed countries, meaning there is sufficient time to contain outbreaks by vaccinating only those who have come in contact with an infected individual. For the measles virus with an R0 of as high as 18, however, outbreaks grow far too quickly for this approach. We instead vaccinate everyone possible, employing a strategy of herd immunity to prevent outbreaks.
Epidemiological studies can also inform researchers about the ideal target population for the vaccine. If the disease is typically associated with childhood, such as measles or polio, then the vaccine might form part of the early childhood vaccine programme. If, however, the disease primarily affects the elderly, then it might form part of an annual vaccination drive, as we see for the flu and COVID-19.
How can a Natural History Study affect vaccine testing?
As well as informing the overall strategy of vaccine development, a natural history study directs the vaccine testing and licensing process. In vivo endpoints form part of these decisions. For example, if the target disease is often deadly, then reduction in the number of deaths might be the aim of the vaccine. Or, if hospitalisation due to severe illness is a likely outcome, then reduction in hospitalisation or reduction in severe illness might be an appropriate endpoint. If a disease is unlikely to cause these severe outcomes, then an endpoint which captures how often the disease is contracted might be appropriate, such as a reduction in reported symptoms or reduction in positive tests.
Correlates of protection can also be determined. As noted above, these are markers of the subject’s immune system response to exposure that correlate with severity of outcome. Often this will be an antibody titre.
On a macro-scale, a natural history study might affect the ethics of and regulatory view on a vaccine trial. Ethically, the format of any vaccine trial will be a trade-off between scientific rigour, the prevalence of the disease in the population, and exposure of subjects to severe outcomes, such as hospitalisation or death.
While such decisions will often have many competing and complex factors, we can examine a simplified picture using the decision matrix below:

Let’s look at each of these combinations in turn.
Low Prevalence, Low Severity: In many cases, diseases which are uncommon and rarely cause severe consequences are not subject to vaccine development as it is seen as unnecessary. A disease, such as norovirus, however, is a case where a vaccine is worth the development cost. Norovirus is uncommon outside of outbreaks and rarely has severe outcomes among the healthy population, but is so unpleasant that a vaccine is of interest for developers.
Vaccines for such diseases might be tested using a human challenge trial where willing (or paid, in many cases) volunteers are deliberately exposed to the disease. These volunteers are split into a trial group, who are given the vaccine, and a placebo group, who might be injected with saline, say. This allows a direct comparison between contraction rates and/or symptom severity among vaccinated and unvaccinated groups with a far smaller number of subjects than would be required in a population study, and challenge studies are thus a very powerful tool when available. It is ethical to use a challenge study for a disease such as norovirus as, while they might face an unpleasant experience, it is highly unlikely that any subjects – whether from the trial group or the placebo – will face serious consequences as a result of contracting norovirus.
Low Prevalence, High Severity: Unlike the previous case, it would be highly unethical to test a vaccine for any disease with a high risk of serious outcomes using a challenge trial in humans. While, of course, such methods have been used historically (Edward Jenner, for example, tested his first vaccine for smallpox by exposing a vaccinated child to smallpox), such methods are seen as too dangerous for participants by modern medicine.
In their place, vaccines for such diseases, which include the serious respiratory virus Nipah, are often tested using challenge trials in non-human subjects. This allows endpoints which involve serious outcomes, such as death of the subject, to be tracked without endangering human participants. Correlates of protection are key in these trials. A vaccinated animal is likely to generate a level of immune response when challenged with the disease. If that response is shown to protect against the endpoint of interest (death or severe illness, for example) then we can assume the same level of response in a human subject would suggest a similar degree of protection. This is known as the Animal Rule. It is clearly quite a big assumption, which is why such studies should be conducted on in animal models that are as close to human as possible.
High Prevalence, Low Severity: When the prevalence of a disease in the general population is high, it is possible to achieve an appropriate sample size for a trial without deliberately exposing subjects to the disease. This has ethical advantages over a challenge trial as subjects catch the disease naturally, rather than through deliberate exposure, meaning one can argue that only the subjects who would have caught the disease anyway do so during the trial. When the severity of the disease is low, it is ethical to use a placebo in the trial as those who receive the placebo and catch the disease are unlikely to suffer severe outcomes.
This population study method was used in the final stages of testing for the COVID-19 vaccines. One group of trial participants was given the vaccine, and another a placebo, and then the groups returned to daily life. As the prevalence of COVID was high during the pandemic, these subjects were exposed to COVID naturally, meaning the developers were able to track differential rates of infection, hospitalisation, and deaths in the trial and placebo groups.
High prevalence, high severity: In cases where a highly prevalent disease is also highly severe, such as during the Ebola outbreak in West Africa during the 2010s, the use of a placebo in a population study becomes unethical. If there is enough evidence of the efficacy of a vaccine to use it in a clinical trial, then it is problematic to only make it available to a portion of the participants in the face of a potentially deadly disease. In such trials, designs are used such that all participants eventually receive the vaccine, or data from those vaccinated is compared to historical data from similar cohorts.
Now, as previously mentioned, these distinctions serve only as a rough guide to the methods used for different types of diseases. Most vaccine development involves several stages of trials Efficacy and, perhaps as importantly, safety in non-human subjects is usually tested for all vaccines before any testing in humans can take place, for example. It does, however, demonstrate how the information from a natural history study can be used to conduct a scientifically appropriate and ethical trial for the vaccine.
Conducting a Natural History Study
There are several key decision points when planning a natural history study. As with any statistical investigation, the goal of the study is to obtain representative information about the disease of interest, meaning we must consider the design of the study carefully. In this case, avoiding bias is of particular importance: biased information about, say, the severity of a disease could lead to unnecessary compromises in the design of future clinical studies, or, worse, subjects being exposed to undue risk.
One such decision concerns whether a study is to be retrospective or prospective. A retrospective study is one which uses existing research and previously collected data to assess a disease. This information can include scientific literature, medical records and patient charts, and interviews with disease experts. A major benefit of this approach is its speed and cost: as there is little or no data collection involved, a retrospective study can often be cheaper and swifter than a prospective study.
However, there are several ways bias can arise in a retrospective study which may be difficult to account for, particularly in cases where existing information about a disease is limited. For example, rare or emerging diseases may only be noticed or reported on by medical professionals in the most severe cases. This might make the disease appear more severe than it truly is. Similarly, the true prevalence of a disease may be suppressed in the medical record if its symptoms appear similar to a more well-known disease or if testing capacity is limited.
These factors notably combined in the early stages of the COVID-19 pandemic to obscure the rapid spread of the disease. Many cases of COVID-19 are asymptomatic, and mild cases could easily be mistaken for the common cold. Meanwhile a lack of testing capacity meant that few cases were confirmed. The result was a perception that COVID-19 was far less prevalent than it actually was at the start of the pandemic.
A prospective study – one in which new data about the disease is collected according to a pre-defined scheme – can help prevent biases such as these being introduced. For example, regular checks on individual subjects or cohorts of subjects can detect the presence of a disease even when it is mild or even asymptomatic when adequate testing is available, giving a clearer picture of its prevalence and severity. This can also give a better understanding of the pathology and immunology of a disease than a retrospective study where such investigation may be incomplete.
In some cases, a prospective natural history study might even be operated as a mini challenge study, in which (typically non-human) subjects are deliberately exposed to the disease in question. This can allow researchers the ability to closely observe the progression of the disease and appreciate its possible outcomes. However, such studies are expensive: indeed, the major drawback of a prospective study is that they are usually significantly costlier than a retrospective study. Conclusions for humans will also always be subject to assumptions on how well the animal model works.
In many cases, natural history studies will employ both retrospective and prospective components. A retrospective study, for example, may be used to highlight gaps in knowledge about a disease which can be filled using a prospective study.
Another key decision to be taken about the format of a natural history study is whether it is to be cross-sectional or longitudinal. A cross-sectional study takes data from a large cohort of subjects at a single timepoint. For example, you might perform a COVID-19 test on a class of college students as a way of understanding the prevalence of the disease among that population. A longitudinal study, by contrast, follows fewer subjects, but over a series of time points. A longitudinal study of COVID-19 prevalence among college students might follow only 10 or 20 students but have them take a test every two weeks for a year, say.
Cross-sectional studies have the benefit of being quicker to perform than longitudinal studies. As they include a greater number of participants than a longitudinal study, a cross-sectional study is also more likely to encompass a greater range of possible manifestations of the disease in the population, including the frequency of different degrees of severity. However, a cross-sectional study is, naturally, a snapshot of the progression of the disease. The benefit of a longitudinal study is that it gives a more comprehensive picture of the onset and progression of the disease, albeit in a smaller subsection of the population. Repeated follow ups also allow correlation of progression and outcomes with covariate factors, such as age and sex. This comes at the cost of a longer and more resource intensive study than a cross-sectional approach.
Natural history studies are a key component of vaccine development. By building an understanding of a disease’s behaviour, immune responses, and patterns of spread, they lay the scientific and ethical groundwork needed to design safe and effective vaccines. Whether shaping the goals of vaccine development, informing trial design, or guiding regulatory decisions, these studies ensure that each step along the vaccine pathway is based on solid evidence. As we continue to face emerging and evolving health threats, robust natural history research will remain essential to choosing the right path toward new, life-saving vaccines. Early statistical involvement in a vaccine development programme can help identify potential pitfalls in research, and highlight optimisations to get the most out of every datapoint.
Comments are closed.