Statistics is divided between two camps: the frequentists and the Bayesians. The latter of these has begun to attract more popular attention in recent years with the rise of AI and other similar technologies. This has extended to the life sciences, in which there are several fields where Bayesian approaches are taking centre stage. So, how do Bayesian statistics differ from the statistical techniques typically encountered in the classroom, and how might we put them to use?

## Frequentist vs Bayesian Statistics

When we think about statistical inference, we often envision hypothesis tests, p-values, and confidence intervals – indeed, this is the vast majority of the statistics we’ve covered in this blog over the years. This is the realm of *frequentist* statistics.

While frequentism is a more prevalent technique today, Bayesian inference actually came about first. It was developed by Thomas Bayes in the mid-18^{th} century, and later popularised by Pierre-Simon Laplace. Bayesian inference centres on Bayes’ Theorem, which is often stated in the form of the equation:

Let’s break this down. A probability in the form gives the chance that we’d observe A *given* B. For example, the probability of hearing thunder in the next 10 minutes *given* that it’s raining would be a lot higher than the probability of hearing thunder with no other information.

In our equation, is the data we’ve collected in our experiment, and is an outcome we’re interested in. So, Bayes’ Theorem gives the probability of observing the outcome *given* the collected data we observe – we call this quantity the *posterior*. The posterior depends on:

- : The probability of observing our data
*given*a certain outcome (the*likelihood*). - : The probability of a certain outcome based on our knowledge of the situation alone (the
*prior*). - : The probability of observing the collected data (the
*marginal likelihood*).

## Thinking Bayesian

In a sense, Bayesian statistics is a formalised way of how we ourselves tend to naturally analyse evidence and form conclusions. We begin with our prior knowledge of the problem, consider new evidence, and use it to update our views.

For an example, imagine you were a bookmaker setting odds for a football team every week. How would you formulate a representative betting line? At the very start of the season, you would start with your priors – the strength of the squad, the results from the previous season, etc. This is the information which you would use to set the odds of the team winning in their first game. In this case, the posterior is identical to the priors because we have no evidence to update our probabilities.

Once the result of the first match is in the books, we now have *evidence *of the performance of the team. This means we can set the odds – that is, find a posterior – for the second game based on *both* the priors and the evidence.

Importantly, this posterior is then used as a prior to set the odds for the third game, which are then used as a prior for the fourth game, and so on. This means that, as the season progresses, the odds for each subsequent game are based more and more on the evidence of the results of previous games and less on the priors.

## A Medical Example

One area of the life sciences where Bayesian inference is vital is diagnostic testing. Let’s consider some examples from that field to see how Bayes’ theorem behaves under some different circumstances.

Imagine we were developing an at-home test for the flu. As we would expect, this test returns one of two results, positive or negative. There are, however, *four* possible scenarios from the results of the test:

- Positive, and the patient has the flu (true positive)
- Positive, but the patient doesn’t have the flu (false positive)
- Negative, and the patient doesn’t have the flu (true negative)
- Negative, and the patient does have the flu (false negative)

That means to effectively use the diagnostic test, we need to understand the probability of a patient having the flu *given* a positive test. Our outcome is whether the patient has the flu or not, which we’ll designate , and our data is the result of the test (+ or -). In the language of Bayes’ Theorem, that means we are interested in the quantities and .

Let’s imagine that we’ve already done some thorough testing on our diagnostic, and we’ve established that:

- . The test will always be positive
*given*the patient has the flu. - . The test will be negative
*given*the patient does not have the flu 99.9% of the time. - . The false positive rate, here about 0.1% – the probability of the test being positive
*given*the patient doesn’t have the flu.

For now, we’ll consider the first of our interesting probabilities. Using Bayes’ Theorem, we can say that:

Which, since , reduces to:

Now we must determine our prior. A sensible choice here would be the prevalence of flu – the percentage of the population who have the flu. Let’s say this is 0.5%, which means that .

To determine , imagine we test 1 million patients. Based on the probabilities we’ve already outlined, 5000 of these will have the flu, all of whom will return a positive test. Of the remaining 995,000, we would see a false positive result from 995 subjects. So, we would see 5995 positive tests from our sample altogether. This means that .

Putting it all together:

So, we can say that the chances of a patient having the flu having returned a positive test are 83%. In this case, the priors are very powerful – if the prevalence of the flu had instead been 0.1%, then our final probability would have been just 50%!

## Why Bayesian Statistics?

The influence of priors on the outcome of Bayesian probabilities is often held as a negative, as one could envision using the priors to manipulate a result to suit one’s purposes. To counter this, the probabilities are often calculated using optimistic, realistic, and sceptical priors to demonstrate how the priors affect the overall results. It is also common to use an uninformative prior which provides no additional information beyond the collected data if little is known about the situation.

However, priors can also be an extremely powerful tool as they ensure that existing evidence is properly accounted for. If a football team has an elite squad, it would take them losing a lot of games before the odds might predict them losing games regularly. It would rightly take a lot of evidence – more than a slow start to the season – to overcome those priors. Similarly, if a new drug shows a very large effect in clinical trials, then it would take a lot of evidence to overturn the prior that the drug is effective down the line. A frequentist interpretation would only account for the data in these situations, which could lead to unrepresentative results.

Bayesian reasoning often provides a more intuitive interpretation of probability than a frequentist reasoning. So why has its prevalence only grown recently? Many Bayesian methods require increased computational power when compared with frequentist methods which has become more readily available in more recent years. Combined with the development of methods such as Markov Chain Monte Carlo algorithms, this means that Bayesian analysis has become more computationally accessible.

Going forward, then, where could Bayesian statistics be used where they are not currently common? In part 2 of our series on Bayesian methods, we’ll outline recent work at Quantics into Bayesian approaches for sample size calculations.

Comments are closed.