This blog explains the difference between true replicates and pseudo-replicates, and discusses the problems that may arise if this difference is ignored or misunderstood in bioassay statistical analysis.
Quality Control by sampling
Consider a manufacturing process that makes widgets. In order to check the quality of a batch of widgets, a number of widgets randomly selected from the batch can be examined in detail to draw conclusions about the batch as a whole. You might perhaps select 100 from each batch of 1000. This is standard QC manufacturing practice. In general, the more widgets you sample, the greater the certainty that the batch is OK.
But this only works if the samples are independent. In this example, because they are sampled randomly, there is no reason to think that the next widget tested is in any way related to the previous one, except that they come from the same batch.
But, if you selected only 10 widgets and tested each one 10 times, this would be the same number of results (100) but clearly less reliable as a test of the batch. This is pseudo-sampling or pseudo-replication. The samples tested are not independent, and the result from the next test is expected to be almost identical to the previous test because it is a test on the same widget. Replicates like this are also known as technical replicates.
Pseudo-replicates in bioassay
In the field of bioassay this is a common issue. Assays are designed to test samples of a batch for QC purposes. Testing more sample (replicates) increases the confidence in (precision of) the result. How this is done experimentally is critical.
If a single sample is taken from the batch, used to make up a dilution, and then pipetted into adjacent cells, this is the same as testing the same sample several times. The results are expected to be identical. Differences tell you something about the plate variability, but not the sample variability.
Why is this a problem?
The problem comes when statistics are used in the data analysis. In general, all statistical tests used in bioassays assume that samples are independent. Using standard statistics on pseudo-replicates will give wrong, and usually falsely narrow, confidence intervals.
The variability of the underlying biological sample is underestimated and can lead to assay suitability criteria failures, or to passing a sample that should have failed on precision. Many scientists do not actually calculate confidence intervals so may not notice directly, but the issue will impact any suitability criterion that evaluates variability.
Here is a simple (and common) example.
The F test for goodness of fit
This is a common test which examines whether the lack of fit error (the difference between the mean value of the replicates and the fitted line) is greater than the pure error (the variability of each replicate from the mean of the replicates).
Pure Error measures the variability of the replicates around the mean of the replicates at each dose.
Lack of fit Error measures the variability of the mean of the replicates around the fitted model.
If the lack of fit error is greater than the pure error then the goodness of fit is poor and the test may fail, depending on the “p” level set.
When using pseudo-replicates, the pure error is expected to be very small as the assay is measuring the same sample. So, in this situation, the F test for goodness of fit often “fails” even though the data looks good.
So why use pseudo-replicates at all?
Pseudo-replicates are useful in development to determine the within-plate variability. In this situation the replicates are usually treated as individual single replicate samples to assess variability across the plate. Other than this, pseudo-replicates waste “plate real estate”.
As assays move from development to commercial use, and throughput and laboratory efficiency become important, it may be worth considering carefully whether you really need pseudo-replicates in your production assay.