There are two primary ways that the number of animals used in a study – the “n” value – are used.  First, it is simply how we track animal consumption, our facility census, accounting measures and any limits set by the IACUC / Ethics Committee.  The 3Rs principle of Reduction sets a goal to reduce this number.  The goal is to use fewer animals in the interest of animal welfare, namely the minimal number of animals that will guarantee statistically significant results. Care and Use Committees that review and approve research are keen on this point, and are tasked with assuring that the number of animals deployed to a particular study is not exceeded.  If Dr Jones says she can get significant results out of his study using only 12 mice, then she should not be ordering 20, right?

This brings us to the second use of the n value, the use to determine the statistical significance of the data. There are numerous formulae (p value, T test, etc.) depending on the nature of the experimental design, and I don’t want to get into those details here, because one factor is true of all of them – the higher the n, the greater the significance of the data. This is not a linear relationship, and thus a fine balance is struck between more animals to generate significance, but not too many animals such as to be wasteful.

All fair and good, except for one crucial mistake.

The assumption that the number of mice enrolled in a study equals the number used for statistical calculations, is FALSE.  It is very convenient to say 100 mice were ordered, 100 mice were enrolled in the study, and the statistics on these 100 mice show significant results. Convenient yes, true no. We need to understand that the Experimental Unit is defined as “The smallest division of the experimental material such that any two experimental units can receive different treatments.” This doesn’t just mean be put into different dose groups, it means we need to be able to track them as true individuals, tracking their dosing, measurements, outcomes, and any other potential variables to the study.

For help understanding this we can look to Michael Festing’s work and online tools   Specifically, the n we use for determining power is “the smallest uniquely characterized dataset”. If each of the animals in a study is uniquely identified and followed, then the total n and statistical n can indeed be the same. But if I have a cage of 5 mice that are all part of my study, but I do not have any way to track them individually, then statistically my n = 1 cage of mice, not 5 individual mice.

This is counter to the 3Rs objectives, since the researcher is using 5, but only getting  the statistical power of 1. The smaller the n, the lower the statistical significance.

Michael Festing gives us guidance on this in (amongst many other sources) this ILAR Journal:  Specifically, The experimental unit should also be the unit of statistical analysis. It must be possible, in principle, to assign any two experimental units to different treatments. For this reason, if the treatment is given in the diet and all animals in the same cage therefore have the same diet, the cage of animals (not the individual animals within the cage) is the experimental unit.”

There is a simple solution to this, and that is to know the unique identity of all animals in your study, so that you can collect data (body weight, calculated dose given, material prep, sample collection, etc.) relative to that specific individual. Individual ID of the animal – be it a tattoo, a tag, or some digital identifier (RFID) allows you to know each individual on study. (It even allows you to house mice from different treatment groups in the same cage to standardize environmental factors!)

I often hear the retort “but all the animals in a cage get the same treatment.  They are part of the same group, they got the same dose, so I don’t need to know one from another within that cage.”   With this justification not to ID the mice, you are saving a small identification cost at the expense of a reduction in statistical power.  It’s not worth it.  IACUC’s, Peer Reviewers, Journals, even granting officials need to ask a simple question: Will the study be conducted in such a way that identity of each individual will be known throughout the study, to assure the integrity of the Experimental Unit and proper statistical calculations?  Because use of incorrect statistical formulae leads to uncertainty of results, and we enter the Reproducibility Crisis.

Author: Eric Arlund, April 2021 – please contact Eric at if you would like to discuss any of the topics raised in this article.