Stratified Randomization

The below guidance on stratified randomization was developed by the CCTS Biostatistics Core. For more information on this topic, including advice about how to apply it in your research, consider scheduling a consultation with a biostatistician.

  1. The purpose of stratified randomization is to achieve greater balance (closer to equal numbers across chosen categories) between experimental groups than might occur by chance alone (unstratified simple randomization). With balance, a category or factor is ruled out as a potential competing explanation (confounder), so the primary hypothesis of interest is protected.

  2. One needs to be able to reliably classify units (subjects) into exclusive categories or factors, such as major categories of disability, or ages, or race/ethnicity. In principle the number of categories can be large – a dozen, say – but in practice one usually limits the setup to two or three categories that will have substantial numbers, perhaps 20% of the total N.

  3. Simple randomization without stratification is quite effective in achieving balance. If one happened to have a sample containing 90 A’s and 10 B’s, and applied 1:1 simple (coin flip) randomization into two groups, on average one will end up with 45 A’s and 5 B’s in both groups. And splits like 50 vs. 40 A’s or 8 vs. 2 B’s are actually fairly low probability. So the bother of stratified randomization has to be weighed against the threat posed by the competing explanation to the primary hypothesis.

  4. Stratified randomization improves balance but does not guarantee perfect balance. Permuted blocks constrain departures from perfect balance. Blocks of size 2 mean that if one puts an A in Group 1, then the next A sampled must be assigned to Group 2. Blocks of size 4 would allow that of 4 A’s sampled, two must be assigned to Group 1, and two to Group 2. (Similarly for B’s.) One benefit of permuted blocks is that it keeps the group sizes close to equal, so there will be balance between groups if the study is stopped early. A second benefit of larger blocks is that the pattern of assignments looks more haphazard or “random” to an observer, so that a person wanting to manipulate the randomization (assign a favored subject to a certain group), faces a greater challenge. A refinement of permuted blocks randomization is to vary the sequence of blocks sizes between two or more sizes randomly; for example, 4, 4, 8, 4, 8 … (With randomization now often carried out by a computer data capture program, e.g., REDCap, as part of baseline intake, the chances of manipulation are greatly reduced.)

  5. There is an alternative to the foregoing approach called minimization, which employs multiple regressions to maintain balance among several (3-7, say) confounding factors simultaneously. Missing data can be a non-trivial complication. There are critics who think the mixing between groups achieved by randomization by minimization is less than ideal. Also, minimization is difficult to include in a data capture program.

Attachments