Here is a general strategy for figuring out the sample size for your study. In essence, it presupposes that you have your statistical analysis specified to at least the design and the exact statistical test you are going to use.
- Figure out the size of the signal that is important to you to detect.
- Determine the amount of variation to expect in the samples.
- Set the false positive rate (nearly always the traditional 5% level).
- Set the power. A good starting level is 90%, but 80% is kind of traditional, too.
- Determine the sample size based on these quantities and the proposed statistical test.
- Moderate the sample size if needed for expected drop-out, non-completion, or similar issues.
If the total sample size you obtain is unrealistic in terms of time, cost, availability, or physical reality, then you need to make changes in your proposed design. Or, you may have some serious soul-searching to do, and may need to rethink your plan of research. Be happy you figured that out now!
Let’s talk about each point in turn. First, you need to figure out the size of the signal that is important to you to detect. This is going to be based on practical or academic considerations. Examples of such considerations could be “Demonstrate a 15 lbs improvement in yield per acre over control”, or “Show a 50% increase in mortality”, or “Show an increase of at least 5 points on the scale due to intervention”. Examples like “Show a positive association of behavior with treatment” are going to require some sort of quantification to move forward.
As mentioned above, determining the amount of variation in the samples can be done in a few ways. One very good way is to mine the available literature for similar studies, then collect estimates of variability from those studies. Another good way is to conduct a pilot study with the express purpose of estimating the variability. Another useful method is to elicit estimates from subject-matter experts. There are some handy tricks for this that statisticians know.
The false positive rate is nearly always going to be set at the 5% level. (That is, the Type I error rate, or α, is going to be 0.05.) You could set it lower, but trying to set it higher is going to most likely just cause you problems.
Most people never consider the so-called “power” of their study. The power in simple language is
The probability that my study will actually give me a statistically significant result if there really is something there to see.
In other words, if what you think is happening really is happening, how likely is your study to actually show it?
Now you can see why setting the power to 90% might be a good idea! If there is really something to see in the study, don’t you want the study to have a good chance of seeing it?