Repeated Measures

What are “repeated measures”?  Do I need to do MANOVA?  What is MANOVA?

Repeated measures is a bit of a slippery term that is actually not well-defined.

The classic use of the term repeated measures refers to measurements of the same quantity repeated across time.  For example, if we measure blood levels of glucose in a single subject on a daily basis, we have repeated measures of glucose.

Nowadays, the term has also come to encompass measurements of the same quantity repeated across space.  For example, if we analyze soil community composition in terms of a single parameter at different depths at the same location, we have repeated measures of the parameter.

A slightly stretched version of repeated measures encompasses measurements of different quantities at the same time and space.  An example of this would be measurement of gene expression profiles on a set of samples.  In this case the different probes represent different protein expression level.  This is an abuse of terminology, included only so you will be aware of the possibility.  The usual way to describe measurements of different quantities at the same time and space would be the term multivariate.

For the following discussion, assume that we have a classic repeated measures situation.  That is, we have individual subjects with measurements of the same quantity taken across time.  The times are assumed to be the same for all subjects, and the times further had probably best be equally spaced.

The conflation of repeated measures and multivariate analysis is probably one that lies more in the lack of computational ability found in days gone by.  With limited computational power, it was necessary to create statistical methods that could be computed in reasonable time on reasonably-sized data sets.

A multivariate analysis of variance (MANOVA) treats all of the repeated measures from each subject as a single vector of responses.  A MANOVA is appropriate for the case where the responses  arbitrarily covary, but can still be though of as a (vector-valued) linear function of the independent variables.  Since the covariance can be general, it certainly encompasses more restrictive situations such as this case, where you might reasonably expect values from the same subject to be correlated.

With a little imagination, you can see how one could devise various tests for whether the repeated measures differ from each other.  For example, one could form a new data set consisting of differences between adjacent time points.  Then, a formal statistical test of whether all of the differences were equal to zero would do the job.

If we continue the statistical bad habit of making assumptions, then we may as well make some more.  Suppose that the correlation amongst the repeated measures takes a form that is known as compound symmetry.  Without going into details, this then implies a correlation structure known as sphericity.  The bottom line on sphericity is that there is an implication that the variance of differences between any two time points is the same.

You may ask what this heroic assumption brings.  Well, in this case, it brings a big computational simplification for the analysis.  A so-called univariate analysis may be run instead.  The analysis rather depends on the assumption of sphericity being true, so in cases where sphericity might be violated, some methods of correcting for the violation have been proposed.

An unfortunate limitation of the old-school methods of MANOVA and univiarate repeated measures is that they rely on having complete data for each subject.  Using these methodogies requires you to essentially toss out the data from subjects with any missing data.  Also, the MANOVA approach requires many subjects per time point in order to provide decent estimation of the covariance structure of the data.

With the advent of linear mixed effects models, we can probably dispense for the most part with using MANOVA or the old school univariate analysis for repeated measures.  Essentially, the linear mixed effect model framework can provide us with pretty close to equivalent modeling capability with much greater flexibility. Also, since this framework is likelihood-based, it allows for missing data, which is a huge improvement.

For more reading on repeated measures, check out these links:

Leave a Reply