Abstract
1 Introduction
Longitudinal studies are ideal for investigating age-related developmental change. When age is the time metric, different types of longitudinal designs can be distinguished according to the distribution of ages at recruitment. In a single cohort design all participants start out at the same age, whereas a study that recruits all available individuals with initial age in a specified range can be regarded as an ‘unstructured multicohort longitudinal design’. 1 An accelerated longitudinal design (ALD) is a more structured multiple cohort design that takes multiple single cohorts, each one starting at a different age.
Figure 1 represents an ALD covering ages 0–7 with three cohorts, four annual measurements per subject and an overlap of two measurements between cohorts. Collection of measurements for this design would take 3 years (ignoring recruitment lags), compared to 7 years for a single cohort longitudinal study covering the same age range. This illustrates the main advantage of an ALD: its ability to span the age range of interest in a shorter period of time than would be possible with a single cohort longitudinal design. An additional advantage of a shorter study is that it should be less affected by dropout (where a participant leaves the study prematurely, so that no further measurements are possible). The trade-off for this shorter duration is the inherent missing data: by design, each subject’s measurement schedule covers only part of the age range of interest. This can be a problem when there is an age cohort effect, that is a systematic difference between people born at different times.
A three cohort accelerated longitudinal design: recruitment ages are 0, 2 and 4 for cohort 1 (squares), cohort 2 (circles) and cohort 3 (triangles), respectively. Four annual measurements are taken, and there is an overlap of two measurements between successive cohorts: cohorts 1 and 2 both have measurements at ages 2 and 3, and cohorts 2 and 3 both have measurements at ages 4 and 5.
Design of an accelerated longitudinal study requires consideration of a number of parameters. Specific to this type of study are the number of cohorts and the extent of overlap between cohorts, whereas common to any longitudinal study, the frequency and timing of measurements also needs to be set. Varying these parameters may produce a large collection of candidate designs, so the question of how to choose the best design arises. In addition, the study may be constrained to a maximum duration, number of participants or number of measurements, and the relative costs of implementing different ALDs will play an important role in choosing between them.
Moerbeek
2
considered the effect of number of cohorts, extent of overlap and frequency of measurement on power to detect a linear trend, for some specific ALDs. Tekle et al.
3
considered
The aim of this paper is to provide a comprehensive discussion of the issues involved in the design and analysis of accelerated longitudinal studies. We provide a systematic investigation of characteristics of ALDs that affect power to detect a linear trend. We incorporate a model for study costs and identify the most cost-efficient designs. We also consider the impact of dropout and how to deal with cohort effects. Based on our results, we recommend some general guidelines for designing accelerated longitudinal studies.
2 Methods
2.1 Models and assumptions
2.1.1 Linear mixed model for responses
We adopt a polynomial linear mixed model for the responses
The form of the fixed effects design matrix
The covariance matrix for the responses from subject
If only cohort-specific covariates such as age are included in the model, and measurements are taken at a common set of ages for individuals from the same cohort, then we can drop the
2.1.2 Linear trend with age
The model for a linear trend with age, and random effects for both the intercept and slope, is
Where there is a linear trend, interest usually lies in estimation and inference regarding the slope parameter
As described by Galbraith and Marschner,
7
for a single cohort longitudinal design with
2.1.3 Locally D-optimal designs
2.1.4 Effect of centring age
As discussed by Fitzmaurice et al.,
4
in a longitudinal study it is often desirable to centre the ages by subtracting some common age for all individuals in the study. This can avoid problems of collinearity when the model for the mean is a polynomial trend. For example, in an accelerated longitudinal study, the initial age
From an analysis point of view, centring age is straightforward and does not require any further adjustments to the model. However, at the design stage of an accelerated longitudinal study, assumptions regarding the parameters of the assumed model will be required. In particular, assumptions regarding the elements of the random effects covariance matrix
As an example, consider designing a study that adopts a linear model for the trend with age, and suppose
2.1.5 Model for costs
We assume that the costs of undertaking an accelerated longitudinal study can be split into the following four components: overheads, costs of recruiting subjects, costs of taking measurements, and ongoing costs related to the duration of the study. Hence
Expressing measurement and duration-related costs as multiples of the cost of recruiting a subject allows us to compare designs according to
2.1.6 Cohort effects
By ‘cohort effect’ in this paper, we mean a
When cohort effects exist, estimates of age-related change obtained from longitudinal and cross-sectional studies will differ. This is because a cross-sectional study samples people of different ages at a fixed time, so that age-related change is estimated from a mixture of different cohorts. For example, a cross-sectional study of respiratory function could be affected by the changes in smoking habits referred to above: the older people in the study may tend to have worse lung function not just because they are older, but also because of their smoking habits. In this case the cross-sectional estimate of change in lung function would indicate a steeper decline than the estimate obtained from a longitudinal study. Ware et al. 10 analyse a study of pulmonary function in never-smoking adults which finds evidence for a cohort effect not related to smoking.
Cohort effects can also have an impact on accelerated longitudinal studies. An ALD ‘pieces together’ trajectories from different cohorts, which may not be a valid representation of the whole age range when there are cohort effects. Since a single cohort longitudinal study consists of only one age cohort, it will produce an unbiased estimate of within-subject change
Some methods for modelling cohort effects in an ALD will be considered in this section. Section 2.1.7 considers methods that treat cohort effects as fixed, whereas Section 2.1.8 discusses a model with random cohort effects.
2.1.7 Fixed cohort effects
The most general model incorporating fixed cohort effects allows a completely different trend for each cohort. This approach is discussed by Miyazaki and Raudenbush.
5
When there is a linear trend with age, it is equivalent to allowing a different fixed intercept and slope for each cohort, so that the mean trend is
A simpler model, discussed by Fitzmaurice et al.,
4
adds terms involving initial age to the model. For example, the linear trend model would include a single extra term in
A model intermediate between models (5) and (6) can be conceptualised by allowing the fixed effects parameters to vary linearly with initial age. For the linear trend model, the intercept and slope for cohort
More general versions of model (7) can be obtained by allowing the fixed effects parameters to vary smoothly, but not necessarily linearly, with initial age. For example, if we assume a quadratic relationship such that
Implied longitudinal and cross-sectional models
For models (6), (7) and (8), it is possible to write down the implied longitudinal and cross-sectional models. We assume here that the spacing between measurement times is the same for all individuals in the study, and let
When the longitudinal and cross-sectional models coincide there is no cohort effect, and therefore the parameter constraints producing equality of these two models also define the null hypothesis of no cohort effect. The null hypothesis for each model is given in the final column of Table 1.
For model (6), the longitudinal model implies linear trends for the different cohorts that are parallel but shifted by the intercept term. This results in parallel linear trends at each time for the cross-sectional model. In model (7) the linear longitudinal trends are no longer parallel, leading to quadratic cross-sectional trends, whereas model (8) gives rise to cubic cross-sectional trends.
Comparison in region of overlap
If the curves for different cohorts are similar in regions of overlap, it may not be of concern that hypothesis tests indicate a significant difference in fixed effects parameters. This could be the case particularly when the regions of overlap are small, so that the information on the age range covered by a particular cohort comes largely from that cohort.
In this situation, fixed effects tests for differences between cohorts could be derived by integrating the absolute value of the difference between the estimated curves over the age range of overlap. Standard errors could be obtained via the delta method.
2.1.8 Random cohort effects
Another method of allowing for differences between cohorts would be to add a third level to the hierarchical model by including cohort-specific random effects. Under this approach, the model (1) becomes
2.1.9 Comparing designs when cohort effects are present
If cohort effects are anticipated at the design stage, two criteria of interest for comparing different designs might be:
Power to detect cohort effects; or The determinant of the generalised variance corresponding to the vector of fixed effects.
Power to detect cohort effects
If cohort effects are modelled using model (6) then the null hypothesis of no cohort effect corresponds to equating a single parameter to zero:
For model (7) the test for no cohort effect involves two parameters:
For the case of random cohort effects, testing the null hypothesis of no cohort effect corresponds to testing that some variance components are equal to zero. This is a test on the boundary of the parameter space, such that the null distribution is in general unknown. In this situation, power and sample size requirements could be obtained via simulation.
The determinant of the generalised variance corresponding to the vector of fixed effects
For comparing different designs (under the same model) with respect to how precisely the entire vector of fixed effects can be estimated, the
2.1.10. Dropout
The relative impact of dropout on different ALDs depends on the dropout mechanism. Whilst this mechanism is in general unknown, it seems reasonable to assume that the overall level of dropout, as measured by the proportion
In studying the impact of dropout on single cohort longitudinal designs, Galbraith and Marschner
7
adopted a Weibull model for the hazard of dropout: the rate at which individuals drop out at time
The approach of Verbeke and Lesaffre
13
can then be used to simulate dropout patterns from a multinomial distribution with probabilities given by
In Section 4.2 we examine the pattern of dropout observed in a real study and investigate the impact of dropout for some ALDs.
3 Results
3.1 Identifying possible designs
With a completely general number and schedule of measurements for each cohort, the collection of candidate designs may be so large that it would be infeasible to investigate the properties of each one. Here we concentrate on ‘balanced’ ALDs, taken to mean ALDs with equally spaced measurements, the same number of measurements per subject, the same extent of overlap (number of common ages) between successive cohorts, and where the initial age for each cohort is an integer multiple of the interval between measurements. Under these conditions the measurement ages are
ALDs covering ages 11–18 with a 12-month interval between measurements.
Overlap for single cohort design taken to be 8 since there is complete overlap between all subjects.
In the next section, the designs listed in Table 2 are used to illustrate how different ALDs can be compared.
3.2 Comparing designs
In this section we consider how to compare designs on the basis of power and the cost function (4) introduced in Section 2.1.5. Section 3.2.1 considers comparisons in the absence of dropout and cohort effects. In Section 3.2.2 the impact of dropout is considered, while Section 3.2.3 deals with cohort effects.
3.2.1 No dropout, no cohort effects
Power to detect a linear trend
We examined the effect of number of measurements per person, number of cohorts and extent of overlap on the power properties of the designs in Table 2.
Figure 2 plots the total number of subjects and total number of measurements required to achieve 90% power (3) to detect a linear trend in model (2) against the three attributes: measurements per person ( Total number of subjects (top) and total number of measurements (bottom) required to achieve 90% power to detect a linear trend, plotted as a function of number of measurements per person (left), number of cohorts (middle) and overlap (right), for the designs in Table 2 (identified by letter in the plot). Design parameters were 
Whilst Effect of number of cohorts/overlap on number of subjects/measurements for 90% power. The 
Figure 4 gives an alternative presentation of the results, showing power as a function of the three determinants of cost: total number of subjects, total number of measurements and duration, for the designs in Table 2.
Power to detect a linear trend, as a function of (a) total number of subjects; (b) total number of measurements; (c) duration for fixed total number of subjects (50); and (d) duration for fixed total number of measurements (112). Different curves/points represent different numbers of measurements per subject (
Figure 4(a) plots power as a function of total number of subjects. The power curves increase in height as
We also examined the effect of the interval between measurements, by comparing the 12-month interval designs listed in Table 2 with the 6-month interval designs of the same duration. Figure 5 shows the total numbers of subjects and measurements required to achieve 90% power for the 44 designs, plotted against duration. It can be seen that the slightly smaller number of subjects often required for the 6-month interval designs is more than offset by the larger number of measurements taken for each subject, leading to a larger total number of measurements for all of the 6-month designs.
Total numbers of subjects (left) and measurements (right) to achieve 90% power, for the designs in Table 2 and the corresponding 6-month interval designs, plotted against duration.
The results displayed in this section assume the design parameters If the aim is to minimise the total number of subjects, then designs with larger values for If the aim is to minimise the total number of measurements, then designs with smaller values for For two designs with the same value of
In addition, decreasing the interval between measurements while keeping duration fixed (and hence increasing the number of measurements per person) led to an increase in the total number of measurements. The effect on total number of subjects appears to be small but could be in either direction.
It should be noted that these conclusions apply to choosing amongst balanced ALDs and rely on the assumptions of a linear trend and no cohort effects. In reality, other considerations may come into play when choosing a design, such as the need to check for linearity or for cohort effects, and the need to estimate variance components.
Costs
To illustrate how costs vary with design parameters, we use the total numbers of subjects and the resulting total number of measurements required to achieve 90% power for each design in Table 2. These are shown in the leftmost plots of Figure 2, for
We considered a grid of 36 values for ( Costs required to achieve 90% power, for the designs in Table 1, with 
It can be seen that for a fixed value of
We also considered the effect of varying the ratio of within to between-subject variability,
We examined the effect of the interval between measurements in two different ways. First, the 12-month interval designs listed in Table 2 were compared with the 6-month interval designs with the same duration. Costs for the 44 designs were calculated using the total numbers of subjects and measurements required to achieve 90% power shown in Figure 5. For the purpose of comparing costs, without loss of generality we can set Costs to achieve 90% power, for 6- and 12-month interval designs with duration 3, as a function of 
The second investigation compared four different designs to cover ages 0–2 years, all using two cohorts with an overlap of one measurement. The interval between measurements was either 12, 6, 4, or 3 months, corresponding to Costs to achieve 90% power, for two-cohort designs to cover ages 0–2 with an overlap of 1 and 3, 4, 6, or 12 months between measurements.
3.2.2 Dropout
Dropout in the Longitudinal Study of Australian Children (LSAC)
Growing Up in Australia: The LSAC 15 recruited over 10,000 Australian children in 2004 and is following them up with the aim of addressing research questions related to child development and well-being. LSAC is a two-cohort ALD: the B cohort aged 0–1 initially and the K cohort aged 4–5 initially. As an initial investigation of the Weibull dropout model described in Section 2.1.10, we examined the pattern of dropout observed in LSAC.
Figure 9 shows the actual proportions of each cohort remaining in the study, as a function of study time, compared to a Weibull model with Dropout pattern in LSAC, showing the proportion remaining in the study as a function of study time. The letters represent observed proportions for the two cohorts (B and K), and the solid line is the fitted Weibull model with 
Extent of power loss for different ALDs
Using the approach described in Section 2.1.10, we examined the impact of dropout for the designs listed in Table 2. Figure 10 shows the ratio of the power under 30% dropout to the power under no dropout, for Extent of power loss for 30% dropout, for the designs in Table 2, identified by letter, for 
Figure 10 shows how, under the assumed models, the extent of power loss increases with duration. Hence the single cohort longitudinal design, with the longest duration, suffers the most power loss. Figure 10 also shows that the power loss is greater when dropout is concentrated more towards the start of the study. For the single cohort longitudinal design it can be seen that when 30% of participants drop out over the course of the study, the power is about 4% lower than if there were no dropout when
3.2.3 Cohort effects
Power to detect cohort effects
In this section we compare the ability of 11 of the designs listed in Table 2 to detect a cohort effect, assuming either model (6) or (7) holds. Designs A (cross-sectional) and M (single cohort longitudinal) are excluded from the comparison since they cannot distinguish cross-sectional from longitudinal effects and are therefore unable to detect cohort effects.
Figure 11 shows the total number of subjects and total number of measurements required to achieve 90% power to detect a cohort effect under model (6) with Total number of subjects (top) and total number of measurements (bottom) required to achieve 90% power to detect a cohort effect under model (6) (left) and (7) (right), for designs B–L in Table 2 (identified by letter on the horizontal axis). Different coloured lines represent different durations (different 
Overall precision with which the vector of fixed effects can be estimated
This section compares designs according to the
Figure 12 shows the determinant of the fixed effects covariance matrix under model (2) (no cohort effects) and with fixed cohort effects under models (6) and (7), assuming Determinant of fixed effects covariance matrix under models (2), (6) and (7), assuming 
Comparing designs for a fixed 120 subjects shows that while the single cohort longitudinal design is best in the absence of cohort effects, designs with fewer measurements per person can be better when there are cohort effects. For the chosen design parameters, design H, with five measurements per subject, is optimal for model (6), and design E, with four measurements per subject, is optimal for model (7).
For a fixed 840 measurements, the design with the lowest possible number of measurements per person is best for models (2) and (7). For model (6), design E, with four measurements per person, is optimal, although design B, with two measurements per person, is almost as good.
Under the assumption of fixed cohort effects, we again see that for designs with the same value for
Figure 13 shows some results for the case where cohort effects are assumed to be random rather than fixed. The three sets of results show the determinant for the case of no cohort effects, assuming a random cohort intercept only, and assuming a random cohort intercept and slope. The determinants for a fixed 120 subjects, and for a fixed 840 measurements, are shown in separate plots. We assume Determinant of fixed effects covariance matrix with no cohort effects, a cohort random intercept, and both random intercept and slope, assuming 
We see that when random cohort effects are present, the best design for a fixed number subjects need not be the single cohort longitudinal design: design K performs best under the assumption of a random cohort intercept only, and design I performs best when random cohort intercept and slope are both present.
For a fixed 840 measurements, the cross-sectional design is best when there are no cohort effects or just a random intercept, but design D is best with both random intercept and slope.
Unlike the no cohort effects case and the fixed cohort effects case, where there are designs with the same value for
4 Discussion and conclusions
ALDs are an attractive alternative to a single cohort longitudinal design when it is important to limit the duration of a study. They can also be preferable to an unstructured longitudinal study, where all individuals with initial age in a specified range are recruited, since they can be designed more efficiently. However, design of an accelerated longitudinal study can be a more complex task because additional design features, such as the number of cohorts and extent of overlap, require consideration.
In this paper we have discussed the issues that need to be considered when designing an accelerated longitudinal study, starting with a description of the linear mixed model framework for age-related trends in general and a linear trend in particular. We have considered criteria against which designs can usefully be judged (including cost considerations), the issue of cohort effects and the impact of dropout. We have shown how to generate all possible designs from the practically useful class of ‘balanced’ ALDs, and used this approach to illustrate how the design issues we have identified can be explored, and optimal designs according to the criteria mentioned can be chosen. In particular, we have investigated the impact of varying design parameters on these criteria.
Whilst we have tried to provide a comprehensive summary of design issues for accelerated longitudinal studies, in this paper we have not attempted to consider all possible designs or all possible trends with age. The motivation for considering balanced ALDs was that these designs represent a practically useful class that would be routinely considered when contemplating such a study. We are currently considering more general designs, including the problem of finding designs that either minimise cost for fixed variance or minimise variance for fixed cost. Similarly, the linear trend assumption represents a simple model that is often adopted at the design stage and is a useful starting point for exploring the effect of varying design parameters. We have also obtained some results for quadratic models that are broadly consistent with the linear trend, and the approaches we have described could be adapted to more complex trends. In addition, we have not looked at an exhaustive range of parameters (for example for the variance components), but rather have sought to describe general methods that can be used for different parameters.
Despite these limitations, our results suggest some broad guidelines for designing accelerated longitudinal studies.
In the absence of cohort effects, and for a fixed power to detect a linear trend, we found that the number of measurements per subject,
For two designs with the same value for
Finally, increasing the frequency of measurement for a fixed duration appears to have little effect on the required number of subjects, but increases the required total number of measurements.
To compare designs with respect to cost, we have used a three-component cost model incorporating recruitment, measurement and duration-related costs. For a fixed power to detect a linear trend, assuming no cohort effects, we found that as measurement costs increase relative to recruitment costs, the best design shifts towards smaller values for
The model we adopted for dropout assumes that the proportion of subjects who drop out by the end of the study increases with study duration. Under this model, studies with shorter duration will perform better. For the designs and models we considered, the maximum power loss for 30% dropout was about 7%, suggesting an increase in target power by this amount would be sufficient to allow for that level of dropout.
We have also presented some results for comparing designs when cohort effects are present. These results suggest that when the aim is either to detect cohort effects or to achieve a desired level of precision for estimating the entire vector of fixed effects estimates, there may be an advantage in increasing the number of measurements per subject. For designs with the same value of
Finally, it should be mentioned that whilst this paper has focussed on design, there are issues surrounding the analysis of accelerated longitudinal studies that also need to be considered. For example, convergence problems can be encountered when fitting hierarchical linear mixed models in general, usually when the number of higher level units is small. For ALDs, there may be designs for which the combination of
