Abstract
Introduction
In recent years, mixture model has gained more and more attention among practitioners and statisticians (McLachlan & Peel, 2000). Finite mixture models (FMMs) underpin a number of statistical techniques, one of which is growth mixture modeling (GMM), a technique becoming increasingly popular in longitudinal studies due to its flexible analysis framework combining continuous and categorical latent variables (Bauer & Curran, 2004; B. O. Muthén, 2004; B. O. Muthén & Shedden, 1999). In a recent publication, “Handbook for Advanced Multilevel Analysis,” several researchers (B. O. Muthén & Asparouhov, 2011; Vermunt, 2011) have pointed out the importance of combining multilevel modeling with mixture models. Despite “the richness of detail that a multilevel growth mixture model can extract from the data” (B. O. Muthén & Asparouhov, 2011, p. 38), “many issues have not yet been fully resolved” due to the fact that “multilevel mixture modeling is a rather new area of statistical methodology” (Vermunt, 2011, p. 78). This article attempts to examine the impact of ignoring the higher level nesting structure in multilevel mixture models (MMMs) and helps to build the body of knowledge in multilevel mixture modeling.
Despite the flexibility provided by FMM, when researchers analyzed their data using FMM, they generally assumed that the participants were independent from each other even though it might not always be true. For example, in educational setting, the data structure is very likely to contain two or more levels (e.g., students nested within schools). Nevertheless, when researchers analyzed their data using FMM, they ignored the higher level nesting structure (i.e., schools) and analyzed the model by assuming that the students were independent from each other (e.g., D’Angiulli, Siegel, & Maggi, 2004). In a literature search we conducted in PSYCINFO (from year 2000 to 2011) for empirical studies applying mixture modeling in different substantive areas, we have found only one recent study using MMM (Van Horn et al., 2008). Some of these studies did not need to use MMM because their data did not have the higher organization level. However, some studies used mixture modeling when they should have used MMM by ignoring the highest level of nesting (e.g., the school level) and mistakenly assume that individuals are independent from each other (reasons for doing so include lack of cluster ID, MMM’s model complexity, and/or model convergence issues). In a recent simulation study conducted by Chen, Kwok, Luo, and Willson (2010), the authors have found that when modeling latent growth trajectories, ignoring the highest level results in the redistribution of the variance from the ignored level (i.e., the organization/school level) to the adjacent level (i.e., the individual/student level). The effects of ignoring clustering have not been studied in the finite mixture modeling setting. It is important to examine its impact and make applied researchers more aware of the consequences of not considering the higher organization level and use caution in their interpretation of statistical results when they had to ignore a higher level.
Purpose of the Study
The purpose of this article is to examine the impact of ignoring a higher nesting structure in MMM on the accuracy of classification of individuals, and the accuracy as well as statistical inference (i.e., Type I error rate and statistical power) of the parameters for the model of each subpopulation.
Data structure including students nested within schools is considered. Two latent classes with known group memberships were generated and then analyzed for the true (MMM considering the higher level structure) and misspecified (FMM ignoring the higher level structure) models. Two simulation studies were conducted. In Study 1, the two latent classes were balanced in both sizes and variances, whereas in Study 2, the two latent classes were unbalanced in sizes and variances. Results were presented to show how the hit rate and the relative biases (RBs) for group mean estimates and the respective standard errors were influenced by ignoring the higher level nesting structure.
Brief Review of Multilevel Mixture Models (MMMs)
In this section, key concepts related to multilevel finite (normal) mixture models with continuous indicators are presented. The development of MMMs drew upon two lines of research. One component of MMM is finite mixture modeling (FMM), which assumes that the data under analysis is composed of a discrete number of components. FMM can handle situations where a single parametric family is unable to provide a satisfactory model for local variations in the observed data (McLachlan & Peel, 2000). FMM is similar to multiple group analysis; however, an important difference between mixture modeling and standard multiple group analysis is that in mixture modeling, the group membership is not observed or latent (B. O. Muthén, 2001; Vermunt & Magidson, 2005). This is why some researchers refer FMM as Finite Mixture Modeling (FMM), although statisticians often reserve the term
FMM has the capacity of modeling the unknown heterogeneous subpopulations and the random variation of the response variables within latent classes. However, FMM does not consider the situation of multilevel data in which individuals are nested within organizations. Hence, FMM cannot handle nonindependence of individuals due to cluster sampling. As an extension to FMM, the MMMs take the nonindependence of individuals into consideration by specifying a model for each level of the multilevel data. The model for each level could be different, depending on whether we assume heterogeneity and/or model the random effects at the individual level and the organizational level. For example, at the individual level, we can specify a mixture model that models individuals’ response patterns and classifies individuals into different subpopulations as well; whereas at the organization level, we can specify a model which only models the variance of organizations, but does not classify organizations into different subpopulations. It is also possible to specify a mixture model at the organization level. However, this article only addressed the more common MMM with classification at the individual level (e.g., students being classified into different subgroups within schools; patients being classified into different subtypes within clinics).
Study 1
Method
Data generation
In Study 1, data with two known subpopulations under a two-level model were first generated with equal population sizes and variances. Then, the data were analyzed as a two-level model (i.e., true model) using multilevel mixture model (MMM) and as a single-level model (i.e., misspecified model) using FMM. The two-level model for data generation is shown below:
Level 1:
with
Level 2:
with
where subpopulation
Suppose Level 1 is the student level and Level 2 is the school level. There were 40 schools, and within each school there were 20 students. The number of students in each subpopulation was 400, as the mixing proportion was set to be 50% versus 50%. Within each school, there were 20 students coming from two subpopulations, 10 at-risk versus 10 non-at-risk. Altogether, there were 800 students within each replication for data generation. The number of higher level units was set to be 40 given that the recommended minimum number of higher level units for MMMs is 30 (L. K. Muthén, 2003; B. O. Muthén, 2005).
In this two-level model, a total of four parameters needed to be specified: two fixed effect coefficients (i.e., γ00 and γ01) and two variances of the random effects (i.e., σ2 and τ00). Before specifying the population parameters in the conditional model, a random intercept model in which there are no subpopulations is presented as follows:
Level 1:
with
Level 2:
with
The variance of the random effect at Level 1 was specified following Raudenbush and Liu’s (2001) criteria, namely,
For τ00*, the intraclass correlation (ICC) formula
According to Snijders and Bosker (1999), adding a predictor (i.e., subpopulation
Using these formulae for calculation, a small (0.161) and a medium (0.300)
Using the
After fixing γ00 to 1, the mean for Subpopulation A and the mean for Subpopulation B were calculated using Equation (1a). The mean of Subpopulation A was 1 in all conditions, and the means for Subpopulation B were 1.632, 2.095, and 2.414 at different levels of
In summary, by specifying
The simulation used a 3 (effect sizes—amount of variance explained by group membership) × 2 (magnitude of ICC) factorial design to generate the data. A total of 500 replications were generated for each condition using SAS 9.1, yielding a total of 3,000 data sets. Each data set was then analyzed by a true model (MMM considering the higher/cluster level, type = two-level mixture) and a misspecified model (FMM ignoring the higher/cluster level, type = mixture) using Mplus 4.2 Mixture routine (L. K. Muthén & Muthén, 2006-2007).
Analysis
For each condition, valid replications for data analysis were selected because among the replications with converged results, there were latent classes with very few students (i.e., 1 or 2). A valid replication was defined as one of the two subpopulations (or classes) with class size at least equal to or larger than 6% of the total sample size (i.e., 48 out of 800). This 6% criterion was based on the average percentage of sample size for the smallest class in published studies using FMM found in PsycINFO database.
The accuracy of classification of individuals, and the accuracy as well as the test of significance (i.e., Type I error rate and statistical power) of the parameter estimates of the model for each subpopulation were then evaluated.
Hit rate is the percentage of at-risk/non-at-risk students correctly classified as at-risk/non-at-risk. The true and misspecified models were evaluated by comparing the hit rate difference between the two models.
The group mean parameter estimates from the true and misspecified models were summarized across the valid replications for each of the six conditions. The RB for each parameter estimate was calculated using the following equation:
where
The RB of estimated standard errors was computed using the following equation:
where
ANOVAs were conducted to determine the contribution of the two design factors (i.e.,
Results
Hit rate
Table 1 presents the number of valid replications in Study 1 and the average hit rate under true and misspecified models across valid replications. The results show that as group difference increased, the hit rate increased for both true and misspecified model. Besides, within the same design condition, the hit rate under true model is always higher than that under misspecified model. As ICC increased, the difference in hit rate between true and misspecified models increased.
Hit Rate of True and False Models in Study 1
Note: ICC = intraclass correlation; Differ = true model hit rate – false model hit rate.
ANOVA results indicate that only
Relative Bias (RB) of group mean estimates
Table 2 presents the mean RB of group mean estimates across valid replications under true and misspecified models. There was an underestimate of Class 1 (the smaller mean) mean and an overestimation of Class 2 (the larger mean) mean under both true and misspecified models when
Relative Bias of Group Mean Estimates in Study 1
Note: ICC = intraclass correlation.
ANOVA results showed that only
Relative Bias (RB) of variance estimates
Table 3 presents the mean RBs of variance estimates of the true and misspecified model. For the true model, the mean RB of most Level 1 and Level 2 variance estimates were within ±10%, whereas for the misspecified model, there was a trend of overestimation in Level 1 variance estimates.
Relative Bias of Variance Estimates in Study 1
Note: ICC = intraclass correlation.
ANOVA results indicated that
Relative Bias (RB) of standard errors of group mean estimates
Table 4 presents the mean RBs of standard errors for group mean estimates under the misspecified model. There was an inflation of standard errors for group mean estimates under the misspecified model. ANOVA results show that
Relative Bias of Standard Errors of Group Mean Estimates in Study 1
Note: ICC = intraclass correlation.
Study 2
Method
Data generation
To extend the findings from Study 1, which was based on the balanced design (i.e., the two classes had exactly same number of observations and variance across clusters), Study 2 was conducted by taking the unbalanced sample size and variance (i.e., unequal class size for the two subpopulations) into account along with other design factors as considered in Study 1. There were two imbalance types, Imbalance Type 1 and Imbalance Type 2. Under Imbalance Type 1, large size was associated with large variance in Group 1 and small size was associated with small size in Group 2; under Imbalance Type 2, large size was associated with small variance in Group 1 and small size was associated with large size in Group 2. The group size and variance varied at Level 1 for the two latent classes. A large group size is a group of 15 students, whereas a small group size is a group of 5 students. A larger variance group has a variance 3 times of the variance of the smaller variance group, so that the variance between the two latent groups was distinguishable. Equation (5) was used to calculate the variances of each individual group based on the size of each group. The value of
The simulation used a 3 (amount of variance explained by group membership) × 2 (magnitude of ICC) × 2 (imbalance type) factorial design to generate the data. A total of 500 replications were generated for each condition using SAS 9.1, yielding a total of 6,000 data sets. Each data set was then analyzed by a true model (MMM considering the higher/cluster level) and a misspecified model (FMM ignoring the higher/cluster level) using Mplus 4.2 Mixture routine (L. K. Muthén & Muthén, 2006-2007).
Analysis
Similar to Study 1, valid replications were selected, with hit rates and RBs of parameter estimates under the 12 conditions for both true and misspecified models calculated and examined. ANOVAs were conducted to determine the contribution of the design factors and all possible interactions.
Results
Hit rate
Table 5 presents the number of valid replications for Study 2 and the average hit rate under true and misspecified models. Similar to the results found in Study 1, as group difference increased, the hit rate increased for both true and misspecified models. Besides, the hit rate under true model was always higher than that under misspecified model within the same condition. As ICC increased, the difference in hit rate between true and misspecified models increased. In addition, Imbalance Type 2 (i.e., large variance associated with small class) always had higher hit rates than Imbalance Type 1 (i.e., large variance associated with large class) when all other conditions remained the same.
Hit Rate of True and False Models in Study 2
Note: ICC = intraclass correlation; Differ = true model hit rate – false model hit rate. Imbalance Type 1: Class 1—large size large variance, Class 2—small size small variance. Imbalance Type 2: Class 1—large size small variance, Class 2—small size large variance.
ANOVA results indicated that there was an interaction effect between the magnitude of
There was an interaction effect between the magnitude of
Relative Bias (RB) of group mean estimates
Table 6 presents the mean RBs of group mean estimates under true and misspecified models. There was bias outside the range of ±10% for both the true and misspecified models. ANOVA results indicated that there was an interaction effect between
Relative Bias of Group Mean Estimates in Study 2
Note: ICC = intraclass correlation. Imbalance Type 1: Class 1—large size large variance, Class 2—small size small variance. Imbalance Type 2: Class 1—large size small variance, Class 2—small size large variance.
Relative Bias (RB) of variance estimates
Table 7 presents the mean RBs of variance estimates of the true and misspecified model. Because the Level 1 variances for two groups were estimated separately in the true and the misspecified models, there were two σ2s for each model. For the true model, the mean RBs for Level 2 variance estimates were within or close to ±10%, and there was no
Relative Bias of Variance Estimates in Study 2
Note: ICC = intraclass correlation. Imbalance Type 1: Class 1—large size large variance, Class 2—small size small variance. Imbalance Type 2: Class 1—large size small variance, Class 2—small size large variance.
For the misspecified model, there was a trend of overestimation in σ22 under both imbalance types, whereas there was both underestimation and overestimation of σ21 only under Imbalance Type 1. ANOVA results indicated that there was an interaction effect between
Relative Bias (RB) of standard errors of group mean estimates
Because the Level 1 variances were estimated separately, there were two RBs of standard errors under each model. RBs of
Relative Bias of Standard Errors of Group Mean Estimates in Study 2
Note: ICC = intraclass correlation. Imbalance Type 1: Class 1—large size large variance, Class 2—small size small variance. Imbalance Type 2: Class 1—large size small variance, Class 2—small size large variance.
Discussion
Study 1
When a higher level structure in cross-sectional data is ignored, the variance at the higher level is redistributed to the lower level, thus affecting the hit rate and group mean and standard error estimates.
Hit rate
The difference between true and misspecified model is that for true model, ICC magnitude does not affect hit rate much within the same design. Whereas for misspecified model, ICC magnitude affects the hit rate, and the hit rate is higher when ICC is smaller. Under the misspecified model, the Level 2 variance is ignored in model estimation, and more variance is ignored at higher ICC. Obviously ignoring variance at Level 2 will decrease classification accuracy, and the more variance ignored, the less accurate the classification.
Relative Bias (RB) in group mean estimates
The difference in RB for group mean estimates between true and misspecified models are all within ±5%, which indicates that the true and misspecified models do not differ tremendously in the estimates of the group means. In other words, there was no substantial difference on the group mean estimates between the true and misspecified models.
Relative Bias (RB) in standard error estimates
There is an inflation of standard errors for group mean estimates when a higher level nesting structure is ignored. This inflation of standard errors under the misspecified model is due to the redistribution of Level 2 variance to Level 1. When ICC is larger, misspecified model has more inflation of standard errors when all other conditions stay the same.
Study 2
After adding one more design factor—imbalance type—the findings in
Hit rate
When all other conditions stay the same, the hit rate under Imbalance Type 2 is higher than that under Imbalance Type 1. In addition, the difference in hit rate between true and misspecified models is smaller for Imbalance Type 2, in which large group size is associated with smaller variance and small group size associated with larger variance. This means that under Imbalance Type 2, the misspecified model’s performance is relatively better than the misspecified model under Imbalance Type 1. This result is not surprising because when a group has smaller variance, it is easier to identify them as coming from the same group. In Imbalance Type 2, when large size is associated with smaller variance, the participants within this group have a higher chance of being classified as the same group. Compared with Imbalance Type 1, where smaller group size is associated with smaller variance, although the participants within this group have a higher chance of being classified as the same group, they are still a smaller percentage of all participants comparing with that in Imbalance Type 2. This is why in general the Imbalance Type 2 has higher hit rates than Imbalance Type 1.
Relative Bias (RB) in group mean estimates
In general, the RBs under Imbalance Type 2 are smaller than that under Imbalance Type 1. For the same reason mentioned before, for Imbalance Type 2, it is easier for both the true and misspecified models to classify the participants into the correct group, therefore resulting in more accurate estimate of the group mean, whereas for Imbalance Type 1, there are more RBs under different levels of
Relative Bias (RB) in standard error estimates
When a higher level nesting structure is ignored, the standard errors of the fixed effects (i.e., the means of the two latent classes) tend to be inflated under Imbalance Type 1 but have less bias or underestimation under Imbalance Type 2. This may result from either the misclassification of participants, or the inflation of Level 1 variance, or both.
Conclusion
Summary of Findings
This simulation study investigated the impact of ignoring a higher level nesting structure in multilevel mixture modeling on hit rates, the estimated latent class means, and the corresponding standard errors. We examined the impact of three potential factors, including the magnitude of latent class differences, the ICC between the lower and higher levels of data, and the unbalance types under the true and misspecified models.
Our results indicate that first, ignoring a higher level structure may result in less accurate classification of individuals to the correct class the individual belonged to. When the variance and size of the two classes in the generated samples are balanced, the true model has higher hit rates than the misspecified model, and the difference between true and misspecified models is affected by group differences and the ICC. When there is unbalanced group size and variance, the true model still has higher hit rates than the misspecified model; in addition, the hit rate is higher when larger size is associated with smaller variance and smaller size is associated with larger variance compared with when larger size is associated with larger variance and smaller size is associated with smaller variance.
Second, ignoring a higher level structure will result in bias in the group mean estimates for the true and misspecified models, but the difference in bias between true and misspecified models is not that large. The difference between true and misspecified models is especially small when the group difference is small, or when the ICC is lower, or when smaller variance is associated with larger size.
Third, ignoring a higher level structure will cause the variance at the higher level structure to be redistributed to the lower level and result in the inflation of standard errors for estimated group means, which in turn, results in an inflated Type I error rate. The inflation of standard errors is especially obvious when ICC is at a higher level or when larger variance is associated with larger size and smaller variance is associated with smaller size.
Recommendations
These findings have practical implications for researchers. According to the findings of the study, when ICC is higher, or when large variance is associated with large size and small variance is associated with small size, or when
Limitations and Suggestions for Future Research
In this study, we only examined the impact of ignoring a higher level structure in mixture model and only two-level structure of the data is considered. In longitudinal study, the data usually contain three levels or more (i.e., repeated measures nested within students nested within schools). In addition, the total sample size in the simulation studies was set to 800 and the cluster size was set to 20. We can change the cluster size and latent class size to see how sample size affects the hit rate and bias of parameter estimates. Another limitation is that, in reality, some data structure is not strictly hierarchical, they are cross-classified in the sense that students come from varied combinations of higher level nesting factors such as schools and neighborhoods. Researchers have found that ignoring the cross-classified structure will result in bias in standard error estimates although the fixed effects estimates were not affected (Luo & Kwok, 2006; Meyers & Beretvas, 2006; Van Landeghem, De Fraine, & Van Damme, 2005). However, there is no software available in the area of latent variable modeling to take into account the cross-classified structure in multilevel mixture modeling. More research and advances in software is needed for the area of multilevel mixture modeling.
