Abstract
This article discusses estimation of average treatment effects for randomized controlled trials (RCTs) using grouped administrative data to help improve data access. The focus is on design-based estimators, derived using the building blocks of experiments, that are conducive to grouped data for a wide range of RCT designs, including clustered and blocked designs, and models with weights and covariates. Because of the linearity of the regression model underlying RCTs, the asymptotic properties of design-based estimators using group-level averages—formed randomly or by covariates for nonclustered designs and as cluster-level averages for clustered designs—match those using individual data. Furthermore, design effects from aggregation are tolerable with moderate numbers of groups and few covariates, suggesting little information is lost in these cases. Ecological inference methods for subgroup analyses, however, yield large design effects. Several empirical examples using real-world education RCT data demonstrate the theory.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
