Sage Journals: Discover world-class research

Abstract

Social scientists often rely on survey data to examine group differences. A problem with survey data is the potential misclassification of group membership due to poorly trained interviewers, inconsistent responses, or errors in marking questions. In data containing unequal subsample sizes, the consequences of misclassification can be considerable, especially for groups with small sample sizes. In this study, the authors develop a new mixture model that allows researchers to address the problem using the data they have. By supplying additional information from the data, this two-stage model is estimated using a Bayesian method. The method is illustrated with the Early Childhood Longitudinal Study data. As anticipated, the more information supplied to adjust for group membership, the better the model performs. Even when small amounts of information are supplied, the model produces reasonably robust estimates and improves the fit compared to the no-adjustment model. Sensitivity analyses are conducted on choices of priors.

Keywords

misclassification Bayesian mismeasured discrete variable mixture model

Get full access to this article

View all access options for this article.

References

Akaike, Hirotugu. 1973. ``Information Theory and an Extension of the Maximum Likelihood Principle.'' Pp. 267-281 in Second International Symposium on Information Theory, edited by B. N. Petrov and F. Csaki. Budapest, Hungary: Akademiai Kiado.

Allison, Paul D. 2002. Missing Data. Thousand Oaks, CA: Sage.

Bachrach, Christine A. , Kathryn London , and Penelope L. Maza . 1991. ``On the Path to Adoption: Adoption Seeking in the United States, 1988.'' Journal of Marriage and Family 53:705-18.

Carlin, Bradley P. , Kathryn Chaloner , Thomas A. Louis , and Frank S. Rhames . 1995. ``Elicitation, Monitoring, and Analysis for an AIDS Clinical Trial.'' Pp. 48-78 in Case Studies in Bayesian Statistics, vol. 2, edited by Constantine Gatsonis , James S. Hodges , Robert E. Kass , and Nozer D. Singpurwalla. New York: Springer-Verlag.

Carroll, Raymond J. , David Ruppert , Leonard A. Stefanski , and Ciprian M. Crainiceanu . 2006. Measurement Error in Nonlinear Models: A Modern Perspective, 2d ed. Boca Raton, FL: Chapman & Hall/ CRC.

Chaloner, Kathryn M. , Timothy Church , Thomas A. Louis , and John P. Matts . 1993. ``Graphical Elicitation of a Prior Distribution for a Clinical Trail.'' Statistician 42:341-53.

Chen, Ming-Hui and Qi-Man Shao . 1999. ``Monte Carlo Estimation of Bayesian Credible and HPD Intervals.'' Journal of Computational and Graphical Statistics 8:69-92.

Cheng, Simon and Brian Powell. 2005. ``Small Samples, Big Challenges: Studying Atypical Family Forms.'' Journal of Marriage and Family 67:926-35.

---. 2007. ``Under and Beyond Constraints: Resource Allocation to Young Children From Biracial Families.'' American Journal of Sociology 112:1044-94.

10.

Cowles, Mary Kathryn and Bradley P. Carlin . 1996. ``Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review.'' Journal of the American Statistical Association 91:883-904.

11.

Dehejia, Rajeev H. and Wahba Sadek. 2002. ``Propensity Score-Matching Methods for Nonexperimental Causal Studies.'' Review of Economics and Statistics 84:151-61.

12.

Diaconis, Persi and Donald Ylvisaker . 1985. ``Quantifying Prior Opinion.'' Pp. 133-56 in Bayesian Statistics 2, edited by J. M. Bernardo , M. H. DeGroot , D. V. Lindley , and A. F. M. Smith. Amsterdam: North Holland Press.

13.

Fisher, Allen P. 2003. ``Still `Not Quite as Good as Having Your Own'? Toward a Sociology of Adoption.'' Annual Review of Sociology 29:335-61.

14.

Garthwaite, Paul H. and James M. Dickey . 1992. ``Elicitation of Prior Distributions for Variable Selection Problems in Regression.'' Annals of Statistics 20:1697-719.

15.

Gelfand, Alan E. and Adrian F. M. Smith. 1990. ``Sampling-Based Approaches to Calculating Marginal Densities.'' Journal of the American Statistical Association 85:398-409.

16.

Gelman, Andrew and Donald B. Rubin . 1992. ``Inference From Iterative Simulation Using Multiple Sequences.'' Statistical Science 7:457-511.

17.

Geweke, John. 1992. ``Evaluating the Accuracy of Sampling-Based Approaches to Calculating Posterior Moments.'' Pp. 169-194 in Bayesian Statistics 4, edited by J. M. Bernardo , J. O. Berger , A. P. Dawid , and A. F. M. Smith. Oxford, UK: Oxford University Press.

18.

Gilks, Walter R. and Pascal Wild. 1992. ``Adaptive Rejection Sampling for Gibbs Sampling.'' Applied Statistics 41:337-48.

19.

Gill, Jeff. 2002. Bayesian Methods: A Social and Behavioral Sciences Approach . Boca Raton, FL: Chapman & Hall/CRC.

20.

Greene, William H. 2003. Econometric Analysis, 5th ed. New York: Prentice Hall.

21.

Gustafson, Paul. 2004. Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments. Boca Raton, FL: Chapman & Hall/CRC.

22.

Hahn, Gerald J. and William Q. Meeker . 1991. Statistical Intervals: A Guide for Practitioners. New York: John Wiley.

23.

Hahn, Gerald J. and Trivellore E. Raghunathan . 1988. ``Combining Information From Various Sources: A Prediction Problem and Other Industrial Application.'' Technometrics 30:41-52.

24.

Hamilton, Laura , Simon Cheng , and Brian Powell. 2007. ``Adoptive Parents, Adaptive Parents: Evaluating the Importance of Biological Ties for Parental Investment. '' American Sociological Review 72:95-116.

25.

Hartman, Ann and Joan Laird. 1990. ``Family Treatment After Adoption: Common Themes.'' Pp. 139-221 in The Psychology of Adoption, edited by D. Brodzinsky and M. Schechter. New York: Oxford University Press.

26.

Hill, Jennifer L. and Hanspeter Kriesi . 2001a. ``Classification by Opinion-Changing Behavior: A Mixture Model Approach.'' Political Analysis 9:301-24.

27.

---. 2001b. ``An Extension and Test of Converse's `Black-and-White' Model of Response Stability.'' American Political Science Review 95:397-413.

28.

Johnson, Jill J. , Francis T. McAndrew , and Paul B. Harris . 1991. ``Sociobiology and the Naming of Adopted Children.'' Ethology and Sociobiology 12:365-75.

29.

Kramer, Sharon H. and Robert Rosenthal . 1999. ``Effect Sizes and Significant Levels in Small-Sample Research.'' Pp. 60-81 in Statistical Strategies for Small Sample Research, edited by R. H. Hoyle. Thousand Oaks, CA: Sage.

30.

Küchenhoff, Helmut , Samuel M. Mwalili , and Emmanuel Lesaffre . 2006. ``A General Method for Dealing With Misclassification in Regression: The Misclassification SIMEX.'' Biometrics 62:85-96.

31.

Lindsay, Bruce G. 1995. Mixture Models: Theory, Geometry and Applications . Hayward, CA: Institute of Mathematical Statistics.

32.

Lynch, Scott and Bruce Western. 2004. ``Bayesian Predictive Checks for Nonstandard Models.'' Sociological Methods & Research 32:301-35.

33.

Maccoby, Eleanor E. 1956. ``Pitfalls in the Analysis of Panel Data: A Research Note on Some Technical Aspects of Voting.'' American Journal of Sociology 61:359-62.

34.

Manski, Charles F. Forthcoming. ``Partial Identification in Econometrics.'' In New Palgrave Dictionary of Economics, 2d ed., edited by Steven N. Durlauf and Lawrence E. Blume. London : Macmillan.

35.

Morgan, Stephen L. 2001. ``Counterfactuals, Causal Effect Heterogeneity, and the Catholic School Effect on Learning.'' Sociology of Education 74:341-74.

36.

O'Hagan, Anthony. 1998. ``Eliciting Expert Beliefs in Substantial Practical Applications.'' Statistician 47:21-35.

37.

Rosner, Bernard. 2000. Fundamentals of Biostatistics. Pacific Grove, CA: Duxbury.

38.

Royston, Patrick. 2004. ``Multiple Imputation of Missing Values. '' Stata Journal 4:227-41.

39.

Rubin, Donald B. and Neal Thomas. 1992. ``Affinely Invariant Matching Methods With Ellipsoidal Distributions.'' Annual of Statistics 20:1079-93.

40.

---. 1996. ``Matching Using Estimated Propensity Scores: Relating Theory to Practice.'' Biometrics 52:249-64.

41.

Seftor, Neil S. and Sarah E. Turner . 2002. ``Back to School: Federal Student Aid Policy and Adult College Enrollment.'' Journal of Human Resources 37:336-52.

42.

Silk, Joan B. 1990. ``Human Adoption in Evolutionary Perspective.'' Human Nature 1:25-52.

43.

Spiegelhalter, David J. , Nicola G. Best , Bradley P. Carlin , and Angelika van der Linde. 2002. ``Bayesian Measures of Model Complexity and Fit.'' Journal of the Royal Statistical Society, Series B 64:583-639.

44.

Stolley, Kathy S. 1993. ``Statistics on Adoption in the United States.'' Future Children of Adoption 3:26-42.

45.

Titterington, D. Mike , Adrian F. M. Smith , and U. Ehud Makov . 1985. Statistical Analysis of Finite Mixture Distributions. New York: John Wiley.

46.

Venter, Anre and Scott E. Maxwell . 1999. ``Maximizing Power in Randomized Designs When N Is Small.'' Pp. 33-59 in Statistical Strategies for Small Sample Research, edited by R. H. Hoyle. Thousand Oaks, CA: Sage.