Sage Journals: Discover world-class research

Abstract

The value-added method of Haberman is arguably one of the most popular methods to evaluate the quality of subscores. According to the method, a subscore has added value if the reliability of the subscore is larger than a quantity referred to as the proportional reduction in mean squared error of the total score. This article shows how well-known statistical tests can be used to determine the added value of subscores and augmented subscores. The usefulness of the suggested tests is demonstrated using two operational data sets.

Keywords

multiple correlation PRMSE value-added method

Get full access to this article

View all access options for this article.

References

Benjamini

Hochberg

(1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological), 57, 289–300.

Boyer

J. E.

Palachek

A. D.

Schucany

W. R.

(1983). An empirical study of related correlation coefficients. Journal of Educational Statistics, 8, 75.

Brennan

R. L.

(2012). Utility indexes for decisions about subscores (CASMA Research Report No. 33). Iowa City, IA: Center for Advanced Studies in Measurement and Assessment.

Dai

Wang

Svetina

. (2016). subscore: Computing subscores in classical test theory and item response theory ( R package Version 2.0).

Diedenhofen

Musch

. (2015). cocor: A comprehensive solution for the statistical comparison of correlations. PLOS One, 10, e0121945.

Dunn

O. J.

Clark

(1969). Correlation coefficients measured on the same individuals. Journal of the American Statistical Association, 64, 366.

Efron

(1979). Bootstrap methods: Another look at the jackknife. The Annals of Statistics, 7, 1–26.

Efron

Tibshirani

R. J.

(1993). An introduction to the bootstrap. New York, NY: Chapman and Hall.

Feinberg

Jurich

D. P.

(2017). Guidelines for interpreting and reporting subscores. Educational Measurement: Issues and Practice, 36, 5–13.

10.

Haberman

S. J.

(2008). When can subscores have value? Journal of Educational and Behavioral Statistics, 33, 204–229.

11.

Hedges

L. V.

Olkin

(1983). Joint distributions of some indices based on correlation coefficient. In Carlin

Amemiya

Goodman

L. A.

(Eds.), Studies in econometrics, time series, and multivariate statistics (pp. 437–454). New York, NY: Academic Press.

12.

Hittner

J. B.

May

Silver

N. C.

(2003). A Monte Carlo evaluation of tests for comparing dependent correlations. The Journal of General Psychology, 130, 149–168.

13.

Liu

Robin

Yoo

Manna

(2018). Statistical properties of the GRE Psychology test subscores (Educational Testing Service Research Report No. 18–19). Princeton, NJ: Educational Testing Service.

14.

Lord

F. M.

Novick

M. R.

(1968). Statistical theories of mental test scores. Reading, MA: Addison Wesley.

15.

Lyren

(2009). Reporting subscores from college admission tests. Practical Assessment, Research, and Evaluation, 14, 1–10.

16.

Meijer

R. R.

Boev

A. J.

Tendeiro

J. N.

Bosker

R. J.

Albers

C. J.

(2017). The use of subscores in higher education: When is this useful? Frontiers in Psychology, 8. doi:10.3389/fpsyg.2017.00305

17.

Olkin

(1967). Correlations revisited. In Stanley

J. C.

(Ed.), Improving experimental design and statistical analysis (pp. 102–128). Chicago, IL: Rand McNally.

18.

Olkin

Finn

J. D.

(1990). Testing correlated correlations. Psychological Bulletin, 108, 330–333.

19.

Olkin

Finn

J. D.

(1995). Correlations redux. Psychological Bulletin, 118, 155–164.

20.

Olkin

Siotani

(1976). Asymptotic distribution of functions of a correlation matrix. In Ikeda

(Ed.), Essays in probability and statistics (pp. 235–251). Tokyo, Japan: Shinko Tsusho.

21.

Pearson

Filon

L. N. G.

(1898). Mathematical contributions to the theory of evolution. IV. On the probable errors of frequency constants and on the influence of random selection on variation and correlation. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 191, 229–311.

22.

Puhan

Sinharay

Haberman

S. J.

Larkin

(2010). The utility of augmented subscores in a licensure exam: An evaluation of methods using empirical data. Applied Measurement in Education, 23, 266–285.

23.

Reise

S. P.

Bonifay

W. E.

Haviland

M. G.

(2013). Scoring and modeling psychological measures in the presence of multidimensionality. Journal of Personality Assessment, 95, 129–140.

24.

Sawaki

Sinharay

. (2017). Do the TOEFL iBT section scores provide value-added information to stakeholders? Language Testing. Advance online publication. doi:10.1177/0265532217716731

25.

Sinharay

(2010). How often do subscores have added value? Results from operational and simulated data. Journal of Educational Measurement, 47, 150–174.

26.

Sinharay

(2013). A note on assessing the added value of subscores. Educational Measurement: Issues and Practice, 32, 38–42.

27.

Sinharay

(2018). An interpretation of augmented subscores and their added value in terms of parallel forms. Journal of Educational Measurement, 55, 177–193.

28.

Sinharay

Haberman

S. J.

(2008). Reporting subscores: A survey (ETS Research Memorandum No. 08–18). Princeton, NJ: ETS.

29.

Sinharay

Haberman

S. J.

(2014). An empirical investigation of population invariance in the value of subscores. International Journal of Testing, 14, 22–48.

30.

Steiger

J. H.

(1980). Tests for comparing elements of a correlation matrix. Psychological Bulletin, 87, 245–251.

31.

Steiger

J. H.

Hakstian

A. R.

(1982). The asymptotic distribution of elements of a correlation matrix: Theory and application. British Journal of Mathematical and Statistical Psychology, 35, 208–215.

32.

Thissen

(2013). Using the testlet response model as a shortcut to multidimensional item response theory subscore computation. In Millsap

van der Ark

Bolt

Woods

(Eds.), New developments in quantitative psychology—Presentations from the 77th Annual Psychometric Society Meeting (pp. 29–40). New York, NY: Springer.

33.

Wasserstein

R. L.

Lazar

N. A.

(2016). The ASA’s statement on p-values: Context, process, and purpose. The American Statistician, 70, 129–133.

34.

Wedman

Lyren

(2015). Methods for examining the psychometric quality of subscores: A review and application. Practical Assessment, Research, and Evaluation, 20, 1–14.

35.

Williams

E. J.

(1959). The comparison of regression variables. Journal of the Royal Statistical Society, Series B, 21, 396–399.

36.

Yao

(2010). Reporting valid and reliable overall scores and domain scores. Journal of Educational Measurement, 47, 339–360.