A recent trend in the psychological literature has been to include measures of effect size when reporting probability values. The several measures of effect size associated with the Student t test for two independent samples are appropriate only when the variances are homogeneous. In this paper, commonly used measures of effect size are considered and compared, using four data sets. A chance-corrected measure of effect size is provided for two or more treatment groups characterized by either homogeneous or heterogeneous variances.
Get full access to this article
View all access options for this article.
References
1.
American Psychological Association. (1994) Publication manual of the American Psychological Association. (4th ed.) Washington, DC: Author.
2.
American Psychological Association. (2001) Publication manual of the American Psychological Association. (5th ed.) Washington, DC: Author.
3.
BerryK. J., & MielkeP. W.Jr. (1997) Spearman's footrule as a measure of agreement. Psychological Reports, 80, 839–846.
4.
BrownR. A.EvansD. M.MillerI. W.BurgessE. S., & MuellerT. I. (1997) Cognitive-behavioral treatment for depression in alcoholism. Journal of Consulting and Clinical Psychology, 65, 715–726.
5.
CapraroR. M., & CapraroM. M. (2002) Treatments of effect sizes and statistical significance tests in textbooks. Educational and Psychological Measurement, 42, 771–782.
6.
CapraroR. M., & CapraroM. M. (2003) Exploring the APA fifth edition Publication Manual's impact on the analytic preferences of journal editorial board members. Educational and Psychological Measurement, 63, 554–565.
7.
CohenJ. (1960) A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46.
8.
CohenJ. (1969) Statistical power analysis for the behavioral sciences.New York: Academic Press.
9.
CohenJ. (1990) Things I have learned (so far). American Psychologist, 45, 1304–1312.
10.
ContiL., & MustiR. E. (1984) The effects of delta-9 tetrahydrocannabinol injections to the nucleus accumbens on the locomotor activity of rats. In AgurellS.DeweyW. L., & WhiteR. E. (Eds.), The cannabinoids: Clinical, pharmacologic, and therapeutic aspects.New York: Academic Press. Pp. 649–655.
11.
DiekhoffG. (1992) Statistics for the social and behavioral sciences: Univariate, bivariate, multivariate.Dubuque, IA: Brown.
12.
DunnD. S. (2001) Statistics and data analysis for the behavioral sciences.New York: McGraw-Hill.
13.
FeskeU., & GoldsteinA. J. (1997) Eye movement desensitization and reprocessing treatment for panic disorder: A controlled outcome and partial dismantling study. Journal of Consulting and Clinical Psychology, 65, 1026–1035.
14.
FidlerF. (2002) The fifth edition of the APA Publication Manual: Why its statistics recommendations are so controversial. Educational and Psychological Measurement, 62, 749–770.
15.
FisherL. D., & van BelleG. (1993) Biostatistics: A methodology for the health sciences.New York: Wiley.
16.
FleissJ. L. (1986) The design and analysis of clinical experiments.New York: Wiley.
17.
GravetterF. J., & WallnauL. B. (2002) Essentials of statistics for the behavioral sciences. (4th ed.) Belmont, CA: Wadsworth.
18.
GrissomR. J. (2000) Heterogeneity of variance in clinical data. Journal of Consulting and Clinical Psychology, 68, 155–165.
19.
GrissomR. J., & KimJ. J. (2001) Review of assumptions and problems in the appropriate conceptualization of effect size. Psychological Methods, 6, 135–146.
20.
HaysW. L. (1988) Statistics. (4th ed.) New York: Holt, Reinhart, & Winston.
21.
HedgesL. V. (1982) Estimation of effect size from a series of independent experiments. Psychological Bulletin, 92, 490–499.
HemkatH. (1973) Systematic versus semantic desensitization and implosive therapy: A comparative study. Journal of Consulting and Clinical Psychology, 40, 202–209.
24.
HensonR. K., & SmithA. D. (2000) State of the art in statistical significance and effect size reporting: A review of the APA Task Force report and current trends. Journal of Research and Development in Education, 33, 285–296.
25.
HowellD. C. (2002) Statistical methods for psychology. (5th ed.) Belmont, CA: Duxbury.
26.
KeppelG. (1991) Design and analysis: A researcher's handbook. (3rd ed.) Englewood Cliffs, NJ: Prentice Hall.
27.
KirkR. E. (1995) Experimental design: Procedures for the behavioral sciences. (3rd ed.) Pacific Grove, CA: Brooks/Cole.
28.
KirkR. E. (1996) Practical significance: A concept whose time has come. Educational and Psychological Measurement, 56, 746–759.
29.
KirkR. E. (1999) Statistics: An introduction. (4th ed.) Fort Worth, TX: Harcourt Brace.
30.
KirkR. E. (2001) Promoting good statistical practices: Some suggestions. Educational and Psychological Measurement, 61, 213–218.
31.
KrippendorffK. (1970) Bivariate agreement coefficients for reliability of data. In BorgattaE. G. (Ed.), Sociological methodology.San Francisco, CA: Jossey-Bass. Pp. 139–150.
32.
LeveneH. (1960) Robust tests for equality of variances. In OlkinI.GhuryeS. G.HoeffdingW.MadowW. G. & MannH. B. (Eds.), Contributions to probability and statistics: Essays in honor of Harold Hotelling.Stanford, CA: Stanford Univer. Press. Pp. 278–292.
33.
McLeanJ. E., & KaufmanA. S. (2000) Editorial: Statistical significance testing and other changes to Research in the Schools. Research in the Schools, 7, 1–2.
34.
MicerriT. (1989) The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166.
35.
MielkeP. W.Jr., & BerryK. J. (1994) Permutation tests for common locations among samples with unequal variances. Journal of Educational and Behavioral Statistics, 19, 217–236.
36.
MielkeP. W.Jr., & BerryK. J. (2001) Permutation methods: A distance function approach.New York: Springer-Verlag.
37.
MurphyK. R. (1997) Editorial. Journal of Applied Psychology, 82, 3–5.
38.
NorušisM. J. (1995) SPSS 6.1 guide to data analysis.Englewood Cliffs, NJ: Prentice Hall.
39.
PaganoR. R. (2001) Understanding statistics in the behavioral sciences. (6th ed.) Pacific Grove, CA: Wadsworth.
40.
PearsonK. (1907) Mathematical contributions to the theory of evolution: XVI. On further methods of determining correlation. (Drapers' Company Research Memoirs Biometric Series IV) London: Dalau.
41.
PedersenS. (2003) Research methods: Effect sizes and “what if” analyses as supplements to statistical significance tests. Journal of Early Intervention, 25, 310–319.
42.
RaudenbushS. W., & BrykA. S. (1987) Examining correlates of diversity. Journal of Educational Statistics, 12, 241–269.
43.
RosnowR. L.RosenthalR., & RubinD. B. (2000) Contrasts and correlation in effect-size estimation. Psychological Science, 11, 446–453.
44.
SatterthwaiteF. E. (1946) An approximate distribution of estimates of variance components. Biometrics Bulletin, 2, 110–114.
45.
SawilowskyS. S., & BlairR. C. (1992) A more realistic look at the robustness and type II error properties of the t test to departures from population normality. Psychological Bulletin, 111, 352–360.
46.
ScottW. A. (1955) Reliability of content analysis: The case of nominal scale coding. Public Opinion Quarterly, 19, 321–325.
47.
SnedecorG. W., & CochranW. G. (1967) Statistical methods. (6th ed.) Ames, IA: Iowa State Univer.
48.
SnedecorG. W., & CochranW. G. (1989) Statistical methods. (8th ed.) Ames, IA: Iowa State Univer.
49.
SnyderP. (2000) Guidelines for reporting results of group quantitative investigations. Journal of Early Intervention, 23, 145–150.
SpearmanC. (1904) The proof and measurement of association between two things. American Journal of Psychology, 15, 72–101.
52.
SpearmanC. (1906) ‘Footrule’ for measuring correlation. British Journal of Psychology, 2, 89–108.
53.
ThompsonB. (1994) Guidelines for authors. Educational and Psychological Measurement, 54, 837–847.
54.
ThompsonB. (2002) “Statistical,” “practical,” and “clinical”: How many kinds of significance do counselors need to consider?Journal of Counseling & Development, 80, 64–71.
55.
ThorndikeR. M., & DinnelD. L. (2001) Basic statistics for the behavioral sciences.Upper Saddle River, NJ: Prentice Hall.
56.
TomarkenA. J., & SerlinR. C. (1986) Comparison of ANOVA alternatives under variance heterogeneity and specific noncentrality structures. Psychological Bulletin, 99, 90–99.
57.
Vacha-HaaseT. (2001) Statistical significance should not be considered one of life's guarantees: Effect sizes are needed. Educational and Psychological Measurement, 61, 219–224.
58.
Vacha-HaaseT.NilssonJ. E.ReetzD. R.LanceT. S., & ThompsonB. (2000) Reporting practices and APA editorial policies regarding statistical significance and effect size. Theory & Psychology, 10, 413–425.
59.
WeiszJ. R.WeissB.HanS. S.GrangerD. A., & MortonT. (1995) Effects of psychotherapy with children and adolescents revisited: A meta-analysis of treatment outcome studies. Psychological Bulletin, 117, 450–468.
60.
WelchB. L. (1938) The significance of the difference between two means when the population variances are unequal. Biometrika, 29, 350–362.
61.
WelchB. L. (1947) The generalization of ‘Student's’ problem when several different population variances are involved. Biometrika, 34, 28–35.
62.
WelkowitzJ.EwenR. B., & CohenJ. (2000) Introductory statistics for the behavioral sciences. (5th ed.) Orlando, FL: Harcourt Brace.
63.
WilcoxR. R. (1987) New designs in analysis of variance. Annual Review of Psychology, 38, 29–60.
64.
WilcoxR. R. (2003) Applying contemporary statistical techniques.Boston, MA: Academic Press.
65.
WilcoxR. R., & MuskaJ. (1999) Measuring effect size: A non-parametric analogue of ω2. British Journal of Mathematical and Statistical Psychology, 52, 93–110.
66.
Wilkinson, L., & The Task Force on Statistical Inference. (1999) Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594–604.
67.
ZechmeisterE. B., & PosavacE. J. (2003) Data analysis and interpretation in the behavioral sciences.Belmont, CA: Wadsworth.