Sage Journals: Discover world-class research

Abstract

A factorial design examines the effects of two independent variables on a single, continuous dependent variable. The statistical test employed to analyze the data is a two-way analysis of variance (ANOVA). This test yields three results: a main effect for each of the independent variables and an interaction effect between the two independent variables. This article explains factorial designs and two-way ANOVA with the help of a worked example using hypothetical data in a spreadsheet provided as a supplementary file. The main effects and interaction effects are explained and illustrated using tables and figures. A short discussion provides general notes about the concepts explained in this article, along with brief notes on repeated measures ANOVA and higher order ANOVAs. Many additional examples, with figures and explanations, are provided in the supplementary materials, which the reader is strongly encouraged to view.

Keywords

Factorial design two-way analysis of variance main effects interaction effect

A study design is said to be factorial in nature if participants are randomized into two or more groups and if participants in each of these groups are further randomized into two or more subgroups. As an example, we conduct a study in which 96 adults with major depressive disorder (MDD) are randomized to receive escitalopram or placebo, and patients in each of these two groups are randomized to receive cognitive behavioral therapy (CBT) or waitlisted CBT. Because drug (escitalopram vs. placebo) and therapy (CBT vs. waitlist) each have two categories, this is a 2 × 2 factorial design.

Table 1 presents endpoint Hamilton Rating Scale for Depression (HAM-D) scores in our hypothetical study. The data file from which Table 1 was generated is made available in the supplementary materials so that readers can run the analyses on their own, if they wish.

Table 1.

Endpoint Depression Ratings in a Hypothetical 2 × 2 Factorially Designed Study.

	Waitlist	CBT	Total
Placebo	19.9 (2.1)n = 24	18.8 (1.6)n = 24	19.3 (1.9)n = 48
Escitalopram	14.9 (1.8)n = 24	10.3 (1.3)n = 24	12.6 (2.8)n = 48
Total	17.4 (3.2)n = 48	14.5 (4.5)n = 48

Data in cells are mean (standard deviation) Hamilton Rating Scale for Depression scores and sample size (n) for the group.

Two-way ANOVA

Analysis of variance (ANOVA) is a statistical procedure used to compare the means of two or more groups. We analyze the Table 1 data using a statistical test known as two-way ANOVA. It is called “two-way” because, as the table shows, there are two factors. One factor is drug, presented in rows in the table, and the other factor is therapy, presented in columns in the table (the rows and the columns are the two “ways”). Each factor has two levels, making it, as already stated, a two row × two column (2 × 2) design with four groups in the study, represented by four boxes (cells) in the table.

The two-way ANOVA, performed using a hand calculator or a statistical program, gives us three results. These are a main effect for drug (F = 361.55; df = 1,92; p < .001), a main effect for therapy (F = 65.59; df = 1,92; p < .001), and a drug × therapy interaction (F = 24.30; df = 1,92; p < .001). We observe that all three results are statistically significant.

Main Effects

The significant main effect for drug tells us that, regardless of what therapy the patients received, escitalopram was superior to placebo. This is evident from the last column: the treatment endpoint HAM-D means for escitalopram vs. placebo were 12.6 vs. 19.3, indicating that patients who received escitalopram were less depressed at endpoint than patients who received placebo.

Similarly, the significant main effect for therapy tells us that, regardless of what drug the patients received, CBT was superior to waitlist. This is evident from the last row: the treatment endpoint HAM-D means for CBT vs. waitlist were 14.5 vs. 17.4, indicating that patients who received CBT were less depressed at endpoint than patients who were waitlisted for CBT.

Interaction Effect

The significant drug × therapy interaction tells us that the extent of improvement with drug depended on what therapy patients received. In the table, we see that placebo patients fared only marginally better with CBT relative to waitlist (endpoint HAM-D means, 18.8 vs. 19.9), whereas escitalopram patients fared noticeably better with CBT relative to waitlist (endpoint HAM-D means 10.3 vs. 14.9). The marked advantage for the escitalopram–CBT group is visually depicted in the supplementary materials; in the line diagram, the noticeable difference in the slopes of the placebo and escitalopram lines is due to the interaction (Supplementary Figure 1).

Summary

The two-way ANOVA tells us that, in our study of patients with MDD, escitalopram was superior to placebo (main effect for drug), CBT was superior to waitlist (main effect for therapy), and CBT improved outcomes with escitalopram more than it improved outcomes with placebo (drug × therapy interaction).

Specific Notes

Instead of randomizing and then subrandomizing, as described in the opening paragraph of this article, patients can be directly randomized into the four groups shown in Table 1.

In the worked example in this article, the endpoint HAM-D score was the outcome variable. Endpoint scores of other rating instruments could be analyzed in the same way, using two-way ANOVA, provided that the ratings are continuous (measured along a ratio scale) and not categorical.

If we actually conducted a study as described in this article, the method of analysis would be more elaborate than that presented here. Whereas the scenario and analysis presented here are technically correct, they are meant to explain concepts in the simplest possible way, and not to recommend a plan of analysis.

General Notes

Here is a technical point for geeks. The main effects are not merely the equivalent of t-tests or one-way ANOVAs for escitalopram vs. placebo (means 12.6 vs. 19.3) and for CBT vs. waitlist (means 14.5 vs. 17.4). Rather, the main effects are escitalopram vs. placebo after excluding the interaction effect and CBT vs. waitlist after excluding the interaction effect. So, it means that escitalopram would outperform placebo and CBT would outperform waitlist even had there not been an interaction. This can be mathematically understood from the way in which the sum of squares and the degrees of freedom are partitioned when calculating the F values for the main and interaction effects.

Here is the same message for non-geeks. How main effects are independent of the interaction effect can be visually understood from the line diagram in Supplementary Figure 1. As an example for the main effect for drug, the escitalopram line is wholly below the placebo line. As an example for the main effect for therapy, the CBT circles are below the waitlist circles. As a cautionary note, what appears likely from visual inspection needs to be confirmed in the statistical analysis.

Other visual examples of different combinations of significant and nonsignificant main and interaction effects are presented in Supplementary Figures 2–6. Readers are urged to view the supplementary materials to obtain a fuller understand of what is explained in this article.

We can have a 3 × 2 factorial design if drug has three levels (e.g., escitalopram, bupropion, and placebo) and therapy has two levels (CBT and waitlist). We can have a 3 × 3 design if drug has three levels and CBT also has three levels (e.g., CBT, art therapy, and waitlist). However, the statistical test used to analyze the data is still a two-way ANOVA because there are still only two “ways”: drug (rows) and therapy (columns). We will still get only three results: a main effect for drug, a main effect for therapy, and a drug × therapy interaction. If any result is statistically significant and we want to know which drug level is better than which other drug level, or which therapy level is better than which other therapy level, and from where a significant interaction arises, we would need to do post hoc analyses. This is conceptually similar to performing post hoc analyses after a one-way ANOVA yields a significant F value when there are three or more groups being compared.

A two-way ANOVA can also be applied to nonrandomized designs, such as when we want to see whether there is a main effect for sex (men vs. women), a main effect for quantity of alcohol consumed (one drink vs. two drinks), and a sex × quantity of alcohol interaction on performance on various cognitive tasks. Whereas we can randomize subjects into one drink vs. two drinks groups, sex is fixed; we cannot randomize subjects to be men or women.¹

The concepts described in this article can be applied to analyses of longitudinal data. Consider a study in which MDD patients randomized to escitalopram or placebo are rated on the HAM-D at baseline, at 2 weeks, at 4 weeks, and at 6 weeks. The data are analyzed using two-way repeated measures ANOVA with two levels for drug and four levels for time. We get a main effect for drug, a main effect for time, and a drug × time interaction. An example is provided in Supplementary Figure 7; the significant drug × time interaction shows that patients treated with escitalopram showed greater improvement across time than patients treated with placebo.

Finally, more complex factorial designs are possible. For example, in a three-way ANOVA, we could examine treatment outcomes based on sex (male vs. female), drug (escitalopram vs. placebo), and therapy (CBT vs. waitlist). We would get three main effects: for sex, for drug, and for therapy. We would get three two-way interactions: for sex × drug, drug × therapy, and sex × therapy. And, we would get one three-way interaction: for sex × drug × therapy. Such higher order ANOVAs are seldom performed because interpretation of the different interactions is difficult.

Supplemental Material

Supplemental material for this article is available online.

Supplemental Material

Supplemental material for this article is available online.

Footnotes

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Funding

The author received no financial support for the research,authorship,and/or publication of this article.

References

Norman

and Streiner

. Biostatistics: the bare essentials. 4th ed. Shelton, CT: People’s Medical Publishing House, 2014.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.21 MB

0.00 MB

0.01 MB

Understanding Factorial Designs,Main Effects,and Interaction Effects: Simply Explained with a Worked Example

Abstract

Keywords

Two-way ANOVA

Main Effects

Interaction Effect

Summary

Specific Notes

General Notes

Supplemental Material

Supplemental Material

Footnotes

Declaration of Conflicting Interests

Funding

References

Supplementary Material