Abstract
Keywords
Virtual education has expanded rapidly over the past 15 years. In 2002–2003, K–12 students took an estimated 317,000 virtual courses nationwide, as opposed to about 1.8 million enrollments in 2009–2010 (National Center for Education Statistics, 2012). New estimates suggest that this growth has continued apace in recent years, with an estimated 4.5 million K–12 course enrollments taken through online providers in 2014–2015 (Evergreen Education Group, 2015). Several states, including Florida, have established or are considering requirements for students to engage in at least one online learning experience before graduating, setting the stage for additional expansion of virtual course taking in coming years (Evergreen Education Group, 2015). Online learning advocates maintain that experience with online learning will be advantageous to students in an economy that rewards digital competency (Sheehy, 2012). Moreover, online course taking may reduce disparities in the quality of teaching across schools, as teachers are not tied to specific schools, and it may allow for pedagogical innovations.
Despite this steady growth in K–12 virtual course taking, we know little about how these courses affect student achievement (Means, Toyama, Murphy, & Bakia, 2013). On the one hand, proponents of virtual education point to several ways that virtual education could provide higher-quality education for students as compared with traditional classroom settings. For instance, virtual classes may allow students to work at a more individualized pace. This individualized pacing may help slower learners by allowing them to repeat confusing material until they master it, and it can help faster learners by allowing them to move on when they master material, without requiring them to sit through repetitious explanations (Berge & Clark, 2005; Tallent-Runnels et al., 2006). Virtual courses may also be well suited to provide immediate feedback on student performance to both students and teachers through intelligent tutoring systems, and they may provide for a uniquely interactive experience between students and the texts that they access (Means, Bakia, & Murphy, 2014). For instance, if students are able to click on links within lessons that provide them additional detail on a subject of interest, that will allow them to explore their interests interactively. Moreover, online courses allow students access to coursework, and potentially to high-quality teaching, that they may lack in their local school.
On the other hand, skeptics worry that online learning may be more difficult than learning in face-to-face environments. For instance, students who are inclined to procrastinate or who are not skilled in self-directed learning may suffer declines in performance if they lack a physically present teacher to direct their attention to the subject matter (Bork & Rucks-Ahidiana, 2013). Other students may be motivated but lack broadband or other technological resources that virtual courses rely on to enable smooth delivery. Even among students for whom technological resources do not pose a problem, some may have trouble in virtual courses if they lack the technological skills to make full use of the course content (Berge & Clark, 2005).
In this article, we examine virtual course taking in Florida—the state with the largest K–12 virtual sector in the nation—to determine the extent to which virtual course taking is associated with three types of outcomes: concurrent course performance, future course performance, and likelihood of persisting in high school through the final term of the 12th grade, which we use as a proxy for graduation. We distinguish first-time course takers from students who retake courses after failing on their initial attempt, and we further explore variation in virtual course-taking effects by student characteristics and course subjects. We focus on nine commonly taken academic courses (e.g., Geometry, English 1, World History) for which downstream course taking is common.
We find that among both first-time course takers and course retakers, those in virtual courses are more likely to pass the course (with a grade of C or higher) than are their face-to-face counterparts. These results are largely stable across different model specifications and for different subgroups of students and subject areas. The implications of this finding are not clear, however, since they could suggest that students learn more in virtual courses or that grading standards are more lenient in such courses.
When we look at downstream outcomes, we find a striking difference depending on whether a student is attempting the course for the first time or retaking it following an initial failure (credit recovery). For “first-time” course takers, virtual instruction is associated with moderately negative downstream outcomes. For example, first-time virtual course takers are 2.6 percentage points less likely to appear in public school records in the second term of their expected 12th-grade year, which we use as a proxy for expected high school graduation. Relative to the sample mean (77.8%), this represents a 3.3% decline in the likelihood of expected high school graduation. In contrast, we find that virtual course taking is positively associated with downstream outcomes for those students who were retaking the course. For example, virtual retakers are 6.5 percentage points more likely to appear in our data as a second-term 12th grader when compared with face-to-face students of the same course, even after we control for an extensive set of student and school characteristics. This translates to a 10% higher likelihood of expected graduation.
Our results are robust to alternative model specifications. Moreover, both the contemporaneous course outcomes and the downstream outcomes appear stable to assumptions about selection. Using methods developed by Oster (2017), we find that once we control for a rich set of observable student and school characteristics, the selection on unobservable characteristics would have to be extremely large to invalidate our main results.
What might explain the differences that we find in online course taking by attempt type (first-time or retake)? One possibility is that these differences could stem from differences in the characteristics of the students in the two types of classes. However, since we find comparable results across a variety of measured student characteristics, this explanation seems less likely. Another possibility is that the differences arise from differences in the counterfactual course offerings. For example, traditional options for students to retake a course in a face-to-face setting, as in summer school, may be lower quality in terms of instruction as compared with the standard version of the course that students take as a first attempt. Students may also not be motivated to work hard during the summer, when traditional retake opportunities are common. Alternatively, perhaps something about course retaking makes it more difficult to accomplish in a traditional face-to-face setting. For instance, students may be embarrassed to repeat the class in view of their peers (whether the course is retaken during the academic year with younger children or at summer school). Although we cannot test these mechanisms directly, our findings of differences in outcomes for first-time versus credit recovery attempts can help guide thinking through how best to use virtual courses for student benefit.
Background
While the use of virtual courses among K–12 students has grown rapidly, credible research on the effects of those classes on student performance remains scant (Barbour, 2013; Means et al., 2013). A larger body of literature has addressed the relationship between virtual course taking and academic performance in higher education settings. One set of studies that randomized students to online/hybrid or face-to-face versions of classes tended to find null to negative effects of virtual instruction (Bowen, Chingos, Lack, & Nygren, 2014; Figlio, Rush, & Yin, 2013; Joyce, Crockett, Jaeger, Altindag, & O’Connell, 2015). These studies, while internally valid, relied on samples from relatively selective institutions that often enroll unusually motivated and well-prepared student bodies. The generalizability of these results is unclear.
A second set of studies in higher education used quasi-experimental methods to explore the effects of virtual course taking, largely among broad access institutions such as community colleges or for-profit institutions. Studies conducted in Virginia (Xu & Jaggers, 2011), Washington (Xu & Jaggers, 2013), California (Hart, Friedmann, & Hill, 2018), and anonymized state systems (Streich, 2014) consistently found poorer performance for students who take classes virtually. Recent evidence suggests that online course taking is associated with negative downstream impacts on performance in follow-on courses as well (Krieg & Henson, 2016). Looking at a different type of broad access setting—a large for-profit university—Bettinger, Fox, Loeb, and Taylor (2017) found similar patterns, with students in online courses performing less well than their peers contemporaneously and facing reduced likelihood of continuing enrollment. 1
These same relationships may be at play in the K–12 sector, or plausibly, the effects of online course taking may differ. While college instructors typically have broad latitude to set the terms of their own courses—determining course content, assignment structures, and so on—such latitude may be considerably reduced at the secondary level, where state standards place stricter limits on what content must be covered in courses and where organizational practices may require all teachers served by the virtual school to use a proscribed set of instructional materials (Friend & Johnson, 2005). Moreover, secondary students are in school full-time and likely have fewer responsibilities, such as work or family, which may crowd out time for school work.
Two studies at the high school level used randomized controlled trials to evaluate the learning effects of virtual or hybrid course taking. The first study randomized ninth-grade students taking Algebra 1 to face-to-face or hybrid models of course delivery (Cavalluzzo, Lowther, Mokher, & Fan, 2012). Hybrid versions of courses offered teachers access to customizable courseware provided by Kentucky Virtual Schools and the Kentucky Department of Education. Roughly 60% of instruction was delivered face-to-face, with 40% delivered through the virtual courseware. Researchers found no differences on test scores between groups randomized to the traditional and hybrid versions of the course, although treatment fidelity was an issue in the study (Cavalluzzo et al., 2012).
A second randomized controlled trial explored effects on student outcomes in Algebra 1 credit recovery classes offered over the summer months (Heppen et al., 2017). Researchers randomly assigned students who had failed second-semester Algebra 1 to face-to-face or online credit recovery courses (both of which took place in the summer). Students assigned to the online delivery mode rated the class as being more difficult, and credit recovery success rates and algebra posttest scores were both higher in the face-to-face condition. Longer-term outcomes for the first cohort of the study, however, suggested no lasting differences between online and face-to-face students on performance in downstream courses.
A handful of quasi-experimental studies came to mixed conclusions about the educational effects of virtual education among students who opt into those settings. A recent study by the Center for Research on Education Outcomes compared students in virtual charter schools with “virtual control records”—that is, records for students who look observationally similar to virtual charter students. The center found that students in virtual charters made substantially smaller gains on math and reading scores over a year than did their peers who remained in face-to-face schools (Woodworth et al., 2015). A study specific to fully online schools in Ohio found similarly negative effects of enrollment in virtual schools (Ahn & McEachin, 2017). Likewise, a quasi-experimental study examining virtual course taking in algebra found that students induced into virtual classrooms underperformed their peers in face-to-face environments (Heissel, 2016).
The study closest to our own is a working paper by Chingos and Schwerdt (2014). Their study examined how participation in Florida Virtual School (FLVS) courses affected student scores on high school math and English standardized tests. They found that although students who took FLVS courses constituted a positively selected group relative to their peers, even when controlling for student and school characteristics, students who took FLVS courses had null or slightly positive results as compared with their peers on standardized tests.
Our article extends on theirs and others examining online course taking in K–12 in several ways. First, we explore a novel set of outcomes for the K–12 level: concurrent course performance, performance in downstream courses, and progression through school. Second, we explore a broader range of courses. Our main results assess virtual course taking in a range of commonly taken courses in English, math, science and social studies, and in robustness checks, we broaden our course sample to look at concurrent course performance in a broader range of courses. Most previous studies focused primarily on math and English language arts. Third, we explore and distinguish the performance outcomes of first-time course takers from those retaking courses to recover credit, which turns out to be informative. We might expect the effects to differ between first-time takers and retakers because the latter are likely to need access to courses at nontraditional times and may retake their courses while fulfilling other traditional requirements. Thus, the counterfactual for online course taking may differ for the two groups. The credit recovery group is also important to study because a growing number of districts, particularly in Florida, are relying on online options to provide credit recovery instruction to supplement and supplant traditional options, such as summer school (Gonzalez, 2012).
Setting
We study online course taking in the context of Florida’s virtual education system. Florida is home to the most extensive K–12 virtual education system in the country, and the Florida education code requires students entering ninth grade since the 2011–2012 school year to take at least one course online. To facilitate this graduation requirement, the state also requires that each district offer every K–12 study multiple part- and full-time options for virtual course taking (Florida Statute 1002.321, 2018). Two such options come through part- and full-time programs operated by FLVS: the statewide virtual school freely available to public school students who meet a broad set of eligibility criteria. 2 FLVS is the largest K–12 virtual education provider in the nation, serving students through >400,000 enrollments in 2014–2015 (Evergreen Education Group, 2015). 3 Districts may operate their own virtual schools or district-based franchises of FLVS. Franchises use FLVS-developed courses and have access to FLVS-provided professional development, but content is delivered by district-employed teachers (Evergreen Education Group, 2015). In addition, other virtual providers increasingly serve Florida students, including K12, Connections Education, and others.
Our data include many virtual providers, and students participate in a range of virtual courses, which can vary in their materials, teachers, type and extent of interactions, and other features. FLVS is the predominant virtual provider in our data. FLVS courses are generally asynchronous; in other words, students may all be accessing course materials at different times. However, instructors are required to have regular phone check-ins with students to assess their understanding and address their questions. In this way, students have regular real-time instructor interactions as well as online asynchronous interactions (Jacob, Berger, Hart, & Loeb, 2016). FLVS teachers work with about 150 students on average, roughly equivalent to a high school teacher with six periods of 25 students each (FLVS, n.d.).
Methods
Data and Sample
We draw on data from the Florida Education Data Warehouse, maintained by the Florida Department of Education. Data include course enrollments taken through traditional public schools as well as credits received from online instructional institutions. Information on some additional school characteristics comes from the National Center for Education Statistics and from Florida School Indicator Reports.
Our main analytic sample includes students taking courses in Grades 9 and 10 in Florida high schools for the 2006–2007 through 2011–2012 school years. Throughout the article, we designate school years by referring to the calendar year of the spring term (e.g., 2012 for 2011–2012). We use data as early as 2005 to characterize students’ prior performance on standardized tests in middle school and to determine whether students had previously taken courses, and we use data as late as 2014 to determine whether students take follow-on courses. We focus on ninth and 10th graders so that we can assess effects on future coursework and because course selections vary less across students in those grades. In robustness checks, we expand our sample to include students in 11th and 12th grades.
Because one of our key outcomes of interest is performance in subsequent courses, we focus on a subset of courses that are commonly taken, are considered academic courses (as opposed to life skills courses, such as physical education or drivers’ education), and are likely to be taken in a specific sequence. We are able to identify a set of courses in which the majority of course takers subsequently appear within the next 2 years. To ensure that we have a consistent 2-year look-forward period in which to observe future course taking for all students, we focus on ninth- and 10th-grade students, and our target courses therefore include courses primarily taken in those grades—specifically, Algebra 1, Biology 1, Chemistry, Earth/Space Science, English 1, English 2, Geometry, Physical Science, and World History.
While these courses have advantages in that we are able to look at downstream course outcomes, these are not the courses most frequently taken online. As Table 1 shows, students take only about 1.3% of their first-attempt enrollments in “core” academic subjects, such as math, English language arts, social studies, and sciences online, while they take about 4.6% of enrollments virtually in life skills courses, such as physical education and driver’s education. Online enrollment is similarly less prevalent for credit recovery in these core academic subjects than for other types of courses. Moreover, our need to look forward for 2 years to capture outcomes in future courses limits us from examining outcomes in 11th and 12th grades, which are the grades where online course taking is more prevalent in Florida (Jacob et al., 2016). By using robustness checks, we address the external validity concerns associated with our sample limitations in terms of courses and grade levels.
Share of Enrollments Taken Virtually
Because our analytic strategies, detailed later, rely on comparing virtual course takers’ outcomes with outcomes of peers in the same school, our study focuses on students enrolled in brick-and-mortar schools. We therefore exclude students enrolled in virtual schools full-time. These students, who take virtual classes to supplement instruction provided at their brick-and-mortar schools, generally take a relatively small share of courses online. Among students in our sample who took any course virtually in at least one of nine target courses, >80% of them took only one course virtually.
We also distinguish between types of course attempts: first-time attempts and retake efforts. Our first-attempt sample comprises all ninth- and 10th-grade students in target courses who were not previously enrolled in a given course. Our course-retaking sample includes students who failed one of the target courses as ninth or 10th graders. Failure is defined by having a cumulative-course grade point average <1.0 across semesters, which equates to less than a D. A grade point performance <1.0 implies that a student failed at least one semester of the course. Our sample includes students who retook the course within the next academic year (including the summer after the academic year of initial failure; roughly 15% of retake enrollments occur during the summer). We exclude students who did not retake courses the next summer or academic year. 4 In robustness checks, we expand our sample to include course retake attempts within 2 years of failing the initial course.
Measures
Our main variables of interest are indicators for whether a class is taken through a virtual provider (vs. an omitted category of exclusively taken face-to-face). Course mode is derived per the instructional institution that the high school transcript files record for the course enrollment. Note that for credit recovery courses, course mode refers to the mode of the retake effort, not the mode of the initial course. In a small number of cases (<3%), instructional institution records are missing, but data are available on the enrollment institution (the institution where the student is enrolled). In these cases, we assume that instruction is provided in the enrollment institution; we show in robustness checks that our results are insensitive to the exclusion of these records.
Outcomes
We explore three main outcomes of interest. Our first outcome of interest is concurrent course performance. Because grades are ordinal, we focus on a binary indicator for whether the student passes the class with a grade of C or better. This measure has substantive importance for students because students need a final grade point average ≥2.0 to graduate (Florida Department of Education, 2016a).
A concern with using contemporaneous course performance as an outcome measure is that grading standards may vary across teachers and schools and between the online and in-person modality. To address this concern, we look at enrollment and performance in follow-on courses (Carrell & West, 2010; Figlio, Schapiro, & Soter, 2015). Follow-on courses are defined as the next course taken in the same subject area. 5 Our main outcome of interest for follow-on coursework is a measure that captures whether a student takes and passes a follow-on course with a grade of C or better. This outcome captures whether a student goes on to any successful future course taking in the same subject area. In supplemental analyses, we decompose the future course-taking behavior by looking at (1) the decision of whether to enroll in follow-on courses in the same subject, (2) the characteristics of the follow-on courses taken, and (3) the likelihood that students pass follow-on courses, conditional on taking one and controlling for next-course characteristics; this decomposition allows us to explore the mechanisms through which virtual course taking may affect future course outcomes.
A third outcome measures the likelihood that a student is observed in the second semester of his or her projected senior year. That is, we look at whether we observe students earning credits in the final term of their projected 12th-grade year or if we see them earning credit for a full-year class (implying that they were attending in the final term). We project that current ninth and 10th graders should be in 12th grade in 3 and 2 years, respectively. We lack graduation data and so cannot observe whether a student actually earns a degree, but we regard this as a strong proxy for students being on track to graduate. In supplemental analyses, we look at credits earned as an alternate measure and find similar results. 6
Student Controls
To compare observably similar students, we include a host of student controls. A vector of race indicators indexes whether students are White (omitted), Black, Hispanic, Asian, multiracial, or “other race”; gender is captured by an indicator for whether students are female. Student socioeconomic status is captured by an indicator recording whether a student is eligible for free or reduced-price lunch (FRPL) in a given year.
We include several variables that capture students’ prior academic performance. We capture prior test performance through grade-by-year standardized student scores on the eighth-grade math and English language arts sections of the Florida Comprehensive Assessment Test (FCAT), which Florida used for accountability purposes. We also include indicators for whether a student is enrolled in a gifted program, identified for special education programs, or classified as limited English proficient in a given year. We include the student’s attendance rate in the same year as the course is taken, as an additional control. We also include indicators for whether each student was a member of a cohort that was subject to an online course-taking requirement to graduate. Students entering ninth grade in fall 2011 or later were required to take at least one online course prior to graduation. 7
Our most saturated models include controls for ninth-grade academic measures (when observed for 10th-grade students) and a more complete set of middle school controls. Ninth-grade academic measures include FCAT scores, attendance rate, Grade 9 grade point average, and number of credits earned. More saturated middle school controls include sixth- to eighth-grade attendance rates, sixth- and seventh-grade FCAT scores in math and reading, and quadratics for test terms and interactions between the math and reading scores for each grade level.
Home Institution Characteristics
We control for a series of characteristics of students’ home institutions. Home institutions refer to the brick-and-mortar school that students attend, while the instructional institution refers to the institution providing a specific course. School academic quality is captured by a series of indicators for the grade (A–F) received by the school under the state’s accountability plan in the current academic year. As a second measure of student achievement, we include the mean value of the eighth-grade FCAT scores of the incoming cohort of ninth graders. School demographic measures include the share of the student body that is Black, Hispanic, Asian, or other race (percentage White is omitted) and the share of students using subsidized lunch. We also include indicators for whether the home institution is a charter or magnet school, as well as a series of indicators capturing school urbanicity (city and suburb vs. rural and town).
Analytic Methods
In assessing the relationship between virtual course taking and student outcomes, we are concerned about several types of selection bias. First, courses disproportionately taken online may be harder or easier to pass than courses taken less frequently online. Table 1 shows the prevalence of virtual course taking among different class types. Note that many of the most popular classes are in life skills (e.g., physical education, driver’s education) or subject areas (e.g., foreign languages), while online course taking is less common in core courses (e.g., English and math). The difference in courses taken online and in-person may bias our estimates if pass rates are systematically higher or lower in courses that enroll larger shares of students virtually. To address this concern, we include course fixed effects. Year fixed effects and grade fixed effects similarly adjust for potential differences in course taking across time and grades.
To examine the relationship between virtual course taking and academic performance, we estimate ordinary least squares regression of outcome
where
These simple regressions may be subject to bias from school and student factors. For instance, descriptive statistics (Table 2) suggest that virtual students were more likely than face-to-face students to come from schools that received state report card grades of A and B. If students in these relatively advantaged schools benefit from other resources that promote success in future course taking, our results could be positively biased. With respect to student factors, virtual classes enroll a higher share of female students and gifted students and a lower share of students eligible for FRPL, English learners (students designated as limited English proficient), and students using special education. Virtual course takers also have, on average, higher eighth-grade math and English language arts scores on the FCAT. These student factors are independently associated with virtual course taking (in the same direction) if we model virtual course-taking behaviors including all characteristics in the same regression, even when we control for school attended (results available on request). If more academically advantaged students disproportionately enroll in virtual courses, as these descriptive statistics suggest, estimates could be biased upward, unless the models adequately adjust for incoming differences.
Summary Statistics
We address these concerns by estimating highly saturated fixed effects models, including student and school controls, as well as school fixed effects
These results may still be subject to bias if students who take virtual courses differ in unobservable ways—such as self-motivation—from their same-school peers. This concern may be especially pronounced because our results are based on a relatively modest proportion of students who take courses online. To address this concern, we provide evidence on the degree of selection on observables required to render our school fixed effects models null, relying on procedures detailed in Oster (2017). As we describe in greater detail, we find that our results would require a substantial degree of selection on unobservables to render our main results null.
Results
Main Results
First-time course taking
Table 3 presents the main results for our first-time course-taking sample. The five columns present model estimates, with each subsequent model adding controls—from column 1, which includes only course, year, and grade fixed effects, through column 5, which includes a host of student and school covariates, including student performance measures from middle school. The effects are mixed. We find positive effects on passing the contemporaneous course but negative effects on subsequent course performance as well as our graduation proxy. For example, the specification in column 3, which contains basic student and school controls, suggests that taking a course virtually increases the likelihood of passing it by 12.5 percentage points (roughly 18%). However, it reduces the likelihood of taking and passing a follow-on course in the same subject by 1.5 percentage points (roughly 2%) and lowers our proxy for high school graduation by 3.4 percentage points (4%).
First-Time Course Takers: Main Outcomes
Because we remain concerned about bias, we add student-level controls to check for specification errors. In particular, we first add controls for ninth-grade performance, which are set to 0 for ninth-grade students, with grade fixed effects subsuming the missing variable indicators. Our downstream coefficients decline in magnitude across specifications—falling in magnitude by nearly half for the next-course outcome—but remain significant (column 4). As a final set of controls, we add in the richer set of middle school controls, including attendance rates and lagged test scores back to sixth grade, in addition to quadratic terms and interaction terms for the math and reading scores in each year. These added controls make little additional difference, although the significance of the next-course estimates diminishes to
Course retakes
We find a somewhat different pattern in the effects of virtual course taking on course retake attempts (Table 4). When our full set of controls and school fixed effects is included (column 5), students who repeat virtual courses are 4.7 percentage points more likely to pass their remedial course, 1.7 percentage points more likely to jointly take and pass future same-subject courses, and 6.5 percentage points more likely to be observed in a projected final term in senior year, as compared with peers who retake coursework in face-to-face settings (
Course Retakers: Main Outcomes
We run a variety of robustness tests to verify that our results are not an artifact of sample selections and analytic specifications (see online Appendix B). We find that the pattern and significance of our results are, for the most part, robust to different sample and specification decisions, although the magnitude of the coefficients shifts somewhat depending on these decisions. Specifically, we confirm that our positive concurrent course-passing results for first-time attempts and retakes hold—albeit with a notable reduction in magnitude for first-time course takers—when we expand our sample to include a broader range of grades and courses. We confirm that results are insensitive to excluding the 1% to 2% of records for which we made assumptions that the student’s home institution provided instruction in the absence of explicit records of instructional institution.
We also confirm that our results are generally insensitive to specification decisions. Our results are similar when we experiment with the inclusion of different patterns of fixed effects (including school by course, course by grade by year, and school by grade by year). We confirm that we find similar point estimates if we use a propensity score–matching analytic strategy. We further confirm that our remedial course-taking results are similar if we include courses that students took within 2 years of the initial failure; our main models include only courses retaken within 1 year. We also confirm that we see a similar set of results to our main graduation proxy if we use alternative (but potentially less reliable) measures of credits accumulated by a projected 12th-grade year. We discuss these results in detail in the online Appendix B.
Selection
Notwithstanding the general stability of the results established in our robustness tests, we may still be concerned that the results are an artifact of selection. Perhaps especially motivated or tech-savvy students opt into virtual courses. While we cannot conclusively speak to how seriously selection issues are affecting our results, we can speak to the extent to which selection would have to occur to render our results null. Building on the work of Altonji, Elder, and Taber (2005), Oster’s procedure (2017) uses the change in magnitude of “treatment” coefficients when controls are included, as compared with uncontrolled models, to quantify the extent to which the inclusion of observable characteristics reduces bias and the extent to which additional selection on unobservables would have to exist to render the “treatment” effect null. Specifically, Oster’s procedure determines the degree of selection on unobservables (δ) necessary to return a null coefficient, given a specified
A key decision in bounds analysis involves the choice of a reasonable maximum
Bounds Analysis Accounting for Selection on Unobservable Characteristics
Results suggest that our coefficients are fairly robust to selection. For all results, we see δ > 1, in most cases by a large factor. These results provide evidence that our results are reasonably stable and that selection on unobservables would have to be quite large to overturn our main results. 10
Decomposition of mechanisms in future course taking
Virtual course taking may affect the joint likelihood of taking and passing future coursework in the same subject in multiple ways: by changing the likelihood of taking a follow-on course, by changing the characteristics (e.g., course difficulty or instructional model) of the next course attempted, or by changing actual performance based on prior learning. In the online Appendix C, we separate these channels.
We find that our positive results for jointly taking and passing the next course for credit recovery students are driven largely by the likelihood of taking a follow-on course. Credit recovery students are 4 percentage points more likely to take a follow-on course, and students who take courses virtually for both attempt types are more likely to select into courses that we would expect to have higher pass rates, due to an increased likelihood of taking follow-on courses virtually and general course difficulty proxied by overall pass rates. Conditional on these course characteristics and controlling for next-course fixed effects, virtual students are effectively no more or less likely to pass their follow-on courses than are face-to-face students for first-time course taking, while they are modestly more likely to pass follow-on courses after credit recovery efforts.
Heterogeneity of Results by Student Characteristics
Our main pattern of results describes average effects, but different types of students and different types of courses could see different effects of virtual course taking. The positive association between virtual course taking and contemporaneous course performance for first-time course taking is largely consistent across students with different background characteristics, although the size of the coefficient varies somewhat, particularly for students of different prior achievement profiles (Table 6). Negative associations between virtual course taking and being positioned for graduation are likewise consistently negative across most student subgroups. By contrast, the negative association between virtual course taking and follow-on course outcomes is largely driven by more advantaged students (White, non–FRPL using, higher achieving). Lower-achieving students (i.e., those in the bottom quartile of a measure of average standardized eighth-grade FCAT scores) have a more positive, or at least less negative, pattern of results in virtual first-attempt classes than do higher-achieving students for all outcomes.
Heterogeneity by Student Subgroup
The subgroup patterns look quite different for course retakers: for remedial course taking, more advantaged students (non–FRPL using, higher achieving) have particularly positive outcomes for our contemporaneous and follow-on course measures. While each of these groups is underrepresented in the course remediation group relative to its numbers in the general first-time course-taker population (Table 2), our results suggest that they benefit more from retaking virtual courses than do their less advantaged peers.
In results presented in the online Appendix D, we show that results are quite consistent for students in different school types as well.
Heterogeneity of Results by Course Subject
We next explore whether the patterns of results that we see are driven by a particular subset of courses or whether they hold across course types. To examine course differences, we separated results by subject: math, science, and language arts (world history, the only social science course in our sample, is presented separately; Table 7).
Heterogeneity by Course Type
For first-time course taking, the broad pattern of results for concurrent course taking is similar for all subjects. Estimates of the increased likelihood of course passing associated with virtual course enrollment range from 10.7 percentage points for world history courses to 16.4 percentage points for English language arts courses. Although the contemporaneous coefficient for world history courses is the smallest, it is the only subject area for which virtual course taking is associated with a positive change in the likelihood of taking and passing a follow-on course (column 2,
As with the results for initial attempts, the results for students retaking courses show larger increases in the likelihood of contemporaneous course passing in English (
Discussion
Online course taking is expanding rapidly for high school students. Florida requires all high school students to take at least one course virtually before graduation, and other states have or are considering similar policies. The online setting offers potential benefits. For students in schools with limited course offerings, the online setting can expand access to curricula. For students seeking additional course taking in the summer—to retake classes in which they performed poorly or to free up time during the school year for other activities—the online setting could again increase opportunities. Yet, prior evidence in K–12 and in higher education comparing students in virtual courses and those in in-person classrooms tended to find that the online setting is currently less effective. Students, especially those with lower prior performance, tend to learn less in the online setting.
In this study, we use data from the state of Florida to compare students in online and in-person classes. In comparison with the prior literature, our study examined an unusually large and diverse set of students and online courses. Moreover, we were able to examine not only current course performance but also future course taking and performance in those courses, as well as proxies for graduation eligibility by Grade 12. To the extent that online course taking increases access to courses that help students progress through school, the effects of online courses could be positive even if learning opportunities are not as great.
Using a rich set of controls and a variety of fixed effects to reduce potential biases, we find that students tend to receive higher grades in online courses. These better grades could be due to better student performance or to more transitory factors, such as easier grading standards. To focus on the effects on performance, we also look at longer-term outcomes. Students taking courses for the first time tend to see less positive longer-term effects when taking courses virtually. When compared with their same-school peers, virtual students are less likely to persist in school through the final term of a projected Grade 12 year and are marginally less likely to take and pass the next same-subject course in the high school sequence. However, students retaking courses that they had previously failed see some benefits from online course taking, being more likely to pass the contemporaneous course, more likely to take and pass the follow-up course, and more likely to persist through 12th grade.
The differences in estimated effects between first-time takers and retakers could be due to differences in the counterfactual course offerings. First-time takers may have access to a similar course in their school at the time when they are taking it and with students in their cohort, while retakers might be limited in alternatives if they want to retake a given course and maintain their progress through high school—for example, by taking the course in the summer or during nonschool hours. In our data, about 15% of virtual retake efforts come during summer terms; first-time virtual course attempts during the summer are negligible. The access offered by online courses may enable retakers to progress in ways that are more difficult in brick-and-mortar schools.
Differences between first-time takers and retakers in the effects of online course taking could also be driven by differences in the characteristics of students taking the two types of courses if the effects of online courses are heterogeneous. In part to address this possibility, we estimated effects separately for groups of students, courses, and schools. Overall, the results were largely though not completely consistent across groups. For first-time course takers, virtual course effects were quite consistent across groups, with some evidence of more negative long-run effects for more affluent, higher-scoring, and White students. Regarding course retaking, the effects again are similar across groups, with evidence of more benefits for nonpoor students and students with higher prior achievement. Given that first-time course takers are not more likely to be poor or low achieving than the course retakers, the more positive results for course retakers relative to first-time takers is unlikely to be due to differences in the students served.
Our results differ from those of some past studies in important ways. At first blush, our credit recovery results seem considerably more positive than the results of online credit recovery in previous randomized controlled trials (Heppen et al., 2017). However, our results for math specifically—the set of results most closely related to those of Heppen et al. (2017)—show null effects of online remediation in math on the contemporaneous course-passing outcome and on the joint likelihood of taking and passing a same-subject follow-on course. Coupled with the fact that Heppen and colleagues’ trial focused on a lower-income population—a population for which we find modestly negative contemporaneous course passing results in subgroup analyses—the differences in our results seem likely to be driven by the differences in the breadth of subjects and the differences in populations studied. The focus on this broader set of subjects and students is one key contribution of our study.
Our estimates have several limitations that provide the impetus for additional research. We attempt to address selection bias in multiple ways, including by applying a highly saturated set of controls and confirming the robustness of our results to multiple econometric techniques to address selection. Moreover, we calculate that selection on unobservables would have to be substantially greater than on observables to render our results null: for our most conservative estimates, the selection on unobservables would have to exceed selection on observables by 2.5 times to nullify results. However, we know little about the generally-unobserved factors that motivate online enrollment in high school, particularly in the core subject areas in our sample. Additional research could uncover factors that prompt students to take online enrollments in core subjects to help contextualize whether characteristics unobserved in our data may explain away our estimates. More generally, our results should be interpreted with some caution given that the number of students taking online courses in these core subjects remains relatively small.
Finally, our estimates are based on a specific point in time—2006–2007 through 2011–2012—yet the effects of online course taking are unlikely to be static. As students develop greater comfort with the online setting, their performance could easily change. The nature of virtual courses may change over time as well: Artificial intelligence, which is just beginning to penetrate online courses, offers promise for improving online learning by responding to students’ abilities and personalizing instruction in ways that classroom teachers would struggle to replicate and by learning which types of information and assessments best facilitate student learning. To support our analytic strategy, our sample focused on courses commonly available in home institutions, but this required us to leave unexamined the potential for online courses to expand access to new courses that are less commonly offered; future researchers should look into this possibility to further evaluate claims that online courses offer the possibility that geography will cease to determine access to quality teaching and diverse course offerings. Nonetheless, we provide an important first set of evidence that these benefits are yet to be fully realized.
Supplemental Material
DS_10.1177_2332858419832852 – Supplemental material for Online Learning, Offline Outcomes: Online Course Taking and High School Student Performance
Supplemental material, DS_10.1177_2332858419832852 for Online Learning, Offline Outcomes: Online Course Taking and High School Student Performance by Cassandra M. D. Hart, Dan Berger, Brian Jacob, Susanna Loeb and Michael Hill in AERA Open
Footnotes
Authors
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
