Abstract
Keywords
Introduction
The impact of school composition on students’ school performance has been at the center of academic and public policy debates in the past decades. Even after the introduction of various desegregation policies and initiatives to reduce the socio-economic achievement gap, most educational systems are still characterized by segregation across social lines (Gutiérrez et al., 2020; Reardon et al., 2019). In most countries, therefore, there is still an ongoing debate about how much of the socio-economic achievement gap can be attributed to school effects and how much of it results from the individual-level effect of social origin on educational outcomes. The answer to this question greatly depends on the specific educational context in which students and schools are embedded (Fekjær and Birkelund, 2007; Raitano and Vona, 2010).
The macro-level educational system can significantly influence the extent of educational inequalities by determining the available educational institutions and shaping the cost–benefit expectations of families of different social origins (Gross et al., 2016; Hadjar and Becker, 2016). Several studies show that a higher level of school segregation is associated with greater inequalities in educational outcomes (Card and Rothstein, 2007; Reardon et al., 2019). These papers treat segregation as a macro-level characteristic of local or national educational systems and investigate how achievement gaps along social or ethnic lines differ depending on the level of segregation.
However, it is also important to focus on how the specific school environment influences students’ achievement and attainment. Schools are the meso-level units that mediate the effects of the education system on the individual students’ outcomes (Esser, 2016; Gross et al., 2016). One of the most important characteristics of the school environment is student composition. An educational system is segregated if students from different social backgrounds are unevenly distributed across schools (Massey and Denton, 1988). Since most educational systems are characterized by school segregation to varying extents, socio-economically advantaged and disadvantaged students tend to attend schools with different resources, teachers, and student compositions. These differences in the learning environment can contribute to achievement differences between students on one hand and influence families’ educational decisions on the other. In this article, we focus on the segregation of disadvantaged students: whether attending a high-poverty school (HPS) negatively influences students’ achievement and educational attainment.
A growing number of studies focus on this meso-micro link and aim to investigate whether attending an HPS has a negative effect on student outcomes. These papers seek to answer the question of how much students would gain if they were transferred from an HPS to a low-poverty school. Several papers have shown a significant negative association between a more disadvantaged socio-economic school composition and student achievement, measured by test scores (see Van Ewijk and Sleegers, 2010 for a review).
The evidence on the effect of attending an HPS on educational attainment is more limited. Previous findings from the United States found that attending a high-poverty primary school has a cumulative negative influence on later achievement (Carbonaro et al., 2023; Langenkamp and Carbonaro, 2018). This suggests that attending a high-poverty primary school might negatively affect long-term attainment through negatively influencing achievement. In addition, Palardy (2013) has shown that attending a high-poverty high school negatively affects high-school graduation and college enrollment. However, there is no direct evidence on the effect of attending an HPS at the early stages on long-term attainment outcomes. Moreover, whether attending an HPS is associated with attainment outcomes over and above influencing achievement is still an open question.
Attending an HPS can affect educational attainment through various mechanisms. Two main channels of these mechanisms are the well-established primary and secondary effects in educational choices (Boudon, 1974; Breen and Goldthorpe, 1997; Jackson et al., 2007). Primary effects emerge from differences in students’ academic performance due to uneven access to resources. Secondary effects capture educational choices conditional on performance and reflect anticipatory decisions based on cost–benefit evaluations (Breen and Goldthorpe, 1997; Karlson and Holm, 2011). Originally, sociological theories of educational inequalities distinguished primary and secondary effects based on students’ social origin. Later, they were also related to other student characteristics, such as gender or ethnicity (Dollmann, 2017; Hadjar et al., 2014; Kristen et al., 2008). We argue that schools as specific learning environments can also have primary effects (mediated by students’ achievement) and secondary effects (independent of achievement) on educational decisions by shaping the cost–benefit expectations of families.
The present study investigates whether attending a high-poverty general school is associated with lower student achievement and long-term educational attainment in the stratified and highly segregated Hungarian educational context. Our contribution to the literature is threefold. First, we investigate the effect of attending a high-poverty general school not only on short-term academic achievement but also on long-term attainment outcomes. Second, we provide suggestive evidence that the effect of attending a high-poverty general school on attainment is not entirely transmitted through students’ academic achievement; it seems to influence attainment directly. Third, we carry out the analysis in an educational context that has received less attention before and where the relationship between students’ socio-economic background and performance is particularly strong in international comparison (Schleicher, 2019).
Using large-scale administrative data from Hungary, we apply a matching approach to estimate the effect of being enrolled in an HPS as opposed to a low-poverty school on educational outcomes (for a similar approach, see Belfi et al., 2016). First, we estimate the association between attending an HPS in Grade 8 and math and reading achievement on one hand and long-term attainment measures: secondary school completion, graduation, and higher education entry on the other. Second, we investigate whether attending an HPS is associated with attainment outcomes only via influencing students’ school performance or whether it directly affects secondary school completion, graduation, and higher education entry. To minimize selection bias, we condition on a rich set of variables measuring students’ family background and cultural resources and combine exact and propensity score matching. As opposed to regression models, matching estimators do not hinge on functional form assumptions. Moreover, matching avoids extrapolating for units that are nonexistent in either the treatment or the control group, which is inherent in regression models (Imbens, 2015). Nevertheless, matching cannot entirely eliminate selection bias if there are unobserved student characteristics affecting both students’ sorting across schools and educational outcomes. Therefore, we conduct additional analyses to reveal whether selection bias is present.
School segregation, student achievement, and educational attainment
There is ample evidence on the effects of attending an HPS on individual students’ achievement. Most studies find that students who attend schools with a higher share of socio-economically disadvantaged peers have lower academic achievement, but the estimated effects differ. The meta-analysis of van Ewijk and Sleegers (2010), which included 188 estimates from 30 (mainly Western or OECD) countries, found that, on average, a one standard deviation increase in the average socio-economic status of a student’s peer group is associated with a 0.32 standard deviation increase in the student’s test score. However, the effect varied between 0.03 and 0.59 standard deviations across the studies. There is much less evidence on the negative effect of attending an HPS on later educational attainment (Palardy, 2013).
The results on achievement effects suggest that attending an HPS deteriorates educational attainment through achievement (primary effect on attainment). HPSs worsen students’ achievements, and a lower achievement has a negative effect on attainment. First, a lower achievement reduces the likelihood of admission to high-quality, popular schools and to schools in the academic track that end with a final exam and provide access to tertiary education (see the following section). Second, extremely low achievement often results in grade retention and dropping out of secondary education.
At the same time, attending an HPS can also have a direct effect on later educational outcomes over and above influencing achievement (secondary effect on attainment). Students with similar achievement may make less ambitious schooling decisions in HPSs or may have a higher probability of dropping out of secondary school.
Several mechanisms can explain why attending an HPS negatively influences student achievement and later educational attainment. First, instructional quality might be lower in HPSs due to the unequal distribution of human and material resources (Condron et al., 2013; Mickelson and Heath, 1999). On one hand, more qualified teachers might self-select into schools with a higher share of high-status students where the circumstances of teaching are more attractive (Condron, 2009; Hanushek et al., 2004; Kertesi and Kézdi, 2005). Empirical research from Hungary shows that less qualified teachers are indeed more likely to teach in schools with a higher share of socio-economically disadvantaged students (Havas and Liskó, 2005; Varga, 2009). On the other hand, HPSs tend to be located in socio-economically more disadvantaged municipalities where school expenditures might be lower (Hermann and Semjén, 2021). Until 2013, Hungary had a decentralized school system in which local governments were responsible for the provision of public education. Therefore, disparities in school expenditures were substantial when the cohorts analyzed in this study were enrolled in Grade 8 (Hermann, 2008, 2010) Although a policy reform transferred the responsibility for school governance to a central agency in 2013, the reform did not decrease the inequalities in student achievement (Hermann and Semjén, 2021).
Second, teachers might hold lowered expectations toward low-status students and adjust the level of instruction accordingly (Fekjær and Birkelund, 2007; Langenkamp and Carbonaro, 2018; Thrupp et al., 2002). On one hand, a lower level of instruction might directly lead to lower student achievement. On the other hand, the literature on the Pygmalion-effect suggests that lowered expectations can have a detrimental effect on school performance by decreasing student effort, aspiration, and motivation (Jussim and Harber, 2005; Rosenthal and Jacobson, 1968).
Third, instructional quality might be lower in HPSs due to a higher concentration of disruptive classroom behavior if teachers are not prepared to facilitate the involvement of students from disadvantaged social background or if the choice of curriculum and the difficulty of instruction is not appropriate for these students (Banks, 2016). The more time needs to be allocated to disciplining students, the less time can be allocated to teaching the curriculum (Lazear, 2001; Triventi et al., 2021).
Fourth, peers can affect each others’ achievement through various channels. High-achieving peers are better able to help each other in mastering the curriculum, can act as role models for their peers, and transmit values and behaviors that favor education (Brännström, 2008; Palardy, 2013; Seuring et al., 2020). In contrast, a concentration of low-achieving peers might contribute to the formation of anti-achievement norms and an oppositional culture against education (Agirdag et al., 2012; Fordham and Ogbu, 1986; Kruse and Kroneberg, 2022; Willis, 1977). Furthermore, peer effects might not only operate within the school; the wider social context might also influence students’ educational outcomes since HPSs are usually located in high-poverty neighborhoods (Chetty et al., 2016; Chetty and Hendren, 2018; Rich and Owens, 2023). Prior empirical research from Hungary also documented peer effects on student achievement (Horn, 2013; Keller and Takács, 2019; Schiltz et al., 2019). As the share of high-achieving students is lower in HPSs, these peer effects might have adverse consequences on the educational outcomes of students attending these schools. Lower instructional quality and peer effects in HPSs have a straightforward negative effect on student achievement. Moreover, these may also play a part in the secondary effect on attainment. First, lower teacher quality in these schools can negatively affect students’ noncognitive skills (Blazar and Kraft, 2017; Flèche, 2017; Jackson, 2012), which are important determinants of educational attainment (Heckman and Rubinstein, 2001). Second, students attending HPSs might see fewer peers who succeed academically or face less favorable teacher expectations. As a result, they might make less ambitious educational choices than their achievement would allow because they might perceive the chances of success to be lower (Morgan, 2012).
It is important to note that many of the results on HPS effects should be interpreted as strong suggestive evidence. These estimates are prone to selection bias: family background does not only influence how students are sorted across schools but also has a separate effect on students’ educational trajectories. Therefore, differences in student achievement between schools might reflect not only the effect of school characteristics but also that of unobserved student characteristics. Many previous studies could not overcome the limitations arising from this selection bias (Van Ewijk and Sleegers, 2010). Lauen and Gaddis (2013) have shown that methods that take into account selection by using more stringent identification strategies provide much smaller estimates than multilevel regression models that have been widely used in educational and sociological studies.
The Hungarian educational context from a comparative perspective
In contrast to the Anglo-Saxon context and similar to many other European countries, the Hungarian education system is characterized by a highly stratified between-school tracking system at the upper secondary level. For most students, general education lasts for 8 years (Grades 1–8), which is longer than in the most stratified systems such as Germany or Austria, but some general secondary schools provide 6- or 8-year-long programs, starting already in Grades 5 and 7. These highly selective programs attract the highest-achieving students from general schools (Horn, 2013; Schiltz et al., 2019).
In Hungary, education is compulsory until the age of 16 now, but for the cohorts analyzed here, it was compulsory until the age of 18. 1 After finishing general school, students are required to continue their studies at the upper secondary level. Upper secondary education consists of three different tracks: (1) General secondary school (4–5 years) is the academic track ending with a final exam that is used in university admission; (2) Vocational secondary school (4–5 years) ends with a final exam but provides vocational training as well; and (3) Vocational schools (3 years) offer vocational training and general education with a limited scope, with no access to tertiary education.
Students and parents can freely choose secondary schools to apply for, but admission to higher-prestige schools and tracks depends on an admission exam on one hand and students’ academic achievement in the past 2 years of the general school on the other. Merit-based selection into secondary schools generates a strong competitive pressure and increases the importance of perceived general school quality for students and parents.
Another key feature of Hungarian general education is the mixture of residence-based catchment areas and free school choice. That is, general schools are required to enroll all students living in their catchment area. Parents, however, are allowed to choose a different school for their children outside the catchment area of their residence. As parents perceive the differences in school quality and the consequences of school choice to be large (Berényi et al., 2008), commuting is quite common: about one-third of the cohorts included in our study commuted to a general school outside of the catchment area of their residence. Commuting has a strong association with socio-economic status: students from more affluent families are more likely to attend a different school than the designated one. Schools are allowed to enroll children from other catchment areas, provided there are free places after enrolling children from their designated admission area. Officially, schools are not allowed to base their enrollment decisions on students’ socio-economic status. If the number of applicants is too high, schools should randomly select, though this rarely happens in practice. Qualitative studies have shown that schools have different practices to select among students. For instance, though general schools are not allowed to organize entrance exams, they can still do so if they launch classrooms with special curricula. In these cases, enrollment can be based on students’ abilities that are correlated with socio-economic status. Schools can also discourage low socio-economic status students from applying by emphasizing the high academic standards and strict requirements of the school (Berényi et al., 2008).
Altogether, early selection, the structure of the three secondary school tracks, and merit-based admission in Hungary are akin to German-type school systems. At the same time, free school choice is a feature more prominent in some English-speaking and Scandinavian countries.
Comparative studies have found that educational inequalities are larger in countries with highly stratified education systems (Van de Werfhorst and Mijs, 2010). It has also been shown, moreover, that choice-driven education systems, where parents have higher involvement in tracking decisions, are associated with a higher level of inequalities (Checchi and Flabbi, 2007; Gross et al., 2016; Stadelmann-Steffen, 2012). Parental decisions in track choices also leave more room for secondary effects to arise (Hadjar and Becker, 2016). These factors might explain why the relationship between students’ socio-economic background and academic performance is particularly strong in Hungary compared with other Organization for Economic Co-operation and Development (OECD) countries (Schleicher, 2019). Furthermore, due to residential segregation, free school choice at the primary level, and the highly stratified tracked system at the secondary level, school segregation by social status is among the highest in Hungary compared with other European countries. This holds both at the age of 15, when most students are already in upper secondary school in Hungary (Holmlund and Öckert, 2021; Jenkins et al., 2008) and in earlier grades (Csapó et al., 2008; European Union Agency for Fundamental Rights (FRA), 2016). Socio-economic and ethnic segregation are highly correlated and have increased in the past decades (Hajdu et al., 2021, 2022; Kertesi and Kézdi, 2012). 2 In this macro-level educational context, we expect that attending a high-poverty general school has a negative effect on students’ academic achievement and later educational attainment compared with attending a low-poverty school.
Data and variables
Data
We use a unique panel of linked administrative data (Admin3) compiled in 2019 by the Databank of the Centre for Economic and Regional Studies that contains anonymized individual-level labor market, education, and health data for the 2003–2017 period and covers 50 percent of the Hungarian population in 2003 (Sebők, 2019). Education data is available from 2008. The data contain individuals’ standardized test scores from multiple time points (6th grade, 8th grade, 10th grade), secondary school completion and higher education entry, as well as various information on family background, place of residence, and school characteristics. Data on test scores and family background come from the annually registered National Assessment of Basic Competencies (NABC). The NABC is a standardized, low-stake blind test similar to PISA, measuring reading literacy and mathematics skills for the full population of 6th, 8th, and 10th-grade students in the country. Besides completing the test, students are also asked to complete a questionnaire with their parents, focusing on socio-economic background and cultural resources.
Sample
There are three cohorts in the Admin3 dataset for which both 8th-grade NABC data and later educational outcomes are available (
Some further restrictions on the sample have been made. First, students attending 6- and 8-year-long secondary schools are excluded from the analysis because we lack information on the characteristics of their general schools (
Variables
Outcome variables
We have five different outcome variables.
School segregation
Our key independent variable is the high-poverty status of the school. Schools’ poverty status is defined with respect to the share of disadvantaged students in the school. Socio-economically disadvantaged status is defined by the law: it indicates whether families are entitled to regular child protection allowance. Families are entitled to child protection allowance if at least two conditions hold from the following ones: (1) low educational level of the caregivers; (2) low employment situation of the caregivers; and (3) inadequate living conditions. 3 The number of students classified in this category is reported by the schools and is available in school statistics each year for each grade. For each school, we calculate the mean of the share of socio-economically disadvantaged students reported in 2008–2010; therefore, school status does not vary across the three cohorts.
We define two groups of HPSs, based on 15 percent and 35 percent cut-off values. Schools with less than 15 percent of disadvantaged students are defined as low-poverty schools. We classify schools with at least 15 percent and a maximum of 35 percent of disadvantaged students as HPSs and schools with more than 35 percent of disadvantaged students as extreme-poverty schools (EPS). While a substantial share of low-status students is enrolled in schools with a high poverty rate, the vast majority of high-status students are enrolled in low-poverty schools (see Supplemental Table S1).
Table 1 shows the number of students and schools belonging to the low-, high-, and extreme-poverty categories in the total and the analytical sample. Overall, 22 percent of students study in HPS and EPSs in the analytical sample, while in the total sample, this share is 20 percent. This is in line with the definition of Lauen and Gaddis (2013) where the top quartile of classroom poverty distribution was defined as high-poverty classrooms. Another anchor for these cut-off values is provided by qualitative research on the segregation of Roma students in Hungary (Ercse, 2018; Havas and Liskó, 2005). Ethnic and social segregation is closely related. Figure 1 suggests a linear relationship between the share of disadvantaged and Roma students. 4 This is a strong association; the school-level correlation between these measures is 0.79 in our sample. Research on ethnic segregation usually considers a school segregated if the Roma share exceeds 25–30 percent and severely segregated if the Roma share is above 50 percent (Ercse, 2018; Havas and Liskó, 2005). As Figure 1 shows, the 15 percent and 35 percent cut-off values based on the share of disadvantaged students roughly match this classification.
Distribution of students across low-, high-, and extreme-poverty schools in the total sample and by municipality type.

The share of disadvantaged students and the estimated share of Roma students in percentiles of schools according to the Roma share (%).
Table 1 also shows massive differences by municipality size. More than half of the students living in villages are enrolled in HPS or EPSs, while this share is only about 20 percent and 5 percent in smaller and larger towns, respectively.
Table 2 shows descriptive statistics of the outcome variables by the three groups of schools and municipality size. Both achievement and education attainment of HPS and EPS students lag far behind what we can observe in low-poverty schools.
Educational attainment and achievement in low-, high-, and extreme-poverty schools.
SD: standard deviation.
Supplemental Tables S2 to S4 display how sample selection affected low-poverty, HPS, and EPSs. Supplemental Table S2 shows the share of students excluded for various reasons and those belonging to the analytical sample for the three groups. The largest excluded group is students in an extended (6- or 8-year-long) academic secondary school track. These secondary schools are low-poverty schools, while the poverty status of the general school these students attended previously is not observed. The second most important source of exclusion is the lack of test scores. This group is substantially larger in HPS and EPSs. The lack of information on family background is also important, but the shares in the three groups are similar.
Supplemental Table S3 compares the attainment outcomes in the samples with and without restrictions due to nonresponse. In the analytical sample, differences in secondary schooling outcomes are somewhat muted: 2–3 percentage points smaller than in the unrestricted sample. The lower share of test-takers in HPS and EPSs and the smaller gaps in the secondary schooling outcomes suggest that sample selection might generate some bias in the results.
However, students not included in the analytical sample can be different in the three groups of schools. Supplemental Table S4 shows that the shares of SEN and disadvantaged students in the excluded group are disproportionally higher in HPS and EPSs. Therefore the differences in Supplemental Table S3 might be explained by student characteristics to some extent. Overall, the effect of sample selection on the results is ambiguous.
Conditioning variables
In the matching analysis, it is important to condition on the main confounding variables, which might be associated with the school type the students attend as well as their educational trajectories. Therefore, we use a rich set of student-level conditioning variables, most of which capture students’ socio-economic status and cultural resources. These include parental education (both mother’s and father’s); number of books at home; parents’ long-term unemployment; subjective affluence of the family; number of siblings; number of family members per room; number of bathrooms at home; number of cars; whether the student has own books; whether the student has own desk; whether the family receives regular child protection allowance; the disadvantaged status of the student; whether the student has special education needs status (e.g. autism spectrum disorder); and gender. We also condition on the type of municipality (villages, small towns with less than 30,000 inhabitants, large towns) to compare students who have access to similar schooling opportunities. Supplemental Table S5 of the Supplementary Material presents descriptive statistics of the conditioning variables in low-poverty, HPS, and EPSs.
Missing data imputation
In the case of item nonresponse, we use hot deck imputation to impute missing data (Andridge and Little, 2010). First, we define 40 strata altogether according to the size of the municipality where the student lives, the mean income in the municipality, and the share of disadvantaged students in the school. Mothers’ education is imputed with the value of a randomly selected student from the same stratum. Then, missing data in the other variables are imputed with values of a randomly selected student from the same stratum with the same value for the mother’s education. Outcome variables and test scores are not imputed.
Analytical strategy
Many papers employ a multilevel regression framework to study the effect of attending an HPS on educational outcomes (for a review, see Van Ewijk and Sleegers, 2010). Regression models including student-level control variables can account for selection on observables, that is, observed student characteristics, such as socio-economic status, influence both the way students are sorted across schools and their later educational outcomes, irrespective of the school they attend. Regression models cannot identify causal effects if selection is on unobserved characteristics.
Regressions models provide less reliable results if students in HPSs (treatment group) are very different in terms of observed characteristics from students in low-poverty schools (control group). This is called the
A widely used approach to overcome these problems is matching, which ensures that characteristics in control and treatment groups are balanced (Imbens, 2015, for applications in the educational context, see for instance Becker et al., 2012; Belfi et al., 2016; Guill et al., 2017; Kainz and Pan, 2014). By assigning to each treated individual one or more control observations that are similar in observed characteristics, no parametric model is needed, and the extrapolation problem is avoided. Moreover, the matching approach provides a straightforward way to explore heterogeneity of the treatment effect.
Therefore, we use a matching approach to estimate the effect of being enrolled in an HPS or EPS as opposed to a low-poverty school on educational outcomes. It is important to note that matching approaches rely on the assumption that selection into treatment can be fully accounted for by observable characteristics (unconfoundedness). Selection bias cannot be eliminated as far as selection is on unobservables; that is, there are unobserved student characteristics (not fully correlated with observables) affecting both sorting and outcomes.
Matching
As the main model, we use nearest neighbor matching to estimate the effect of attending an HPS or EPS on educational outcomes. We create matched samples (separately for HPS and EPS students) consisting of pairs of students who attend different schools but have otherwise similar observable characteristics. This approach uses the observed outcome of the control student from a low-poverty school as the counterfactual outcome of the student in an HPS or EPS (treatment groups). The difference between the observed and counterfactual outcomes provides estimates of the treatment effects (attending a high- or extreme-poverty school): it shows what would have been the outcome for the treated students if they had attended a low-poverty school. However, these estimates are likely to be biased if unobserved student and family characteristics (e.g. unobserved skills, motivation, educational attainment goals) play a role in selection into HPSs. Therefore, the estimated effects are as close to the true causal effect of attending an HPS or EPS as far as the observed characteristics account for selection into these schools.
We estimate the average treatment effect on the treated (ATT), that is to what extent achievement and attainment would change in the case of HPS and EPS students if they were studying in low-poverty schools. The average treatment effect (ATE) cannot be credibly estimated as high-status students hardly attend HPSs.
The first part of the analysis consists of the following steps. First, we estimate propensity scores (Rosenbaum and Rubin, 1983) that reflect the individual students’ probabilities of attending an HPS or EPS. Different propensity scores are estimated for HPSs and EPSs, using logistic regression. In this logistic regression, the dependent variable captures whether the student attends an HPS (EPS) or not. As explanatory variables, we use a rich set of student-level variables such as gender, disadvantaged and special educational needs status, and several family characteristics capturing students’ economic and cultural resources, such as parental education, number of books at home, parents’ long-term unemployment, subjective affluence of the family, and others (see Supplemental Table S5 of the Supplementary Material). The propensity scores are the predicted probabilities calculated based on these logistic regression models. The propensity scores are estimated separately in the case of the three different groups of municipalities (villages, small and large towns).
Second, we assess the overlap in the covariate distributions across the treatment and control groups. We trim the sample at extreme values of the propensity scores where common support is lacking (Imbens, 2015), that is, where there are few comparable control students.
Third, we use a combination of exact matching based on students’ gender, type of municipality, disadvantaged status, and mothers’ highest education and nearest neighbor matching based on the estimated propensity scores. We assign a single control student to each treated observation with replacement. That is, each HPS (EPS) student is compared with a control student from the same gender, type of municipality, disadvantaged status, and mothers’ highest education who is the most similar in their estimated probability to attend an HPS (EPS). For every outcome, we calculate the ATT separately for HPS and EPSs using the following matching estimator (Abadie and Imbens, 2006)
where
Besides calculating the simple matching estimator, we also use bias correction in subsequent analyses. If matching is not exact, differences might remain in some covariates within the matched pairs. Bias correction removes potential biases remaining after the matching by adjusting the differences within the matches using linear regression (for more details, see Abadie and Imbens, 2011). Bias correction is performed based on the covariates used for the matching.
We also estimated the effects taking into account regional differences. Here, we include counties in the set of covariates for exact matching, that is, assign control observations to treated students living in the same county.
In the second part of the analysis, we estimate the effect on the attainment outcomes conditioning on test scores to investigate whether attending an HPS or EPS has a direct association with attainment outcomes over and above the indirect association via achievement. Here, we provide suggestive evidence as our data does not allow the identification of causal mediation effects (on the assumptions of causal mediation analysis, see Imai et al., 2010). Similar to the main analysis, we use a combination of exact matching and nearest neighbor matching. Here, nearest neighbors are chosen by taking into account the estimated propensity scores and the math and reading scores in Grade 8, as well.
After estimating the treatment effect for the full sample of students in HPS and EPSs, we explore the heterogeneity of the effect in the subsamples. We estimate treatment effects for urban and rural students and for students with a relatively high and low propensity to attend these schools.
Assessing the assumption of selection on observables
Whether we can consider matching results as good estimates of the true causal effect of attending an HPS or EPS hinges on the validity of the unconfoundedness assumption. This means that conditionally on observed characteristics, students in HPSs would perform similarly to control students in low-poverty schools if they also attended low-poverty schools. Unfortunately, this assumption cannot be tested directly.
Imbens (2015) suggests an indirect approach to assess the plausibility of this key assumption. He suggests estimating the treatment effect on pseudo-outcomes, that is, outcomes that (1) are correlated with the outcomes of the analysis, and (2) were measured before the treatment occurred. If the unconfoundedness assumption holds, conditional on observed covariates, one should find no correlation between the treatment and pretreatment or pseudo-outcomes, as treatment cannot have a causal effect on pretreatment outcomes by definition. In contrast, a significant association between the treatment and pretreatment outcomes indicates selection into treatment on unobservable characteristics. This implies that the estimates of the effects on the educational outcomes are likely to be biased. Here we calculate the ATT for two pseudo-outcomes, years spent in kindergarten and students’ age when starting school. 6 If unconfoundedness holds, attending an HPS or EPS should not have a significant effect on either of these pseudo-outcomes.
Propensity score distribution and matching quality
First, we summarize the results of the preliminary steps of matching and matching quality. Our matching model combines propensity score and exact matching. The propensity score is a condensed measure of several student and family background characteristics. Figure 2 shows the distribution of the propensity scores for attending an HPS and an EPS. As expected, students attending low-poverty schools have lower propensity scores on average than HPS and EPS students. If we examine the right tail of the distribution (Figure 2), it can be observed that there are few control students above the 0.8 and 0.9 thresholds in the case of HPS and EPSs, respectively. Therefore, to ensure common support, we trim the sample at these values of the propensity scores for the matching estimates.

Propensity score distributions (a) high-poverty schools, (b) extreme-poverty schools, (c) high-poverty schools, right tail and (d) extreme-poverty schools, right tail.
As a next step, we investigate whether covariates are balanced between the control and treatment groups after the matching and trimming. A covariate is balanced when its distribution is similar in the two groups (Austin, 2011; Imbens, 2015). Similarity is measured by standardized differences and variance ratios. 7 The standardized differences should be close to 0, and the variance ratios should be close to 1. Supplemental Tables S6 and S7 of the Supplementary Material show the standardized differences and variance ratios for the raw data and the matched samples. In the matched samples, all standardized differences are close to zero, and all variance ratios are close to one. As a rule of thumb, standardized differences below 0.1 in absolute value indicate a good balance between the treated and matched samples (Austin, 2011). In our case, standardized differences for all covariates are below this threshold. Nevertheless, in our preferred specification, we employ bias correction to remove the small biases remaining after the matching (Abadie and Imbens, 2011).
Finally, Table 3 shows the mean share of disadvantaged students in the matched samples. As the thresholds on which schools are categorized as low-poverty, HPS, or EPSs are set arbitrarily, it is important to examine whether student composition in the control group is indeed different, that is, control students do not bunch in schools just below the 15 percent cut-off. Table 3 confirms that HPS and low-poverty schools have indeed different compositions in the matched sample. The mean share of disadvantaged students in the matched control groups is 6–7 percent, well below the 23 percent and 48 percent in HPS and EPSs. However, this is higher than the 4 percent overall mean in low-poverty schools.
Mean share of disadvantaged students in the schools of treated and matched control students.
Results
Main results
Columns 1–4 in Table 4 present the main results of the analysis. The rows contain the five outcomes. Matching estimates in columns 1 and 2 do not explicitly take into account regional differences in matching, while estimates in columns 3 and 4 use matched samples generated within counties. Estimates in columns 2 and 4 use bias adjustment to eliminate any remaining differences in covariates after matching. Column 4 is our preferred model, because this takes into account all important conditioning variables (including regional differences) and adjusts potential biases after matching.
Matching estimates of the effect of high-poverty and extreme-poverty schools on educational outcomes.
The test score results show a uniform pattern in all estimates. Math scores of HPS and EPS students do not significantly differ from those of students attending low-poverty schools. At the same time, reading scores are significantly lower in HPS and EPSs in all model specifications. Students in HPSs perform 0.04 standard deviation below comparable students in low-poverty schools in the reading test, while EPSs incur a substantial 0.17 standard deviation loss (column 4).
Regarding educational attainment outcomes, there is a marked difference between matching estimates within counties and ignoring counties. If we use matching without taking into account counties, attending an HPS appears to have zero effect on completing upper secondary education and a weak positive effect on both obtaining the secondary school leaving certificate and enrollment in higher education. EPSs have a weak negative, positive, and zero effect on the three outcomes. At the same time, matching within counties suggests marked negative effects in the case of secondary school completion and school leaving certificate and a substantial but imprecise negative estimate for admission to higher education. Thus, each estimated treatment effect is negative when matching within counties.
Matching estimates ignoring counties provide counterintuitive results. It is possible that families differ in not-observed characteristics across counties, even after accounting for observable characteristics. Another explanation is that schooling opportunities are different for similar students in different counties, depending on both secondary school supply and the degree of poverty. In less developed counties, where the majority of HPS and EPS students live, the relative status of a student can be very different from the relative status of a control student in a rich county, even though their family background variables have identical values. That is, in less developed counties, where the share of socio-economically disadvantaged students is higher, students with low socio-economic status (SES) might have higher chances of being admitted to a higher secondary school track, because they mostly compete with other low-SES students. In contrast, in more developed counties, low-SES students compete with high-SES students for secondary school places.
We consider matching within counties as the preferred model. These estimates show that studying in an HPS involves a similar 2–3 percentage points decrease in the probabilities of both completing secondary education and obtaining the school leaving certificate. Compared with the baseline probabilities of 0.78 and 0.58 in HPSs (Table 2), these effects are modest. A similar student in a low-poverty school has 3 percent and 5 percent higher probabilities of completing secondary education and obtaining the school leaving certificate, respectively, than the average HPS student. Attending an EPS goes together with higher losses. The estimated effect is a 5 percentage point decrease in the case of both outcomes, while baseline probabilities are 0.64 and 0.42 in EPSs. This implies that a similar student in a low-poverty school has 8 percent and 11 percent higher probabilities of completing secondary education and obtaining the school leaving certificate than the average EPS student.
Regarding enrollment in higher education, the preferred matching estimate is a marginally significant 1 percentage point decrease in the case of HPSs, while a less precise but similar-sized estimate in the case of EPSs. Compared with the baseline probabilities of higher education enrollment in HPS and EPSs (0.22 and 0.13, respectively), the estimates are similar in magnitude to the estimates for obtaining the secondary school leaving certificate.
Altogether, bias adjustment has only a small, almost negligible impact on the estimates, while taking into account county of residence or not makes a huge difference. Attending an HPS or EPS involves lower reading scores but no difference in math, lower probability of success in upper secondary education in its both measures, and an imprecise but substantial negative estimate for the progression to higher education. Each effect is more pronounced in the case of EPSs.
For robustness checks, we first estimated multilevel regression models and then re-estimated the matching models using three nearest neighbors instead of one. The results are presented in Supplemental Tables S8 and S9 in the Supplementary Material, respectively. Both analyses provide remarkably similar results as the main analysis. The main difference is that regression models and matching using three nearest neighbors provide larger and statistically significant estimates for attending an HPS or EPS on higher education enrollment.
Assessing effect size
Which outcome is affected most by attending an HPS or EPS? The estimated effects on test scores and attainment cannot be compared directly. Test score effects cannot be compared with average baseline achievement neither, as test scores do not have a natural ratio scale. To assess the effect sizes, we compare the magnitude of the matching estimates for test scores and attainment outcomes with existing socio-economic gaps in achievement and attainment. Besides comparability, these relative effect sizes reveal the importance of segregation in social inequalities in education and therefore have direct policy relevance.
Table 5 reports relative effect sizes that are calculated as the matching estimates divided by the differences in baseline outcomes between two groups of students with lower and higher maternal education. Columns 1–2 present the matching estimates from Table 4. Column 3 shows the observed gaps in the outcomes between students having a mother with general education versus a mother with upper secondary education. Columns 4–5 compare the estimates with the baseline differences between students having a mother with general education versus a mother with upper secondary education. The fourth column shows, for instance, that the effect of attending an HPS on secondary school completion is 6.8 percent of the difference between the baseline outcomes of students having mothers with general versus upper secondary education. Similarly, the fifth column shows that the effect of attending an EPS on secondary school completion is 18.2 percent of the difference between the baseline outcomes of students having mothers with general versus upper secondary education. The results show that attending an EPS has higher negative effects on the outcomes than attending an HPS. Furthermore, considering the three attainment outcomes, the effects are largest in the case of secondary school completion and smallest in the case of admission to higher education. The effect on reading score is comparable in size to the effect on secondary school completion.
Effect sizes for matching estimates compared with the differences between baseline outcomes of students having a mother with general education versus a mother with upper secondary education.
HPS: high-poverty schools; EPS: extreme-poverty schools.
In the case of test scores, the size of the estimated effects can be evaluated relative to the average yearly growth in test scores, 8 as well. We calculated the average yearly test score growth between Grade 6 and Grade 8 for Grade 6 students between 2008 and 2012. 9 The estimated effects of attending an HPS or EPS on reading, converted to raw, unstandardized test score is about −8 and −33 points, respectively. Average yearly reading test score growth is 40 points. This implies that by this simple back-of-the-envelope calculation, students in HPS and EPSs lose about one-fifth and four-fifths of a year’s progress in literacy skills over their studies in general school. However, note that this loss is accumulated over 8 years, while yearly growth can be different in lower grades. As coefficients for math scores are nonsignificant, we do not evaluate effect size for math achievement.
Matching estimates for educational attainment conditioning on test scores
Attending an HPS or EPS can have a negative effect on later educational outcomes directly or indirectly by decreasing test scores. To investigate the direct association with attainment outcomes over and above the indirect association via test scores, we present matching estimates in which we also condition on test scores. Similar to the main analysis, we use a combination of exact matching and nearest neighbor matching. Nearest neighbor matching is carried out by taking into account not only the estimated propensity scores, but also math and reading scores in Grade 8.
Column 5 in Table 4 presents the matching estimates for educational attainment conditioning on students’ test scores. With some exceptions, the estimates are very similar to the ones without conditioning on test scores. This suggests that attending an HPS or EPS has a direct effect on educational attainment that is not transmitted via students’ lower academic achievement. One exception is the effect of attending an EPS on obtaining the secondary school leaving certificate. This effect is substantially lower and not significant anymore after conditioning on test scores.
Heterogeneity analysis
We conducted heterogeneity analyses by the type of settlement (town vs village) and students’ propensity scores (below and above the median). The median propensity scores were calculated based on the propensity scores of the treated students, separately for those attending HPS and EPSs. Students with above-median propensity scores are typical students attending HPS or EPSs, whereas students with below-median propensity scores are more similar in their characteristics to the typical students studying in low-poverty schools. Table 6 presents the matching estimates of the subgroup analysis. We present the results of our preferred specification, with exact matching based on county and bias adjustment.
Matching estimates for urban and rural students and for students with high and low propensity scores.
PS: propensity scores.
The results show that attending an HPS or EPS is associated with a lower likelihood of secondary school completion and obtaining the secondary school leaving certificate in every subgroup. However, admission to higher education is significantly associated with attending an HPS or EPS only in towns and among students with below-median propensity scores.
With regard to test scores, attending an EPS is negatively related to math scores in towns. The estimates for reading scores are negative in all cases but not significant in the case of HPSs in towns and among students with a high propensity score.
Assessing the unconfoundedness assumption
We estimated ATTs of attending an HPS or EPS on two pretreatment outcomes, the years spent in kindergarten, and the age of starting primary school, using our preferred matching method (matching within counties, with bias adjustment). Table 7 shows the results. On one hand, both HPS and EPS students spent significantly less time in kindergarten than matched control students. On the other hand, the age of starting primary school is significantly related to attending an HPS. This raises doubts about the validity of the unconfoundedness assumption and suggests that matching estimates are prone to selection bias.
Matching estimates of the effect of high-poverty and extreme-poverty schools on pseudo-outcomes.
Conclusion and discussion
In a country with a high level of stratification and school segregation, we have analyzed whether students attending an HPS or EPS in Grade 8 have lower educational achievement and attainment outcomes than students attending low-poverty schools.
Our first set of results shows that attending an HPS or EPS is associated with lower reading scores, a lower likelihood of completing secondary education, and a lower likelihood of obtaining the secondary school leaving certificate. Estimates are negative but imprecise in the case of higher education enrollment in the main model, but they are statistically significant in the robustness checks. These findings are in line with previous studies conducted in various Western countries that have found a negative association between socio-economic school composition and student achievement (Langenkamp and Carbonaro, 2018; Palardy, 2013; Van Ewijk and Sleegers, 2010). We advanced this research by showing that attending a high- or extreme-poverty general school is also associated with long-term educational attainment outcomes. At the same time, attending an HPS or EPS is not significantly associated with math scores in the Hungarian context.
We have found that attending an EPS has larger negative effects on educational outcomes than attending an HPS. Estimated effects on secondary school attainment outcomes are about twice as large in EPSs as in HPSs. Regarding reading test scores, students in HPSs perform 0.04 standard deviation below comparable students in low-poverty schools, while EPSs incur a substantial 0.17 standard deviation loss.
Considering the three attainment outcomes, the effects are largest in the case of secondary school completion and smallest in the case of admission to higher education. The effect on reading score is comparable in size to the effect on secondary school completion.
Our second set of results provides suggestive evidence that attending an HPS or EPS has a large direct effect on educational attainment over and above influencing achievement. In fact, lower student achievement in Grade 8 in HPS or EPSs seems to play only a minor role in lower attainment later on. Different mechanisms might explain this finding. First, a strong direct effect is in line with previous studies showing that teacher quality influences students’ noncognitive skills (Blazar and Kraft, 2017; Flèche, 2017; Jackson, 2012), which then affect later achievement growth, educational attainment, and labor market outcomes (Heckman et al., 2006; Heckman and Rubinstein, 2001). Second, attending an HPS or EPS might have a direct effect on students’ educational choices by influencing the perceived opportunity structure (Breen and Goldthorpe, 1997; Morgan, 2012). If students in HPS or EPSs see fewer peers who succeed academically or face less favorable teacher expectations, they might perceive the chances of success to be lower. Therefore, they might leave the educational system earlier or choose academically less demanding tracks than they were able to pursue based on their abilities.
Finally, we have found heterogeneity in the estimated effects only in the case of admission to higher education. This is significantly associated with attending an HPS or EPS only in towns and among students less likely to attend an HPS or EPS. Students from rural areas and more disadvantaged families are less likely to attain tertiary education, probably irrespective of the characteristics of the general school they attend.
As we presented earlier, the Hungarian educational context is characterized by a high level of school segregation and stratification. As school effects and individual decisions are embedded in the macro-level education systems (Gross et al., 2016), the associations we found in Hungary might be different in less stratified or less segregated educational contexts. Nevertheless, previous international research suggests that achievement effects are similarly present in many other countries. Whether long-term attainment effects can be found in other educational contexts is an empirical question for future research. Similarly, further work is needed to identify the exact mechanisms of attainment effects.
Our study is not without limitations. First, in Hungary, socio-economic and ethnic school segregation is closely related. While we concentrated on the analysis of segregation based on social status, it is important to note that schools with a high share of socio-economically disadvantaged students also enroll a high share of Roma minority students. Although studies that included measures of ethnic and socio-economic segregation in the same regression model usually have found that schools’ socio-economic composition was a more important predictor of academic achievement than racial or ethnic composition (Agirdag et al., 2012; Ryabov and Van Hook, 2007; Sykes and Kuyper, 2013; Yancey and Saporito, 1995), the two effects are hard to disentangle meaningfully in our empirical setting due to the high correlation between them. Therefore, our estimates for the effect of schools’ socio-economic composition probably also reflect the effect of ethnic composition.
Second, indirect evidence suggests that school choice is influenced by student characteristics we cannot observe. Therefore, it is very likely that our estimates are biased upward. Although we have taken into account a rich set of observed student characteristics such as socio-economic circumstances and cultural resources, unobserved factors such as ability, aspirations, and motivations might influence both students’ school choices and educational outcomes. Thus, we interpret our estimates as an upper bound for the effect of attending an HPS or EPS. Taking into account that selection bias is likely to be present, the true causal effects are probably smaller. This finding has important methodological implications. Although we controlled for a richer set of observable background characteristics than many other sociological studies, the results suggest that we were not able to eliminate selection bias completely (see also Lauen and Gaddis, 2013). Future studies should aim to focus on a more precise estimation of the causal effect of attending an HPS.
Supplemental Material
sj-docx-1-cos-10.1177_00207152231198434 – Supplemental material for School segregation, student achievement, and educational attainment in Hungary
Supplemental material, sj-docx-1-cos-10.1177_00207152231198434 for School segregation, student achievement, and educational attainment in Hungary by Zoltán Hermann and Dorottya Kisfalusi in International Journal of Comparative Sociology
Footnotes
Acknowledgements
The authors thank Dániel Horn, Márton Medgyesi, Péter Róbert, participants at the seminar series of the TÁRKI Social Research Institute, members of the PIONEERED project, and the editors and reviewers for their helpful suggestions to earlier drafts of the manuscript.
Disclaimer
The content provided in this article reflects the authors’ views only. Neither the Research Executive Agency (REA) nor the European Commission is responsible for any use that may be made of the information it contains.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The PIONEERED project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement no. 101004392. The authors acknowledge the support of the National Research, Development and Innovation Office – NKFIH grant no. K124989 to Z.H.; FK137765 to D.K.
Supplemental material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
