Abstract
Racism, discriminatory practices, institutional bias, and systematic exclusion can take lasting physical form in the built environment. Prominent examples include the segregationist mortgage lending policies, including redlining, that contributed to White suburbanization and Black ghettoization (Faber 2020; Jackson 1985; Rothstein 2017), and the interstate highway construction and urban renewal projects that razed neighborhoods, displaced Black residents, and systematized the physical divisions between racial groups (Connerly 2002; Schindler 2015). Such practices have lasting power, in part, because they take physical form: rather than requiring shared understandings among actors or the legal enforcement of covenants, racism and inequality can become embedded in the urban infrastructure. Their consequences can persist for generations, resulting in long-standing inequalities and enduring patterns of racial residential segregation (Faber 2020; Rothstein 2017).
In this article, we seek to understand the relationship between residential segregation—the extent to which social groups reside in distinct places—and a key yet often overlooked feature of the built environment, road network connectivity. Connectivity is an important consideration in understanding segregation because two residential areas may be spatially proximate but not well connected by roads, and connectivity may be more important than mere proximity in explaining racial segregation patterns (Grannis 1998). Road disconnectivity and physical barriers, such as fences, walls, railroad tracks, highways, and dead-end streets, have been used as mechanisms to reinforce or exacerbate segregation by facilitating greater separation between ethnoracial groups in nearby areas (Jackson 1985; Mohl 2008; Schindler 2015; Sugrue 2005). Examples of this phenomenon include the selection of routes for interstate highways built during the 1950s and 1960s in cities such as Chicago, Atlanta, and Houston (Feagin 1988; Mohl 2008), as well as more recent instances of using dead-end streets, bollards, and fences in cities such as Baltimore and Detroit (Armborst, D’Oca, and Theodore 2015).
To study the relationship between segregation and (dis)connectivity, we develop a novel computational approach, the counterfactual road networks method, that conceptualizes roads as spatial networks and identifies missing road segments that we would expect to exist in a city’s road network given the surrounding infrastructure. We demonstrate the application of our approach with three analyses in five U.S. cities.
First, we analyze the relationship between residential segregation and road network connectivity at the city level. We find that the missing road segments we would most expect to exist are associated with the largest differences in segregation.
Second, we examine the racial composition of nearby areas that would be connected by missing road segments. We find that unexpected disconnectivity in a city’s road network is associated with greater differences in the racial composition of nearby areas. Road segments we would expect to exist given the surrounding infrastructure are more likely to be missing between areas with different racial compositions.
Third, we construct counterfactual road networks that include missing road segments, and we use the spatial proximity and connectivity (SPC) method (Roberto 2018) to compare the segregation measured using the observed and counterfactual road networks. We find that unexpected disconnectivity is associated with significantly higher levels of segregation in local areas of missing road segments and at the city level. The results suggest these highly likely but nonetheless missing road segments facilitate both social and spatial disconnection.
Our approach does not establish causality or imply the precedence of the road network, nor does it adjudicate which came first: residential segregation that shaped the structure of the road network, or road network connectivity that shaped where people live. We use this approach not to make recommendations about road construction, but to examine the significance of unexpected missingness. Our results underscore the power of the built environment and suggest that infrastructure decisions can have long-term social consequences. Overall, our approach enables further research to uncover the inequalities embedded in urban infrastructure and examine the consequences of racist infrastructure.
Background
Scholarship on the built environment and residential segregation has tended to focus on historical and contemporary policies and processes related to housing. Studies have examined how segregation has changed alongside historical processes of redlining (Faber 2020; Rothstein 2017) and suburbanization (Fischer 2008; Hayden 2003; Logan et al. 2023; Massey and Denton 1988b), as well as contemporary processes of gentrification (Ding, Hwang, and Divringi 2016; Freeman 2009; Hwang 2020; Hwang and Sampson 2014; Lees, Slater, and Wyly 2008; Smith 1979, 1987; Zukin 1987). In particular, there has been growing attention to the consequences of housing-related inequalities related to housing values, housing location, and mortgage lending on social and economic outcomes, including residential segregation (Faber 2018, 2019; Hwang, Hankinson, and Brown 2015; Hyra et al. 2013; Korver-Glenn 2022; Owens 2019; Rugh, Albright, and Massey 2015; Spielman and Harrison 2013). 1
This scholarship has generated insights on housing policies and processes and their lasting effects on segregation, but other aspects of the built environment remain understudied and undertheorized. In particular, few scholars have considered how roads or road networks are related to segregation (for notable exceptions, see Archer 2020; Bayor 1988; Grannis 1998, 2005; Korver-Glenn et al. 2024; Roberto 2018; Roberto and Korver-Glenn 2021). And when they are considered, studies often focus on particular streets or highway segments, rather than conceptualizing roads as a spatial network. We seek to contribute to this nascent area of scholarship by developing a computational approach for understanding the relationship between residential segregation and road network connectivity.
Road Connectivity and Residential Segregation
In urban policies and planning, roads are typically regarded as connectors that facilitate the movement of goods and people between locations. From this perspective, roads may seem to be neutral elements of the urban landscape. But as Harvey (1973:61) noted, changes in urban form and functions (including building roads and highways) are not simply “natural” manifestations of changing urban demands.
Road networks have important social dimensions. First, the placement of roads may provide better connectivity between some areas and residents than others. Second, disconnected roads, such as dead-end streets, can facilitate separation and division between nearby areas. Third, constructing or expanding roads in urban areas often requires land that is already in use and decisions about which residents and businesses to displace or protect. Fourth, once roads are constructed, their change or removal can be slow and costly and may require institutional action, hence becoming one of the most static elements of the built environment. Finally, roads can carry symbolic meaning, such as being a commercial destination or a well-known boundary between neighborhoods.
Roads should thus be regarded as both a structural element of the built environment and part of the social fabric of a city. Moreover, roads are an important consideration in understanding segregation patterns and processes. In what follows, we first describe how previous scholarship has considered these social dimensions of roads by conceptualizing them as boundaries (also referred to as edges or dividing lines) or as sites. We then argue for the need to conceptualize roads as networks and explain how attention to road networks and connectivity can contribute to our understanding of the relationship between segregation and the built environment.
Roads as Boundaries, Roads as Sites
Scholarship on residential segregation has considered roads in two distinct ways: roads as boundaries and roads as sites. In conceptualizing roads as boundaries, scholars have pointed out how roads can both connect and divide. More specifically, scholars highlight how highways have cut through neighborhoods and communities, displacing some groups and isolating others (Mohl 2008). In doing so, some highways form the “edges” of neighborhoods, including defining the physical border of “ghettos” and separating them from the rest of the city, further isolating populations that were already among the most marginalized (Wacquant and Wilson 1989). In some cities, such as Birmingham and Atlanta, highways were constructed to mirror the boundaries of racial zoning (Archer 2020).
Although highways created great opportunities for commuting and being “connected” to the city for suburban residents (mostly middle- and upper-class White families), some city residents recognized their detrimental effects on neighborhoods and communities, including creating separation between nearby areas. By the 1960s, residents in many cities began to organize and resist the construction of planned highway routes. Starting in San Francisco and spreading to other cities, these highway (or freeway) revolts affected future highway plans, including the paths of highways and the cancellation of some projects (Mohl 2008).
The construction of highways had other spatial and functional consequences. Not only did highways cut through the social fabric of cities and neighborhoods, they were also part of the complex process of suburbanization and the emergence of malls as the main site of consumer shopping (Hayden 2003). This process destabilized and sometimes completely destroyed commercial streets within cities, turning “soulful” streets at the center of urban life to soulless spaces (Wacquant and Wilson 1989).
In the following decades, in some cities, the disinvestment and neglect of inner-city neighborhoods were somewhat reversed through gentrification and urban programs that sought to reinvent cities. In this process, some streets emerged as sites of investments through which urban governance and private capital tried to redefine land use and social space. In these efforts, the emphasis was not on the vehicular speed and movement efficiency of roads. Instead, a mixed-use conceptualization of roads and a return to the street as the site of commercial activity, social life, and human connection were highlighted. Commercial streets became sites to implement urban policies that hoped to attract new residents, businesses, and tourism (Deener 2007; Zukin 2009). But although some commercial streets became social hubs, others became the symbolic boundaries of residential neighborhoods, dividing residents of different races and classes or separating gentrifiers from old-timers (Anderson 1990; Lloyd 2006; Rabin 1987).
Roads as Networks: A New Approach
Prior work has studied how roads become physical manifestations of the social boundaries between groups. Scholars have considered how roads carry symbolic meaning and influence residents’ perceptions and social relations (Korver-Glenn et al. 2024). Scholars have also studied how roads, as symbolic boundaries, shape residents’ interactions (Anderson 1990; Deener 2007) and how rates of crime or conflict change in areas of transition between neighborhoods (Kim and Hipp 2016; Legewie 2018; Legewie and Schaeffer 2016). Such studies conceptualize roads as both boundaries (e.g., the borders of neighborhoods) and sites (e.g., of crime or conflict). This work recognizes both the physical and symbolic nature of roads and their capacity to influence social relations. Both methodologically and empirically, these studies link the social and physical elements of urban environments.
However, whether on the scale of residential streets or highways, roads are part of a broader network. We argue that analyzing road networks offers insights that go beyond studying roads as boundaries or sites. It allows us to foreground social processes and institutions that create road network connectivity and the mechanisms through which patterns of connection and disconnection are reproduced or reconfigured over time. We reject assumptions that conceive of the road network as neutral or in the background, and we examine, theoretically and empirically, how road connectivity matters in patterns of segregation.
We highlight the importance of road networks as connectors by examining the connectivity provided by a city’s road network. We also consider how road segments that may be missing from the network facilitate disconnection between nearby areas. We show that decreases in the connectivity function of road networks are associated with increases in the possibility of segregation. Through a study of roads as networks, rather than boundaries or sites, this article expands our understanding of the separating and connecting power of roads.
In the following section, we introduce our computational approach for identifying road segments that one would expect to exist in a city’s road network, given the surrounding infrastructure, but are nonetheless missing. We then demonstrate the application of our approach by analyzing the relationship between residential segregation and road network connectivity at the city level. We continue with two local analyses: first, we compare the racial composition of nearby, disconnected areas where there is a missing road segment; we then compare the segregation measured using the observed road network and counterfactual road networks that include missing road segments.
Counterfactual Road Networks
One of the most common problems in network topology inference is that of edge propensity or link prediction, that is, which among the missing edges is the most probable to materialize given the current structure of the network (Liben-Nowell and Kleinberg 2007; Popescul and Ungar 2003). To answer this question, one must rely on some modeling assumption about the growth of the underlying network and the observation of the current state of the network. There is an interesting technical challenge in the fact that commonly used edge propensity scores (e.g., the Jaccard index or the Adamic and Adar score [Adamic and Adar 2003]) are more relevant for social networks than for urban road networks. Such metrics are rooted in underlying social mechanisms of edge formation, such as homophily (McPherson, Smith-Lovin, and Cook 2001) or the strategic bridging of “structural holes” between unconnected nodes (Burt 2009).
In the absence of existing methods, we developed a novel approach for estimating edge propensity in road networks: the counterfactual road networks (CRN) method. Our goal is to identify road segments (or edges) that, although missing from the network, one would expect to be present given the surrounding infrastructure. A key challenge in this process is that the space of all possible road network configurations is vast, and exhaustively exploring the space would be computationally intractable. Because network growth problems, particularly those involving connectivity optimization, are often NP-hard, 2 one must use heuristics and approximations to constrain the space of candidate road segments. Our approach leverages structural properties of the observed network to guide this selection, allowing us to construct a counterfactual representation of the road network on the basis of plausibility, rather than a combinatorial enumeration of all possibilities.
To achieve this, we first characterize the existing road network using a variety of features, namely, the maximum straight-line distance (i.e., the maximum distance between two intersections directly connected by a road), shadow angle threshold (i.e., the typical level of collinearity between connected nodes), and neighbor angle threshold (i.e., the typical angles in connected V-shaped motifs). We use these features, along with road classifications, to constrain the set of new edge “candidates” (i.e., road segments that are not currently in the network but could plausibly be added). The nodes (or vertices) in the network are the intersections or end points of roads. We treat the nodes as a fixed attribute of the network—we do not add or remove nodes from the network.
We assess the utility of each candidate edge in terms of how much connectivity it contributes to the network. We measure this as the reduction in the shortest path length between nodes that are nearby if the road segment is added to the network. 3 This provides a heuristic for identifying counterfactual edges that are not merely plausible but also functionally impactful. We use this utility as our metric of edge propensity with the assumption that edges with higher utility have a higher likelihood (propensity) of being included in the network, absent additional considerations. We validate this assumption for the cities under study by comparing the utility of existing edges against that of potential candidates that could replace them.
We define three networks for use throughout the following sections:
Proportion of Road Segments in the Network,
Local neighborhood road, rural road, or city street.
Service drive usually along a limited access highway.
Observed Road Network
We construct the observed road network data using publicly available geographic data provided in TIGER/Line shapefiles (U.S. Census Bureau 2012). We use the TIGER/Line shapefiles for “edges” to define the path of roads, where we restrict our analysis to edge entries that represent road segments. Each edge is assigned a permanent unique identifier (UID) by the Census Bureau. The two end points of an edge are called nodes, and each node also has a UID. A single node may be associated with multiple edges, such as a node that joins together two road segments. The data record for each edge includes the edge’s UID, the two node UIDs for its end points, and the classification code for the type of road feature (e.g., primary road, local road, alley). (For more information on the data construction, see Roberto 2018.)
Given a city of interest, we represent its road network as a graph
To rule out outliers and avoid the effect of a few road segments disconnected from the main networks in our analysis, we keep the largest connected component of
Computation of Road Network Features
We compute global descriptors of the road network under study to establish characteristics of roads that are already present in a city. Previous research has used a variety of features to characterize road networks (Boeing 2021, 2022; Jiang and Claramunt 2016; Knaap and Rey 2023; Louf and Barthelemy 2014). Such studies have typically classified types of street networks, developed measures of street network design, or created indicators for related concepts, such as sprawl. Our aim is to identify new edge candidates that do not drastically differ from existing roads. As such, we now propose characteristics of road networks that we will later use to create minimum criteria that new edges must satisfy to be considered edge candidates.
We first remove the primary road (S1100) and ramp (S1630) edge types from
Count of Edges and Nodes in the Road Networks.
We compute the following three global features of
Maximum straight-line distance
Fifth-percentile shadow angle α: For any three (distinct and ordered) nodes (
Fifth percentile neighbor angle β. For any three nodes (

Illustration of different angles considered, centered on
The maximum distance
Selection of Edge Candidates
The vast majority of disconnected pairs of nodes in
These rules ensure the selected edge candidates behave similar to existing edges. The first condition requires at least one of the nodes to be associated with a secondary or local road. These road types are associated with residential streets and are the main sources of connectivity within small geographic areas. The second condition encodes the fact that, in general, we are unlikely to see a road between two nodes that are far from each other. Instead, we tend to observe several road segments connecting these two nodes via some intermediate nodes. For the third rule, if there exists a node
Computation of Edge Utility Scores
Our next objective is to assign a propensity score to each of the candidate edges identified in our previous step. We develop a measure of the propensity, or utility, of a particular edge candidate by computing the (normalized) change of the average shortest path length in the candidate’s local environment when the candidate edge is included in the network. The local environment of each edge includes its end points, as well as all nodes within a particular distance of the end points (here we use a distance of 0.5 km). We use
For every edge candidate
Step 1: Identify a subset of nodes (denoted by
Step 2: Construct the subgraph
Step 3: Add
Step 4: Compute the shortest path lengths between every pair of nodes in
Step 5: Add
Step 6: If any two nodes
Step 7: Compute the average shortest path length for
Step 8: The utility score of the candidate
This procedure quantifies the average reduction in the shortest path length in a local environment achieved by including the candidate edge, normalized by the length of the edge. These steps can also be followed to determine the utility of an existing edge
Figure 2 provides an example of the network produced through the procedure described in the prior sections for the city of Hartford, Connecticut. Edge candidates are sorted by utility: higher utility edges are the top 20 percent of the distribution with a mean of 1.328 (

The road network of Hartford, Connecticut.
Identifying Compatible Edge Candidates
Observing the edge candidates identified in Figure 2, we see that several edge candidates share one terminal node. Thus, when augmenting the network, we might include one of these candidates but not all of them, as the realization of one of these edges might invalidate the others as plausible edge candidates. More formally, we say that two edge candidates are compatible with each other if either of these two cases hold: (1) they do not share any end points; or (2) they share one end point
With this notion of compatibility defined, we detail our procedure for growing a road network
Step 1: We denote by
Step 2: Add the edge candidate in
Step 3: Remove the edge candidates from
Step 4: Repeat steps 2 and 3 until
Put differently, in the above procedure we sort the edge candidates by utility and add them in an ordered fashion to
Comparison between Existing Edges and Edge Candidates
Before implementing the procedure described in the prior sections, we validate our choice of edge utility score. To do this, we consider a measure of edge propensity to be reasonable if existing edges tend to receive higher scores than competing edge candidates. (Edge candidates are considered to be competing with an existing edge if they are nearby and provide similar connectivity, as defined formally below.) In other words, we deem the chosen utility to be meaningful if it can be used to significantly differentiate between the actual edges that exist in the network and competing edge candidates. Thus, in what follows we compare the utility scores of existing edges and edge candidates to see whether they meet such a requirement.
A simple way to evaluate our measure of edge propensity is to compare the average utility score of existing edges (in
For a more detailed validation procedure, we perform the following experiment. We denote by
Figure 3 shows the results of this validation process for both selection schemes for the competing edge candidates. For every existing edge, we have a histogram of the size of the set of competing candidate edges (blue) and a histogram of the cases in which the existing edge has a larger utility than all its competing edge candidates (orange). A large fraction of existing edges do not have alternative candidate edges (

Utility validation experiment for Hartford, Connecticut: (a) candidate edge shares one end point with
Racial Residential Segregation in Observed and Counterfactual Road Networks
To demonstrate the application of our proposed method, we analyze segregation in five cities in the rust belt region of the United States: the former “industrial belt that extended from New England across New York, Pennsylvania, and West Virginia, through the Midwest to the banks of the Mississippi” (Sugrue 2005:6). Rust belt cities have well-documented histories of residential segregation, which reached particularly high levels during the midtwentieth century when these cities peaked in urban growth and has been more resistant to change than other regions (Logan 2000). Concurrent with these legacies of segregation, policy and planning decisions that shaped the built environments of these cities, such as where to build interstate highways or public housing, were often made in response to racial tensions (Mohl 2008; Schindler 2015; Sugrue 2005). We selected five cities that capture a range of sizes: Philadelphia, Pennsylvania, and Baltimore, Maryland, are among the largest rust belt cities (population > 500,000); Cincinnati, Ohio, is a medium-sized city (population between 250,000 and 500,000); and Hartford, Connecticut, and Rochester, New York, are smaller cities (population between 100,000 and 250,000). (Table A4 provides a summary description of each city.)
We use the SPC method (Roberto 2018) to measure and analyze segregation. The SPC method incorporates spatial features of the built environment, including the road connectivity and physical barriers between locations, into the measurement of segregation. We use publicly available population data from the 2010 decennial census (U.S. Census Bureau 2011) and the TIGER/Line shapefiles for blocks and roads (U.S. Census Bureau 2012). 5 Following the SPC method, we use the permanent UIDs assigned by the Census Bureau to establish the relationships between roads (and their associated nodes) and blocks. We then distribute the aggregate population of a block by assigning a portion of the block population to each of the nodes associated with the block. Following this procedure, we estimate the population count and composition for each node in the road network.
We measure distance by calculating the shortest path length (weighted by the length of each road segment) along the road network between all pairs of nodes. We use the road distance measure to construct local environments, or “egocentric neighborhoods” (Lee et al. 2008), around each node.
6
The reach of local environments (i.e., the distance in each direction from a given node that we denote by
To measure the level of segregation, we use the divergence index (Roberto 2024), which measures the difference between the population composition of each local environment and the city’s overall composition. The values of the divergence index represent how surprising the composition of a local environment is, given the overall population composition of the city. The divergence index equals zero, its minimum value, when there is no difference between the local and overall population composition; greater differences produce higher values and indicate a greater degree of segregation. Local values of the divergence index will reach their maximum value when the smallest group in a city is 100 percent of the local population.
The divergence index for node
where π
where
The divergence index measures the same concept of segregation as the dissimilarity index. Both indexes measure the evenness dimension of segregation (Massey and Denton 1988a) by comparing the residential distribution of groups to an even distribution in which groups are distributed proportionally across residential environments (for more details about the divergence index, see Roberto 2024). In the following sections, we focus on the segregation of Black, Hispanic, and White residents in each city. 9
In summary, the SPC method measures the distance between locations in a city along the road network, rather than using straight-line distance as in previous studies, to represent the connectivity of roads and the excess distance imposed by physical barriers. The SPC method systematically analyzes the prevalence of disconnectivity and physical barriers, their association with segregation, and their variation within and across cities. The SPC approach significantly improves on previous measures of spatial segregation by incorporating the connectivity between locations. To examine the relationship between road network disconnectivity and segregation levels, previous studies have compared segregation measures using road network distance and measures using straight-line distance (e.g., Roberto 2018). This is a reasonable comparison because prior spatial segregation measures have relied on straight-line distance to represent proximity. However, as a counterfactual, it would not be realistic for us to expect all intersections to be connected by straight-line roads. The SPC method also does not resolve questions about where in a city disconnection occurs or where it is most surprising. These observations motivated the development of the CRN method to be used in conjunction with the SPC method.
Global Analysis of Segregation
In this section we consider whether adding connectivity to a road network is associated with changes in a city’s level of segregation, and whether the lack of connection facilitated by higher utility missing road segments may facilitate higher levels of segregation. We do this by measuring segregation with the counterfactual road network and comparing it to a control scenario that adds connectivity to the road network without regard for edge propensity.
We calculate segregation using the observed road network

Comparison of segregation for utility sorted and random edge candidates added to the road network: (a) Philadelphia, Pennsylvania; (b) Cincinnati, Ohio; (c) Baltimore, Maryland; (d) Rochester, New York; and (e) Hartford, Connecticut.
At the city level, the overall decreases in segregation may seem small (see Table A4); there are three reasons why this may be the case. First, the segregation values are for the whole city, including places where there are no counterfactual road segments and we would not expect any difference between observed and counterfactual segregation. Second, even in areas with counterfactual road segments, not all of them have differences in racial composition in the disconnected areas, so there would be little to no change in the composition of their local environments with our without the counterfactual road segment.
Third, depending on local compositions, there are cases when including a counterfactual road segment in the network increases segregation. This can occur if, hypothetically, area A has a racial composition similar to the city’s overall composition (low segregation), area B has a racial composition very different from the city’s overall composition (high segregation), and there is a counterfactual road segment between area A and area B. When we measure segregation with the counterfactual road segment included in the network, segregation in area B may decrease, but perhaps not by enough to offset the potential increase for area A. Although the city-level decreases may seem small, these results suggest an association between segregation values and unexpected disconnectivity in the road network.
Next, we would like to determine whether the decreases in segregation in Figure 4 are simply due to the increased connectivity of adding edges to the network, or if the decreases are related to our measure of edge propensity. To test this, we compare the segregation calculated for the counterfactual road network, in which sets of edge candidates are added in order of highest to lowest utility, to a control scenario in which edge candidates are added in sets of randomly selected edges. This control scheme is similar to the procedure in the section “Identifying Compatible Edge Candidates,” but in step 2 we add the candidates randomly rather than based on utility scores.
We consider 10 different random controls for each city and plot the results (average and standard deviation of segregation) in orange in Figure 4. The blue curves are significantly different from the orange ones, which indicates a significant difference in segregation when adding edge candidates from highest to lowest utility compared with randomly ordered edge candidates. Note that the blue and orange curves coincide when no edges are added, as the initial, observed city is the same for both cases. Also, once most edge candidates have been added, the curves tend to converge. However, for intermediate numbers of added edges, we see a significant difference between them, indicating that higher edge propensity is related to larger differences in segregation.
Our computation of edge propensity in the section “Computation of Edge Utility Scores” does not include any information about racial composition. Thus, the significant differences in racial segregation shown in Figure 4 are particularly striking. This finding suggests the high propensity missing road segments—roads that are not present in the city but which one would expect—facilitate disconnection, and their absence is associated with higher levels of segregation.
Local Analysis of Racial Composition and Segregation
In the previous section, we considered city-level patterns and found that high propensity missing road segments facilitate unexpected disconnectivity in the road networks, which contributes to higher levels of segregation in each city. We now shift our focus to the local level and analyze differences in racial composition and segregation for local areas within the cities. We examine whether unexpected disconnectivity (because of the absence of an edge candidate) is associated with differences in racial composition between nearby areas and with higher levels of segregation in nodes’ local environments. 10
To conduct this analysis, we use the set of compatible edge candidates included in the counterfactual road network in the prior section. Taking each compatible edge candidate one at a time, we measure the racial composition and segregation for each of its end points, using local environments with a reach of 0.5 km. We measure the racial composition and segregation with and without the edge candidate included in the observed road network. Segregation values represent the average for nodes associated with each compatible edge candidate. 11
Local Analysis of Racial Composition
If road disconnectivity and physical barriers facilitate greater separation between ethnoracial groups in nearby areas, we should see bigger differences in the local environment composition of an edge candidate’s end points in the observed road network when the edge candidate is missing, compared with when the edge is included in a counterfactual road network. To examine this, we measure the difference in the racial composition of end points’ local environments with and without the candidate edge included in the road network. We then analyze whether there is a statistically significant difference between the observed and counterfactual differences in racial composition using a paired
We find statistically significant differences for all three ethnoracial groups in all five cities (see Table 3). On average, there are larger differences in the local environment composition of an edge candidate’s end points when the edge is missing from the road network. In other words, racial differences between nearby areas are larger when the areas are disconnected. This trend is particularly pronounced for the Black and White populations of Cincinnati and Baltimore and for the Hispanic and White populations of Hartford. Although some of the differences are relatively small, they represent the average over all edge candidates’ end points. The standard deviation of the difference is quite large, indicating sizable variation in the differences among edge candidates.
Average Differences in the Racial Composition of Local Environments for the End Points of Compatible Candidate Edges.
The association between disconnectivity and differences in the racial composition of nearby areas suggests that road segments we would expect to exist given the surrounding infrastructure may be more likely to be missing between areas with different racial compositions. To examine this further, we use a series of ordinary least squares models, summarized in Table 4. We analyze whether there is a relationship between edge propensity and differences in the local environment composition of an edge candidate’s end points, using the observed road network. In other words, are the high propensity missing road segments more likely to connect areas with different racial compositions, compared with the lower propensity edge candidates? The models include the end points of all edge candidates, and we run separate models for each city and each ethnoracial group.
Difference in Observed Racial Composition for Nodes Associated with Each Compatible Edge Candidate.
We find a significant relationship, with higher utility values associated with larger differences in the racial composition of disconnected nodes’ local environments when the edge candidate is missing from the road network. The relationship holds for all cities and racial groups, except the Hispanic population of Philadelphia. This suggests these highly likely but nonetheless missing road segments facilitate both social and spatial disconnection. This is consistent with prior qualitative findings that physical barriers and disconnectivity have been used as mechanisms to reinforce or exacerbate segregation by facilitating greater separation between ethnoracial groups in nearby areas (Armborst et al. 2015; Feagin 1988; Jackson 1985; Mohl 2008; Schindler 2015; Sugrue 2005).
Figure 5 provides an example of the social and spatial division associated with high propensity missing road segments. The figure shows a map of the Black-Hispanic-White population composition in an area of Hartford. In the center of the map, the dotted line represents a high propensity edge candidate that is missing from the observed road network, and the asterisks represent its end points. The solid gray line bisecting the map from north to south represents railroad tracks that divide this area of Hartford, with residents who are predominantly White to the west of the tracks and predominantly Black residents to the east.

Black, Hispanic, and White populations in an area of Hartford, Connecticut, in 2010.
When the edge candidate is excluded from the network, the racial composition in the local environments of its end points are nearly monoracial. In contrast, the local environments are more diverse and representative of the city’s composition if the edge is included in the observed network. This difference in the racial composition of local environments corresponds to a difference in segregation. Segregation in the local environments is higher by 0.16 for the node to the east and by 0.21 for the node to the west when the edge is missing from the network.
Local Analysis of Segregation
To further consider the implications of these differences in racial composition, we examine the relationship between disconnectivity and segregation. We measure segregation in the local environment of each compatible candidate edge’s end points with and without the candidate edge included in the road network. We then compute the difference between segregation when the road network includes and does not include the edge candidate. We measure this separately for the local environment of each end point and calculate the mean value of the two end points for each edge candidate. If the nodes’ disconnectivity (because of the absence of the edge candidate) helps facilitate segregation in the local area, we should see higher segregation values when the edge candidate is missing from the road network.
On average, we find that local segregation is significantly higher when an edge candidate is missing from the road network, compared with when it is included (
Summary Statistics for Compatible Edge Candidates.
To further understand the relationship between segregation and edge propensity, we run a series of ordinary least squares models regressing the difference in segregation with and without the edge candidate on the propensity for the edge to exist in the network (measured in terms of its utility). Table 6 presents the results. The values for the intercept correspond to the difference in segregation for edge candidates with a utility score of zero. The coefficient for edge utility indicates the change in the segregation difference associated with a one unit increase in edge utility. Although we normalize our measure of utility to give it a consistent meaning (i.e., the average reduction in shortest path length per unit of candidate edge length, see the section “Computation of Edge Utility Scores”), the mean and range of values varies across cities. This is important to keep in mind, as the cities with the smallest coefficients for edge utility (Cincinnati and Philadelphia) also have the largest mean and maximum utility values (see Table A3).
Models of the Relationship between Edge Utility and Difference in Counterfactual and Observed Segregation.
We find a significant relationship between edge propensity and segregation differences (see Table 6). Higher propensity edge candidates are associated with larger (negative) differences in segregation. These results indicate that local segregation is higher when an edge candidate is missing from the road network, and segregation is even higher when the edge candidate is expected to exist but is missing. In other words, the more unexpected it is for a road segment to be missing, the higher levels of segregation are when it is, in fact, missing. This suggests these unexpectedly missing road segments are the sources of disconnectivity that contribute the most to higher levels of segregation. This relationship is statistically significant in all five cities.
Conclusions
In this article, we developed and demonstrated a novel approach for measuring and analyzing residential segregation, the CRN method, which enables a deeper understanding of the relationship between road network connectivity and segregation patterns. Our method identifies missing road segments one would expect to exist in a city’s road network, given the surrounding infrastructure. We demonstrated the application of this approach with a global analysis of segregation and two local analyses in five U.S. cities.
First, at the city level, we compared changes in segregation when counterfactual road segments are added to the road network from highest to lowest edge propensity, to a control scenario with edges added without regard for their propensity. We found that the highest utility missing road segments are associated with the largest differences in segregation. The disconnection created by these missing road segments seems to facilitate higher levels of segregation.
Second, at the local level, we examined the racial composition of nearby areas that would be connected by missing road segments. We found that compositional differences are associated with unexpected patterns of disconnectivity: road segments one would expect to exist are more likely to be missing between areas with different racial compositions.
Third, we compared segregation measured using the observed road network and counterfactual road networks that include missing road segments. We found that unexpected disconnectivity was associated with significantly higher segregation in the local areas of missing road segments and at the city level. Missing road segments that are most likely to exist are the sources of disconnectivity that contribute the most to higher levels of segregation. Our findings suggest these highly likely but nonetheless missing road segments facilitate both social and spatial disconnection. This is consistent with prior qualitative findings that disconnectivity and physical barriers have been used as mechanisms to reinforce or exacerbate segregation by facilitating greater separation between ethnoracial groups in nearby areas (Armborst et al. 2015; Feagin 1988; Jackson 1985; Mohl 2008; Schindler 2015; Sugrue 2005).
Some of the differences in observed and counterfactual segregation may seem small, particularly at the city level (see Table A4). To highlight the range of differences, we can compare the differences in segregation for compatible candidate edges in the top 5 percent and bottom 95 percent of utility values (see Table A5). Three notable patterns emerge in this comparison. First, local differences in segregation are substantially larger than citywide differences. Second, the differences in segregation are larger for the top 5 percent of utility values than for the bottom 95 percent, as we would expect from our models (see Table 6). Third, there is a lot of variation across and within cities. For example, in Hartford, the average difference for the top 5 percent of utility is huge: −0.183. In Philadelphia, the average difference is quite small, regardless of the utility value. In both cities, there are sizable standard deviations, especially for the top 5 percent. We would not expect all counterfactual road segments to be associated with patterns of racial segregation equally within or across cities. However, they are an important consideration in understanding the spatial structure of segregation and may help explain why segregation persists in some areas and not others.
In this article, we do not examine the historical processes that made some roads possible and prevented others. Future research using archival methods could extend our approach by shedding light on these decision-making processes. Moreover, our research suggests further questions about the social processes and institutions that create road networks, the mechanisms through which patterns of connection and disconnection are reproduced or reconfigured over time, and the consequences of such patterns. Future research can also explore how spatial patterns of connection and disconnection become a source of information for individuals and institutions and influence residential mobility decisions and housing market processes.
The CRN method does not establish causality or imply the precedence of the road network, nor does it adjudicate which came first: residential segregation that shaped the structure of the road network, or road network connectivity that shaped where people live. Nor is the CRN method a tool for suggesting where new road segments should be built. The method is designed to gain a deeper understanding of complex processes that may result from existing (or missing) infrastructural elements.
Indeed, because of the complex nature of networked systems, we use abstraction and approximation to construct a counterfactual representation of road networks on the basis of plausibility rather than a combinatorial enumeration of all possibilities. We do not account for all physical or geographic features, such as bodies of water or changes in elevation, that might affect decisions about the feasibility of roads. We use this approach not to make recommendations about road construction, but to examine the significance of unexpected missingness.
Overall, our development of the CRN method makes four key contributions. First, we draw attention to the possibilities that emerge by analyzing roads as networks. By situating roads within their networked context, our approach expands our understanding of the separating and connecting power of roads.
Second, rather than focusing only on the observed built environment, we consider what is missing. We developed a set of measures and criteria for identifying a set of plausible and compatible edge candidates for a given road network.
Third, we developed and validated a method to quantify the propensity of missing road segments, which evaluates individual edge candidates and how surprising it is that they are missing, on the basis of how much connectivity they would provide in the nearby area. This allows us to examine the significance of unexpected missingness.
Fourth, the CRN method foregrounds the role of the built environment in understanding residential segregation. Road networks are a major component of the built environment and a durable form of infrastructure that cannot be easily modified.
The CRN method enables further research to uncover the inequalities embedded in this infrastructure and examine the consequences of racist infrastructure. In these ways, this article contributes to understanding the interconnectedness of the spatial and social dimensions of cities.
Footnotes
Appendix
Summary Statistics for Compatible Candidate Edges among the Top 5 Percent and Bottom 95 Percent of Utility Values.
| Top 5 Percent of Utility Values |
Bottom 95 Percent of Utility Values |
|||||
|---|---|---|---|---|---|---|
| Mean |
|
Median | Mean |
|
Median | |
| Hartford, CT | ( |
( |
||||
| Edge utility | 3.969 | 2.366 | 2.876 | .174 | .304 | .049 |
| Observed segregation | .506 | .473 | .302 | .390 | .426 | .226 |
| Counterfactual segregation | .323 | .318 | .202 | .381 | .429 | .221 |
| Difference in segregation | −.183 | .323 | −.056 | −.015 | .084 | 0 |
| Rochester, NY | ( |
( |
||||
| Edge utility | 2.442 | 2.255 | 1.456 | .101 | .154 | .037 |
| Observed segregation | .462 | .331 | .413 | .346 | .318 | .282 |
| Counterfactual segregation | .434 | .346 | .396 | .339 | .319 | .260 |
| Difference in segregation | −.030 | .081 | −.008 | −.007 | .039 | 0 |
| Cincinnati, OH | ( |
( |
||||
| Edge utility | 6.102 | 11.323 | 2.650 | .186 | .284 | .061 |
| Observed segregation | .435 | .286 | .356 | .381 | .293 | .325 |
| Counterfactual segregation | .396 | .299 | .320 | .370 | .295 | .302 |
| Difference in segregation | −.044 | .083 | −.011 | −.011 | .051 | 0 |
| Baltimore, MD | ( |
( |
||||
| Edge utility | 4.263 | 6.433 | 1.987 | .103 | .175 | .031 |
| Observed segregation | .704 | .528 | .520 | .554 | .468 | .438 |
| Counterfactual segregation | .643 | .535 | .449 | .545 | .465 | .433 |
| Difference in segregation | −.066 | .216 | 0 | −.011 | .088 | 0 |
| Philadelphia, PA | ( |
( |
||||
| Edge utility | 10.921 | 34.797 | 1.931 | .096 | .169 | .025 |
| Observed segregation | .673 | .471 | .601 | .640 | .352 | .666 |
| Counterfactual segregation | .669 | .475 | .597 | .636 | .354 | .663 |
| Difference in segregation | −.006 | .101 | 0 | −.004 | .045 | 0 |
Acknowledgements
We are grateful for the feedback from seminar participants at the University of Washington Center for Studies in Demography and Ecology and the Institute for Analytical Sociology, and the attendees of sessions at the American Sociological Association and Population Association meetings. We also benefited from conversations with Maria Riolo and comments from the editors and anonymous reviewers.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported in part by a Rice University InterDisciplinary Excellence Award and resources from the Center for Research Computing at Rice University.

