Abstract
Introduction
Different NBA teams employ different lineup usage patterns. One group of players might not play together often because they play the same position, another because they are collectively weak on defense, or another because the team prefers to keep the starters and reserves in separate units, for example. These lineup usage patterns vary from team to team, and from year to year. We have data on NBA lineups from the website cleaningtheglass.com (Cle, 2023) – for each season, for each team, how many possessions each lineup played together – and we want to understand how these patterns vary and what it means for the teams.
The traditional way to analyze lineup data is to put it into a graph, or network: each vertex is a player, and two vertices are connected if the two players were ever on the court together. This approach has been very fruitful for lineup and other types of data in sports; see, for example, Ahmadalinezhad and Makrehchi (2020); Bai and Bai (2022); Brave et al. (2019); Clemente and Martins (2017); Clemente et al. 2016; Fewell et al. (2012); Hambrick 2019; Korte and Lames (2018); Korte et al. (2019); Mora-Cantallops and Sicilia (2019); Wäsche et al. (2017). However, the network approach considers only information about pairs of players, and thus we lose information about the other players in the lineup with them.
Another approach is to include information about which groups of players have been in a lineup together, in the form of a
Team 1 has two wings,
Now suppose that Team 2 has wings

Lineup graphs – pairwise information. (a)
Since five players play together at a time, we should also consider information about which groups (not just pairs, but trios, quartets, and quintets) played together. We can do that with a simplicial complex (Moore et al., 2012; Ramanathan et al., 2011).
Consider again the teams from the previous example. We can build the simplicial complex as follows. Again the vertices correspond to players, and there’s an edge between two players if they were ever on the court together. But now we add information about trios: if three players were ever all on the court together, then we fill in the triangle that they form. So part of the simplicial complex for Team 1 looks like Figure 2(a), while Team 2’s looks like Figure 2(b). The hollow triangle in Figure 2(b) shows that the three players, who all played together pairwise, never played together as a trio. We will want to count holes like this. (Since the boundary of the hole consists of one-dimensional objects (line segments), we use

Lineup simplicial complexes. (a)
We add more information about Team 2’s lineups. Say that there is a lineup including players

Lineup simplicial complex for Example 3.
We go back to Team 1 and add more information. Say that there is another player,

As we add more players and lineups to the simplicial complex, we will get objects (and holes) in higher and higher dimensions. Since a lineup has five players, the complete lineup simplicial complex will reside in five dimensions. We can no longer visualize it, but fortunately our intuition from two and three dimensions is still essentially correct. We can use topology, a kind of generalization of geometry, to count the holes of different dimensions in the simplicial complex, and we can perform the actual calculations regardless of dimension.
We also want to add information about the number of possessions played together by each lineup. We can analyze this using persistent homology.
Possessions played by different lineups, Team 2.
Now, when we build our simplicial complex out of the lineups, we can filter it based on the number of possessions played. If we include only lineups playing 1000 or more possessions, we get Figure 5(a). Similarly, if we include only lineups playing 800 or more possessions, we get Figure 5(b). A cutoff of 500 possessions gives Figure 5(c), a cutoff of 300 possessions gives Figure 5(d), and a cutoff of 10 possessions gives Figure 5(e). (Note that the lengths and angles in the complex don’t matter, just the ways in which the vertices and edges are joined.) We see that a hole appears at possessions threshold 500, and another hole appears at 300 and disappears at 10.
Lineup simplicial complexes for varying possessions thresholds. (a) Threshold 1000 (b) Threshold 800 (c) Threshold 500 (d) Threshold 300 (e) Threshold 10.
For large data sets, it will be difficult to keep track of all the simplicial complexes this way (and impossible to draw them in higher dimensions), so we summarize the information about the holes, the homology (Section ‘Homology’), in the form of a Homology barcodes. (a) Barcode for Example 5 (b) Barcode for Example 6.

Barcode summary for Example 5.
Possessions played by different lineups, Team 1.
So, to summarize our method of analysis of the lineups data:
We combine all the information about which players and groups of players were on the court at the same time into a simplicial complex. This is high-dimensional and can be very complicated, so: We compute the homology of the simplicial complex to detect holes, roughly corresponding to groups that don’t play together. Persistent homology adds information about how many possessions each lineup played together.
Simplicial complexes
We will give the bare minimum of background information necessary to understand the paper. There are many references for simplicial complexes and homology; see, for example, Edelsbrunner and Harer (2010); Fugacci et al. (2016, 2024); Nanda (2024). For a more intuitive and less formal explanation of homology, see Houston-Edwards (2021).
An
Every abstract simplicial complex can be realized geometrically as a simplicial complex in
The simplicial complex determined by an NBA team’s lineups has a particularly simple form. Since every lineup has exactly five players, the simplicial complex is generated by the 4-simplices, each representing a single lineup of five players. So, for example, the simplicial complex contains a given 3-simplex if and only if those four players played together in at least one five-player lineup.
Homology and persistent homology
Homology
We can put an algebraic structure on a simplicial complex by giving each simplex an orientation. Intuitively, an orientation for a line (1-simplex) corresponds to a choice of direction, an orientation for a triangle (2-simplex) corresponds to a choice of top and bottom, and analogously in higher dimensions. Each simplex has two possible orientations, and we can begin to do algebra by saying that a simplex with a given orientation is equal to minus the same simplex with the opposite orientation; then their algebraic sum would be zero. (An orientation for a point (0-simplex) is just a choice of positive or negative.)
An orientation for a

Lineup simplicial complexes. (a) A boundary (b) A cycle that is not a boundary.
In this example, we observe that
Intuitively, the cycles that are not boundaries correspond to ‘holes’ in the simplicial complex. More formally, we take the algebraic set of cycles in each dimension, and divide out the set of boundaries. What remains are the homology groups of the simplicial complex,
Similarly, a hollow tetrahedron would contribute a 1 to

A hollow cylinder with no top or bottom.
Persistent homology
Again, we give only a superficial overview of persistent homology; for details see, for example, Edelsbrunner and Harer (2010); Fugacci et al. (2016, 2024); Nanda (2024). The basic idea is straightforward. We add an appropriately defined “weight” function to the simplices, then filter the simplicial complex to include only simplices at or above a given weight. 1 As we decrease the threshold, the simplicial complex grows, and the homology can change, as in Examples 5 and 6 above. We record the changes in a barcode, which shows when each hole is created and when it disappears, as in Figure 6.
NBA lineups results
Using the data from cleaningtheglass.com, for each season we create simplicial complexes for each team in Python using
We see that the Timberwolves’ lineup data are more homologically complicated than the Spurs’, in two different ways. First, Minnesota has more holes, in more dimensions. Second, Minnesota’s holes persist longer, reflecting a more robust structure. The single hole for the Spurs, in

Betti numbers. (a) 2022-23 San Antonio Spurs (b) 2018-19 Minnesota Timberwolves.

Barcodes. (a) 2022-23 San Antonio Spurs (b) 2018-19 Minnesota Timberwolves.
In this setting, the ‘holes’ detected by homology tell us about the set of lineups. For example, if
A nonzero
The manner in which the homology algorithm works makes it difficult to determine exactly which lineups are creating the holes. This is not a real problem, since we are interested in the overall pattern rather than individual lineups, but we can get the information by analyzing the data by hand if necessary. For example, for the 2022-23 Spurs, we see that the

Each pair played together, but not the trio, creating a one-dimensional hole.
(Direct examination of the data also shows that Bates-Diop, Vassell, and Jakob Poetl played together pairwise, but not as a trio, so that hole can also explain the
The persistent homology tells us how the number of lineup ‘holes’ changes as we decrease the possessions threshold. In some cases, we can gain homology, as the newly added lineups create new cycles. In other cases, we can lose homology, as the existing holes are filled in by simplices created by the new lineups.
We have hundreds of individual barcodes, one for each team for each season going back to 2003-04. To analyze them collectively, we need to condense the data. There are many ways to do this. Here, as an example, we will examine trends over time by taking the maximum, over all thresholds, for each
The complete results are displayed in Figure 12.
3
The most interesting cases,

NBA lineups, average maximum Betti numbers, 2003–04 to 2022–23.

NBA lineups, average maximum
Interpretation and uses
Missing edges in the network reflect pairs of players who do not play together. Holes in the simplicial complex, as measured by homology, reflect groups of players who do not play together. If there were no positions in basketball (formal or informal), then everyone could play together, and we would not expect to see any lower-dimensional holes. The more restrictions there are on groups playing with each other, the more holes we would expect to see.
An obvious source of these restrictions is shooting ability. Over the last decade, teams have placed enormous importance on spacing. As in Example 1, where Team 2 doesn’t play two poor-shooting wings and the center at the same time, this can lead to certain lineup combinations receiving very few possessions.
Another obvious potential restriction is defensive ability. As shooting has become more important, so has the ability to defend it. For example, a team might be able to play two weak perimeter defenders if a strong defensive center is in the game to protect the rim, but not if a smaller player is in at center. Again, this creates restrictions on playable lineups.
This emphasis on spacing, and the resulting limitations on lineups, could explain the pattern we see in Figures 12 and 13. In both
Lineups persistent homology, before and after Thibodeau takes over.
Betti numbers for 2022-23 playoff teams in the regular season and playoffs.
In Figure 14, we plot average player age versus three-point attempts per game for teams from the 2022–23 season. We see that teams with nontrivial persistent homology (meaning those with maximum 2022–23 average age, three-point attempts per game, and homology, by team.
Teams from the 2022-23 season with nontrivial persistent homology (again, meaning those with maximum Team winning percentages, 2022–23 vs. 2023–24, and homology.
There are many other ways to use the data. On the level of a single team, we could track the evolution of the simplicial complex throughout the season. As the team experiments initially and then begins to abandon lineups that don’t perform well, we would expect to see more holes form. When a player is injured, new lineups would be used, eliminating holes.
We can also compare teams, either different teams in the same season or the same team from year to year. Different substitution patterns (for example, resting most of the starters at once versus staggering their rest) would show up in the simplicial complex. Some teams might replace an injured starter by moving a top reserve into the starting lineup, while others might prefer to start a little-used player in order to keep their second unit intact; this would lead to different patterns in homology. Similarly, we could detect different approaches to load management and resting players.
The techniques could also be applied at the level of individual games. We would expect to see more different lineups in a blowout than in a close game, for example. Games with more fouls called would also be expected to involve more kinds of lineups. Some teams might use different types of lineups in home games versus away games. By including data on when in the games lineups appear, and not just the total possessions played, we could detect usage patterns at different times in the game, and relate that to the game situation at the time.
There are many other ways to generalize this approach. We could include information on wins and losses, or plus/minus, to analyze the relationship with lineup usage. Instead of looking at each team’s lineup separately, we could analyze the ten players on the floor at a time, to detect reactions to the other team’s lineup changes. (For example, a key advantage of having a good shooter in the lineup is that it forces the other team to defend him – how is this reflected in the lineups that are employed against that player?) Or, instead of using individual players as the vertices in the simplicial complex, we could use clusters of similar players – similar heights and weights, or playing the same position, or playing similar roles (as determined by clustering or other methods Muniz and Flamand, 2022). And of course this technique can be applied to other team sports, many of which have very different player usage patterns.
Conclusion
We describe a technique to analyze NBA lineup usage by considering the persistent homology of a simplicial complex, based on information on which groups of players are in a lineup together, rather than pairs of players, as in traditional network analysis. Groups of players who aren’t on the floor together show up as holes in the simplicial complex and can be detected by homology. As is frequently the case in topological data analysis, the interpretation of the homology data can be somewhat challenging. Nonetheless, there are many possible applications in which the technique can offer an improvement over network analysis.
