Sage Journals: Discover world-class research

Abstract

Theoretical constructs, such as the information gap theory and compression progress theory, seek to explain how humans practice curiosity. According to the former, curiosity is the drive to acquire information missing from our understanding of the world. According to the latter, curiosity is the drive to construct parsimonious mental world models. To complement the densification and simplification processes inherent to these frameworks, we propose the conformational change theory, wherein we posit that curiosity builds mental models with marked conceptual flexibility. We formalize curiosity as a knowledge-network-building process to investigate each theoretical account for individuals and collectives. In knowledge networks, gaps can be identified as topological cavities, compression progress can be quantified using network compressibility, and flexibility can be measured as the number of conformational degrees of freedom. We find that curiosity fills gaps and constructs increasingly compressible and flexible knowledge networks. Across individuals and collectives, we determine the contexts in which each account is explanatory, clarifying their complementary and distinct contributions. Our findings offer a novel networks-based perspective that harmonizes with (and compels an expansion of) the traditional taxonomy of curiosity.

Keywords

Knowledge networks curiosity topological data analysis information theory mechanical networks

Significance statement

Curiosity is an intrinsically motivated search for information. It is enduring and open-ended, and may have evolved to help humans build accurate mental representations of our ever-changing environments. Due to the significant role that curiosity plays in our lives, several theoretical constructs have sought to explain how we engage in its practice. Yet, quantitative validation of these accounts has remained elusive due to the fundamental challenge of constructing formal models of mental representations of knowledge. We overcome this challenge by conceptualizing curiosity as the process of building a growing knowledge network. We find that different theoretical accounts may be explanatory in different contexts, thereby offering a pluralistic view of curiosity.

Introduction

Humans must manage uncertainty and embrace change to thrive in a complex and dynamic environment (Gottlieb et al., 2013). To this end, we continually consume information to construct and maintain accurate mental models of the world (Johnson-Laird, 2010; Valadao et al., 2015). Information-seeking behavior may be driven by a variety of intrinsic and extrinsic factors. Arising from the latter, information acquisition is an intermediate step towards attaining a specific goal—such as increased wealth or social recognition—that is ultimately rewarding (Dweck, 1986). By contrast, the intrinsic motivation to seek information is commonly conceptualized as curiosity (Gottlieb et al., 2013; Kidd and Hayden, 2015; Loewenstein, 1994). In silico work suggests that curiosity may have evolved to maximize long-term evolutionary fitness in rapidly changing environments (Singh et al., 2010). Additionally, studies have shown that humans are driven to know, even when information is costly to obtain (Clark et al., 2021; Hsee and Ruan, 2016) and may have no immediate tangible utility (Bennett et al., 2016; Brydevall et al., 2018). Curiosity-driven information gathering is, therefore, inherently rewarding.

Given the significant role that it plays in our daily behavior and decision-making, several theories have sought to explain how individuals practice curiosity. The information gap theory views curiosity as the drive to obtain information that is missing from a mental model of the world (Loewenstein, 1994). In this account, perception of a gap in one’s knowledge creates an aversive state of uncertainty that, in turn, motivates a search for information to close the gap (Daddaoua et al., 2016). In the complementary perspective of compression progress theory, curiosity is the drive to obtain information that improves the compression of a mental model and thereby lowers its cost of representation (Schmidhuber, 2008; Zhou et al., 2020). Both theories provide several important explanations for curiosity-driven information-seeking behavior. On the one hand, curiosity that modulates uncertainty levels by closing information gaps can facilitate lifelong learning. On the other hand, compression progress enables the extraction of essential latent structures of knowledge, offering greater capacity for generalization (Collins, 2017; Tenenbaum et al., 2011). Low-cost representations also allow for greater functional capacity by freeing cognitive resources that would otherwise be dedicated to information storage and retrieval (Amancio and Wolff, 2019). However, theoretical constructs for curiosity, such as the information gap theory and compression progress theory, are difficult to validate quantitatively. A fundamental challenge is constructing formal models of mental representations of knowledge. In the absence of such models and the conceptual frameworks that accompany them, it is unclear how to translate theoretical concepts such as knowledge gaps and compression progress to well-defined variables that can be measured in experiments.

One such model that has shown promise is a network model where knowledge is composed of discrete and yet interconnected concepts (Chrastil and Warren, 2014, 2015; Ericson and Warren, 2020; Peer et al., 2021; Schapiro et al., 2016; Stiso et al., 2022; Warren et al., 2017). In graph learning studies, volunteers are shown sequences of images on a screen, where, unbeknownst to the volunteers, each image corresponds to a node in an underlying network (Lynn and Bassett, 2020). Based solely on observed transitions, and despite being unaware of the underlying network’s structure, participants successfully infer statistical regularities from the temporal order in which images appear (Garvert et al., 2017; Kahn et al., 2018; Schapiro et al., 2013; Schapiro et al., 2016; Tompson et al., 2019). Crucially, the structure of the pre-defined experimental graph can be recovered from neural activity by decoding simultaneously acquired functional magnetic resonance imaging (fMRI) data (Garvert et al., 2017; Tompson et al., 2020). The sequential manner in which stimuli are presented in graph learning tasks can be conceived of as a walk prescribed by the experimenter in a limited knowledge space of objects, images, concepts, or movements. Curiosity, too, can be conceived of as a walk, but one that is largely self-directed and purposeful across the vast landscape of knowledge. To evaluate curious walks, recent work gathered browsing histories from individuals who freely explored the online encyclopedia Wikipedia. Structural features of the knowledge networks that participants walked upon (Figure 1(a)) were found to be associated with curiosity, as measured by an independent index of participants’ sensitivity to information deprivation (Lydon-Staley et al., 2021).

Figure 1.

Connectional approach to curiosity. (a) A participant constructs a growing knowledge network through curiosity-driven self-directed exploration of Wikipedia, a vast networked landscape of information. Nodes represent unique Wikipedia pages. Edges represent hyperlinks between nodes. Nodes are colored to denote the order in which they are visited. (b) Gaps in a knowledge network can be formalized using algebraic topology and tracked in several topological dimensions. The green and blue gaps represent a 0-dimensional and 1-dimensional cavity, respectively. (c) Compression progress aims to construct internal representations of the world that are both storage efficient and generalizable. In a knowledge network, all concepts that belong to the same cluster can be represented parsimoniously at a higher level of abstraction using their cluster identity. The unclustered network has 9 nodes and 12 edges, while the clustered network only has 3 nodes and 3 edges. (d) A mechanical network can possess several spatial configurations, any of which can be arrived at from any of the others through a series of conformational changes. We formalize and measure knowledge network flexibility as the number of available conformational degrees of freedom.

Here, we leverage this framework and cast curiosity as a network building process. This approach allows us to take qualitative explanations for information-seeking behavior, such as the information gap theory and compression progress theory, and operationalize them in quantitative statistics. Information gap theory posits that humans add information to regulate uncertainty by filling gaps (Gottlieb et al., 2013; Kidd and Hayden, 2015; Loewenstein, 1994). This theory can be operationalized by treating gaps in networks as topological cavities, and by tracking their evolution using techniques from applied algebraic topology (Bianconi, 2021; Ghrist, 2007; Hatcher, 2002) (Figure 1(b)). In contrast, compression progress theory posits that humans subtract or discard information (Lynn and Bassett, 2021; Schmidhuber, 2008; Zhou et al., 2020) due to limited cognitive capacity (Shiffrin and Schneider, 1977; Zhou et al., 2020, 2020). Compressing a network while maintaining meaningful latent structure requires that we discard some irrelevant information while maintaining important information about past experiences and present priorities (Lynn et al., 2020; Lynn and Bassett, 2021; Momennejad, 2020; Zhou et al., 2020a). This theory can be operationalized by measuring the compressibility of a network, an information-theoretic quantity that captures the ability of a network to be compressed (Lynn and Bassett, 2021) (Figure 1(c)). Via the network operationalization of these two theories, we come to see that curiosity is marked as a process by which networks of knowledge densify and simplify, raising the question of what alternative process might drive them to sprawl and become complex.

To address this question, we expand beyond historical accounts to operationalize our own conformational change theory of curiosity. The conformational change theory suggests that information-seeking behavior results in the creation of expansive knowledge networks (Zurn et al., 2021) embedded in a conceptual geometry (Figure 1(d)). The notion of a conceptual geometry is motivated by prior studies of neural population geometry and the fact that information can be embedded and processed in locally Euclidean geometric representations to solve complex tasks (Chung and Abbott, 2021). The geometry provides a key affordance for curiosity—conceptual flexibility—as the knowledge network can mechanically conform into different shapes. While some concepts are separated from other concepts by fixed distances of shared-versus-unshared meaning, other concept pairs can move closer together or farther apart as inter-concept relations shorten or lengthen depending on time and context (Kim et al., 2019). This flexibility allows us to draw from past experience, cohere the past with newly learned information, monitor conflict, and respond appropriately in different contexts (Botvinick and Braver, 2015; Karuza et al., 2016; Tenenbaum et al., 2011); it may also subserve the unexpected conceptual combinations that accompany imaginative thought and support serendipitous discoveries (Copeland, 2019; McAllister et al., 2012). Mechanically akin to conformational change in proteins, the flexible reshaping of the knowledge network can only occur if concepts are sparsely connected; densely connected linkage networks embedded in a Euclidean geometry are rigid. Hence, the conformational change theory of curiosity posits a drive for conceptual flexibility that leads networks to sprawl and become complex.

As is now evident, each of the three theories is motivated by a distinct and uniquely important psychological drive: to reduce uncertainty by learning a missing piece of information, to discover latent patterns by distilling fundamental epistemic elements, and to reshape information by flexibly reconfiguring knowledge networks. Here, we test each theory through parallel analyses of the growth of individual and collective knowledge networks derived from Wikipedia. At the individual scale, we construct knowledge networks for 149 individuals using their Wikipedia browsing histories (Lydon-Staley et al., 2021) (Figure 1(a)). At the collective scale, we extract Wikipedia networks to assess knowledge growth in 30 disciplines such as calculus, economics, and linguistics (Ju et al., 2020). We treat Wikipedia pages as nodes in both sets of networks and add edges between them according to the presence of hyperlinks between pages. For the data on individuals, we specify network growth using the order in which individuals visit pages; for the data on collectives, we use the years in which different concepts originate. To model the random growth of knowledge in both data sets, we create 25 degree-preserving edge-rewired versions of each network. We test the predictions of the three theories by comparing measurements of relevant features from empirically observed knowledge networks to those from the related null networks. First, considering the information gap theory, we expect to find fewer-than-chance topological cavities in growing empirical knowledge networks due to the hypothesized drive to close knowledge gaps when they are perceived. Considering compression progress theory, we hypothesize that growing knowledge networks will exhibit greater-than-chance compressibility due to the hypothesized drive to distill fundamental epistemic elements (Zhou et al., 2020a). Third, considering conformational change theory (Zurn et al., 2021), we hypothesize that knowledge networks will possess greater-than-chance capacity for conformational changes due to the hypothesized drive for conceptual flexibility. In testing these hypotheses, we demonstrate the utility of the network approach in quantitatively validating existing theoretical constructs of curiosity as well as in formulating new ones.

Results

Network growth formalism

Before testing the predictions of the three theories, we clarify the network formalism upon which they are operationalized. Consider a graph $G = (V, E)$ with node set $V$ and edge set $E \subseteq V \times V$ . We define a growing knowledge network with the tuple $(G, s)$ , where s denotes a map $s : V \to N$ that specifies the rank order in which nodes are added to the network. For networks built by individuals, s is determined by the order in which Wikipedia pages are first visited. For collective networks, s is determined by the years in which different concepts originate. With nodes in a network, we construct a sequence of graphs $G_{0} \subset G_{1} \subset \dots \subset G_{N} = G,$ (1)where $G_{p}$ is a subgraph of $G$ comprised of the first p nodes in s and all q connections between them that exist in $E$ . Such a sequence—in which each element is a subset of the next—is an example of a filtration. We index each subgraph in a filtration by the number of nodes in the network at that stage. We identify topological cavities, measure network compressibility, and compute conformational flexibility for all subgraphs in filtrations of individual and collective knowledge networks as well as in filtrations of related null model networks. We perform non-parametric permutation tests to examine differences between feature curves for empirical and null model data (see Sec. Statistical testing for details). Since networks in a given data set may have different sizes, we normalize filtration indices to span the range [0, 1], and align values of interest to be defined on the same points before computing the mean across all individuals or topics for a feature-of-interest (Christianson et al., 2020). For completeness, we also report results with unnormalized values in the Supplementary Materials.

Information gap theory

The information gap theory posits that curiosity is the drive to collect units of knowledge that fill gaps in one’s internal representation of the world (Loewenstein, 1994). When we model internal representations as networks, missing information can be usefully operationalized as topological cavities, which can be tracked in a principled manner using tools from applied algebraic topology (see Sec. Detecting topological cavities for methodological details) (Ghrist, 2007; Hatcher, 2002). This operationalization follows prior work demonstrating that domains as diverse as language development in toddlers (Sizemore et al., 2018), the introduction of characters in Dostoyevsky’s novels (Gholizadeh et al., 2018), and the presentation of concepts in linear algebra textbooks (Christianson et al., 2020) exhibit a systematic creation and closing of such cavities. By employing this approach, we can determine whether curiosity-driven exploration is motivated by a preference for gap closure. In a network, (except for dimension 0) a k-dimensional cavity, also known as a k-cycle, is identified as an empty enclosure formed from (k + 1)-cliques, where cliques are defined as all-to-all connected subgraphs of k + 1 nodes. The k-th Betti curve records the number of k-cycles present at each stage of a network’s growth. Cycles of dimension 0 represent disconnected network components (Figure 2(a)), whereas those of dimensions 1 and 2 represent loop-like holes (Figure 2(d)) and pocket-like voids (Figure 2(g)), respectively. Nonetheless, we also benchmark our results using a simpler formalization of gaps in the Supplementary Materials. In this alternative formulation, we define gaps as triads and quantify their presence with the average clustering coefficient measured at each stage of knowledge growth.

Figure 2.

Probing information gaps as topological cavities in growing knowledge networks. We operationalize information gaps as topological cavities (also referred to as cycles) and track their evolution in growing individual and collective knowledge networks. We plot the number of cycles as a function of time. (a) Topological cavities in dimension 0, or 0-cycles, represent disconnected network components. (b), (c) Individual and collective knowledge networks tend to possess fewer disconnected components than expected in edge-rewired null model networks. (d) In dimension 1, a topological cavity represents an enclosed loop formed by edges. (e), (f) Growing individual and collective knowledge networks tend to possess fewer loops than expected in edge-rewired null model networks. (g) A topological cavity in dimension 2 constitutes a void enclosed by 3-cliques, or triangles of interconnected nodes. (h), (i) Growing individual and collective knowledge networks tend to possess more 2-dimensional cavities than expected in edge-rewired null model networks. Shaded regions in panels B, C, E, F, H, and I represent standard error. Purple curves denote the average number of cavities in edge-rewired null model networks.

Considering information gap theory, we hypothesized that empirical knowledge networks would contain fewer cavities than topologically similar edge-rewired null model networks. To test this hypothesis, we compute persistent homology for filtrations of individual and collective knowledge networks in dimensions 0, 1, and 2. We find that the number of 0-cycles, or disconnected network components, increases as individual knowledge networks grow, and does so at a steeper rate in null networks than in empirical networks (Figure 2(b)). For collective knowledge networks, we find that the number of disconnected components first increases and then decreases both in the empirical and in the null model networks, albeit with significantly different peak values (Figure 2(c)). In both data sets, for a significant duration of growth, Betti curves for observed networks are lower than those for null model networks. In dimensions 1 and 2, we find that the number of cycles increases as individual and collective knowledge networks grow (Figure 2(d)–(i)). This temporal trajectory could arise from the fact that filling gaps by forging new connections can open new gaps, making it prohibitively difficult to track (and fill) gaps among an increasingly large number of items. In support of information gap theory, the rate at which 1-cycles increase is lower for the empirical networks than for the null networks (Figure 2(e) and (f)). In contrast to information gap theory, the rate at which 2-cycles increase is higher in the empirical networks than in the null networks (Figure 2(h) and (i)). The marked growth of 2-dimensional cavities could reflect an alternative drive to expand and complexify knowledge networks. All empirical Betti curves are significantly different from the Betti curves for the null model data (p_perm < 0.001) as determined via permutation testing. In summary, across both individual and collective knowledge networks, our findings suggest that information gap theory explains how separate areas of interest (0-cycles) grow and then are subsequently linked together, and how loop-like holes (1-cycles) within specific areas of interest grow and are subsequently filled. However, the extent and longevity of larger pocket-like voids (2-cycles) remains unexplained by the information gap theory, motivating an assessment of alternative psychological drives.

Compression progress theory

Originally proposed as a general algorithmic framework for reinforcement learning, compression progress theory posits that curiosity is the drive to continually improve the compression of a learner’s mental model of the world (Schmidhuber, 2008). By conceptualizing mental models as knowledge networks, we can measure compressibility using recent advances at the intersection of information theory and network science (Lynn and Bassett, 2021). To compute the compressibility of a network, we begin by considering a random walk x = (x₁, x₂, ⋯), where x_t is the node that appears at step t (Figure 3(a)). The rate at which the sequence x generates information is given by its entropy H( x ). If we group the network’s nodes into clusters, we can re-write x = (x₁, x₂, ⋯ ) as y = (y₁, y₂, ⋯ ) by replacing each node x_t with its cluster identity y_t. The rate at which the clustered sequence y generates information about the original sequence x is given by the mutual information I( x , y ) = H( y ) − H( y | x ). The number of clusters that we use to compress the network defines a scale of its description. As we decrease the number of clusters—that is, as we increase the scale of description—the information rate I( x , y ) decreases. When each node belongs to its own cluster, the information rate I( x , y ) equals the original rate H( x ) (Figure 3(a)). By contrast, when all nodes are grouped together into one cluster, the information rate is zero (Figure 3(a)). At all scales in between, we can find the optimal clustering of nodes that minimizes the information rate (Figure 3(b)). We then define the compressibility of the network as the maximal reduction in the information rate, averaged across all scales of its description (Figure 3(b)) (Lynn and Bassett, 2021).

Figure 3.

Quantifying compression progress using network compressibility. (a) A random walk x on a network is a sequence of nodes constructed by transitioning from a node x_t to one of its neighbors uniformly at random. Such a sequence generates information at a rate given by its entropy H( x ). Now suppose that we group the nodes into different clusters; the number of clusters defines the scale at which the network is described. The random walk x is compressed into a new sequence y , where y_t is the cluster that contains node x_t. The clustered sequence y generates information about the original sequence x at a rate given by the mutual information I( x , y ) = H( y ) − H( y | x ). Mutual information I( x , y ) is greatest—and equal to the entropy H( x )—when each node is assigned independently to its own cluster. By contrast, in the limit where the entire network is viewed as one large cluster, mutual information vanishes. (b) At each intermediate scale between these two extremes, we can find an optimal clustering that maximally lowers the information rate. Network compressibility is then defined as the maximal reduction in the information rate, averaged across all scales. (c) Growing individual knowledge networks are markedly more compressible than expected considering related edge-rewired null model networks. (d) Growing collective knowledge networks show only a slight tendency for greater-than-expected compressibility. Shaded regions in panels C and D represent standard error. Purple curves denote average compressibility values for edge-rewired null model networks.

Considering compression progress theory, we hypothesized that growing knowledge networks would be more compressible than topologically similar edge-rewired null model networks. We test our hypothesis by computing network compressibility for each subgraph in filtrations of individual and collective knowledge networks. We find that compressibility increases monotonically as knowledge networks grow. At all stages of growth, and in support of our hypothesis, networks for individuals exhibit greater-than-expected compressibility (Figure 3(c)). This same trend holds, but to a much weaker extent in the collective knowledge networks. While the early stages of growth evince greater separation between empirical and null compressibility values, the two curves overlap in later stages of growth (Figure 3(d)). Based on non-parametric permutation testing, compressibility curves for individual and collective knowledge networks are significantly different from their null model counterparts (p_perm < 0.001). These data provide evidence that is critical for an evaluation of compression progress theory. Consistent with the theory, for individuals, a preference for greater compressibility indicates that curiosity is driven to construct parsimonious mental representations of knowledge. For collectives, a similar-to-expected compressibility could reflect (i) the group’s nature as constituted by the diverse voices and expertise that comprise it, which can enhance the relevance of details and preclude their compression, and (ii) the fact that groups may not be constrained by the same cognitive capacity limitations that constrain individuals.

Conformational change theory

A curious learner practising curiosity solely according to information gap theory strives for growth and completeness of knowledge. By contrast, a learner practising curiosity solely according to compression progress theory strives to uncover the latent organization of the world. In the process, neither individual can keep pace with the growing complexity of the environment; with a rapidly expanding frontier of ignorance as new unknowns become accessible. Crucially, both theories suggest how we can usefully add or relinquish information but neither acknowledges the worth of what we already possess. Prior work has shown that curiosity-driven information acquisition is not only about growing or shedding knowledge, but also about retreading and reconsidering what one presently holds (Lydon-Staley et al., 2021; Zhou et al., 2020a). Following Zurn et al. (2021), we propose that such reflection entails moving concepts flexibly in relation to one other. Specifically, we define curiosity as the process of constructing knowledge networks with a finely arbitrated balance between local internal rigidity and global external flexibility. Rigidity and flexibility are mechanical notions that require an object of interest to be embedded in physical space. Therefore, drawing inspiration from a rich literature on cognitive maps (see Supplementary Materials for background), we assume that knowledge networks are embedded in Euclidean space where they possess several degrees of freedom. We then measure flexibility as a network’s ability to undergo conformational changes (Kim et al., 2019) and formalize our account as the conformational change theory of curiosity.

Before measuring the conformational flexibility of growing knowledge networks, we offer a brief introduction to mechanical networks. Consider a triangular network in two dimensions. Each of its nodes can be located with two coordinates (Figure 4(a)). This network has three available rigid-body motions: horizontal translation, vertical translation, and rotation. Next, consider a network comprised of 4 nodes and 4 edges (Figure 4(b)). This network possesses the same rigid-body motions as are available to the triangle. Additionally, the quadrilateral possesses a conformational degree of freedom. A conformational change in a network alters the Euclidean distance between unconnected pairs of nodes. For instance, if a pair of adjacent nodes in the quadrilateral is held fixed in space, the remaining nodes can be moved freely while sweeping across an angle θ with respect to the fixed pair (Figure 4(b)). Through this process, this simple network exhibits a conformational change from a square to a diamond. Mechanical networks can exist in several configurations, each of which can be reached through a series of conformational changes from any of the others (Figure 4(c) and (d)). The number of independent conformational motions available to a network with p nodes and q edges embedded in a d-dimensional space is dp − q. Among these dp − q degrees of freedom are d(d + 1)/2 rigid-body motions, which include translations and rotations. The rest, given by $D o F_{C} = d p - q - \frac{d (d + 1)}{2},$ (2)are the available conformational degrees of freedom, or conformational motions.

Figure 4.

Conformational change in mechanical networks. (a) In two-dimensional space, a network with three nodes and three edges has three rigid-body degrees of freedom: horizontal translation, vertical translation, and rotation. (b) In addition to the three rigid-body motions, a quadrilateral frame also possesses a conformational degree of freedom, depicted here via the angle parameter θ, which allows it to change shape from a square to a diamond. (c) Rigid and flexible sub-units can be combined to construct networks capable of undergoing large-scale conformational changes. Different configurations of the same network can be reached by propagating conformational changes through its structure. (d) A network chain with 338 nodes and 672 edges folds to form a quadrifolium (panel D reproduced with permission from Kim et al. (2022)).

Crucially, equation (2) relies on the linear independence of edges. Linear independence entails that there are no redundant edges to over-constrain a set of nodes beyond the formation of a rigid cluster yielding a state of self-stress, and that the network does not exist in a rare and pathological geometry known as a kinematic bifurcation (Kim et al., 2019; Mao and Lubensky, 2018). States of self-stress imply that edges within a network bear internally balanced forces. A negative value for the number of conformational degrees of freedom would indicate that the network—when considered in its entirety—is over-constrained. In our framework, we assume that such states, wherein competing constraints between concepts cannot be resolved, are aversive to humans. This assumption could be tested in future work by correlating individual-level measurements of self-stress with the Need For Gap Closure (NFCS) scale (Roets and Van Hiel, 2007; Webster and Kruglanski, 1994). We alleviate the tension by incrementing the dimensionality by 1 when needed. Specifically, we increment d by 1 until DoF_C is no longer negative. This approach yields the minimum dimensionality required at each stage of growth to avoid over-constraining the network. Figure 5(a) depicts this process for a representative filtration of a growing network. When node 3 is added to the network, which is initially embedded in a 1-dimensional space (green), the number of conformational degrees of freedom evaluates to (1 × 3) − (3) − 1 = −1, indicating the presence of self-stress and requiring that the embedding dimensionality be incremented to 2 (orange). This process repeats when node 6 is added, resulting in an increase in dimensionality to 3 (red). To compute the number of conformational degrees of freedom for growing knowledge networks, we assume that they are initially embedded in a 1-dimensional space. Whenever the quantity in equation (2) becomes negative, we increment dimensionality by 1. We note that due to this initial assumption, values for embedding dimensionality, both for individuals and collectives, are best interpreted not in absolute terms but instead as differences or in comparison with values for the null model networks.

Figure 5.

Conformational change theory of curiosity. We propose that in the networked space of the mind, while some concepts and their relationships have fixed locations, others can move flexibly in a context-dependent manner. Such flexibility affords curious humans the ability to rethink and reconfigure what they already know in light of new information. We formalize flexibility as the number of conformational degrees of freedom (DoF_C). In a network residing in a d-dimensional space with p nodes and q edges, DoF_C = dp − q − d(d + 1)/2. Assuming networks are initially embedded in a 1-dimensional space, we compute DoF_C for filtrations of growing knowledge networks. A negative value for the number of conformational degrees of freedom indicates the presence of self-stress, which we resolve by incrementing dimensionality by 1. (a) In the example filtration, when nodes 3 and 6 are added, the network becomes over-constrained and develops self-stress. Consequently, dimensionality first increases from 1 (green) to 2 (orange) and then from 2 (orange) to 3 (red). (b), (c) Individual knowledge networks require greater dimensionality and possess greater flexibility than null model networks. (d), (e) Collective knowledge networks do not exhibit greater dimensionality or conformational flexibility than null model networks. Shaded regions in panels B-E represent the standard error. Purple curves denote average values for edge-rewired null model networks.

We hypothesized that knowledge networks would possess greater conformational flexibility than corresponding null model networks. We test this hypothesis by computing the number of conformational degrees of freedom in filtrations of individual and collective knowledge networks (Figure 5(c) and (e)). In parallel, we track the minimum embedding dimensionality required to prevent self-stress from developing in the growing networks (Figure 5(b) and (d)). We find that individual knowledge networks need greater dimensionality and possess greater conformational flexibility than null model networks (Figure 5(b) and (c)). By contrast, measurements of dimensionality and flexibility for collective networks cannot be as easily distinguished from their corresponding null model data (Figure 5(d) and (e)). However, for both data sets, the empirical curves for dimensionality and conformational flexibility are significantly different from the curves for the null model data (p_perm < 0.001). Our findings suggest that individuals value the ability to reconsider what they already know in light of newly acquired information. On the other hand, collective knowledge displays less capacity for global reconfiguration over the long time scales evaluated in this study; future work could investigate the existence and dynamics of internal sectors that change shape over different time scales or during paradigm shifts.

Discussion

In this work, we formalize curiosity as the process of constructing a growing knowledge network. We leverage tools from network science to quantitatively examine several theoretical constructs for curiosity such as the information gap theory and compression progress theory. Information gap theory suggests that curiosity is the drive to obtain units of knowledge that fill gaps in understanding (Loewenstein, 1994). Compression progress theory posits that curiosity is the drive to uncover the latent organization of the world (Schmidhuber, 2008). We probe information gaps as topological cavities in growing knowledge networks and quantify compression progress using network compressibility. The two theories offer complementary perspectives on curiosity; the information gap theory suggests that new information is acquired to fill knowledge gaps, whereas the compression progress theory suggests that new information is used to distill the essential epistemic elements of knowledge. While these perspectives describe how knowledge networks become denser and simpler through information acquisition, an alternative formulation is needed to explain how they become expansive and more complex. Therefore, we build upon a recently proposed conceptual framework (Zurn et al., 2021) to develop the conformational change theory of curiosity. We posit that knowledge networks are embedded in a Euclidean geometry, which allows concepts to move flexibly in relation to one another. We then view curiosity as the practice of constructing mechanically flexible knowledge networks. Formally, we measure conceptual flexibility as the number of conformational degrees of freedom available to a growing knowledge network. Throughout our investigations, we take a multi-scale view and probe evidence for each theory in individuals and in collectives. Across the two scales, we determine the precise contexts in which each theoretical account is explanatory, thereby clarifying their complementary and specific affordances.

Information gap theory and topological cavities in knowledge networks

Information gap theory suggests that humans tolerate a finite amount of uncertainty in their knowledge of the world (Loewenstein, 1994). Exposure to a small amount of previously unknown information brings into focus the presence of a knowledge gap, pushing the level of uncertainty past an acceptable threshold. This increased uncertainty prompts a search for information to fill the knowledge gap and resolve the unknown. In this work, we formalize gaps as topological cavities in growing knowledge networks and track their evolution in dimensions 0, 1, and 2 (Bianconi, 2021; Ju et al., 2020; Sizemore et al., 2018). Each dimension is characterized by a different kind of topological gap: 0-dimensional gaps correspond to disconnected network components, 1-dimensional gaps correspond to loop-like holes, and 2-dimensional gaps correspond to pocket-like voids. Across all dimensions, we find that the number of cavities increases as individual knowledge networks grow. Stated differently, associations between familiar concepts remain undiscovered even as we acquire more information. Hence, in addition to the common view of an expanding frontier of ignorance, knowledge growth is accompanied by an ever-expanding interior of ignorance (Ju et al., 2020). Except for the 0-th dimension, we report similar results for knowledge networks built collectively. Filling a 0-dimensional cavity entails adding an edge between two disconnected network components. Such edges may be easier for collectives to add than for individuals since interdisciplinary sub-fields within scientific domains are motivated to link disparate sub-areas of knowledge (Okamura, 2019). Importantly, and in support of the information gap theory, the number of 0- and 1-dimensional cavities is lower in observed individual and collective knowledge networks than in the corresponding null model data, reflecting a downward pressure on the number of gaps created, consistent with a gap-filling drive. Therefore, from a networks perspective, gaps—as envisioned by information gap theory, those that are prioritized for filling—may best correspond to topological cavities of dimensions 0 and 1. Stated differently, information gap theory provides an explanation for the markedly damped growth of lower dimensional cavities; however, a different account is needed to explain the contrasting proliferation of higher-dimensional cavities, both in individuals and in collectives.

Compression progress theory and efficient network representations of knowledge

To gain a deeper intuition, we turn to compression progress theory, which derives inspiration from resource limitations that underpin brain function (Schmidhuber, 2008). We represent knowledge as a network of concepts and their inter-relationships, and we compute network compressibility (Lynn and Bassett, 2021) to determine whether curiosity drives compression. We find that growing individual knowledge networks consistently exhibit greater-than-expected compressibility, consistent with the theory. This finding can be contextualized by considering that as we interact with the world, we encounter and consume large quantities of information. Constructing perfectly accurate mental models would entail storing each unit of acquired knowledge separately. However, finite resources constrain us to build compressed or efficient abstractions of observed data that can generalize across contexts (Tenenbaum et al., 2011). According to compression progress theory, information that—when acquired—facilitates such abstraction is more valuable (Schmidhuber, 2008). Our results support this proposition and suggest that individuals preferentially seek such information. By contrast, the compressibility curve for collective knowledge networks tends to align with the curve for the corresponding null model data in later stages of growth. This finding can be contextualized by considering the fact that collectives can store vast quantities of detailed information in a distributed manner and, hence, do not face the same resource limitations that individuals do. In summary, while compression progress theory is supported by our data from individual knowledge networks, the building of collective knowledge networks appears to require a different account.

Conformational change theory and the mechanical flexibility of knowledge networks

The conformational change theory of curiosity is an alternative account that is built on two assumptions. First, we assume that humans encode conceptual knowledge in cognitive networks. Second, we assume that knowledge networks are embedded in Euclidean space, where they possess several degrees of freedom. Both assumptions are predicated on how humans encode spatial and abstract knowledge (Garvert et al., 2017; Peer et al., 2021; Stiso et al., 2022; Warren, 2019). Evidence from spatial navigation studies demonstrates that mental representations of space take the form of labeled cognitive graphs. Each node represents a physical location and is accompanied by local metric information such as angles and Euclidean distances to its immediate neighbors (Chrastil and Warren, 2014; Peer et al., 2021; Warren, 2019). Furthermore, hexadirectional modulation, the telltale signature associated with an underlying map-like neural code, is observed in neural signals when individuals navigate discrete and continuous abstract concept spaces (Constantinescu et al., 2016; Park et al., 2021) (see Supplementary Materials for details on mental representations of spatial and non-spatial knowledge). Building on Euclidean cognitive graphs, we operationalize conceptual flexibility in knowledge networks as the number of conformational degrees of freedom. We find that growing individual knowledge networks have greater-than-expected embedding dimensionality and conformational flexibility. According to conformational change theory, embedding dimensionality increments when growing knowledge networks become over-constrained and develop self-stress. We find that such stress arises more frequently in individual knowledge networks than in null model data. This observation is consistent with the conformational change theory of curiosity, and suggests that individuals’ idiosyncratic acquisition of information leads to a frequent reshaping of concept relations based on context. By contrast, in knowledge networks built collectively we find that the evolution of mechanical features-of-interest cannot be distinguished from their evolution in null model data. Collective networks grow through a dynamic interplay of consensus and dissensus between large groups of individuals. Therefore, it is possible that due to the long time scales that we focus on in this study, dynamic events associated with collective knowledge growth, such as paradigm shifts, are simply concealed from view in local sectors of each field.

Implications for reinforcement learning

The computational metrics that we examine here are relevant not only for the study of human curiosity, but also potentially for that of artificial intelligence. Compressibility, for instance, was originally proposed as an intrinsic learning signal to guide reinforcement learning (Schmidhuber, 2008). In both single and multi-agent settings, the design of intrinsic (or curiosity-based) reward signals for reinforcement learning is an increasingly important area for further research (Aubret et al., 2019), and may benefit from computational insights into human behavior, such as those derived from our analyses here. Our work provides several candidate metrics—such as the number of topological cavities, network compressibility, and conformational flexibility—that can act as suitable curiosity-based signals for tasks where the environment can be modeled as a network. Information acquisition in reinforcement learning is a means to an end, where the end is a reward associated with the successful completion of a specific task (Sutton and Barto, 2018). An agent seeking to collect high total reward during interactions with its environment must strike a balance between exploitation and exploration. The agent must exploit, or productively use, those actions that are currently known to yield high reward but must also occasionally explore untested actions that may eventually turn out to be better. In many real-world settings, external rewards are highly infrequent or even completely absent and, thus, cannot reliably guide behavior. In such sparse reward environments, curiosity-like intrinsic motivations can lead to improved exploration and, by extension, improved task performance (Pathak et al., 2017; Savinov et al., 2018). At the collective level, models of intelligence tend to characterize the interactions between multiple agents. It remains to be seen, however, whether features such as coordination or cooperation emerge from prescriptive rules that describe individual agents’ motivations. Our work represents an initial step in this direction.

Conclusion

We conceptualize curiosity as the process of knowledge network building in order to examine three theoretical accounts: information gap theory, compression progress theory, and conformational change theory. Formalizing curiosity in terms of networks helps us to quantitatively operationalize predominantly qualitative theoretical constructs. Information gaps can be identified as topological cavities, compression progress can be quantified using network compressibility, and flexibility—premised on the conformational change theory—can be quantified as the number of conformational degrees of freedom. We use data acquired from Wikipedia to construct growing knowledge networks for individuals and for collectives. We find that as networks grow, knowledge gaps increase in number, suggesting an expanding interior of ignorance. Yet, in support of an aversion to gaps predicted by information gap theory, we also find fewer-than-expected disconnected network components (or 0-dimensional topological cavities) and fewer-than-expected loops of edges (or 1-dimensional topological cavities) in growing knowledge networks. This set of findings suggests that knowledge “gaps” as conceptualized by information gap theory may best translate, in network terms, to 0 and 1-dimensional cavities. We also find that growing individual knowledge networks possess greater-than-expected compressibility, indicating that information acquisition may be driven to construct parsimonious mental world models. In addition, we find that knowledge networks built by individuals become increasingly flexible with growth, foregrounding the longstanding relevance of conformational change in the mind. Our results lend support to a pluralistic view of curiosity, wherein intrinsically motivated information acquisition fills knowledge gaps and builds increasingly compressible and flexible mental representations of the world. Our findings offer a novel network theoretical perspective on intrinsically motivated information acquisition that may harmonize with or compel an expansion of the classical taxonomy of curiosity.

Methods

Data

Knowledge networks built by individuals

Knowledge networks for individuals are constructed with data obtained from the “Knowledge Networks Over Time” (KNOT) study (Lydon-Staley et al., 2020, 2020; Lydon-Staley et al., 2021). These data are comprised of Wikipedia browsing histories of 149 individuals (121 women, 26 men, 2 other) collected between October 2017 and July 2018. At the beginning of the study, all interested participants attended a laboratory session, where they received training in a daily assessment protocol and were guided through the installation of a tracking software called Timing to monitor their Wikipedia browsing sessions. Every evening for 21 days, participants were sent a survey questionnaire. After completing the survey, they were instructed to engage in 15 min of self-directed information search on wikipedia.org. At the end of each session, participants used the tracking software to export and upload their browsing histories. Participants were incentivized with Amazon gift cards at each phase of the study; $25 for the initial laboratory visit, and $10, $15, $20, $25, and $35 for completing three, four, five, six, and seven daily assessments, respectively. Data acquired from the daily assessment questionnaires are not used in this work. At the time of data acquisition, participants were aged between 18.21 and 65.24 years; 6.71% identified as African American/Black, 25.50% identified as Asian, 5.37% identified as Hispanic/Latino, 49.66% identified as White, 5.37% identified as Multiracial, 5.37% identified as Other, and 2.01% provided no racial or ethnic information.

We treat all pages visited by an individual as nodes in a knowledge network. Edges between nodes are specified based on the presence of hyperlinks. Prior work has found that pairs of pages connected by hyperlinks are significantly more similar to each other compared to pairs that are not (Lydon-Staley et al., 2021). Thus, we add an undirected and unweighted edge between Page 1 and Page 2 if either Page 1 links to Page 2 or Page 2 links to Page 1. Hyperlinks are not required to exist bidirectionally for an edge to exist between two nodes. We determine the presence of hyperlinks based on how Wikipedia appeared on August 1, 2019. Each node (or page) in the browsing data is accompanied by an index that denotes the temporal order in which it was visited. Each new session begins at the last visited page of the previous session. We stitch data acquired across 21 days to build a comprehensive browsing history. For every individual, the nodes and edges as well as the order of node visitation is used to specify a growing knowledge network.

Knowledge networks built collectively

In its role as an encyclopedia, Wikipedia represents a large repository of knowledge acquired over thousands of years through collective human effort. Building on prior work, we construct domain-specific collective knowledge networks by taking subgraphs of the larger Wikipedia network (Ju et al., 2020). Information in Wikipedia is organized in a hierarchical manner, which makes it possible to identify articles that pertain to a particular domain of interest. We capitalize on this structure to construct knowledge networks for the following thirty topics: abstract algebra, accounting, biophysics, Boolean algebra, calculus, cognitive science, commutative algebra, dynamical systems and differential equations, dynamical systems, earth science, economics, education, energy, evolutionary biology, geology, geometry, group theory, immunology, linear algebra, linguistics, meteorology, molecular biology, number theory, optics, philosophy of language, philosophy of law, philosophy of mind, philosophy of science, sociology, and software engineering. All pages listed under a topic are treated as nodes in the topic’s network. For instance, the network for molecular biology contains pages for “allele,” “lymphocyte,” and “antibody” as nodes. Similar to knowledge networks for individuals, edges between nodes are considering hyperlinks. Typically, articles also contain information about the year in which the concept they describe first became known; the year attribute is used as an index to specify node order in a growing graph. For instance, benzene was first isolated by Michael Faraday in 1825. Therefore, in the collective knowledge network for chemistry, the node for benzene is added before other nodes timestamped after the year 1825. More details on the network construction process (such as the procedure followed when a page has no year attribute) are available from Ref. Ju et al., 2020.

Detecting topological cavities

In order to identify cavities of various dimensions in a network, we construct a higher-order relational object known as a simplicial complex. While a graph is comprised of a set of nodes and a set of edges, a simplicial complex consists of simplices. A simplex represents a polyadic relationship among a finite set of k nodes. Geometrically, a k-simplex is realized as the convex hull (enclosure) of k + 1 generally placed vertices. For 0 ≤ k ≤ 2, a node is a 0-simplex, an edge is a 1-simplex, and a filled triangle is a 2-simplex. Simplices follow the downward closure principle, which requires that any subset of vertices, known as a face, within a simplex also form a simplex. For instance, a 2-simplex (filled triangle) has three 1-simplices (edges) as its faces, each of which in turn is comprised of two 0-simplices (nodes). In graph terms, a k-simplex corresponds to a (k + 1)-clique, which is an all-to-all connected subgraph of k + 1 nodes. We can construct a simplicial complex by assigning a k-simplex to each (k + 1)-clique in a binary graph. Thus, the resulting combinatorial object is sometimes also referred to as the clique complex of the graph. We denote the clique complex of the graph $G_{p}$ as $X (G_{p})$ .

In a clique complex, a k-dimensional topological cavity is identified as an empty enclosure formed by k-simplices. Whether a collection of simplices encloses a cavity is determined in part by its boundary. The boundary of a k-simplex σ is defined as the set ∂σ of its (k − 1)-faces. The boundary of a set of simplices K = {σ₁, σ₂, ⋯ , σ_m} is obtained by taking the symmetric difference Δ of the boundaries of its constituents $\partial K = \partial {σ_{1}, σ_{2}, \dots, σ_{m}} = \partial σ_{1} Δ \partial σ_{2} Δ \dots Δ \partial σ_{m},$

The symmetric difference is an associative operation that returns the union of two sets without their intersection. A set of k-simplices with an empty boundary is called a k-cycle. At first glance, it may seem adequate to identify cycles of various dimensions in a simplicial complex and treat them as topological cavities. However, note that any collection of (k + 1)-simplices has a k-cycle as its boundary. For example, the boundary of a 2-simplex is a 1-cycle that is “filled in” by the 2-simplex. Thus, it is necessary to distinguish non-trivial cycles that constitute true cavities from those that trivially belong to the boundaries of higher-dimensional simplices. Finally, we introduce the notion of equivalence. Two k-cycles K₁ and K₂ are equivalent if K₁ Δ K₂ is the boundary of a collection of (k + 1)-simplices. Homology refers to the counting of non-equivalent cycles of various dimensions in a clique complex. It is customary to refer to non-equivalent cycles simply as cycles for brevity.

The graph filtration from equation (1) induces a related filtration of clique complexes $X (G_{0}) \subset X (G_{1}) \subset \dots \subset X (G_{N}) = X (G),$ (3)

At each stage in the filtration, we add a node and replace all cliques that may result from its addition with relevant simplices. While some newly added simplices create cavities, others close older ones. Equivalence allows us to compute persistent homology wherein we track the evolution of each cavity from the moment it is first born to the moment it is completely filled in by higher simplices. At any index p of the filtration, the k-th Betti number β_k(p) records the number of active cavities of dimension k. We then define the k-th Betti curve as the sequence of numbers

{β_{k} (p)}_{p = 0}^{N}

(Table 1). We compute persistent homology for all knowledge networks using the Ripser package (Tralie et al., 2018).

Table 1.

Notation for information gap theory.

Parameter	Description
$G_{p}$	Graph induced by the visited nodes at time p
k	Topological dimension that defines a specific type of cavity in a network
$X (G_{p})$	Clique (or simplicial) complex derived from graph $G_{p}$ at time p
σ	A simplex σ is a shape enclosed by flat sides
K	A hollow enclosure formed out of simplices
β_k(p)	The number of topological cavities of dimension k at time p

For a more comprehensive treatment of topological data analysis, we direct the interested reader to Refs. Bianconi, 2021; Carlsson, 2009; Ghrist, 2007; Hatcher, 2002; Sizemore et al., 2019; Zomorodian and Carlsson, 2005.

Computing network compressibility

In order to estimate the compressibility of a network, we consider a binary graph $G_{p}$ with p nodes and q edges, which can be represented by a symmetric adjacency matrix $M \in R^{p \times p}$ . A message containing information about the network’s structure can be conveyed to an arbitrary receiver by encoding it in the form of a random walk $x = (x_{1}, x_{2}, \dots)$ . The walk sequence is generated by transitioning from a node to one of its neighbors uniformly at random. Thus, for a random walk on $G_{p}$ , the probability of transitioning from node i to node j is P_ij = M_ij/∑_jM_ij. Since the random walk is Markovian, the rate at which such a message transmits information (or its entropy) is given by $H (x) = - \sum_{i} π_{i} \sum_{j} P_{i j} \log P_{i j},$ (4)Here, π_i is the stationary distribution representing the long-time probability that the walk arrives at node i, which is given by π_i = ∑_jM_ij/2q.

Assigning clusters to nodes leads to a coarse-grained sequence

y = (y_{1}, y_{2}, \dots)

, where y_t is the cluster containing node x_t. The number of clusters n can be used to define a scale of the network’s description S = 1 − (n − 1)/p. For example, when n = p, the network is described at a fine-grained scale S = 1/p; by contrast, when n = 1, the network is described at the largest possible scale S = 1. In general, the distorted sequence y is non-Markovian. However, we can still use equation (4) to find an upper bound on its information rate. At every scale of description, it is possible to identify a clustering of nodes that minimizes this upper bound. After computing these optimal clusterings across all scales, we arrive at a rate-distortion curve R(S), which represents the minimal upper bound on the information rate as a function of the scale S. The compressibility C of the network is then given as the average reduction in R(S) across all scales (Lynn and Bassett, 2021).

C = H (x) - \frac{1}{p} \sum_{S} R (S),

(5)Visually, this quantity represents the total area above the rate-distortion curve and below the entropy of the original random walk H( x ) (Figure 5(b)). For a graph filtration such as in equation (1), we abuse indexing notation and define the compressibility curve as the sequence

{C (p)}_{p = 0}^{N}

, where p denotes the number of nodes in subgraph

G_{p}

(Table 2)

Table 2.

Notation for compression progress theory.

Parameter	Description
$G_{p}$	Graph induced by the p visited nodes at time p
M	Adjacency matrix corresponding to graph $G_{p}$
x	A random walk sequence on graph $G_{p}$
P _ij	Probability of transitioning from node i to node j uniformly at random
H( x )	Entropy of the random walk sequence x
π _i	Probability of a random walk sequence terminating in node i
q	Number of edges in the graph $G_{p}$
y	Walk sequence where node identities are replaced by cluster identities
n	Number of clusters into which nodes are separated
S	The scale at which the network is being considered
R(S)	Information rate of a clustered walk sequence at scale S
C(p)	Network compressibility at time p

Computing mechanical network features

Consider a set of nodes $V_{p} = {1, . . ., p}$ embedded in d dimensions. Each node $i \in V_{p}$ is located at a particular coordinate in space $z_{i} \in R^{d}$ . On its own, this system possesses dp degrees of freedom, as each node is able to move independently in space. If we connect these nodes with edges in the set $E_{p} \subseteq V_{p} \times V_{p}$ , then each edge $e_{i j} \in E_{p}$ between node i and node j removes one degree of freedom along the direction of edge extension. Each edge generates a constraint that keeps the distance between the nodes constant, such that ${(z_{i} - z_{j})}^{⊤} (z_{i} - z_{j}) = constant,$

To linear order, this constraint can be modified by taking the total derivative of both sides and dividing by 2 to yield ${(z_{i} - z_{j})}^{⊤} (d z_{i} - d z_{j}) = 0$ (6)where d is the differential operator. Intuitively, equation (6) is simply a dot product between the vector pointing from z _j to z _i, and the node motions. Hence, equation (6) implies that the nodes must move perpendicular to the edge such that the edge does not change length. If we compile all such constraints for every edge in $E_{p}$ then we obtain $q = | E_{p} |$ constraints on the node motions. If these constraints are independent, then the total number of degrees of freedom is reduced to dp − q. Among these degrees of freedom, d(d + 1)/2 are rigid body motions that do not change the distance between any pair of nodes. Hence, the number of conformational degrees of freedom is given by $D o F_{C} = d p - q - \frac{d (d + 1)}{2},$

This line of reasoning was first put forth by Maxwell (1864).

Many important extensions of this idea exist. One important extension considers the violation of independent constraints. This violation can occur in several ways. One such way is over-constraining a network. For example, a network of p = 4 nodes embedded in d = 2 dimensions is over-constrained if we place edges between all node pairs, such that

E_{p} = {(1,2), (1,3), (1,4), (2,3), (2,4), (3,4)}

. Here, there are

| E_{p} | = 6

edges, such that DoF_C = [(2 × 4) − 6] − 3 = −1. For a network to possess negative degrees of freedom, there must exist patterns of edge compressions and tensions that are load-bearing, such that there are balanced internal forces held within the edges and experienced by the nodes (Mao and Lubensky, 2018). In the conformational change theory of curiosity, we treat such states of self-stress as aversive. Practically, whenever DoF_C becomes negative, we resolve competing constraints by incrementing the embedding dimensionality by 1 (Table 3).

Table 3.

Notation for conformational change theory.

Parameter	Description
$G_{p}$	Graph induced by the p visited nodes at time p
$V_{p}$	Node set comprising of the p nodes in graph $G_{p}$
$E_{p}$	Edge set comprising of the q edges in graph $G_{p}$
d	Euclidean dimensions in which the graph $G_{p}$ is embedded
z _i	Coordinates in space marking the location of the i-th node
DoF _C	Number of ways in which the network $G_{p}$ can change shape

Statistical testing

We use non-parametric permutation testing to determine whether feature curves, such as those for compressibility and conformational flexibility, for empirical knowledge networks differ significantly from those for corresponding null model networks (Ramsay and Silverman, 2005). For a given feature, we first compute the area A between the average curve for the observed data and the average curve for the null model data using numerical integration. We then pool all data together and randomly re-assign each data point to either the empirical data group or the null model data group. Each group results in a pseudo-curve of values for a given feature-of-interest. We compute the area A′ between the pseudo-curves for the two groups and repeat this process for I = 1000 iterations. For the group difference between empirical and null model data, we define the p-value p_perm as the number of times A′ is greater than A divided by the number of iterations I.

Citation diversity statement

Recent work in a number of scientific fields has identified a bias in citation practices such that papers by women and other minority scholars are under-cited relative to the number of such papers in the field (Bertolero et al., 2020; Caplar et al., 2017; Chatterjee and Werner, 2021; Dion et al., 2018; Dworkin et al., 2020; Fulvio et al., 2021; Maliniak et al., 2013; Mitchell et al., 2013; Wang et al., 2021). Here, we sought to proactively choose references that reflect the diversity of the field in thought, form of contribution, gender, race, ethnicity, and other factors. First, we predicted the gender of the first and last authors of each reference using databases that store the probability of a first name being carried by a woman (Dworkin et al., 2020; Zhou et al., 2020a). By this measure (and excluding self-citations to the first and last authors of our current paper), our references contain 16.27% woman(first)/woman(last), 11.80% man/woman, 19.30% woman/man, 52.63% man/man citation categorizations. This method is limited in that (a) names, pronouns, and social media profiles used to construct the databases may not, in every case, be indicative of gender identity and (b) it cannot account for intersex, non-binary, or transgender people. Second, we obtained predicted racial/ethnic category of the first and last author of each reference using databases that store the probability of a first and last name being carried by an author of color (Ambekar et al., 2009; Sood and Laohaprapanon, 2008). By this measure (and excluding self-citations), our references contain 4.67% author of color/author of color, 9.86% white author/author of color, 20.34% author of color/white author, and 65.12% white author/white author citation categorizations. This method is limited in that (a) names, Census entries, and Wikipedia profiles used to make predictions about gender may not be indicative of racial/ethnic identity, and (b) it cannot account for Indigenous and mixed-race authors, or those who may face differential biases due to the ambiguous racialization or ethnicization of their names. We look forward to future work that could help us to better understand how to support equitable practices in science.

Supplemental Material

sj-pdf-1-col-10.1177_15459683211207633 – Supplemental Material for Curiosity as filling, compressing, and reconfiguring knowledge networks by Shubhankar P Patankar, Dale Zhou, Christopher W Lynn, Jason Z Kim, Mathieu Ouellet, Harang Ju, Perry Zurn, David M Lydon-Staley and Dani S Bassett in Collective Intelligence

Supplemental Material, sj-pdf-1-chc-10.1177_26339137231207633 for Supplemental Material for Curiosity as filling, compressing, and reconfiguring knowledge networks; by Shubhankar P Patankar, Dale Zhou, Christopher W Lynn, Jason Z Kim, Mathieu Ouellet, Harang Ju, Perry Zurn, David M Lydon-Staley and Dani S Bassett

Footnotes

Acknowledgements

The authors gratefully acknowledge helpful discussions with Drs. Lorenzo Caciagli,Erin G. Teich,and Kieran Murphy.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This work was supported by the Center for Curiosity. The authors would also like to acknowledge additional support from the Army Research Office (Grafton-W911NF-16-1-0474,Falk-W911NF-18-1-0244,DCIST-W911NF-17-2-0181) and the National Institute of Mental Health (1-R21-MH-124,121-01). The content is solely the responsibility of the authors and does not necessarily represent the official views of any of the funding agencies.

ORCID iDs

Shubhankar P Patankar

Harang Ju

Data availability statement

We will make data available upon request to the corresponding author. All code used is available at

Supplemental Material

Supplemental material for this article is available online.

References

Amancio

Wolff

(2019) Information compression as a unifying principle in human learning, perception, and cognition. Complexity 2019: 1879746. DOI: 10.1155/2019/1879746.

Ambekar

Ward

Mohammed

, et al. (2009) Name-ethnicity classification from open sources. In: Proceedings of the 15th Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, Paris, 28 June–1 July 2009, pp. 49–58.

Aubret

Matignon

Hassas

(2019) A survey on intrinsic motivation in reinforcement learning. arXiv:1908.06976.

Bennett

Bode

Brydevall

, et al. (2016) Intrinsic valuation of information in decision making under uncertainty. PLoS Computational Biology 12(7): e1005020–e1005021. DOI: 10.1371/journal.pcbi.1005020.

Bertolero

Dworkin

David

, et al. (2020) Racial and ethnic imbalance in neuroscience reference lists and intersections with gender. bioRxiv. doi: 10.1101/2020.10.12.336230

Bianconi

(2021) Higher-order Networks. Cambridge: Cambridge University Press. DOI: 10.1017/9781108770996.

Botvinick

Braver

(2015) Motivation and cognitive control: from behavior to neural mechanism. Annual Review of Psychology 66: 83–113.

Brydevall

Bennett

Murawski

, et al. (2018) The neural encoding of information prediction errors during non-instrumental information seeking. Scientific Reports 8(1): 6134. DOI: 10.1038/s41598-018-24566-x.

Caplar

Tacchella

Birrer

(2017) Quantitative evaluation of gender bias in astronomical publications from citation counts. Nature Astronomy 1(6): 0141.

10.

Carlsson

(2009) Topology and data. Bulletin of the American Mathematical Society 46: 255–308.

11.

Chatterjee

Werner

(2021) Gender disparity in citations in high-impact journal articles. JAMA Network Open 4(7): e2114509.

12.

Chrastil

Warren

(2014) From cognitive maps to cognitive graphs. PLoS One 9(11): e112544–e112548. DOI: 10.1371/journal.pone.0112544.

13.

Chrastil

Warren

(2015) Active and passive spatial learning in human navigation: acquisition of graph knowledge. Journal of Experimental Psychology: Learning, Memory, and Cognition 41(4): 1162–1178.

14.

Christianson

Sizemore Blevins

Bassett

(2020) Architecture and evolution of semantic networks in mathematics texts. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 476(2239): 20190741. DOI: 10.1098/rspa.2019.0741.

15.

Chung

Abbott

(2021) Neural population geometry: an approach for understanding biological and artificial neural networks. Current Opinion in Neurobiology 70: 137–144. DOI: 10.1016/j.conb.2021.10.010.

16.

Clark

Vincent

Wang

, et al. (2021) Smokers’ curiosity for tobacco-related trivia aids memory of tobacco-related information. PsyArXiv.

17.

Collins

AGE

(2017) The cost of structure learning. Journal of Cognitive Neuroscience, 29(10): 1646–1655. DOI: 10.1162/jocn_a_01128.

18.

Constantinescu

O’Reilly

Behrens

TEJ

(2016) Organizing conceptual knowledge in humans with a gridlike code. Science 352(6292): 1464–1468. DOI: 10.1126/science.aaf0941.

19.

Copeland

(2019) On serendipity in science: discovery at the intersection of chance and wisdom. Synthese 196(6): 2385–2406.

20.

Daddaoua

Lopes

Gottlieb

(2016) Intrinsically motivated oculomotor exploration guided by uncertainty reduction and conditioned reinforcement in non-human primates. Scientific Reports 6(1): 20202. DOI: 10.1038/srep20202.

21.

Dion

Sumner

Mitchell

(2018) Gendered citation patterns across political science and social science methodology fields. Political Analysis 26(3): 312–327.

22.

Dweck

(1986) Motivational processes affecting learning. American Psychologist 41(10): 1040, 1048.

23.

Dworkin

Linn

Teich

, et al (2020) The extent and drivers of gender imbalance in neuroscience reference lists. bioRxiv. Retrieved from: https://www.biorxiv.org/content/early/2020/01/11/2020.01.03.894378

24.

Ericson

Warren

(2020) Probing the invariant structure of spatial knowledge: support for the cognitive graph hypothesis. Cognition 200: 104276.

25.

Fulvio

Akinnola

Postle

(2021) Gender (im)balance in citation practices in cognitive neuroscience. Journal of Cognitive Neuroscience 33(1): 3–7.

26.

Garvert

Dolan

Behrens

TEJ

(2017) A map of abstract relational knowledge in the human hippocampal-entorhinal cortex. eLife 6: e17086. DOI: 10.7554/eLife.17086.

27.

Gholizadeh

Seyeditabari

Zadrozny

(2018) Topological signature of 19th century novelists: persistent homology in text mining. Big Data and Cognitive Computing 2(4). doi: 10.3390/bdcc2040033

28.

Ghrist

(2007) Barcodes: the persistent topology of data. Bulletin of the American Mathematical Society 45: 61–76.

29.

Gottlieb

Oudeyer

P-Y

Lopes

, et al. (2013) Information-seeking, curiosity, and attention: computational and neural mechanisms. Trends in Cognitive Sciences 17(11): 585–593. DOI: 10.1016/j.tics.2013.09.001.

30.

Hatcher

(2002) Algebraic Topology. Cambridge: Cambridge University Press.

31.

Hsee

Ruan

(2016) The pandora effect: the power and peril of curiosity. Psychological Science 27(5): 659–666.

32.

Johnson-Laird

(2010) Mental models and human reasoning. Proceedings of the National Academy of Sciences 107(43): 18243–18250.

33.

Zhou

Blevins

, et al. (2020) The network structure of scientific revolutions. arXiv:2010.08381.

34.

Kahn

Karuza

Vettel

, et al. (2018) Network constraints on learnability of probabilistic motor sequences. Nature Human Behaviour 2(12): 936–947.

35.

Karuza

Thompson-Schill

Bassett

(2016) Local patterns to global architectures: influences of network topology on human learning. Trends in Cognitive Sciences 20(8): 629–640.

36.

Kidd

Hayden

(2015) The psychology and neuroscience of curiosity. Neuron, 88(3), 449–460. DOI: 10.1016/j.neuron.2015.09.010.

37.

Kim

Strogatz

, et al. (2019) Conformational control of mechanical networks. Nature Physics 15(7): 714–720. DOI: 10.1038/s41567-019-0475-y.

38.

Kim

Blevins

, et al. (2022) Nonlinear dynamics and chaos in conformational changes of mechanical metamaterials. Physical Review 12.

39.

Loewenstein

(1994) The psychology of curiosity: a review and reinterpretation. Psychological Bulletin 116(1): 75–98.

40.

Lydon-Staley

Falk

Bassett

(2020a) Within-person variability in sensation-seeking during daily life: positive associations with alcohol use and self-defined risky behaviors. Psychology of Addictive Behaviors 34(2): 257–268.

41.

Lydon-Staley

Zurn

Bassett

(2020b) Within-person variability in curiosity during daily life and associations with well-being. Journal of Personality, 88(4), 625-641. DOI: 10.1111/jopy.12515.

42.

Lydon-Staley

Zhou

Blevins

, et al. (2021) Hunters, busybodies and the knowledge network building associated with deprivation curiosity. Nature Human Behaviour 5(3): 327–336. DOI: 10.1038/s41562-020-00985-7.

43.

Lynn

Bassett

(2020) How humans learn and represent networks. Proceedings of the National Academy of Sciences 117(47): 29407–29415. DOI: 10.1073/pnas.1912328117.

44.

Lynn

Bassett

(2021) Quantifying the compressibility of complex networks. Proceedings of the National Academy of Sciences 118(32): e2023473118. DOI: 10.1073/pnas.2023473118.

45.

Lynn

Kahn

Nyema

, et al. (2020) Abstract representations of events arise from mental errors in learning and memory. Nature Communications 11(1): 1–12.

46.

Maliniak

Powers

Walter

(2013) The gender citation gap in international relations. International Organization 67(4): 889–922.

47.

Mao

Lubensky

(2018) Maxwell lattices and topological mechanics. Annual Review of Condensed Matter Physics 9(1): 413–433. DOI: 10.1146/annurev-conmatphys-033117-054235.

48.

Maxwell

(1864) L. On the calculation of the equilibrium and stiffness of frames. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 27(182): 294–299.

49.

McAllister

Frappier

Meynell

, et al. (2012) Thought Experiment and the Exercise of Imagination in Science. New York: Routledge.

50.

Mitchell

Lange

Brus

(2013) Gendered citation patterns in international relations journals. International Studies Perspectives 14(4): 485–492.

51.

Momennejad

(2020) Learning structures: predictive representations, replay, and generalization. Current Opinion in Behavioral Sciences 32: 155–166.

52.

Okamura

(2019) Interdisciplinarity revisited: evidence for research impact and dynamism. Palgrave Communications 5(1): 141. DOI: 10.1057/s41599-019-0352-4.

53.

Park

Miller

Boorman

(2021) Inferences on a multidimensional social hierarchy use a grid-like code. Nature Neuroscience 24(9): 1292–1301. DOI: 10.1101/2020.05.29.124651.

54.

Pathak

Agrawal

Efros

, et al. (2017) Curiosity-driven exploration by self-supervised prediction. Proceedings of the 34th international conference on machine learning - 70: 2778–2787.

55.

Peer

Brunec

Newcombe

, et al. (2021) Structuring knowledge with cognitive maps and cognitive graphs. Trends in Cognitive Sciences 25(1): 37–54. DOI: 10.1016/j.tics.2020.10.004.

56.

Ramsay

Silverman

(2005) Functional Data Analysis. London: Springer Nature.

57.

Roets

Van Hiel

(2007) Separating ability from need: clarifying the dimensional structure of the need for closure scale. Personality and Social Psychology Bulletin 33(2): 266–280. DOI: 10.1177/0146167206294744.

58.

Savinov

Raichuk

Marinier

, et al. (2018) Episodic curiosity through reachability. arXiv:1810.02274.

59.

Schapiro

Rogers

Cordova

, et al. (2013) Neural representations of events arise from temporal community structure. Nature Neuroscience 16(4): 486–492. DOI: 10.1038/nn.3331.

60.

Schapiro

Turk-Browne

Norman

, et al. (2016) Statistical learning of temporal community structure in the hippocampus. Hippocampus, 26(1): 3–8. DOI: 10.1002/hipo.22523.

61.

Schmidhuber

(2008) Driven by compression progress: a simple principle explains essential aspects of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes. arXiv. Short version: J. Schmidhuber. Simple algorithmic theory of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes. Journal of SICE 48(1): 21–32.

62.

Shiffrin

Schneider

(1977) Controlled and automatic human information processing: ii. perceptual learning, automatic attending and a general theory. Psychological Review 84(2): 127–190.

63.

Singh

Lewis

Barto

, et al. (2010) Intrinsically motivated reinforcement learning: an evolutionary perspective. IEEE Transactions on Autonomous Mental Development 2(2): 70–82. DOI: 10.1109/TAMD.2010.2051031.

64.

Sizemore

Karuza

Giusti

, et al. (2018) Knowledge gaps in the early growth of semantic feature networks. Nature Human Behaviour 2(9): 682–692. DOI: 10.1038/s41562-018-0422-4.

65.

Sizemore

Phillips-Cremins

Ghrist

, et al. (2019) The importance of the whole: topological data analysis for the network neuroscientist. Network Neuroscience 3(3): 656–673. DOI: 10.1162/netn_a_00073.

66.

Sood

Laohaprapanon

(2008) Predicting race and ethnicity from the sequence of characters in a name. arXiv:1805.02109.

67.

Stiso

Lynn

Kahn

, et al. (2022) Neurophysiological evidence for cognitive map formation during sequence learning. eNeuro 9(2). doi: 10.1523/ENEURO.0361-21.2022

68.

Sutton

Barto

(2018) Reinforcement Learning: An Introduction. Cambridge: The MIT Press.

69.

Tenenbaum

Kemp

Griffiths

, et al. (2011) How to grow a mind: statistics, structure, and abstraction. Science 331(6022): 1279–1285. DOI: 10.1126/science.1192788.

70.

Tompson

Kahn

Falk

, et al. (2019) Individual differences in learning social and nonsocial network structures. Journal of Experimental Psychology: Learning, Memory, and Cognition 45(2): 253–271.

71.

Tompson

Kahn

Falk

, et al. (2020) Functional brain network architecture supporting the learning of social networks in humans. Neuroimage 210: 116498.

72.

Tralie

Saul

Bar-On

(2018) Ripser.py: a lean persistent homology library for Python. Journal of Open Source Software 3(29): 925. DOI: 10.21105/joss.00925.

73.

Valadao

Anderson

Danckert

(2015) Examining the influence of working memory on updating mental models. Quarterly Journal of Experimental Psychology 68(7): 1442–1456.

74.

Wang

Dworkin

Zhou

, et al. (2021) Gendered citation practices in the field of communication. Annals of the International Communication Association 45: 134–153. DOI: 10.1080/23808985.2021.1960180.

75.

Warren

(2019) Non-euclidean navigation Journal of Experimental Biology 222(Pt Suppl 1): jeb187971. DOI: 10.1242/jeb.187971.

76.

Warren

Rothman

Schnapp

, et al. (2017) Wormholes in virtual space: from cognitive maps to cognitive graphs. Cognition 166: 152–163. https://www.sciencedirect.com/science/article/pii/S0010027717301373: DOI: 10.1016/j.cognition.2017.05.020.

77.

Webster

Kruglanski

(1994) Individual differences in need for cognitive closure. Journal of Personality and Social Psychology 67(6): 1049–1062.

78.

Zhou

Cornblath

Stiso

, et al (2020a) Gender diversity statement and code notebook v1.0. Zenodo. DOI: 10.5281/zenodo.3672110.

79.

Zhou

Lydon-Staley

Zurn

, et al. (2020) The growth and form of knowledge networks by kinesthetic curiosity. Current Opinion in Behavioral Sciences 35: 125–134. DOI: 10.1016/j.cobeha.2020.09.007.

80.

Zhou

Lynn

Cui

, et al. (2020b) Efficient coding in the economics of human brain connectomics. Network Neuroscience; 6(1): 234–274.

81.

Zomorodian

Carlsson

(2005) Computing persistent homology. Discrete & Computational Geometry 33(2): 249–274. DOI: 10.1007/s00454-004-1146-y.

82.

Zurn

Zhou

Lydon-Staley

, et al. (2021) Edgework: viewing curiosity as fundamentally relational. PsyArXiv.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

3.97 MB