Abstract
‘
I Introduction
Historical research in geography is changing. Over the past thirty years, digital technologies have significantly altered the way that historical sources are stored and retrieved. The term ‘digital archives’ captures both this move of sources online, as well as the wider ecosystem of digital tools and platforms that scholars use to find and analyse archive materials (see Owens and Padilla, 2021). The ability, within seconds, to locate exact words or phrases across inconceivably vast databases of text is now a routine part of our daily practice.
The creation of digital archives raises distinct issues of how best to preserve and disseminate historical material, how to manage notoriously unstable file formats, and how to approach intellectual property rights. These are important questions, but they are also predominantly technical ones. It is easy to mistake digital archives as tools of convenience: that they simply speed up the kind of work
In this paper, we argue that digital archives not only bring new ways of doing historical geography; they engender a deep and philosophical shift in our relationship to the past itself. Following the work of the cultural theorist Steve Anderson, we argue that digital archives are giving rise to new recombinant historical geographies. Traditional research – once highly ordered by place and archival arrangement – is being supplanted by the rise of digital, text-searchable databases that offer a world of ‘infinitely retrievable fragments’ (Anderson, 2014: 101). These fragments, alongside the platforms that host them, and the algorithms we use to find them, are reorienting our research around practices of
This is not the same work
The geographical promise of digitisation is to unbind archives from place, giving us the ability to view sources from anywhere. But digital space is not synonymous with anywhere. Instead, a different set of geographical configurations emerge over what gets digitised, where it is viewed, and who has access. Online, archives often exist in a very particular space: that of the proprietary research platform. These platforms are presented to us as benign digital replications of the physical world in a way that profoundly underplays their determinative role in our research. Although they appear to overcome the hierarchy of the archive, in practice they supplant one form of ordering for another. In so doing, digitisation makes the historical and geographical claims of archives harder to see, even as it makes information more widely accessible.
By bringing work from history and archive studies into dialogue with geographical scholarship, this paper develops its argument in three parts. In the first part, we conceptualise recombination. We show how recombination differs from the principles of original order and
II Conceptualising recombination
Until recently, virtually all historical work in geography was done with physical sources in a so-called ‘analogue’ archive. Since the mid-nineteenth century, archives have been organised by a basic principle: records which come from different creators or origins
The sanctity of this principle is enshrined in the concept of
Yet in the digital world, creatorship is no longer the primary access point or organising principle. Digital archives exist as databases of information and, as Jefferson Bailey (2013: np) argues, in ‘a database, objects are related but not ordered’ – certainly not in the ways of traditional archives. Bailey argues that ‘nonlinear retrieval supplants the narrative logic of respect des fonds with a broader notion of context and discoverability’ (Bailey, 2013: np). Navigation is not predetermined by a creator, an archivist, or a finding aid, but arrangement is dynamic and dependent upon our search terms. Materials are open to new connections and
If you have used a digital archive – or platforms like Google Books or JSTOR – you will be familiar with the search box that invites us to enter our ‘key terms’. It has the look and feel of a finding aid that we might use to identify a call number in a physical archive. But, as Ted Underwood (2014) argues, the underlying technology and philosophical principles are vastly different. The search bar does not help us navigate the arrangement of the archive; it allows us to circumvent it. Digital platforms privilege searching over browsing, and that searching has more in common with data mining than document retrieval. The more precise the phrasing, the more efficiently digital search can personalise our results. This is part of the nature of computer search or ‘information retrieval – it is very effective at identifying exact terms and can do so across millions of data points in seconds.
When we receive our search results, fragments of historical information are recombined from multiple collections and places with little fidelity to original order or provenance. As Sassoon (cited in Sternfeld, 2011: 565) notes, digital archives return results as ‘a databank of orphans which have been removed from their transactional origins and evidence of authorial intent’. Digital archives, then, offer us an unprecedented means to quickly find exactly
The ability to extract and recombine historical information based on its proximity to our search terms brings a clear confirmatory bias. Digital search is not akin to a finding aid, but to an experiment – and, as Underwood (2014: 65) reminds us, ‘there’s something a bit dubious about experiments that get repeated until they produce a desired result’. And how representative are those results? How well can we understand them when they are decontextualised from their historical site of meaning? Results sorted by relevance filter out historical ideas that might contradict the assumptions underpinning our search terms. Take the example of newspapers. They constitute the largest bulk of digitised material, with multiple national and local titles and therefore thousands of potential data points for digital search to ‘hit’ (Gooding, 2016). Against that volume of material, platforms show the information you need and little more. Articles appear separate from information that offered credibility and context to historical readers of the newspaper: mastheads, page layout, even other articles in the same issue. Searches also return multiple versions of syndicated stories, comparatively small in their own day but now given artificial prominence. This is the version of research ‘efficiency’ and ‘empowerment’ that digital platforms promise, and it is these to which we now turn.
III Platforms and recombination
Digital archives have emerged in the context of an increasingly competitive marketplace of digital platforms and software providers. These companies, in partnership with leading public and private institutions, offer high-quality research platforms alongside support for preserving and cataloguing archive collections. But we should be in no doubt, they are commercial enterprises. They include
Aggregation and recombination are central to the business model of these platforms and their mode of value capture. As the opening quote from
The term digital archive is a misnomer, then. Research platforms are rarely archives in any actual sense. Rather, they act as intermediaries that monetise the digital reproduction and exchange of materials whose original versions continue to exist in a pre-existing location, copyright status and arrangement. The platform is an on-screen interface that allows researchers to view archive materials but, as Lizzie Richardson (2020: 460) notes, it is also a ‘flexible spatial arrangement’. It reorganises archives through ‘novel technologies of coordination’ that rearrange sources already in existence elsewhere. Value (or investment return) is generated by creating increasingly advanced coordinating activities and elaborate recombinative effects. Disparate collections are brought together; search is enhanced to read further and deeper; sources are repeatedly recombined into new thematic collections based on evolving trends.
For commercial publishers, recombination is crucial to securing licensing agreements and justifying charges to access materials that are ultimately owned by someone else or are outside copyright. For example,
As elsewhere in the digital economy, commercial logics are not unchallenged. For public providers, recombination matters because it offers the opportunity to unlock collections, widen access and increase usage. In this respect California, the birthplace of the platform, offers a compelling example. The California Digital Library, part of the University of California, was founded in 1997 and today it constitutes one of the world’s largest digital research archives – accessible through its purpose-built platform,
Power in the digital archive appears to move from the archivist to the researcher. After all, online we arrive at materials not through the arrangement of the archive, but through tools that prioritise
We need to situate digital ‘discovery tools’ within the context of critical historical geographies of archival research that interrogate practices of selection and ordering. For example, scholars rarely know (or can know) the relevance metrics that an algorithm is using to organise and display their results (Underwood, 2014). If archiving not only represented the world but participated in it – today algorithms increasingly shape our understandings of the past. This is more widely appreciated. Kitchin and Dodge (2014: 44), for example, note that ‘software needs to be understood as an actant in the world; it augments, supplements, mediates and regulates our lives… Software transforms and reconfigures the world in relation to its own systems of thought’. This presents a clear research challenge. As Louise Amoore (2020: 20) writes: ‘To attend to algorithms as generating active, partial ways of organizing worlds is to substantially challenge notions of their neutral, impartial objectivity’.
Although digital platforms often feel neutral and comprehensive, they are no less selective and determinative of our research findings. As Sternfeld (2011: 557) notes, in its representational form, the platform
IV Platforms and aggregation
Digitisation promotes forms of aggregation. Source material which existed in separate archives and places is bundled together on digital platforms. ProQuest (n.d.a) notes how its ‘vast content sets’ span centuries of newspapers and primary sources.
The drive to aggregation reveals how digitisation is simultaneously a business model, a political ideology and a distinct perspective on understanding archives. As with other areas of the platform economy, like Google for search, the goal of totality and market dominance is inherent to the commercial strategy of digital platforms, driven by the stark imbalance between high start-up cost and low marginal cost. This economic logic also shapes public providers who have their own need to demonstrate value for money. In the case of digital archives, platforms frequently pitch that the virtually limitless capacity and global reach of the internet age offers the possibility that we might search across the world’s knowledge, near perfectly preserved. But this is to present a political point merely as a technical one. The idea of the ‘total archive’, which has been reactivated in our digital present, has a longer, discernibly analogue history and geography.
Here, several accounts turn to the historical example of the Mundaneum. Planned by Belgian internationalists Paul Otlet and Henri La Fontaine in the late-nineteenth century, the Mundaneum was to gather together the world’s knowledge, and organise it under a standard system of decimal classification in the Brussels-based Palais Mondial. Otlet’s vision may seem a historical eccentricity, if it was not for the fact that in 2012 Google, responding to regulatory and cultural concerns about its dominance of online search, announced a partnership with the Mundaneum Archive Center, now in Mons. Its innovative
That same tension continues to play out on digital platforms. Supporters and the companies themselves are keen to pitch digital archives as placeless, neutral tools for scholarship. It is Google’s self-proclaimed mission ‘to organise the world’s information and make it universally accessible’ (Google, n.d.). However, critics see those same platforms as the enclosure and privatisation of knowledge. In short, digital archives cannot be separated from wider political struggles over the control and ownership of cultural memory. There is a dynamic set of tensions between international capital and sovereignty in this digital landscape, most vividly expressed in the ambition of aggregation. National claims to archives collide with commercial interests that ostensibly drive towards internationalist agendas, even while having to defend themselves in specific legal jurisdictions. For example, when challenged by American authors over its digitisation of books, Google appealed to US fair use copyright law (Liptak and Alter, 2016).
Thylstrup’s important account lays bare how digital assemblages are remaking political, cultural, economic and historical geographies of knowledge in profound and far-reaching ways. New technical and economic infrastructures are ‘governed less by the hierarchical world of curators, historians, and politicians, and more by feedback networks of tech companies, users, and algorithms’ (Thylstrup, 2018: 51). And there is not simply an economics to digital archives, there is a geopolitics too: ‘The
V Ethics and recombination
1 Remote access
As we have argued, recombination is an inherently geographical process; the act of digitisation is also always an act of dislocation. And if we consider the location and display of records to be central to the task of understanding the work that they historically performed, how do geographers remain attentive to the impress of setting when accessing materials remotely (see Griffiths and Baker, 2020)? We can see that the locational geography of archives matter in the case of disputed collections (Lowry, 2017; Shepard, 2015) or those themes, such as internationalism (Hodder et al., 2021) or race (Hyacinth, 2019), marginalised by state-centric recording practices. The promise of digitisation is precisely its ability to work against the locational geography of archives by pulling together disparate collections and thereby removing the time and cost involved in research travel. Remote access is therefore central to the version of ‘efficiency’ that platforms promise, and to the task of opening up collections to wider audiences. These benefits are clearly significant, but there is nothing inherently egalitarian about digital platforms that make it possible for
Putting to one side subscription fees and technology costs, remote access raises a larger ethical issue. How might we manage our vastly increased opportunities to write about people and places we never have to visit? Putnam argues that with these ‘research efficiencies’ we also risk losing the unintended, but important, experiential learning of fieldwork. How much do we really know about the fragments that surface in our search results? Digital platforms require ‘almost no prior contextual knowledge: that’s what happens when you piggyback on commercial technology honed to connect people to purchases as easily as possible’ (Putnam, 2016a: 399).
Without travel, how might we ethically stress-test our archival research? As geographers have highlighted (e.g. Haines, 2019), the archive is a space in which documents are read, but so too is the researcher. It is the space where we are forced to confront our positionality and our relationship to the material. By contrast, digital archives invite us to confront pasts – sometimes difficult and uncomfortable ones – from the safety of our world. We risk becoming insulated from the people and places we claim to know and write about. In their study of the digitisation of community archives in Scotland’s Western Isles, Beel et al. (2015: 203) note that traditional archives existed ‘like “silos” of local knowledge whereby you have to be in-place to add to them or view them’. In that sense, geography was an obstacle. Now the idea of the geographer being
Digital archives have largely been welcomed with a benign sense of enthusiasm, and we fully recognise their convenience and their capacity to broaden our research. The novelty of this new digital infrastructure, however, does raise a series of ethical questions. How can we maintain ethical standards of anonymity in a world of full-text search or guarantee the right to be forgotten (Allen, 2017; Crossen-White, 2015; Mkadmi, 2021)? What are the effects of making wholesale historical collections about marginalised groups available online? What kind of ethic of care is there to help users navigate sensitive materials? What does it mean to enable records to be broken apart, decontextualised and recombined? And what happens when we do not have to confront the racial and colonial power structures that fixed records in their current locations? These questions invite no clear-cut answers, and we do not wish to defend a narrow conception of expertise, but by considering them we can certainly better fit our training to meet the new ethical challenges and opportunities presented by digital technologies.
2 Absence and erasure
If one has any training in archive methods, it has likely highlighted how acts of erasure or silencing have shaped the production and management of records (see Mills, 2013; McGeachan, 2018). One of the great hopes of digitisation is that recombination might herald new, emancipatory historical geographies. Digital discovery tools offer new ways of navigating the older infrastructures and biases of collections previously separated by place, time and context. The sub-text of this is that aggregation (combined with powerful search tools) offers the possibility of scaling-up absence into presence; digital archives can be used to resurrect ‘forgotten’ individuals. However, Caroline Bressey repeats the warning of feminist historians that digitisation ‘has not transformed the nature of the sources we are searching’ (Hunter, 2017: 210, cited in Bressey, 2020). Here alternative methods might merit attention, such as Saidiya Hartman’s (2008: 11) notion of ‘critical fabulation’. Hartman’s creative response to absences in the archive on trans-Atlantic slavery was to blend archive fragments with fictional narratives. Such work marks one response to the limitations of using archival sources as evidence. For our purposes, mass digitisation might appear to represent another; but by itself it cannot overcome the silencing intrinsic to historical recording practices – it simply scales it up.
Importantly, digitisation can compound forms of absence and erasure (Hodder, 2017). Platforms, and the commercial logics that power them, generate results, however tangential. The effect, as Brian Maidment (2012: 112) notes, is that ‘Any sense of what might be absent recedes under the press of what is so obviously and overwhelmingly present’. As more material becomes available online, the records left behind seem to become more hidden, less important. Leary (2005: 82) argues that soon analogue materials may ‘simply cease to exist’ to anyone but the most dedicated of specialists. He coins the term the ‘offline penumbra’ to refer to ‘that increasingly remote and unvisited shadowland into which even quite important texts fall if they cannot yet be explored [online]’ (Leary, 2005: 82). For Leary, the offline penumbra is one half of a new ‘digital divide’ that is fundamentally transforming historical work.
Digitisation has facilitated a wider shift in how we understand archives, then. In a world of disparate analogue archives, knowledge had tended to be imagined as scarce and therefore fiercely guarded, and the labour to retrieve it arduous. Conversely, digital platforms present a world of limitless information in which knowledge is being relentlessly expanded. This shift has undoubtably made archive work more appealing to those who would not traditionally have used them, reaching beyond the sub-discipline of historical geography. This is to be welcomed. However, to develop the ideas of Anderson (2014), the illusion of comprehensiveness obscures how digitisation is a spatially uneven and unrepresentative process. The rationale for choosing which collections are digitised differs between commercial and public providers. But in both cases, it usually includes some combination of ease of digitisation, perceived popularity and copyright issues. Beyond those points, however, we know that whole institutions and collections remain offline.
In practice, digitisation has a geography. Its origins in elite, well-funded institutions in the Global North still indelibly mark the boundaries of our scholarship. For some areas of enquiry, this has enabled a transformative expansion. It is not insignificant that Putnam (2016a) notes the enabling effects of digitisation on the development of transnational history. Different regional platform configurations will continue to determine the geographical parameters of our work. As is hinted at by its very name,
VI Conclusion
In 2002, when the RGS-IBG embarked on the first step of ‘unlocking its collections’ by digitising its archive catalogue, Charles Withers (2002: 309) wrote that ‘It is too soon to say whether the WWW will act simply as a means of recall from a global archive, or if it marks the beginning of “a new inventive relationship to knowledge, a relationship that is dissolving the hierarchy associated with the archive”’ (Caygill, 1999), cited in Withers, 2002: 309). Twenty years on, as hundreds of thousands of items from the RGS-IBG archive have been digitised, we can see more conclusively that digitisation is fundamentally changing historical geographical research.
This paper has considered the impacts of digitisation, drawing on discussions underway in cognate disciplines such as history and archive studies. We have argued that digital archives and their associated technologies are fundamentally changing our relationship to the geographical past – from one heavily structured by place and archival arrangement, toward one shaped by processes of aggregation, fragmentation and recombination. In doing so, our aim is not to reify the pre-digital age as the benchmark of ideal practice. Digitisation has many positive aspects, including widening access, preserving collections and promoting novel, interdisciplinary scholarship. Instead, digitisation invites us critically to consider how the material qualities of paper documents have an increasingly undue influence on how we conceptualise historical research in geography. Our methods teaching must go further to interrogate not simply the experience of working in archives, but the changing political and economic infrastructures of digitisation. As we have argued above, digital platforms make the user think they are in control, that it is their effort that returns new connections. But it is the platform that sorts, reorders, tabulates, extracts and recombines results in ways that are profoundly determinative of our research outcomes.
As with analogue archives, working with digital platforms demands a consciously antagonistic approach and a clear theorisation of underlying technology (Beckingham and Hodder, 2023). At present, we are using digital platforms as tools of convenience without critically examining their role in determining the conclusions we draw. As Ted Underwood (2014: 69) argues, ‘Researchers can never afford to treat algorithms as black boxes that generate mysterious authority. If we’re going to use algorithms in our research, we have to crack them open and find out how they work’. Geographers do not need to be coders, but we do need to ask what relevance metrics an algorithm is using to organise our results? What are the basic assumptions that underpin them? What kind of information is likely to be included, prioritised or lost through that process? And how can we better report those parameters in our writing? The rise of recombinant historical geographies is not inherently worrisome, but it does demand us to take a different set of critical questions into the archive.
Digitisation prompts us to consider how we cite material. Advocates for greater source transparency – influenced by the open science movement – have called for researchers to publish accompanying datasets. As we come to rely on digital archives, which are potentially accessible to our readers, could such demands be made of historical work? For example, Cope (2018) raises the prospect of providing hyperlinks to each source in full from our publications (also see Elman and Kapiszewski, 2017). But digitisation invites a far broader discussion about what it means to have ‘used’ a source (Leary, 2005). We can no longer assume that an archival footnote is evidence that the contextual education of fieldwork has been gained. Nor can we assume that a source is representative of the wider collection or context in which it sits. In a world of recombination, full transparency would require us to explain how we arrive at sources, not merely link to what they say.
We probably also need a better way to fully account for the hidden benefits of offline research – or, the ‘unsheddable contexualization that makes work with analog sources so inefficient’ (Putnam, 2016a: 393). We need to value those practices that embed our scholarship, as much as those that speed it up. With hindsight, we can see that the need to publish more, faster, laid the groundwork for the expansion and adoption of digital archives. Platforms cater to a demand, identified by Lorimer (2010: 254) over a decade ago, that researchers be ‘directed along the shortest, quickest and easiest search routes likely to lead to the desired archival object, or anticipated “find”’. In this way, platforms are a response to the demands we have made as researchers, as a discipline and as a scholarly industry.
By opening this discussion, we suggest that historical geographers not only have much to reflect on with respect to their own practice but can also contribute to disciplinary discussions on new digital geographies (e.g. Ash et al., 2018a, 2018b; Kinsley, 2014; Offer, 2013; Pickrell, 2018) and histories (Dougherty and Nawrotzki, 2013; Owens, 2018; Weller, 2012). The technologies mobilised in digital archives are part of a set of applications and platforms whose everyday use has transformed geographical relations, ‘altering space, time, memory, and collective knowledge’, shaping what pasts become available, when and to whom (Elwood and Mitchell, 2015: 147; Elwood, 2021).
Ultimately, whether we access sources in person or online, we need a new model of historical research in geography. A model that rewards geographers for taking the time to learn about the richness of what was going in particular periods and places, rather than one that disproportionately rewards us for recombining fragments that surface in search results. That is the intellectual challenge ahead of us but, as we have argued in this paper, it is an ethical challenge too.
