Abstract
Introduction
The movements by national governments, funding agencies, universities, and research communities toward “open data” face many difficult challenges. As a slate of recent studies have shown, the phrase “open data” itself faces at least two central questions, namely (1) what are “data” (Borgman, 2015; Leonelli, 2015)? and (2) what is “open” (Levin et al., 2016; Pasquetto et al., 2016; Pomerantz and Peek, 2016)? In the face of the vagueness of these terms, individuals, research projects, communities, and organizations define “data” and “openness” in a variety of ways, often via informal norms in lieu of codified policies.
The concepts of “accountability” and “transparency” provide insight in understanding how open data requirements and expectations are achieved in different circumstances. An individual or organization is accountable for “open data” when they are answerable for the act(s) of making data open, whatever those acts might be. Being accountable means having to justify actions and decisions to some individual or organization. Transparency, on the other hand, refers to the notion that information about an individual or organization’s actions can be seen from the outside. Both concepts feature prominently in research and policy discussions concerning the relations that governments, organizations, and other social bodies have with their constituents or communities (Leshner, 2009; Lessig, 2009; McNutt et al., 2016).
Accountability and transparency
In high-level visions of open data, researchers’ data, and metadata practices are expected to be robust and structured. The integration of the internet into scientific institutions amplifies these expectations, as it provides a seemingly ubiquitous data distribution mechanism (Agre, 2002). When examined critically, however, the data and metadata practices of scholarly researchers often appear incomplete or deficient (Van Tuyl and Whitmire, 2016; Vines et al., 2014). The concept of accountability helps to guide explanations for data practices that seem, on the surface, to be insufficient. “Accountability” is a concept drawn from multiple social science traditions, including studies of governance in organizations and nations (Bovens, 1998), and studies of mundane activities in everyday life (Garfinkel, 1967; Woolgar and Neyland, 2013). It is important to remember that for most researchers, working with data is a very mundane activity. As Pink et al. (2017) note, data are intertwined with everyday routines, and often entail significant improvisation, both in data generation and use. For field-based scientists, such as ecologists and archaeologists, data may literally emerge from the dirt. For laboratory and computational scientists, data generation and management are less obviously subject to worldly interference, but are nevertheless imperfect human activities (Gitelman, 2013). To be accountable for data, researchers must be able to describe in a way sufficient for the social situation at hand how any perceived data problems are anomalous, correctable, or in fact not problematic at all—they must be “answerable” for their data. Simply being answerable for data can be called
Turning now to transparency, being transparent is often described as a public value and norm of behavior that counters corruption, and enables easy access and use of information (Ball, 2009). Diverse political drivers are increasing the attention on transparency as it relates to open data (Levy and Johns, 2016). As a result, researchers are increasingly being asked or required to enable their data to be transparent by sharing with colleagues or making data available on the web. Transparency in research is almost always selective, however (Jasonoff, 2006). Researchers may have numerous incentives to keep particular aspects of the work out of the eye of the public or their research competitors, including the fear of being scooped, or a lack of time to fully clean, process, and package data. This selective character of research openness suggests a distinction between different kinds of transparency, specifically,
Categorizing “open data”
The strength of these two concepts – accountability and transparency – emerges when they are coupled together. Enabling one does not necessarily mean enabling the other (boyd, 2016). Figure 1 presents a model of open data that couples the The relationship between accountability and transparency in different open data scenarios.
The top row in the model illustrates how the possibility of sanctions (
The middle row depicts how open data might manifest under a
The bottom row in the model depicts scenarios in which no accountabilities for open data exist, or scenarios in which accountabilities related to data are so diffused in the context of highly distributed scientific activity as to be effectively absent (Leonelli, 2016). The distinction between
Conclusion
This model illustrates a few key insights. First, good data management can happen even without sanctions. Many research communities have developed robust data repositories and other institutional support for data archiving without formal requirements from research funders or journals. These efforts find ways to integrate community norms and routines, data and metadata standards for archiving and interchange, professional data management roles, and technical infrastructures (Mayernik, 2016). Second, the transparency concept clearly implies accessibility—if something is not accessible, it cannot be transparent—but providing access does not itself make something transparent. Achieving
Being a “competent researcher” has always involved the ability to generate data that meet the standards of evidence in a given domain (Borgman, 2015). In most situations involving daily research tasks (e.g. data collection and documentation, and writing publications and reports), researchers’ daily data practices do not have to be perfect, they just have to be explainable. The integration of the internet into research institutions has changed the kinds of accountabilities that apply to research data, and enabled new kinds of transparency. Achieving “openness” requires the navigation of these context-specific accountabilities and transparencies.
