Abstract
Introduction
Agency is deeply connected to the distribution of knowledge and power. If we understand agency as ‘the longer processes of action based on reflection, giving an account of what one has done, even more basically, making sense of the world
The open data movement is a particularly interesting case because it intersects with two ongoing transformations of knowledge and power that seem to contradict each other in terms of agency: datafication and the proliferation of hacking or open source culture. On the one hand, the practices and imaginaries of open data activists are centered around the distribution and use of data and thus linked to datafication, the ubiquitous quantification of social life (Mayer-Schönberger and Cukier, 2013: 78), for which Big Data is the most prominent expression. Big Data ‘reframes key questions about the constitution of knowledge’ (Boyd and Crawford, 2012: 665) and raises concerns about the agency of publics. As Couldry and Powell (2014: 4) note, Big Data technologies and the growing relevance of algorithms may disconnect ‘system and experience’ because the traces of data people leave behind are often unconscious and not meaningful to them, and the insights generated by companies or governments are not, or only partially, ‘folded back into the experience of everyday life’. The comprehensive surveillance of online activities made possible by Big Data technologies thus might impede our potential to act in an agentic manner. On the other hand, open data activists apply practices and values from open source culture to the creation and use of data. This links them to other initiatives rooted in open source culture, like Open Access, Wikipedia, Wikileaks, Anonymous or Creative Commons (Beyer, 2014a, 2014b; Coleman, 2014; Sauter, 2014). Similar to datafication, these phenomena raise fundamental questions about ‘the nature of knowledge and expertise, how information is organized and evaluated, and who decides’ (Lievrouw, 2011: 26). Different to datafication, however, open source culture is associated with a transparent and collaborative form of governance that might support agency. As Raymond (2001) famously pointed out when he contrasted the ‘bazaar model’ of open source with the ‘cathedral approach’, open source culture is fundamentally concerned with the rights to access and distribute knowledge. Open source is based on voluntary participation (Weber, 2004: 62) and collaboration, granting access to the source code of software and incorporating contributions from potentially everyone. The implications of transferring the ‘open source process’ (Weber, 2004: 16) and the values inherent in this process to new domains with different ways of organizing knowledge ‘reach directly into the heart of the legitimacy, certainty, reliability and especially the finality and temporality of the knowledge and infrastructures we collectively create’ (Kelty, 2008: 6–7).
An analysis of the open data movement offers a unique opportunity to
Tracing the influence of open source culture on open data activists
It is generally acknowledged that activism around open data is rooted in hacking culture (cf. Bates, 2012; Davies, 2010; Johnson, 2014), or more specifically open source culture as one of the most prominent genres of hacking (Coleman and Golub, 2008). However, while this connection is frequently pointed out, it is rarely examined in more detail. Authors usually refer to a set of broad ethical commitments taken directly from traditional hacker culture. These ethical commitments have been famously described by Levy (1984) as follows: access to computer technology and information should be free, centralized forms of power are rejected in favor of decentralization, hackers adhere to a meritocratic culture of technological excellence in which the hacker should only be judged by his or her code, and the belief that computers can create a ‘better world’. While these principles are indeed relevant, we run the risk of oversimplifying the relationship between open data activism and open source culture if we solely rely on them. As Coleman (2013: 17) points out, the frequent reference to Levy’s account is problematic because it ‘whitewashes’ the diversity among hackers. While hackers share some technical and ethical commitments (for which Levy’s description is still useful), hacker culture should not be treated as a ‘singular code formulated by some homogeneous group called hackers but instead as a composite of distinct yet connected moral genres’ (Coleman, 2013: 19).
To develop a more nuanced and differentiated picture of how open data activists draw from open source culture, we can turn to research on its broader cultural significance and influence beyond software development. Particularly helpful here is an approach developed by Kelty (2008). While most attempts to grasp this phenomenon are primarily interested in making generalizations—for example by asking whether diverse initiatives rooted in open source culture are forming a coherent movement with a political project (cf. Beyer, 2014b; Clement and Hurrell, 2008; Kapczynski and Krikorian, 2010) or how the organizational features of open source software development can be generalized and applied to new domains (cf. Demil and Lecocq, 2006; Matei and Irimia, 2014; Weber, 2004)—Kelty developed a model that can be used to trace the influence of open source culture for specific cases. In his study of the cultural significance of free software,
3
Kelty suggests that open source advocates associate with each other not just through a set of ethical commitments, but through a range of key practices and social imaginaries (Taylor, 2004). He understands open source as an experimental system made up of five key practices or ‘components’: sharing source code, defining openness, writing copyright licenses, coordinating collaborations, and forming a movement. Understood in this way, open source becomes ‘a system of thresholds, not of classification’ (Kelty, 2008: 16): Within each component are a range of differences in practice, from conventional to experimental. At the center, so to speak, are the most common and accepted versions of a practice; at the edges are more unusual or controversial versions. (Kelty, 2008: 15)
Due to their flexibility, these components are not exclusive to the development of software: each of these practices can be adapted or ‘modulated’ to apply them to other domains. Therefore, Kelty (2008: 246) calls initiatives like Wikipedia or Creative Commons
To trace the influence of open source culture, Kelty (2008: 278) suggests treating its key practices as a template that interacts with other forms of knowledge management: ‘Where the practices match, no change occurs, and where they don’t, it is the reorientation of knowledge and power’. Therefore, the proliferation of open source culture can be described as the proliferation and modulation of its key practices in order to alter the means of knowledge production and circulation. Tracing the influence of open source culture on open data activists then comes down to a set of specific questions: Which practices are modulated? How are they modulated? How does this change the domain to which they are applied? Answering these questions will help us to grasp how activists try to apply the more transparent and collaborative forms of governance associated with open source to politics, and how this might support the agency of datafied publics.
Practices and imaginaries of open data activists 4
The following analysis is based on 10 semi-structured interviews with members of the OKF DE core team (including the chairman and founder, main developers, committee members and project managers) and a content analysis of nine relevant documents that were selected using a theoretical sampling, for example self-portraying descriptions from the official homepage. The data was collected in three rounds between September 2012 and January 2013 and analyzed using a grounded theory approach (Glaser and Strauss, 1967). I will structure the presentation of the findings in a way that shows how one modulation of open source culture leads activists to other, subsequent adaptations and interpretations.
By regarding data as a prerequisite for generating knowledge, activists transform the sharing of source code to include the sharing of raw data. Sharing raw data would allow others to make their own interpretation of it and generate their own knowledge, which represents a ‘democratization of information’ for activists. Seeing information as a necessary precondition for political participation, activists connect this idea to an open and flexible form of representative democracy by applying the open source model of participation (the ‘bazaar model’) to political participation, which should lead to more participation of citizens in political decision-making processes and more active and engaged local communities. A third set of practices refers to activists’ acknowledgment that raw data needs to be ‘refined’ to create knowledge for citizens, which is why they seek to create, and become, ‘data intermediaries’ for the public. This leads them to a special interest in journalism.
In the following, I explain each of these modulations and their implications in more detail.
Raw data as source code
The overall mission of the
Implicit here is that activists do not simply modulate the practice of sharing source code by replacing code with data. They also adapt the metaphors and concepts behind this practice. To execute human-readable source code on a computer, it has to be translated into binary instructions that are only readable by machines. These binary instructions cannot be retranslated back into the source code from which they have been generated. Having only the binary code without the source code (which is the case for most proprietary software) means that it is not possible to understand or modify the ‘inner workings’ of the software. Similarly, open data activists treat raw data as source code and interpretations—or knowledge—as binary code. As one activist explains, raw data ‘is not really neutral’ but it allows more interpretation than a ‘summary or a press conference’ (Interview: Developer 1).
5
That is because summaries are
It is interesting to contrast the notion of ‘raw data’ developed by activists with the way the term is used in discussions about Big Data. ‘Raw data is an oxymoron’ (Gitelman, 2013) is one of the most common critiques of Big Data advocates’ belief in ‘objective quantification’ (Van Dijck, 2014: 198) or Big Data’s ‘aura of truth, objectivity, and accuracy’ (Boyd and Crawford, 2012: 664). In their critique, authors point out that data is always prefigured through gathering mechanisms (Van Dijck and Poell, 2013: 10) and collected data has to be interpreted to make it meaningful and actionable, a process guided by specific interests and rationalities and not something that can be considered as objective. Essentially, this questions whether something like ‘raw’ data actually exists when we understand it as something ‘pure’ beyond human influence. However, members of the OKF DE adapt a different understanding of ‘raw’ data. For them, ‘raw’ simply means ‘as collected’. Accordingly, sharing data in ‘raw’ form—‘as collected’—is not about revealing an unbiased and objective truth, but about making the biases of this data
Using this understanding as a basis, members of the OKF DE are also concerned with the conditions that must be met to ensure this type of transparency, i.e. with the way raw data has to be provided to fulfill their vision of a democratization of information. This leads them to another modulation of open source practices: defining openness. More specifically, they define both the legal and technical characteristics of openness in relation to data in order to delineate open data from ‘closed’ data. For legal openness, the OKF developed the international Open Definition (Open Knowledge, n.d. b), according to which data is ‘open’ when it can be accessed, modified and shared by anyone for any purpose without restrictions. Technical openness is about ensuring that these rights can be exercised without too much effort. Key here are the principles developed by the Sunlight Foundation (2010) and the rating system developed by Tim Berners-Lee (2010). According to these guidelines, datasets should be complete, released in a timely fashion, accessible, machine readable, and available in open formats. While activists acknowledge that personal data and data crucial to security should not be made available in this way, they suggest that these legal and technical conditions are necessary to effectively break the interpretative monopoly of governments.
Given the importance of knowledge for agency, this type of transparency has the potential to support the agency of datafied publics. As activists acknowledge themselves, however, the mere provision of raw data is insufficient and only represents ‘the first step’ (Interview: Chairman & Founder). As I will explain in the following sections, this provision should go along with more continuous and flexible forms of participation and ‘data intermediaries’ that make raw data accessible to the public.
Data and democracy
The democratization of information described above is not regarded as an end in itself by activists. Ultimately, this form of transparency is taken as a means through which ‘the people should be considered again as the sovereign’ (Interview: Project Manager 1). Even though they do not explicitly talk about agency themselves, activists’ articulations of their broader aims are interesting for understanding
This means that more possibilities for citizens to participate in political decision-making processes is a major goal for members of the OKF DE: ‘to participate, people need information’ (Interview: Project Manager 2). In this respect, they regard themselves as part of an ‘Internet generation’ that is not content with periodic voting: ‘[I want] a higher degree of participation … a more continuous form of participation’ (Interview: Committee member). This does not, however, necessarily translate into a demand for more direct democracy. Instead, the open source model of participation is taken as a paradigm: What is powerful about open source development is that people can elect themselves as participants. I mean people can find my project and then decide for themselves to participate in its development and contribute to it. I think this model of self-selective participation is extremely powerful and I believe it can be applied to politics. (Interview: Developer 1)
This means that everybody who wants to participate in the decision-making process of a particular issue should have the opportunity to do so in a meaningful way. Here, activists explicitly modulate another practice from open source culture: coordinating collaborations, the organization of open source projects (Kelty, 2008, ch. 7). As mentioned above, this organizational model has been described as the ‘bazaar model’ (Raymond, 2001) because it encourages and incorporates contributions from potentially everyone. Just as there is not one standard model for coordinating collaborations in open source—larger and more well-known projects like the Linux kernel, the Apache servers, or the Debian project have all developed distinct organizational models over time (Coleman, 2013; Kelty, 2008; Weber, 2004)—activists reject clearly prescribing a specific model of participation. For them, applying the bazaar model of open source to governance is first and foremost about experimentation. There will not be ‘this one solution that you just need to apply. I think public authorities will need to have the courage to experiment’ (Interview: Chairman & Founder). This illustrates that more participation is not seen as a natural outcome of open data. Activists argue that it requires a cultural change within public institutions: a change towards a ‘beta culture’ that is willing to experiment and risk failure (Schwegmann, 2012), and a more collaborative and less authoritative relationship with citizens. Public institutions, it is argued, should promote the use of data and actively include citizens in decision-making processes: ‘It is not just about opening data…but also about investments from public institutions to ensure that this data is used’ (Interview: Committee member). Activists think that this cultural change will mainly happen at a local level, where issues are ‘closer’ to the people and institutions can experiment with ‘less resources’ (Interview: Chairman & Founder).
Taken together, the way activists apply the open source model of participation to governance results in a notion of a more open and flexible form of representative democracy. ‘Open’ refers to a higher degree of transparency (by sharing raw data) and the openness of political decision-making processes for public participation. ‘Flexible’ means that activists think that the inclusion and coordination of citizens’ voluntary, ‘self-selective participation’ should be adapted to the issue at hand and to the local context. At the same time, activists do not question representative democracy as such and are rather skeptical about elements of direct democracy, e.g. referendums: ‘I don’t know if direct democracy is always the right answer … but I definitely want more mechanisms to involve people more often’ (Interview: Chairman & Founder). From the perspective of democratic theory, they negotiate between representative models of democracy—in which participation is mainly limited to periodic voting —and direct models of democracy, where entire electorates vote on certain proposals. This is similar to Barber’s (2004) model of ‘strong democracy’, a more explicit attempt to develop an alternative to representative and direct democracy. Put briefly, strong democracy is based on a ‘creative consensus’ that is meant to recognize the diversity of interests and ‘is premised on citizens’ active and perennial participation in the transformation of conflict through the creation of common consciousness and political judgment’ (Barber, 2004: 224). Similarly, the diverse and flexible modes of organizing voluntary participation envisioned by OKF DE members require the active involvement of citizens and imply a consensus building process that is ‘creative’ in negotiating diverse interests
We can summarize the ideas and aims of open data activists described thus far to articulate—as an intermediary result—a first proposal about the conditions that must be met to support the agency of datafied publics: the transparency created through the sharing of raw data should be accompanied by a cultural change within public institutions to support voluntary and flexible forms of participation similar to those found in open source projects. As I will detail in the next section, activists not only emphasize the importance of public institutions, but also of other intermediaries to facilitate this participation.
Creating empowering intermediaries: Complementing or replacing journalism?
Even though the idea behind the democratization of information is to
In terms of agency, more interesting than the basic acknowledgment that intermediaries are necessary is what kind of intermediaries are deemed necessary to empower citizens. Three criteria can be identified that constitute an ‘empowering intermediary’ in the eyes of activists. First, they should be
To ‘create’ these intermediaries, activists try to cooperate with other NGOs and professional journalists and offer teaching. Here, they are part of a larger phenomenon: the increased interaction between the social worlds of technology and journalism, or more specifically between hackers and journalists (Karlsen and Stavelin, 2014; Lewis and Usher, 2013; Parasie and Dagiral, 2013; Royal, 2010). Members of the OKF DE are involved, for example, in Hacks/Hackers events (Lewis and Usher, 2014), where hackers and journalists come together to innovate news; the News Challenge of the Knight Foundation, an open-to-all contest rewarding projects that aim to transform news and information distribution (Lewis, 2012a); or the Knight-Mozilla Fellowships, which bring together hackers and technologists ‘to spend 10 months working on open source code with partner newsroom[s]' like
However, activists do not only try to influence journalism by interacting with professional journalists or by becoming programmer-journalists in newsrooms. They also act as intermediaries outside the profession and develop independent, non-profit applications to ‘implement’ their ideas. Key here are so called ‘civic technologies’—small-scale, specialized applications that aim to ‘connect people’ (Interview: Developer 1). These applications are either about improving government services for citizens, or about helping citizens to coordinate with each other to solve problems together. Often these are relatively simple web applications that focus on one task. For example, there are civic technologies that help people to exchange deposit bottles, that show how and where to engage in local building projects, that inform people about the local air quality, visualize which parts of the city are barrier-free and which are not, and so forth. 7 Even though civic technologies do not always depend on open data, data is key to their functioning in two ways: first, the availability of open data creates more opportunities to develop civic technologies (for example, when they require traffic data); second, they often datafy the activities they are concerned with, i.e. they often create new data. For example, FragDenStaat.de (inspired by the British WhatDoTheyKnow 8 ) makes it easier to submit freedom of information requests to public authorities and tracks both the requests and the responses from institutions. This crowdsourcing approach created a database that can be used to analyze and compare how different institutions react to these requests, what kind of requests are more likely to get refused and so forth. This illustrates that the development of civic technologies is not only interesting because it could support the agency of citizens. It also shows how activists use or create data to meet their own ends by developing tools to put their ideas into practice. For OKF DE members, the purpose of these applications is two-fold. On the one hand, they are supposed to help citizens to be more active and engaged in their local communities in a general sense—for example by helping people with disabilities to move around the city. On the other hand, they hope to create new communities or ‘alternative publics … with a controlling function’ (Interview: Developer 1). An often cited example is Ushahidi, 9 which was originally developed by a group of citizen journalists to track violent outbreaks after a disputed election in Kenya (Giridharadas, 2010). Because journalists received threats about their work, Ushahidi was designed as a crowdsourcing application that maps incidents reported anonymously by users. Both in the sense of more active and engaged citizens and of ‘controlling publics’, civic technologies are linked to a notion of ‘self-empowerment’ (Interview: Chairman & Founder) or ‘do-it-yourself-empowerment’ through data, understood as the ability of citizens to solve issues without the help of governments or businesses.
In terms of agency, the development of civic technologies by activists is interesting for another, less obvious reason. Civic technologies can be described as alternative ways of fulfilling functions traditionally described as ‘journalistic' (making governments more transparent and accountable and engaging citizens in public issues) or of accessing and using public services (e.g. with an easy-to-use website to submit freedom of information requests). In other words, these applications are developed A recursive public is
By being able to maintain their own terms of existence (to a certain degree at least), recursive publics can act as ‘actually existing alternatives’. In this sense, civic technologies developed by activists could to some degree act as ‘actually existing alternatives’ to professional journalism or (ways of accessing) public services. Activists are well aware of this potential: The ultimate goal of developing alternative services with civic technologies is to pressure established institutions to adapt them. ‘Flagship projects' (Interview: Chairman & Founder) are intended to demonstrate what is possible and to invite (or provoke) established institutions to imitate them. As one member notes: ‘We have discovered software as a lobbying tool' (Interview: Developer 2). Let me illustrate this with another example: Frankfurt-Gestalten.de (∼‘Shaping-Frankfurt’) monitors information provided by local parliaments in the city of Frankfurt and illustrates them on a map. Users can check what is currently discussed in their street or district (e.g. building projects), comment on it or initiate new discussions. Activists use this project to advocate for easier access to local parliamentary data, and for local public institutions to offer similar services. Moreover, I suggest that applications like Frankfurt-Gestalten.de represent a data-driven form of local journalism that is focused on engaging citizens on a local level. As such, Frankfurt-Gestalten.de has a complex relationship with professional journalism: First, it could complement professional journalism because local journalists can use it as a research tool. Secondly, however, it also represents a potential threat for professional local journalism—if people use an application like Frankfurt-Gestalten.de instead of consulting their local news media. Yet it is also conceivable, thirdly, that news media develop and maintain similar applications themselves, offering them as services to their audience and using them as research tools for their own investigations—Bell (2014) recently made a similar suggestion. This example illustrates how activists attempt to directly or indirectly influence established institutions on many different levels through the development of civic technologies, and shows that acting as intermediaries themselves is as much about directly putting ideas into practice as it is about transforming existing institutions. It not only shows how activists use data to directly meet their own ends, but also how they attempt to influence the conditions of the wider public to support the agency of ordinary citizens.
Conclusion: Data hacking and new forms of agency?
I conclude by returning to the questions raised in the beginning of this article. What do the practices and values developed by members of the OKF DE tell us about the conditions under which datafication can support agency?
When we look at activists themselves, datafication obviously does not undermine, but rather supports their agency in important ways: their technological expertise enables them to utilize or create data to meet their own ends. They even use the applications they create as lobbying tools that pressure institutions by offering actually existing alternatives. These findings emphasize the connection between datafication and the proliferation of hacking culture. The ability to ‘hack’ and to create recursive publics fundamentally depends on the availability and modifiability of the underlying technology (Kelty, 2008: 10–11): participants have to be able to access and modify the technology needed to build their own, independent infrastructures. Otherwise, the expressive use of technology—the expression of imaginaries, values and rationalities
Moreover, members of the OKF DE are primarily concerned with how they can support democratic values and the agency of citizens through open data. As I showed in this article, three interrelated conditions must be met in their eyes: raw data should be shared openly to make decision-making processes more transparent, public institutions should actively include citizens in these decision-making processes to create a more open and flexible form of representative democracy, and ‘empowering intermediaries’ are needed to make raw data accessible to the wider public. It seems clear that these propositions have a
A guidance for future research provided by this analysis is to look at the way activists' practices and ideas are institutionalized, i.e. how they are adapted by other NGOs, news media, or public institutions. As activists acknowledge themselves with their emphasis on the importance of empowering intermediaries, their influence on the wider public—and therefore their potential to support the agency of datafied publics—depends on transforming existing institutions rather than on building new, alternative ones. To study these processes, we can take further inspiration from Couldry's (2010: 1) concept of ‘effective voice'—the insurance that ‘my voice matters', which is a crucial aspect for both agency and democratic legitimacy. We can argue that activists describe important preconditions for processes of effective voice in datafied societies. Yet we have to be critical about whether the adaption of activists' practices and ideas really leads to
