Abstract
Introduction
In recent years, the healthcare field has welcomed an emerging field of practices captured under the umbrella term of ‘Big Data’. 1 Big Data initiatives are welcomed because of their envisioned benefits for faster and more representative knowledge 2 that is presumed to improve the process, management and predictability of care (Murdoch and Detsky, 2013). The healthcare field traditionally favours high-quality evidence from randomized controlled trials (RCTs) and observational studies to guide treatment decisions and to organize the field (Timmermans and Berg, 2003). However, as the persistent discussions about evidence-based medicine show, the field has been struggling with the reductionist and generalized character of this evidence (Berwick, 2016; Greenhalgh et al., 2014). Patient guidelines are, for example, often based on time-consuming RCTs and done on selective populations, which makes it hard to extrapolate results to individual patients (Felder and Meerding, 2017). Big Data seem to offer an attractive alternative and are surrounded by claims of quick and comprehensive analysis of data and ‘with the aura of truth, objectivity and accuracy’ (Boyd and Crawford, 2012: 663). These grand promises lead to a positive rhetoric that surrounds the term and that drives implementation of Big Data in healthcare.
Publications about Big Data frequently discuss topics related to knowledge generation, evidence and causation (e.g. Anderson, 2008; Mayer-Schönberger and Cukier, 2014). Provocatively, these publications celebrate the inevitable decline of traditional research as Big Data are supposed to handle large volumes of messy real-world data more efficiently and can uncover hidden correlations. In response to these claims, there has been a recurrent call for more studies into the epistemological implications of Big Data (Boyd and Crawford, 2012; Crawford et al., 2014; Mittelstadt and Floridi, 2016), which scholars have started to address. As a result, a critical scholarly discourse that reflects on how Big Data shape our knowledge and understanding is forming in, primarily, the fields of Science and Technology Studies (STS) and Critical Data Studies (e.g. Kitchin, 2014; Leonelli, 2014; Rieder and Simon, 2016). While these fields have been instrumental in elaborating the neglected and problematic dimensions of Big Data, it remains an open question how and to what extent such insights become embedded in other fields, such as healthcare.
This paper critically reviews the epistemological claims and envisioned implications that accompany Big Data in the healthcare domain. The healthcare field is characterized by a strongly institutionalized set of epistemological principles and generally accepted scientific methodologies (Timmermans and Berg, 2003). Big Data challenge these principles and methodologies with the consequence that the epistemological implications of Big Data practices could be particularly profound. What we value as evidence and knowledge has implications for the way medical decisions are taken and healthcare is organized. Opening up the assumptions allows us to evaluate the role of Big Data in healthcare critically and open up opportunities for debate and fruitful intervention.
We base the paper on a systematic and comprehensive review of scientific editorials as these, in particular, summarize and reflect upon developments in the field. We focus on discourses surrounding Big Data in the analysis and construct five ideal-typical discourses based on a detailed analysis of the language conveyed in the editorials. The discourses show the diverse ways in which Big Data and the epistemological claims are conceptualized. We chose this focus as language is the medium through which people come to understand Big Data and it influences the way Big Data initiatives are performed and legitimated. Three questions guide our analysis:
(1) What Big Data discourses can be identified in scientific healthcare literature? (2) How do the discourses conceptualize the meaning of evidence? (3) What are the consequences of these conceptualizations for the way Big Data is understood in healthcare?
Big Data as material practice and semantic reality
Many authors have discussed the ambiguity surrounding the term Big Data. The term is often characterized by its volume, velocity and variety (‘the 3Vs’; Mayer-Schönberger and Cukier, 2014). However, many believe that these three characteristics do not sufficiently capture Big Data. The 3Vs are thus often extended with extra ‘V’s, such as value, viability, variability, visualization and veracity (DeVan, 2016; Kitchin and McArdle, 2016). Others use different qualifications to characterize Big Data, such as exhaustively, relationality, extensionality and scalability (Boyd and Crawford, 2012; Kitchin and McArdle, 2016; Mayer-Schönberger and Cukier, 2014). Despite the many attempts, there is still no consensus about the term Big Data.
Inspired by the approach of Beer (2016) and Rudinow Saetnan et al. (2018), we conceptualize Big Data as a set of practices and ideas that exist in both (1) real material practice and in (2) a semantic reality. First, Big Data exist in specific actions, technologies and initiatives that are introduced to restructure healthcare. It is linked to the collection and aggregation of available data and correlation, pattern-recognition and predictive analyses. These data and analytics are subsequently used in real initiatives that aim to collect data, track, profile and predict behaviour, preferences and characteristics (Mittelstadt and Floridi, 2016). Second, Big Data exist in a semantic reality as it is something that we talk and write about in order to anticipate the (possible) effects. In this semantic reality, we envision and give meaning to the present and future of Big Data. Of course, the way we describe Big Data subsequently influences the way Big Data are performed and legitimated and vice versa.
In this paper and our analysis, we focus on the semantic reality of Big Data and discourses and metaphors. This is not to argue that detailed empirical investigations into material practices are less important. However, if we want to explore the implications of Big Data we also need a better understanding of how Big Data are discursively constructed. The crucial role of metaphors 3 in people’s experience and sense-making of the world has been long recognized (Lakoff and Johnson, 2011) as metaphors play a large role in framing debates in particular ways. Metaphors are not neutral; they embody assumptions, imagined implications and impose opportunities and limitations (Puschmann and Burgess, 2014; Zinken et al., 2008). This makes metaphors especially valuable as we want to open up the epistemological claims and assumptions that accompany Big Data in healthcare.
Methodology
We conducted a comprehensive and systematic search of scientific literature to show the different ways in which Big Data and its epistemological claims are being articulated in the healthcare field. We chose this approach, because we did not want to miss major views and also gain insight in the relative spread of the articulations. Although our search of the literature fits the methodological approach of a systematic literature review, we subsequently departed from this approach in the interpretation and analysis of the results. While a ‘traditional’ review counts and synthesizes the results and provides an exhaustive
Identifying relevant studies
A search term was composed with the help of a librarian to select the relevant studies. The search term covered terms related to (1) ‘healthcare’ and (2) ‘Big Data’ and related techniques, such as data mining. We wanted to be as inclusive as possible. The librarian and the first author looked for mentioning of the term Big Data in relevant studies and included those. Also, they started with a small list of techniques related to Big Data and iteratively added additional techniques to the search term if they were frequently mentioned in the found studies and resulted in relevant studies. The minimum requirement for inclusion was the mentioning of unusually large data sets or combinations of diverse types of data sets. We choose not to include the search term ‘artificial intelligence’ as this resulted in thousands of studies more for inclusion. In addition, we decided not to include ‘knowledge’, ‘evidence’ and related terms in the search profile, because we assumed that even studies that do not mention these terms can still make epistemological claims. The exact search terms are listed in Appendix 1. Eventually, we conducted the extensive search in Embase, Medline Ovid, Web of Science, Scopus, LISTA EBSCOhost and Google Scholar in January 2017.
We chose to limit our search to editorials from scientific journals in the healthcare domain because of their distinct characteristics. Editorials are expressions, reflections or commentaries on developments. They are a medium for editors, researchers and clinicians to communicate with peers and informed publics, as well as a forum for the explicit expression of beliefs and opinions (Loke and Derry, 2003; Miller et al., 2006). They can contain substantial scientific content, compelling messages, calls for action and discuss little known scientific facts with far-reaching consequences (Rousseau, 2009). They are usually written by the journals’ editors or leading authors of the field. Editorials are often accessed and appear in well-regarded academic journals (Loke and Derry, 2003; Youtie et al., 2016). We selected editorials instead of viewpoints and opinion articles because we assume that editorials have a more critical role in defining the standpoint of the journal as compared to presenting the opinions of individuals. Lastly, editorials set the agenda for specific research fields and are a basis for future action. Hence, we believe that editorials capture Big Data discourses in the scientific community and have an important function in disseminating assumptions about Big Data in the healthcare domain.
Given the size of the original body of selected documents, further selection criteria were needed to obtain a manageable data set for detailed analysis. Hence, we chose to define a timeframe (2012–2016) for the review. As other studies have, we noticed an exponential increase in the number of publications about Big Data in general in 2012 (Youtie et al., 2016). Therefore, we choose 2012 as the starting point. Also, we included only English language editorials for practical reasons. If we could not find the editorial text online, we contacted the first author to gain access. In 24 instances, this did not work, and these documents were excluded because we could not access the full text.
The final selection of documents contained 1204 original documents. The first author of this paper read the title and abstract or the first and last paragraphs (if an abstract was unavailable) and excluded the irrelevant texts. Documents were excluded in close cooperation with the second and third authors because they either did not qualify as editorials or were outside the scope of this review (i.e. documents that were not about Big Data or were unrelated to health or healthcare). After screening, 206 editorials were eventually included for detailed review (see also Figure 1). Appendix 2 provides an overview of the included editorials.
Selection of the editorials.
Data analysis
The analysis was conducted in three phases. First, the first author randomly selected 20 editorials and flagged sections of interest. The authors of the paper discussed trends in the editorials and composed a list of questions that would be relevant to answer for each editorial. Subsequently, the first and second author both analysed another 20 editorials and the list of questions was finalized. The list contained questions about (1) conceptualization of Big Data (e.g. how is Big Data described?), (2) the epistemological position (e.g. what is described as a good way of obtaining evidence/knowledge?), (3) the envisioned consequences (e.g. how are outcomes of Big Data used?) and (4) noticeable discursive elements, such as metaphors and surprising examples or comparisons. In the second phase, all remaining editorials were analysed with the finalized analytical scheme by the first author, second author and a junior researcher. The questions were answered for all the editorials and organized in a spreadsheet. Ten per cent of the editorials were also analysed by another member of the research team to ensure analytical consistency. Third, to organize and interpret the spreadsheet and to construct the ideal-typical discourses, the authors of this paper jointly tested, critically interrogated and experimented with the analytical themes and organization of results until consensus was reached about the structure and characteristics of the several discourses. This process eventually resulted in the construction of the five discourses.
Results
Description of data set and overview of findings
Based on our analysis, we were able to construct five ideal-typical discourses: modernist, instrumentalist, pragmatist, scientist and critical-interpretive. We drew inspiration for the names of the discourses from the relations we saw between implicit assumptions about evidence and knowledge and diverse philosophical and epistemological positions. The discourses were distributed over the editorials in the following way: modernist ( Presence of the ideal-typical discourses in the editorials.
Overview of the discourses.
The modernist discourse: Capturing data
The conceptualization of Big Data
In this ideal-type, Big Data are often not defined, but the editorials link it to large amounts of data. Big Data are described as a positive development and the editorials stress the beneficial effects of Big Data. They state, for example, that it will lead to proactive, predictive, preventive, participatory and patient-centred health (Shah and Tenenbaum, 2012; Weinstein, 2016). However, the precise meaning of these statements often remains unclear and ambiguous, as they are not discussed further.
The editorials unanimously and unambiguously recommend the use of Big Data in healthcare. This is emphasized by three rhetorical techniques. First, the tone of these editorials is optimistic, signified by such words as ‘explosion’, ‘revolutionizing’, and ‘world-changing possibilities’. Big Data are presented as innovative and as a rupture with the past that will radically transform healthcare (Restifo, 2013; Weinstein, 2016). Secondly, a sense of urgency is created in the editorials as they often draw a contrast between the medical domain and other sectors that supposedly already take advantage of Big Data. The medical domain is presented as slow, conservative and old-fashioned, while other domains are already taking Big Data analytics for granted. This discursively constructs the field of medicine and its current approaches as unsustainable and outdated (MacRae, 2012; Risoud et al., 2016). Third, there is almost no attention for the negative sides of Big Data, such as potential issues with privacy, consequences of shifting power-relations or for practical questions concerning implementation. Illustrative of this position is the almost complete lack of non-use of Big Data as a theme in this discourse.
Epistemological assumptions
Capturing data is the metaphor (Figure 2) that most clearly illustrates the epistemological assumptions in the modernist discourse. First, because the modernist discourse assumes data to exist in the world and to have inherent value (like a butterfly or other natural resources). The assumptions are that the data can be captured and that this results in new insights, evidence and practices. Second, the metaphor aptly illustrates the epistemological assumptions in this discourse because capturing is a relatively simple act that also leaves the data itself unaffected, which shows the ease in which Big Data are portrayed in these editorials to be able to arrive at knowledge. This process is viewed in such simplistic terms that data seem to equal knowledge. This creates the idea that only ‘capturing data’ already leads to new knowledge.
Capturing data metaphor.
Consequences
The modernist discourse strives for a radical change as the traditional ways of knowledge production in the medical domain are rejected. Editorials in the modernist discourse aim to overthrow the status quo in order to transform knowledge production in healthcare radically. Big Data are seen as a legitimate source of knowledge in these editorials because Big Data are argued to lead to more timely and reliable knowledge that is viewed as immediately useful in practice. However, the discourse seems to be naïve in the sense that it only addresses grand visions and is not concerned with, for example, the practical development and application of Big Data, nor with the societal effects.
The instrumentalist discourse: Illuminating data
The conceptualization of Big Data
In this ideal-type, Big Data are understood in terms of a range of analytical techniques, such as pattern-recognition, data mining and machine learning (Amato et al., 2013). The editorials have a positive tone and describe ways in which these Big Data techniques can aid healthcare, for example by predicting disease outcomes and increasing the understanding of the causes of diseases (Belgrave et al., 2014; Van De Ville and Lee, 2012). The editorials typically discuss how analytic techniques should be used and how they can be improved. The editorials contain advice on how one should deal with the missing data, correlated features and replication and separation of training and validation sets.
The editorials recommend that Big Data techniques should be developed and enhanced to gain better results. Editorials in this discourse place a high value on experimentation. For example, innovative studies in which Big Data techniques are used for brain decoding and the development of clinical decision support systems are presented (Najarian et al., 2013; Van De Ville and Lee, 2012). Using Big Data techniques for these purposes is by no means standard practice, but by trying out and experimenting with data analytic processes, the techniques are improved. Illustratively, terms like improving, experimenting, exploring, developing and learning frequently occur in the instrumentalist editorials.
Epistemological assumptions
The illuminating data metaphor (Figure 3) best represents the epistemological assumptions in the instrumentalist discourse and is exemplified by phrases such as ‘casting light’ and ‘highlighting’ in the editorials. Similar to the modernist discourse, in the instrumentalist discourse data seem to exist in the world and are viewed as having an intrinsic value. However, the process of knowledge discovery through Big Data is depicted in less simplistic terms than in the modernist discourse, as the editorials emphasize that information can only be extracted from highlighting the data with specific analytic techniques so that patterns in the data can be seen (Amato et al., 2013; Rosenstein et al., 2014). This is an indirect critique of the more traditional methods for knowledge generation, which are implicitly depicted as outdated and inefficient. The editorials thus suggest that by constructing and positioning the ‘light sources’ (e.g. the analytic techniques), we are increasingly able to ‘see’ the data and emerging trends within them. This means that knowledge improves together with the set of analytical techniques.
Illuminating data metaphor.
Consequences
The instrumentalist discourse promotes the use of Big Data techniques in healthcare as they become a reliable source for decision-making. Less radically than the modernist discourse, editorials in this discourse still argue for a change of the ways knowledge is obtained in healthcare, as Big Data are expected to solve persistent problems in healthcare. The discourse seems to envision Big Data as a tool to solve problems and the tool is valid to the extent that it helps to make accurate predictions and increases our understanding. However, similar to the modernist discourse, the instrumentalist discourse also neglects the broader implications and potential societal effects of the use of Big Data techniques.
The pragmatist discourse: Harnessing data
The conceptualization of Big Data
In this ideal-type, Big Data are conceptualized as a useful (managerial) instrument for problem-solving and decision-making in healthcare (Garrison, 2013; Klonoff, 2013; Potters et al., 2016). Big Data are discursively constructed in the editorials as a phenomenon that is already here and is likely to stay (Basak et al., 2015; Ghani et al., 2014; Hay et al., 2013). Big Data are described as a positive development. However, in this discourse, people are presumed to have a significant influence on the way Big Data take shape, as opposed to the more technological determinist pattern of thinking that characterizes the modernist discourse.
The editorials in this discourse primarily focus on how Big Data should be implemented and describe the steps for successful implementation. They discuss, for example, the training, recruitment and the introduction of data scientists or knowledge engineers, cultural factors that need to change in healthcare, new rules and regulations that have to be made, the adoption of new platforms and information systems, and how access should be gained to the data and analytics (Cases et al., 2013; Kottyan et al., 2015; Narula, 2013; Potters et al., 2016). The editorials do mention concerns and other challenges that need to be overcome or solved, as the following quote from McNutt et al. (2016: 914) illustrates: ‘We envision future systems that incorporate [Big Data] decision support models into the clinical systems in ways that enable clinicians to improve both the quality and the safety of care they give and the efficiency with which they give it. To reach this vision, there remain technological needs and human challenges to overcome.’
Epistemological assumptions
The metaphor of ‘harnessing data’ (Figure 4) best illustrates the ideas and assumptions about Big Data in the pragmatist discourse. Similar to the previous discourses, data continue to be described as something ‘out there’, simply existing in the world. The data are viewed as valuable as they can be translated into information and knowledge. Different is that this discourse sees traditional scientific and Big Data methods as complementary approaches that can both generate ‘evidence’ and have practical relevance (Basak et al., 2015; Klonoff, 2013). A more pragmatic attitude towards evidence seems dominant as evidence is not strictly related to scientific processes. There are no fundamental objections against using Big Data outcomes. Big Data are viewed as beneficial whenever it helps to gain knowledge about situations that traditional scientific methods cannot study and decision-makers pragmatically make choices on the basis of the available evidence. Discussions about the status of the outcomes of traditional scientific studies and Big Data analyses disappear to the background in this discourse, as the actionable character is emphasized.
Harnessing data metaphor.
Consequences
Similar to the instrumentalist discourse, the pragmatist discourse envisions a change in the way decisions are taken as Big Data offer more knowledge than currently is available and can generate useful new insights for healthcare practice. Big Data are seen as a valuable source for decision-making next to traditional knowledge producing approaches. This discourse deals – more than the previous discourses – with some of the practical issues surrounding Big Data implementation (such as the recruitment of data scientists). However, the epistemological and normative changes that Big Data bring are not addressed.
The scientist discourse: Selecting data
The conceptualization of Big Data
In this ideal-type, Big Data are described as a new trend that deals with data collection, analysis and outcomes in a less rigorous way than scientific methodologies do. The editorials mention that Big Data can be useful in some situations because of its potential to identify valuable research directions, for hypothesis-generation and exploration of massive data sets (Khoury and Ioannidis, 2014; Krakoff and Phillips, 2016). It can thus only be used as exploratory, hinting at possible directions for traditional research designs. The tone of the editorials is critical, especially compared with the modernist discourse, and Big Data are seen as a potentially dangerous development.
The editorials argue for caution with regards to Big Data and claim that traditional scientific methods will remain essential despite the arrival of Big Data methodologies. The editorials try to distinguish ‘proper’ from erroneous science. They do this, for example, by comparing Big Data outcomes and findings from RCTs (Freeman and Saxon, 2015). Some editorials mention the limitations of traditional studies. For example, they state that RCTs are costly or not always possible because of ethical considerations (Freeman and Saxon, 2015; Leem, 2016). However, the consensus seems to be that despite the potential of Big Data as a starting point for research, it always needs to be followed by more substantive research. Or as Khoury and Ioannidis (2014: 1054) state in their editorial: ‘We should embrace (and not run away from) principles of evidence-based medicine.’
Epistemological assumptions
The epistemological assumptions about Big Data within this discourse can be summarized by the metaphor of ‘selecting data’ (Figure 5). The notion that Big Data can lead to reliable and valid knowledge is questioned and sometimes outright denied in the editorials. Two arguments are frequently made. First, the editorials stress that data are essential to arrive at knowledge. However, data are not viewed as pre-existing in the world. As such, they cannot simply be captured, illuminated or harnessed, but need to be selected and processed via specific methods. This position is reinforced by statements like ‘garbage in, garbage out’ (denoting the idea that the lack of selecting ‘high-quality’ data from the masses of available, often poor quality data leads to useless analyses), or by presenting the data of Big Data as erroneous or as a ‘dumping site’ (Brown, 2016; Patrick, 2016). Through discursively oppositioning high-quality data with ‘garbage’, the editorials point to the need to have the proper or right procedures for data gathering and analysis in place. Such procedures are meticulous and less easily abandoned than presumed in, for example, the modernist discourse. Second, the editorials problematize the assumption that more data equal better knowledge. This idea is widespread in the modernist, instrumentalist and – to some extent – pragmatist discourses. According to editorials in the scientist discourse, this assumption is wrong. As Onukwugha (2016: 92) explains: ‘We cannot assume that more data necessarily means more information. Indeed, as the volume of data increases, it will be important to pay continued (or more) attention to established concerns regarding measurement, bias, and fallacies relevant to empirical analysis and interpretation.’
Selecting data metaphor.
Consequences
The scientist discourse argues against a radical change in healthcare as according to this discourse, Big Data are not a reliable source of knowledge. The only proper knowledge seems to be scientific knowledge and such knowledge can only come from the use of strict scientific methods. The consequences of Big Data would be erroneous evidence and knowledge with possibly large, detrimental effects. This discourse discusses in-depth the epistemological concerns and how Big Data related to traditional structures for knowledge generation.
The critical-interpretive discourse: Constructing data
The conceptualization of Big Data
In this ideal-type, Big Data and data are presented as an oversimplified presentation of reality. The critical-interpretive discourse incorporates diverse forms of criticisms. Generally, the editorials share a concerned tone and their criticisms are both epistemological and societal.
The editorials advocate discussion on the position of Big Data in our society as a whole. Two lines of critique can be distinguished in this discourse. First, the simplicity of data is frequently addressed. Big Data are dismissed because it is a reductionist and oversimplified presentation of reality, unable to adequately capture and account for the richness and diversity of human experience. Editorials make this point by describing data that are missing in Big Data sets and by stressing the importance of personal experience, objectives and preferences (Pope et al., 2014; von Gunten et al., 2016; Zurlinden, 2016). Second, the editorials stress the normative aspects of Big Data and point out that these aspects are often overlooked or neglected. The editorials, for example, focus on the danger of Big Data that is not being interpreted by physicians and warn that Big Data can be a first step for ‘dangerous’ automatic decision models. As Von Gunten et al. (2016: 1240) state: ‘It [Big Data outcomes] must be interpreted by a seasoned clinician with critical thinking skills.’
Epistemological assumptions
The epistemological assumptions that characterize editorials in this discourse can be best understood via the metaphor of ‘constructing data’ (Figure 6). In terms of epistemological assumptions, the critical-interpretive discourse is most distinctive from the other discourses as it reasons from a different set of epistemological assumptions (building on constructivist traditions in philosophy of science as opposed to positivist approaches). Consequentially, data are no longer presented as something given that can be captured or illuminated, but understood as the result of the social and political processes that created them. As Pope et al. (2014: 68) state: ‘We must remember that all data – big or small – are socially constructed.’ This perspective means a recognition that data always emphasize certain aspects of the world while leaving out other elements. Importantly, the constructed data present an image, but editorials in this discourse warn that this image can never be complete. This discourse can especially be contrasted with the modernist discourse, in which the ideal of ‘complete knowledge’ is maintained. Big Data, therefore, according to the critical-interpretive discourse, will always generate limited knowledge and data have to be handled with care.
Constructing data metaphor.
Consequences
The critical-interpretive discourse warns for the limitations of Big Data. According to this discourse, while Big Data create new possibilities for generating knowledge, the use of these possibilities is not seen as a positive change. The starting point is that it is better not to use Big Data (or at most only with great restraint). The consequences of Big Data would be that limited data are extrapolated and would lead to erroneous outcomes that could cause harm to people and healthcare systems. In addition, if people are not able to recognize the fact that data are constructed, for example, by the use of automated decision models, essential aspects of care would be lost.
Discussion
Reviewing literature is a first step in gaining a better understanding of the epistemological implications of Big Data in healthcare. Based on a systematic literature search and consecutive interpretive analysis, we constructed five ideal-typical discourses of Big Data in healthcare. These five discourses all highlight particular aspects of Big Data, neglecting others, and thereby frame Big Data and its (epistemological) implications in specific ways. This study is vital because discourses and metaphors pre-structure the way that the material practices of Big Data take shape. As such, they are highly consequential in shaping current and future debates on Big Data. In this discussion, we will take the next step by drawing attention to the political dynamics of the discourses. We build on insights from STS and Critical Data Studies to point to issues that have been ignored or neglected in the current construction of the Big Data debate in healthcare editorials. We end with suggestions for future research.
We noticed that the discourses that frame Big Data in positive terms (modernistic, instrumentalist and pragmatist) were more present in our empirical material (
The discourses that frame Big Data in more critical terms (scientist and critical-interpretive) were less present in the editorials (
Especially editorials in the critical-interpretive discourse were limited (
We argue that the healthcare field would benefit from a more prominent critical-interpretive discourse, as three important issues would be neglected (as they are not addressed by the other discourses): (1) the normative assessment of Big Data, for example, the role that automatic decision models should play in the doctors’ office and issues related to data access and consent (Mittelstadt and Floridi, 2016). (2) Reflection on the situatedness of data. Data do not speak for themselves and we must remember that they are always an oversimplification of reality. Reflection on what particular aspects of a phenomenon are emphasized in the data and what aspects are occluded is therefore crucial (Boyd and Crawford, 2012; Mittelstadt and Floridi, 2016). (3) The social and political processes that create Big Data. While Big Data and data may seem objective to many, they still are subjective and contain biases and other limitations which should be opened up (Boyd and Crawford, 2012). We believe that the pragmatist discourse deals with the first issues too pragmatically and the scientist discourse with the last issues too statically and without enough attention for the social dynamics. Subsequently, the healthcare field would benefit from more critical reflection and intervention.
Based on this review, we stress that the epistemological discussion in healthcare needs to be developed further and that we have to find ways to better integrate aspects of the critical-interpretive discourse in the healthcare domain. Based on this paper, we suggest the following directions for further research:
Further study into the five ideal-typical discourses could provide important insights into the ways (and extent in which) similar discourses and dynamics are also noticeable in other disciplines. Quantitative approaches could investigate correlations between the background of editors/authors and the discourses they endorse. As discourses are not only part of editorials, but also of broader cultural discussions, future research could study the various ways in which the semantic realities of Big Data intersect with material practices and vice versa. Especially warranted are comparative studies that open up the ways Big Data are depicted in different cultural domains and the sociotechnical imaginaries (Jasanoff and Kim, 2015) in which these depictions are embedded. Empirical reflections on the material practices of Big Data are warranted as well. Discourses and socio-technical imaginaries are still part of theoretical discussions, while at the same time many Big Data initiatives are started in healthcare. Studying such initiatives ethnographically is likely to provide highly valuable insights into the dynamic encounters between data and healthcare.
Conclusion
The fields of STS and Critical Data Studies have been instrumental in opening up discussions about the epistemological and ethical implications of an emerging field of practices, captured under the umbrella term ‘Big Data’. On the basis of this study, we have to conclude that these reflections have not been embedded in the healthcare field in any substantial way. Based on a systematic analysis of scientific editorials, we constructed five ideal-typical discourses to gain a better understanding of how Big Data are discursively constructed. We observed that editorials in the critical-interpretive discourse were limited (only 5.3%). We conclude that the healthcare field would benefit from a more prominent critical-interpretive discourse, since important reflections on the normativity and situatedness of Big Data, as well as the social and political processes that create Big Data, are not addressed by the other discourses.
