Abstract
This article is a part of special theme on Datafied Development. To see a full list of all articles in this special theme, please click here: https://journals.sagepub.com/page/bds/datafied_development?pbEditor=true
Introduction
The convergence of technical advances in big data, artificial intelligence (AI), and algorithmic complexity, along with the growing accessibility and affordability of services integrating these technologies, is bringing foundational transformations to a humanitarian sector otherwise known for its constancy. An emerging frontier in this datafication of humanitarianism sees a number of startups and consultancies mobilizing datafied services to build what they view as representative localized depictions of humanitarian needs and context, without any need for direct interaction on the ground. These AI tools were originally developed to identify patterns, sentiments, and cultural nuances in large volumes of online data for purposes of market research, brand strategy, and product development. Targeting the humanitarian sector, they are now sold as services enabling spatio-temporal proximizations that put humanitarian organizations and people of concern into co-presence, bypassing distances through data.
Firms such as American “on-demand insights company”
Initially celebrated for its perceived ability to increase professionalization through standardization, speed, and objectivity (Raymond and al-Achkar, 2016)—or extolled for its supposed empowerment stemming from accessibility, transparency, and a wide stakeholder base that can provide information and resources in real time (Mulder et al., 2016)—the datafication of humanitarianism is now widely scrutinized for its potentially detrimental effects. A growing critical literature highlights how digital humanitarianism has enabled remote management techniques that sideline concerns regarding data regulation and privacy protection (Duffield, 2016) or questions the dominance of private corporations in shaping the utilization and outcomes of “extractive” data practices and systems primarily designed for commercial gain (Sadowski, 2019). The datafication of humanitarianism has more recently coincided with an increased focus on inclusion and empowerment in the humanitarian sector, epitomized by the
This article suggests that these two defining developments—the integration of big data to enhance professionalization and accountability and the localization agenda with its focus on inclusion—are now coalescing to reshape core tenets of the humanitarian sector. While localization suggests that proximity to crisis allows more efficient responses as locals have a deeper understanding of needs through better knowledge of the context—“a personal understanding of what needs to be done” (Elkahlout and Elgibali, 2020: 237)—digital humanitarianism encapsulates how new digital tools facilitate the collection of large amounts of data from a distance (Aarvik, 2020). By simulating (Kopytowska, 2015) or projecting proximity, these new digital tools claim an ability to construct a digitalized geometry capturing and representing localized needs and perspectives by scraping posts and likes from social media platforms and messaging services in a delimited geospace, effectively making the granularity of data an argument for decontextualization.
We argue that this framing of (big) data as a shortcut to localization presents a new development in the perpetuation of epistemic injustices and logics of paternalism in humanitarianism, as cultural context is deconstructed and recontextualized through the granularity of big data (Squire and Alozie, 2023). Specifically, we show that this form of datafied localization has the potential to exacerbate existing unequal power hierarchies in humanitarianism through three main concerns: the fabrication of contextualization; representation at a distance; and the reproduction of power imbalances. We analyze these three main concerns through an analysis of existing literature with examples from the humanitarian literature. Methodologically, the article builds on conceptual work on the entrenchment of big data practices within existing unequal practices of power and extensive fieldwork in places of humanitarian disaster (Clausen, 2019; Fejerskov, 2022). The conceptual work is complemented with data from online public sources and informed by insights gained from attending conferences, meetings and events hosted by institutional actors where so-called thought leaders, including AI practitioners, data analysts, CEOs, and representatives from the humanitarian sector meet and discuss how AI can be harnessed to meet future challenges.
The rest of the paper is structured as follows: We start by introducing key debates on localization and big data related to distance and proximity in humanitarianism, outlining how this article places itself within a long-standing debate in humanitarianism on the impact of distance and proximity on relational ontologies of connection that dictate the manner, extent, and ways in which individuals extend assistance to others—a perspective aligned with Bauman (1990) and Levinas (1979) conceptualization of proximity in morality through concepts such as “face” and “face-to-face.” Next, we tease out the three main concerns that we identify as particularly relevant for how the mobilization of big data to convey local perceptions of needs can perpetuate epistemic injustices and power imbalances in humanitarianism. First, the fabrication of contextualization as a form of recontextualization that allows an understanding of big data as something that travels pristinely from one context to another and across sectors. Second, issues related to representation at a distance through the implementation of services specifically aimed at capturing unbiased and representative beneficiary preferences, while the last section discusses the reproduction of power imbalances as implied in localization efforts that are primarily market driven and defined by actors in the Global North. In the conclusion, we bring together the core concerns outlined in the previous sections. While these are emerging trends, the article's analysis aims to inform future inquiry and debate on the ways that datafied localization might reproduce exclusionary practices in humanitarianism.
Datafied localization in humanitarianism
Distance and connectivity have long been central topics in humanitarianism, especially in discussions of humanitarian communication that aims to bridge time and space gaps to foster empathy and support (Meyrowitz, 1985; Tomlinson, 1999). Efforts to deterritorialize experiences have focused on making spatially and temporally distant events relatable locally. However, decolonial scholars contend that this approach tends to decontextualize individuals, reducing aid recipients to faceless biological beings. Instead, they argue for including diverse perspectives in humanitarian work to challenge the traditional helper-receiver dynamic (Brun and Horst, 2023; Radice 2022). Localization seeks to move away from this “politics of pity” (Boltanski, 1999) that positioned beneficiaries as being without agency. The localization policy agenda has come to dominate humanitarian discourse as a response to the critique that humanitarianism has been dominated by Western responses to conflicts and disasters, sidelining local actors who have historically received less than 0.3% of formal system funding (Poole, 2018; see also Ayobi et al., 2017; Metcalfe-Hough et al., 2020). Epitomized in the 2016 Grand Bargain 1 , this political agenda aims to empower local communities and local humanitarian organizations through increased funding, capacity building, equitable partnerships, and inclusive coordination platforms.
The backdrop to the localization agenda is an expansive literature that has shown how local participation and leadership create more effective global responses (Honig, 2018; Fox, 2020; Khoury and Scott, 2024). The argument forefronts that proximity to crisis leads to faster and more contextually relevant responses (Cretney, 2015; Khan and Kontinen, 2022; Macrae, 2008). However, an unequal hierarchy between international (Western) humanitarians and locals—a category that itself has been criticized for being reductionist—remains (Shuaib, 2022). Despite rhetoric emphasis on partnership, equality, and commitment to bottom-up decision-making, the literature documents how humanitarian collaborations frequently result in hierarchized relationships where local non-governmental organizations (NGOs) act as subcontractors with limited decision-making power (Obrecht et al., 2022; Kraft and Smith, 2019; Schenkenberg van Mierop et al., 2020). This has led scholars to critique the humanitarian system as perpetuating colonial practices (Barnett, 2011; Melis and Apthorpe, 2020; Pailey, 2020). Consequently, a key debate in the localization literature revolves around the tension between inclusion and transformation (Fast and Bennett, 2020; Pincock et al., 2020). The integration perspective advocates for the integration of local actors into the existing international humanitarian system, making funding and coordination mechanisms more accessible, while adherents of the transformation perspective advocate systemic adaptations to address deep-seated imbalances embedded in colonial or neocolonial dynamics (Robillard et al., 2021). In sum, progress on the localization agenda remains uneven, with divergent interpretations and applications of localization in scholarly discourse (Lucatello and Gómez, 2022; Mulder, 2023; Roepstorff, 2020).
While international humanitarian organizations are limiting their direct physical presence in targeted areas, digital technologies are introduced to maintain a sense of proximity (Tammi, 2022). Humanitarian organizations seek to gain real-time insights into the evolving dynamics of a particular crisis by harnessing a plethora of data sources, including satellite imagery, social media feeds, and mobile communication patterns (Meier, 2015). In this way, technologies like big data analytics are applied to enhance risk assessment, resource allocation, and decision-making in crisis response (Burns, 2014: 52; 2015). The result is the integration of big data into a variety of humanitarian interventions, ranging from customized healthcare systems (Amankwah-Amoah, 2016), real-time environmental monitoring and crisis mapping (Specht, 2020), to the registration of biometrical data points to identify and to track individuals or groups of people (Jacobsen, 2015). But humanitarian actors go further in seeing digitalization as having the ability to simulate “the experience of proximity” (Collinson et al., 2013). The International Committee of the Red Cross (ICRC), for example, does not see the digital as solely a method or approach to deliver aid but as a space to connect and interact—a novel dimension in the proximity and distance nexus of humanitarian action. According to the ICRC: “This involves combining physical proximity and digital proximity with the aim of interacting more closely with affected populations – who are increasingly online – while handling and protecting personal data responsibly” (ICRC, 2020: 5). This introduction of big data into humanitarian efforts has reinforced the involvement of specialized private companies in data extraction and analysis, monetizing (already) generated data, or using opaque methods and methodologies that end up informing the aid response (Taylor and Broeders, 2015). These companies perceive complex datasets as opportunities for efficient business growth, leveraging patterns, market trends, and knowledge (Sadowski, 2019). While our analysis does not delve into the nuances of data control domination, we underscore the intertwined processes of datafication and localization in humanitarianism. These processes reveal the sector's struggle to reconcile distance and proximity, often yielding seemingly contradictory responses. We highlight how the extraction and utilization of big data under the guise of localization hinder the envisioned decolonization of power structures within the humanitarian system. Specifically, we explore three main concerns: the fabrication of contextualization; representation at a distance; and the reproduction of power imbalances.
Fabrication of contextualization
The localization agenda advocates for a paradigm shift towards empowering local actors assuming that proximity enables quicker and more efficient responses to humanitarian crises. However—drawing from critical literature that highlight how a decontextualized understanding of individuals may worsen the unequal relationship between those positioned as helpers and those positioned as receivers in humanitarian contexts, by reducing the latter to “bare” or “biological” life (Agamben, 1998; Brun and Horst, 2023)—we argue that the use of big data to make human suffering commensurable across borders based on an individualist or universalist ontology of needs (Glasman, 2020), can exacerbate unequal power hierarchies in humanitarianism.
Digital universalism promotes digital media as a means of independence from local constraints and the challenges of proximity, representing a form of place agnosticism (Chan, 2014; Loukissas, 2019). In doing so, digital universalism “tends to assimilate the heterogeneity of diverse contexts and to gloss over differences and cultural specificities” (Milan and Treré, 2019: 324). The amount of data, and its very granularity, becomes an argument for its decontextualization. This notion is exemplified by the consultancy firm Premise, which highlights its capability to “capture an unparalleled level of granularity to enable precise analysis and targeted strategies” (Premise, 2024). More specifically, Premise presents itself as offering “an unparalleled field presence” resulting in scalable insights. The company's services are framed as an alternative to the “human-intensive endeavor” of fieldwork, yet one that produces decontextualized knowledge by meeting organizational “data needs across scales – from the hyperlocal to the global – and across social, cultural, physical, and political boundaries” (Premise, 2024). This is achieved by relying on gig workers who deliver localized data by accepting tasks on Premise's data platform, thereby linking the distant platform owner and client with on-demand workers (Couldry and Mejias, 2024). In other words, they operate as if big data produces depictions of reality that are simultaneously decontextualized and empirically situated. This speaks to a broader discussion on the explanatory value of decontextualization, understood as the belief that data can be removed from the situation in which it was learned and collected (Van Oers, 1998). Decontextualization is criticized for having limited explanatory value as it is focused on what is not there, i.e. context, but more importantly, if context is seen as subjective, decontextualization implies actions occurring without interpretation within a specific setting. Yet, that contradicts the idea that context shapes meaning.
McCosker and colleagues (2022) have argued that although big data can seem void of context, all data is local, embedded in socio-technical, cultural, and organizational contexts (McCosker et al., 2022: 2). Consequently, we argue that the representation of humanitarian crisis from a distance through big data can be understood as abstracted representations of people and social phenomena. This constitutes a fabrication of context or a recontextualization rather than decontextualization. Recontextualization signifies a shift from viewing big data as contextless to seeing big data as offering an image of an empirical reality crafted from real-time microdata, rich in detail but detached from specific geographical locations. Instead, the digitalized reality reflects power structures and a political economy, where those collecting data come to define the boundaries of the visualized reality. This can take place at a macro level where big data offers the opportunity to re-visualize countries and populations through remote data collection techniques such as social media scraping. Taylor and Broeders (2015) describe how a so-called “data for development” project in Côte d’Ivoire produced rhetorically powerful visualization of an optimized transport network, but since it was based solely on mobile users, it offered only a partial image of the transport network that was then scaled up without considering local context (Taylor and Broeders, 2015: 234; see also Burns, 2015). In the same way, both Premise and Quilt.AI assume the ability of big data to create digitalized “data doubles” of reality (Haggerty and Ericson, 2000). For instance, Quilt.AI used big data from online searches and public posts to analyze changes in violence against women during COVID-19 in eight Asian countries in a partnership with UN Women and UNFPA. The findings are described as “proxy information regarding general trends,” implying that the abstracted data creates a proximation of reality from the distance. This proximation is then used to inform generalized recommendations for action across the eight countries with potential localized consequences (Quilt and UN Women, 2021). This shows how abstracted data, scraped from a distance, travels into localized contexts where it informs actions based on its granularity and visualizable character.
Yet, this is a constructed reality shaped by algorithmic processes controlled by corporations, where each data point is extracted from its original context into an alternative meta-reality. Consequently, context becomes a collection of data points assembled by the organizing algorithm, cultivating a specific vision of reality. While big data is often presented by digital humanitarians as empowering and as dismantling exclusivity dynamics in humanitarianism by incorporating messaging and microblogging data from nonprofessional participants during disasters (Mulder et al., 2016), this perspective overlooks that not all individuals use digital tools universally, particularly during crisis. Moreover, it fails to consider that global platforms such as Facebook may be used differently depending on the context (Costa, 2018) or that linguistic studies suggest that words and phrases carry diverse connotations depending on their context of use, as discussed in more detail in the next section. But while Quilt.AI refers to bias in AI outputs as “the elephant in the room,” pointing to the “inadvertent Western-centric lens” that follows from the fact that AI is predominantly trained on English digital data, data collection online is still presented as less biased than “conventional research initiatives” without further explanation (Quilt, 2024). The approach utilizes the internet as a repository of human data, free to be indexed to identify patterns, sentiments, and cultural nuances in large volumes of online data. It does not confine itself to the abovementioned forms of geospatial analysis, but also crosses into the most intimate sphere by using AI powered tools to deconstruct complex social interactions to deliver analysis that is cast as “seamlessly recognizing objects, emotions, and even cultural nuances” embedded within video (Quilt, 2024). In this way, even human bodies and emotions can be abstracted from their contextual setting and separated into discrete series of data points. This suggests that despite bias and unequal access, big data represents decontextualized information capable of traveling unchanged from the northern to the southern hemisphere and from one context to another.
But data needs to be interpreted to become knowledge and the diversity of local cultures, phrases, and media usage renders the adaptation of universal principles to local contexts exclusionary (Micheli et al., 2022). Thus, the data-driven aggregated classifications suggested by consultancies such as Premise and Quilt.AI create generalizations that may exclude marginalized perspectives. This means that an image of the world is produced from a specific cultural or historical vantage point, leaving out that which does not fit into the established system of standardized classifiers, including those of users (of interest, accessible) and non-users (not of interest, inaccessible) (Bechmann and Bowker, 2019). This is particularly significant given the growing recognition that comprehending events, actions, and crises within their broader cultural, socio-political, and environmental contexts enhances the cultural appropriateness and sustainability of response and recovery efforts (Frennesson et al., 2022). Moreover, as data is moved from one group of contributors to another, individuals directly impacted and locally situated are often excluded from the flow of information and interpretation processes, diminishing their role in co-creating context. Reducing beneficiaries or local communities to mere providers of extractable data contradicts the goals of localization as a paradigm shift aimed at empowering local actors in humanitarian endeavors. In this way, the digitalized rendition of reality, shaped by big data, constructs a singular artifact of local knowledge, which is perceived as coherent and static, thereby overshadowing the existence of multiple, often conflicting, local knowledges. This process of fabrication not only recontextualizes humanitarian crisis but does so from a specific viewpoint, granting the ability to define the boundaries of the AI-created reality squarely in the hands of those who control the data, algorithm, and subsequent humanitarian action.
Representation at a distance
Representation plays a crucial role in humanitarian action by ensuring the inclusion of diverse voices and perspectives in decision-making processes, thereby enhancing the appropriateness, cultural sensitivity, and effectiveness of crisis responses. Local representation, in particular, fosters accountability and legitimacy by reflecting the needs and priorities of affected communities, aligning with the key priority of transferring responsibilities, capacities, and resources to local actors within the localization agenda (Frennesson et al., 2022). This imperative extends beyond efficient disaster management to uphold fair representation as a normative ideal, addressing broader discussions on rights and justice (Heeks and Shekhar, 2019).
The issue of representation has evolved from a universal imperative to assist those in need to a focus on empowering beneficiaries. This shift has gradually gained ground since the 1970s, when it grew out of criticism of the starving-child appeal or
The multifaceted nature of the terms “localization” and “local actors” presents challenges with diverse definitions reflecting the lack of consensus within the humanitarian community (Barbelet et al., 2021; Wall and Hedlund, 2016). One challenge stems from the inherent relativity of the concept of “local,” intertwined with the spatio-geographical, social, and identity distinctions within crisis-affected nations. A static understanding of “local” as bound to a specific space or locale struggles to encompass diaspora, migrants and internally displaced people, something that has spurred calls for a critical localism, viewing the local as highly contextual and relational, focusing on the processes through which this
In the digitized and datafied geometries constructed by companies such as Premise and Quilt.AI, the question of defining the local becomes even more pertinent. How are the boundaries of the local defined and delimited in a recontextualized version of the local? When the consultancy Premise describes its methodology as a “type of nonprobability sampling that is hyper efficient in dynamic events, involving the sample being drawn from the population that is ready to assist” that produces a “ground-truth,” one is tempted to question how representative of a diversity of local voices this approach can really be (Premise, 2024). Showing the weakness of having proximity as a main factor of determining localism, the individuals engaged through Premise's systems may be somewhat near humanitarian situations, but that does not mean they are part of these or understand those who are. Rather, this seems to align with Krause's (2014) compelling description of the humanitarian field as a quasi-market in which beneficiaries become “the means to an end” (Krause, 2014: 40). This is underlined by the delegation of collection and analysis of big data to private companies, whose main focus evolves around making profit. It is thought-provoking that the United States (US) Census Bureau, in the lead up to the 2020 US census, ultimately recommended against using third-party databases and administrative records to find data on so-called “Hard To Find” or HTF populations, citing racialized disparities in those databases to conclude that they might exacerbate the problem of non-representation (Gilman and Green, 2018), while the humanitarian sector continues to expand its use of similar practices.
The introduction of big data in humanitarianism has also accentuated the digital divide across access, usage, and outcomes, especially for companies such as Quilt.AI that depend on social media for construing their geometries of local needs and interests. It is well-documented that under-privileged groups tend to benefit less from digital technologies than privileged ones (Blank and Lutz, 2018; Van Deursen and Helsper, 2015), a finding that is sometimes referred to as first, second, and third-level digital divides. The first-level divide pertains to access to technologies and connectivity, with internet penetration in Ethiopia, for instance, at around 17% compared to Western internet penetration rates of around 90%. This discrepancy can worsen if analysis relies on platforms with limited usage, such as X (formerly Twitter), utilized only by a small percentage of society (Blank, 2017). Additionally, accessibility might be limited by the lack of availability of platforms in local languages like Swahili, Amharic, and Hausa, spoken by hundreds of millions of Africans. Overall, social media use exhibits notable inequality across gender, age, socioeconomic status, and attitudinal variables (Kvasny and Keil, 2006; Lutz, 2022), with some platforms operating with limited transparency or representational obligations. This inequality is further perpetuated in the second tier, which denotes more complex divides in capabilities, skills, and practices, and the third tier, which alludes to outcomes from internet use and the ability to translate online capabilities to offline benefits, resulting in certain individuals being better equipped to leverage digital resources to strengthen their social and economic positions (see van Deursen and van Dijk, 2015).
In 2023, Premise launched a “rapid-response sentiment survey” to understand grounded Ethiopian opinions about the country's suggested agreement with Somaliland on access to the sea. Utilizing “convenience sampling,” a methodology that collects data on primarily based on availability, the company drew conclusions on how the broader Ethiopian public perceived the deal with Somaliland. However, convenience sampling is widely criticized for biases arising from under-representation. When convenience (i.e. access) becomes the sole criterion for inclusion, there is no mechanism to screen for sampling biases, raising doubts about both internal and external validity—and when informants are effectively approached as users and gig workers in a market, biases are greatly skewed toward those with some resources in the first place (Taylor and Broeders, 2015). Additionally, as mentioned above, internet penetration is approximately 17% in Ethiopia raising questions of the method's overall representativeness. The primary concern appears to be speed and accessibility, with traditional methodological and substantive representation taking a back seat, implying that the sheer volume of data is considered sufficient to ensure representativeness in big data analysis (Madianou et al., 2016). In this way, the introduction of big data for localization not only sidelines on-the-ground engagement with local actors or communities but also allows humanitarian organizations to continue speaking on their behalf. Ultimately, these concerns tell the story of social media as a locus of inequality and under-representation, reflecting the social or political marginalization of certain communities or groups, aligned with virtual marginalization within datasets or a fundamental lack of data. The emphasis on localization-from-a-distance driven by big data is likely to blur the distinction between local elite views and an interpretation of “the local” as a fixed location whose concerns can be readily extracted, transported, and interpreted across distances. Importantly, reducing representation to data entry undermines the intention of localization, which is to shift real decision-making power to local communities.
Reproduction of power imbalances
Localization aims to reconfigure the humanitarian system by bolstering local decision-making power and agency to challenge entrenched hierarchies. While often framed to enhance the reach, effectiveness, and accountability of humanitarian action, localization ideally serves as a decolonial or social justice endeavor, reshaping assistance dynamics and power structures (Roepstorff, 2020). However, some scholars view data practices as “extractive,” echoing historical colonialisms’ appropriation of land and resources (Couldry and Mejias, 2019, 2024; Madianou, 2019; Sandvik, 2023), leading to concerns that the introduction of big data into localization efforts risks perpetuating power imbalances by reducing local communities to mere data suppliers or by bypassing local humanitarian organizations through the supposed direct link enabled between headquarters and people of concern.
The ongoing digital transformation of humanitarianism and the shift towards localization have prompted discussions about the skills required for frontline humanitarians to implement technology-driven solutions (Frost et al., 2022). This process can be supported by industry-focused localization, understood as “taking a product and making it linguistically and culturally appropriate to the target locale (country/region and language) where it will be used and sold” (Esselink, 2020). This is envisioned as a way to mitigate the consequences of the growing gap between the complexity of digital technologies being rolled out by international humanitarian organizations and the level of digital literacy among local partners (Frost et al., 2022). This gap is further exacerbated by the prevalence of short-term funding structures that local organizations often operate under (Ghorkhmazyan, 2022). Yet, while datafied localization can enhance technology's reach by tailoring it to local conditions, it should be distinguished from the broader humanitarian localization agenda. For instance, datafied localization might facilitate remote management techniques, maintaining local organizations in contractual relationships with international donors and potentially reinforcing existing power imbalances (Elkahlout and Elgibali, 2020). Elkahlout and Elgibali (2020) refer to “remotely managed localized humanitarian action” and use the case of Syria to establish that remote management can facilitate localization, although with ethical and legal risks for local NGOs.
Owners of gig apps often claim that their platforms are benefiting society such as when Premise presents their end-to-end solutions for international development as “empowering local citizens to gather actionable data” (Wilson, 2024). However, the result often facilitates remote surveillance that addresses “the challenge of humanitarian and health product diversion from donors’ most geographically dispersed and highly funded activities” (Wilson, 2024). In this way, instead of rectifying power imbalances in the responder-beneficiary dynamic, these tools may exacerbate them (Brun and Horst, 2023: 67; Madianou et al., 2016). This type of remote management can exacerbate unequal power dynamics by shifting risks onto local partners (Duclos et al., 2019), and introducing new issues related to organizational accountability, risk management, and forms of ignorance (Fejerskov et al., 2023). While local partners might undergo digital literacy training, it often concentrates on specific tools and applications, rather than building their overall capacity to utilize digital technology and data effectively and independently. Consequently, this approach may relegate partners to passive recipients of narrowly focused “capacity-building” initiatives, limiting the potential for sustainable transformation (Frost et al., 2022: 14), thus enabling international actors to retain decision-making power and control in the guise of localization efforts (Barbelet et al., 2021).
Digital technologies have also been lauded for their potential to amplify the voices of affected people and hold humanitarian agencies accountable by cutting the middleman—disintermediation. This can include employment opportunities as big data collection utilizes microwork tasks, which involve simple, easily outsourced data entry and “human computation” activities (Le Ludec et al., 2023). These tasks are tailored by development partners and assigned to individuals known as “data contributors.” An example of this is Premise's gig-work platform which, according to Premise itself, has 6 million so-called contributors, who earn a small income by completing brief surveys or similar, with amounts as low as 0.10 USD. Consequently, Premise effectively acts as a subcontractor for humanitarian organizations, further subcontracting to local communities. In this setup, which is presented by Premise as part of its mission to “democratize data collection” by making it possible for everyone to have their voices heard and “earn a living through development work,” the “local” is defined as a member of this “community” (Wilson, 2022). However, this type of subcontracting model has been criticized for potentially perpetuating power imbalances, as it may impede local decision-making and equitable financial compensation (D’Arcy, 2019; de Geoffroy and Grunewald, 2017; Howe and Stites, 2019).
Just as the inclusion of certain groups and their relation to specific tasks in the humanitarian information chain enables forms of exploitation or extraction, hierarchies are also strongly formed by efforts of
These examples illustrate how the introduction of technology can enable more efficient extraction and commercialization of data by private entities predominantly located in the Global North.
This dynamic of technocolonialism (Madianou, 2019), wherein powerful technology firms and developed nations exert control over less technologically advanced regions, often results in digital dominance, enabling large corporations to become data monopolies and shape the interpretation and contextualization of information. Locals are reduced to data producers through gig work, thereby becoming part of territories from which data can be extracted and exploited from the distance. In these relationships employer responsibility is fragmented across long supply chains and total power lies with the client (Couldry and Mejias, 2024: 94–95). Hence, instead of challenging the foundational power imbalances of the humanitarian system that permit some people to extract data from others, datafication makes the process of extraction even more efficient. This trend reflects a broader shift towards what scholars have termed “information capitalism” (Parayil, 2005), “surveillance capitalism” (Zuboff, 2015, 2019), “digital colonialism” (Kwet, 2019), or “data colonialism” (Aitken, 2017; Sadowski, 2019; Thatcher et al., 2016). Such literature has problematized the extraction of valuable data without corresponding compensation or benefits provided to the communities from which the data originates (Couldry and Mejias, 2019, 2024; Madianou, 2019). Moreover, this extraction might occur without individuals or communities fully understanding, giving informed consent, or receiving any benefit. In humanitarian contexts where datafication becomes essential, individuals may heavily rely on aid, lack agency, or alternative options, blurring the lines of consent and benefit (Cieslik and Margócsy, 2022). This is not limited to data extraction but also encompasses ownership over data, as evidenced by ongoing struggles surrounding storage, processing, and handling of data within specific geographical boundaries or jurisdictions (Clausen, 2021; Gazi, 2020). One particular concern revolves around the accessibility of data to private companies for the advancement of third-party technologies, codes, or algorithms, especially within contexts of dependence (Martin and Tazzioli, 2023). The transfer of data to remote servers can potentially offer enhanced data security, but it risks undermining data ownership and consent rights, underscoring the need for comprehensive discussions on data agency and legal frameworks within humanitarian contexts (AccessNow, 2024).
In this limited perspective of local inclusion facilitated by subcontracted private companies, there is a lack of transformative intent that disregards local knowledge and experiences. The primary function assigned to the “local” is to provide the desired data point in a format dictated by others, rather than contribute to design or decision-making processes. This dynamic can further exacerbate power imbalances, as these infrastructures embody a specific way of structuring the world. Such issues extend beyond the data collection process and into data analysis, perpetuating perceived Western superiority in knowledge hierarchies (MacRae, 2008; Pailey, 2020). Thus, despite the emphasis on empowerment and balancing out hierarchies between international humanitarians and local communities, datafied localization risks reinforcing inequalities through the extensive involvement of the private sector—utilizing their data, technologies, or algorithmic systems. In the humanitarian sector's pursuit of bridging distances in humanitarian response through big datafication processes, it risks reproducing power imbalances by reducing local individuals, communities, and organizations to mere data producers or precarious subcontractors with limited decision-making power.
Conclusion
In this article, we have used insights from the humanitarian literature on distance and proximity to argue that the use of big data as a shortcut to localization risks intensifying existing hierarchies between international and local actors in humanitarianism. We have done so by examining three main concerns that emerge from the juxtaposition of localization and big data as working in conjunction to empower local populations: the fabrication of contextualization, representation at a distance, and the reproduction of power imbalances through big data(fication). We have argued that data points are artifacts linked to specific geographical sites, challenging the notion that big data unproblematically allows for depictions of reality that are simultaneously empirically situated and decontextualized. Rather, the process of knowledge production, in which big data is transformed into actionable knowledge, is a fabrication of contextualization from a distance, dominated by a certain ontology of power that reflects the western-led humanitarian system. Additionally, we have built on how localization has gained traction as part of a humanitarian strategic intent to delegate humanitarian tasks and decision-making to local actors. However, we argue that when community aspirations are filtered through issues of digital connectivity and provers, representation suffers as not all groups or individuals are assigned the same level of visibility in the global digital system. Finally, when local communities are reduced to data producers in a digitalized global humanitarian system dominated by Western humanitarian organizations and companies, datafied localization can enable international actors to retain decision-making power and control, from a distance, in the guise of localization efforts.
In combination, these arguments show how the mobilization of big data to convey local needs from a distance—what we call datafied localization—can perpetuate epistemic injustices and paternalistic practices in the humanitarian field. We see this as a nascent trend where the consequences are still to fully emerge. The article has therefore aimed to make the case for more critical reflection before the potential of big data as a means of localization can be uncritically accepted. This is not just an abstract discussion, as big data increasingly defines and directs intentional action, shaping the design and execution of development and humanitarian programs and efforts. As we accentuate, big data does not remove questions of contextualization, representation, and power hierarchies. Instead, questions of what data is, who it represents, and what it shows remain a significant structuring force for the delivery of humanitarian support. We have used recurring examples from two specific consultancies, Premise and Quilt.AI, but see these and the services they offer as part of a broader trend in which humanitarian sector leaders such as UN bodies, the World Bank, or the Gates Foundation are seeking out potentialities of datafied localization. By pointing to this emerging sphere in humanitarianism where AI, machine learning, and algorithms are presented as a shortcut to localization, we hope to advance the discussion on distance and proximity in both critical data studies and in humanitarianism. These processes speak to the intersections between specification and aggregation—contextualization and abstraction—and we have shown here how they risk reproducing power asymmetries in ways that run counter to the core intentions of localization. While localization through datafication may ostensibly be seen as a way to acknowledge local voices, we show that the modes in which it is currently conceptualized can have the opposite effect. The centralization of not just decision-making but also of the data that informs it reminds us that centralization is not typically a form of deterritorialization but rather a Westernization—not towards a universal global level, but towards another form of the local, specifically the capitals of donors and development partners.
