Abstract
Suddenly, [an internet user] is silently captured in a database and will soon receive information through the mail tailored to specific interests. What was learned cruising the internet has been vacuumed, and converted to a targeted selling proposition.
Donald Libby, “Cruising and Vacuuming the Internet”,
Cited by Turow (2006), p.74.
As the above quote shows, since the 1990s, marketing professionals have seen the emerging Internet as a means of profoundly renewing their approaches and tools. As datafication makes individual behaviors measurable (Gitelman, 2013; Kitchin, 2014), marketers have been considering digital technologies not only as a new communication channel, but also and more importantly, as a new field of experimentation for calculating consumers and developing new marketing techniques and strategies (Beauvisage and Mellet, 2019). I will focus here on predictive algorithms, whose current momentum, 25 years after the first email marketing campaigns, perpetuates the digitalization of marketing methods.
Indeed, during the last few years, a new kind of algorithms, often labeled as “artificial intelligence” but more precisely belonging to the field of machine learning, has spread at an unprecedented scale. Their main feature is their ability to make statistical predictions from very large sets of heterogeneous, possibly unstructured data. 1 Born in the 1960s at the margin of the statistics field, these algorithms are now enjoying a wide diffusion thanks to the increase of computing power and to the datafication of society (Jones, 2018; Plasek, 2016). Now designed not only by mathematicians, but also by a new population of “data scientists, programmers and hackers” (Cardon et al., 2018: 210), this new kind of algorithm has considerably expanded the realm of predictive computing (Mackenzie, 2015).
Within the marketing field in particular, specialized service providers (startups, software vendors and consulting firms), as well as in-house data scientists, are increasingly using machine learning to predict consumer behavior for large companies. A customer’s interest in a given product, their risk of churn, or fraud, is predicted through algorithmic models that assign probabilistic scores to individual customers. These scores rely on data describing their past behaviors, whether collected from the web, or already held and stored in the companies’ internal databases (Alemany Oliver and Vayre, 2015).
These methods, which I here refer to under the generic term of predictive marketing, are defined by their promise of individualized customer knowledge, much more granular than knowledge produced by traditional market research (questionnaires, focus groups, etc.). They are part of the longstanding project of personalizing market strategies, which traces back to the 1920s (Lauer, 2012), but has been given new prominence since the 1990s (Turow, 2006). Predictive marketing also borrows from well-established customer scoring practices, which originated in the banking sector to rationalize credit (Poon, 2007), and have been used since the 2000s in customer relationship management (Benedetto-Meyer, 2014). By systematically modelling a very large number of variables, these algorithms are said to enable the anticipation of individual behaviors, and thus produce a much better match between goods and services and customers.
How do these predictive algorithms for personalized marketing change the way corporations know and act upon their customers? As “personalization” constitutes the horizon of countless contemporary algorithmic devices (Lury and Day, 2019; Mackenzie, 2018), I study how this general promise unfolds in action, analyzing at a fine-grain level the discussions and material practices of the actors involved in the conception and use of predictive marketing algorithms. Drawing from STS and the sociology of quantification, I consider the epistemic and political consequences of these very practices and assemblages on the production of consumer knowledge (Bowker, 2005; Diaz-Bone and Didier, 2016; Espeland and Stevens, 2008). In particular, this qualitative study contributes to the reflections on the new forms of social ordering performed by Big Data analytics, and how these technologies shape and account for individuality (Bolin and Schwarz, 2015; Couldry et al., 2016). It focuses on the original ambivalence of personalized algorithmic marketing, which draws on both instrumental and humanistic arguments, as it simultaneously aims to optimize market strategies, and to take better account of
To this end, I rely on 24 semi-structured interviews with data scientists, data engineers, client advisors and marketing analysts, conducted in different settings: the “data labs” and marketing departments of a major telco company, a retailer, a banking group, and a startup. 2 These interviews lasted between 1 and 2 hours, and were mainly focused on the technical tasks and choices realized by these various professionals, based on accounts of actual cases. I also attended a series of three online training seminars organized by this startup, and accessed a number of professional and commercial documents (whitepapers, presentations, brochures, etc.). This in-depth qualitative material is essential to move beyond hype and commercial statements made by Big Data analytics companies; it allows to understand “algorithms in practice” (Christin, 2017), i.e. the material settings in which they are conceived, experimented and interpreted.
In this paper, I specifically focus on two distinct cases: the data lab of the banking group, which I will call “The Bank”, and a predictive marketing startup, which I will call Predicto. While these two organizational contexts differ, data scientists from both the Bank’s data lab and Predicto find themselves in a position where they strive to articulate the requests addressed to them, and the pre-existing data infrastructures of their (internal or external) clients: customer databases, purchase records, data management platforms, etc. The cases presented, set in the worlds of banking and insurance, make it possible to observe how predictive marketing takes place in universes that have historically been populated (or even saturated) with all kinds of calculations (Lazarus, 2012; Porter, 1995). Despite this specificity, the results developed in this paper have a more general reach, as the practices studied here are rather similar to what I observed in other domains where predictive marketing was used (namely retail and telecommunications).
I show that there is more to algorithmic predictive marketing than the issues of surveillance and control raised by many critical studies of Big Data. In contrast, this paper aims to understand the paradox posed by contemporary
In the first section, I discuss earlier research on personalization through Big Data analytics, highlighting the theoretical framework and contributions of this paper. The next three sections describe how predictive algorithms are successively negotiated, tuned and interpreted within collective practices and how it affects the conception of the individual customer. The conclusion discusses the implications of these results and suggests future lines of research.
Personalization as a disputed moral ground
Personalization is the cornerstone of contemporary algorithmic devices. What we buy, the news we read, the music we listen to and so many more components of our everyday lives increasingly depend on algorithmic suggestions, supposedly tailored to fit our personal interests. What Gerlitz and Helmond (2013) call the “like economy” has extended its roots deep in our daily behaviors, and allows algorithms to classify and treat people according to the digital expression of their tastes, thus intertwining profoundly our
Personalization for commercial uses has mostly been criticized in Big Data scholarship. Many authors see it as a mere cynical communication artifice, a “fairytale vision” used by marketers to dissimulate the “algorithmic manipulation of consumers” (Darmody and Zwick, 2020: 2). In this perspective, personalization would essentially mean a strengthened grip on individual behavior and the optimization of marketing strategies (Beckett, 2012). Thanks to the increasing computability of a datafied world, marketing would now consist of a constant surveillance of individuals’ actions, colonizing everyday more aspects of social life (Pridmore and Zwick, 2011). This “new marketing paradigm” (Arvidsson, 2002) would thus allow unprecedented control of companies over their consumers, through sophisticated algorithms and ever-increasing data volumes. This configuration is sometimes even described as an emergent fully-fledged modality of capitalism (Srnicek, 2017; Zuboff, 2018).
Importantly, this line of research considers predictive algorithms as the advent of a new mode of government that would definitively dissolve the very reference to the central figure of our liberal democracies and economies: the individual subject. In a Deleuzian perspective, Rouvroy and Berns state that “the [
These critical approaches raise significant questions, drawing attention to the dynamics of power and value extraction at stake in the algorithmic shaping of the social world. Nevertheless, they present two important limitations that this paper aims to address. First, many contributions on this subject mainly rely on various publicly available material, such as conferences, news stories and public communication of Big Data analytics companies. They lack first-hand empirical accounts of how algorithms are designed for consumer measurement. This leads to a sometimes general and relatively mundane denunciation of surveillance (Castagnino, 2018), that blends very different types of actors (from the overly famous GAFAM to data brokers, historical software vendors, startups, etc.), and in which the power of algorithms is often taken for granted, rather than empirically described (Beer, 2016).
Second, and somehow consequently, these contributions tell us little about what it actually means to build and use algorithms for personalized consumer knowledge, and what it does to the shaping of the consumer. In particular, they largely overlook the practices and motivations of data scientists and other professionals involved in these activities that cannot be reduced to a mere instrumentalism. As shown by Mackenzie (2018) and Thévenot (2019), personalization has become an essential moral underpinning of Big Data quantification devices, justifying practices, on the users’ side (Siles et al., 2020), but also on the part of developers and practitioners. Putting aside “common-sense distinctions […] between ethical critics and unethical practitioners, positivist programmers and interpretive ethnographers, and so on” (Moats and Seaver, 2019: 3), this investigation aims to account for the varied ways in which predictive marketing algorithms are collectively adjusted to fit with local forms of customer knowledge (Kiviat, 2019). It describes how data scientists and their interlocutors constantly strive to produce rich, detailed accounts of their clients’ identities, by accumulating thick data about their biographies, preferences and the events in their lives. As suggested by Matzner (2019), the algorithmic sorting of people does not necessarily erase the figure of liberal individuality, but rather reshapes it, in a way that we need to analyze in detail.
This paper thus studies how actors resolve in practice the tensions of algorithmic mass personalization, between the massiveness of calculation and the promise of personalization, and what it does to the way clients are represented and acted upon by corporations. To this end, in the continuity of an emerging research stream, I follow as closely as possible the actors involved in the design and circulation of algorithmic models. I thus study predictive marketing as an activity, inscribed in an ecology of practices, rooted within organized work collectives (Christin, 2017; Jaton, 2017). Such an approach goes beyond a strictly semiotic analysis of algorithms and their general spirit, or “fetichization of code” (Chun and Kyong, 2008). It allows to account for the numerous negotiations and frictions that occur in the design and circulation of predictive marketing devices (Bechmann and Bowker, 2019; Seaver, 2017).
Summoning the world and embedding it into algorithms
I will first show that predictive algorithms do not act
Data marketing as a science of “life” and a relational crutch
This attention to people stems from what might be called the humanistic ambition of personalized marketing, shared by the employees of the data lab, a structure created in 2015 within the Bank’s marketing department. 3 Its Scientific Director, 42 years old, worked in several large industrial companies before joining the Bank in 2015. He particularly values the variety of problems facing his profession, all of which, in his opinion, have a common denominator: “Data science really deals with every aspect of our lives. And for me, what really interested me was the human being, who is at the center of it all”. 4 Among the many solicitations that receives a man of his experience in the young and trendy field of data science (Brandt, 2016), the banking sector was not the one that attracted him most, at first. He was nonetheless convinced by the variety of problems related to the Bank’s relationship with its clients. “Ok, [working for the Bank] is about banking, insurance, finance, etc. But there is also security, publishing, real estate… In short, everything that affects a person’s life”. 5 The epistemic plasticity of data science and its ability to pervade a wide variety of social worlds (Dagiral and Parasie, 2017) underpin the trajectory of this interviewee, who sees his work as solving the many different problems of people’s lives.
As a result, many of the activities and projects deployed by the data lab are described as a way to improve the quality of commercial relationships, both for customers and customer advisers. Indeed, the Bank’s client advisers have to manage portfolios of several hundred clients each, which makes individual follow-up difficult. Data science is therefore considered here as the means to better achieve the objective of a quality service relationship, supposedly based on the knowledge of the client’s life trajectories and projects (all the more so in the banking sector, a world of long-lasting commercial relations). The scores thus constitute a form of crutch, or support, for interpersonal relation. As infrastructural devices, they capitalize on the clues needed to organize and prioritize the relational and commercial work of advisers. As a result, data lab employees see their activity as quite distinct from the “ The [
The client and her spokespeople
Taking people into account as much as possible then implies a particular organization which consists, as for Pasteur in the canonical article by Bruno Latour (1983), in transporting the world into the closed and controlled enclosure of the (data) laboratory. During what Slota et al. (2020) describe as the “prospection” phase, data scientists audit existing data infrastructures to gather and organize usable data for their projects. To this end, they organize meetings where multiple authorized spokespersons from the world of the customers are invited, in order to include their points of view in the design of measurement projects. The idea is not to start from scratch; it is to contribute something else, something complementary, to enrich current knowledge. So we ask them: for this output, this variable that we want to predict, in your opinion, what could be the relevant variables? What are the indicators that you are used to build? Even beyond existing data, we often ask the question of the interest for a product, for example, why a customer might be interested? So we make lists. And then we’re going to prospect: does this type of information exist in internal databases? Can we look for it in external databases? Scientific Director of the Data Lab (interview, July 2018) What I call the The The The
This polyphonic arrangement guaranteed the representation of different modes of relationships between customers and the Bank, and brought out the ordinary forms of client knowledge, by asking participants questions such as: which product might be of interest to which category of customer? What indicators already exist? etc. It also made it possible to gather the opinion of professionals on the content and properties of future algorithmic models, the variables deemed relevant, or the data they would like to use – and of course, the legal and technical constraints regarding the use of said data. We can thus observe the relative epistemic modesty of data scientists, who actively solicit the expertise of the usual custodians of the problem they seek to address through computation.
What concrete variables should then be taken into account to define the “financial difficulties” whose occurrence they were trying to predict? For data scientists, the obvious thing to do was to define these difficulties based on the variables contained in the litigation databases: late payment, overdraft, etc. According to them, such events, faithfully recorded in a dedicated database, were a rather plausible indicator of a customer’s difficulty. However, this first version of the predictive model, which was submitted to the assessment of their colleagues, was widely criticized by customer advisor representatives. We started with that, because it’s something that is traced in our information systems. But what we want to do is to avoid getting to that point, so we’re discussing with the advisers, because this was the first deliverable they received after the various workshops. And now we think we’re getting in rather late. […] What the advisers told us is: we want to avoid being in front of a client who is already in default, etc. This client, I can’t help anymore: it becomes a litigation, a debt recovery. Scientific Director of the data lab (interview, July 2018)
The way in which the early stages of designing algorithmic models were organized shows that an essential condition for the epistemic as well as organizational success of predictive marketing devices lies in the way data scientists make themselves permeable to the varied knowledge locally held by their colleagues. In this case, data scientists and client advisers negotiated the person that the algorithm had to model and predict. At first, the model predicted the wrong person, a client already in too much distress to be rescued. Training data had to be adjusted and redefined in order to predict a still-rescuable client. The action of algorithms on the world of people is therefore not only an effect of the scores and rankings they produce: it intervenes early on, by anticipation, in their very design.
When blindness is robustness
We will now study the precise moment when personalization becomes massive, i.e. when data centralization and unification make it possible to calculate “likenesses” between large populations of individuals (Seaver, 2012). Until now, making “good” predictions meant paying close attention to the many facets of the client. Once the predictions are imported into the data lab, however, building effective predictions require data scientists to deliberately “blind” themselves, through dedicated procedures, to specific traits of people described in the selected data. Only then is it possible to perform calculations whose heuristic value can be transposed outside the data lab. We will thus see that the modeling of clients does not mechanically result from “feeding” or “throwing” data to the algorithm (Amoore and Piotukh, 2015), as a recurring food metaphor would suggest, but from the work of enrichment and selection of variables deployed by data scientists.
The first challenge for them is to build the training environment in which machine learning algorithms will explore the most significant correlations, between a set of descriptive variables of customer behavior, and one (or several) “target variables” (for example: the customer’s future financial difficulties). To this end, they need to unify multiple and heterogeneous data sources, often stored in variously accessible databases: transaction records, purchase histories, calls to technical support, litigation, current contracts, etc. In this way, data scientists seek to constitute what they call a “master file”, or “ground truth” (Jaton, 2017), a table in which each customer is described by one single line, with its multiple descriptive variables in columns. This table materializes unique, calculable and therefore predictable individual behaviors. The challenge here is to transform the fragmented and scattered presence of clients into individualized profiles, which is similar – in its spirit, if not in its technical means – to the work of rearranging individual files described by Denis (2018). 9
Once the master file has been compiled, it is then paradoxically important for data scientists to “blind” the algorithms to the persons represented in it, in several ways. First, two voluntary blinding measures consist of anonymizing the lines representing clients, so that data workers cannot access nominative personal information. The other one is to make the gender variable disappear: while machine learning algorithms do aim to
The learning algorithms can then come into play. In what is called
In the controlled environment of the data lab, everything goes quite smoothly: provided with a large amount of data and variables, the machine learning algorithm almost always manages to produce models with a good prediction rate. But how can data scientists be sure that this mathematical rule will retain its predictive power when applied to new, real-world cases, to future customers? This raises the question of the robustness of the model, i.e. its ability to be transposed and used outside the data laboratory.
This is where data scientists deliberately organize the blindness of their algorithms. First, it means hide part of the data available for learning from the algorithm, in order to be able to “regularize” the predictions
But the essential blinding operation here is linked to the risk of An algorithm is always going to be able to say: if he had a red hat and sneakers, he’d take this; if he had this, this and that, he’d buy that. It is able to learn everything by heart. That’s overfitting. That’s when it learns everything by heart. If there are too many specificities, he’ll learn that John Doe buys the Big Mac, and the problem is that if you know that John Doe buys the Big Mac, but one day you’re introduced to someone who’s not John Doe, then you won’t be able to generalize. In fact, you have to find all the common points between different profiles to say: when it’s a similar profile, I know what he’s buying. If you know all the specificities of the person, in fact, you’re not learning a typical profile: you’re learning John Doe. […] You don’t need to know what There are mathematical techniques that allow data to be randomly and artificially dragged down. So the algorithm doesn’t learn well. So that it doesn’t give too much weight to certain values. So that he doesn’t learn everything by heart. Randomly, I’m going to introduce some error. When it tries to minimize the error, it basically tries to minimize the weights so that the error is as small as possible. So it adjusts the weights less finely. It does something coarser, less efficient, but more robust. You take off its glasses, so its vision becomes blurry: it can’t recognize John Doe any more. Data scientist, Predicto (interview, May 2018)

Graphical representation of the regularization of overfitting. 13
The construction of predictions, far from relying on the mechanical “ingestion” of large datasets, involves complex work in which the data are centralized, aggregated and tested by data scientists. Paradoxically, the promise of personalization of data marketing implies during this phase to make the algorithms blind to some specificities of the customers, and sometimes to the most significant data points of their behaviors. The algorithm must not know too precisely the unique individuals included in its training environment, otherwise an overly personalized calculation will subsequently prevent it from effectively predicting the behaviors of new individuals. Algorithmic learning thus puts in tension the attention to the people, which is its horizon, and the need for generalization, of making predictions that will apply to future data, and not only to the people included in the training data.
Interpreting scores, reconstructing explainable consumer figures
Finally, once the algorithmic model is out of the data lab, an important task for data scientists is to interpret its results, which in themselves are not self-evident. They carry out a specific articulation work (Strauss, 1988), seeking to reduce the frictions between the world predicted by the algorithm, and the world of real customers. As in the case of digital advertising (Bolin and Schwarz, 2015), their objective is then to make predictive marketing intelligible and to prove its specific value, which implies linking its results to knowable figures of the consumer and interpretable macroscopic behaviors. This work is all the more important as the predictive model generates apparently incoherent results, as in the two examples analyzed here.
More targeting, less conversion? The paradoxes of “ultra-personalization”
I focus on the case of the startup company Predicto, created in 2015 by a computer scientist and a sales engineer. 12 One of its clients, a mutual insurance company, was seeking to optimize its commercial efforts to sell funeral insurance policy to their clients that did not have one already. The ambition was to better target potentially interested clients, to reduce the costs of the thousands of letters sent, but also not to “miss” potentially interested customers. This type of optimization, one of Predicto’s standard services, relied on what the startup calls its “ultra-personalization” algorithms, as shown below. It is based on a positivist epistemology in which interested customers preexist the commercial strategies deployed to “reach” them. This allows to consider direct marketing not as a technique to construct or trigger the client’s interest, but as an optimization problem.
Figure 2 illustrates the difference between contacting the same number of clients (4% of the whole population), based on traditional segmentation methods (first graph), and based on machine-learning “ultra-personalized” segmentation that takes into account both “strong” and “weak signals” (second graph). Both are compared to randomly contacting clients (the black line).

Training seminar “From Segmentation to Ultra-Personalization”, Oct-18.
French titles respectively say:
This comparison generated some frictions. The first surprise was that, although the scoring carried out by Predicto did indeed allow selling more policy contracts in total, it was because it produced more requests for quotes (+30%) from the customers contacted. However, it produced a lower conversion rate than traditional methods; in other words, a smaller proportion of these requests for information resulted in an effective subscription. Let’s say that with the traditional method, they make 50 [
The perfect client: When the algorithm predicts too well
The post-campaign evaluation generated another inconsistency, when Predicto realized that the customers who actually purchased a policy were not the top-ranked ones: they did have a good ranking, but the top-scored customers proved to be surprisingly unresponsive to the company’s solicitations. This observation called for another bit of interpretative work. You can have a funeral policy with Roc Eclerc [
Here is another tension regarding the person produced by algorithmic prediction, and the necessity for the actors to interpret and contextualize the results of algorithms, to establish their value. It is linked to a structural limitation of predictive scoring, which necessarily depends on the data available to train the algorithm and the choices made by data scientists in defining its desired outcomes (Grosman and Reigeluth, 2019). By measuring the likelihood to subscribe by comparison with actual policyholders, the algorithm may have assigned the highest scores to people who resembled them too strongly for them not to have a policy elsewhere.
The cases studied here allow us to depart from the ordinary representation of algorithmic targeting that would produce conversion rates tending towards 100% (Beer 2017; Cardon, 2015). In reality, quite often, algorithms do not do exactly what they are supposed to. Moreover, they call for human interpretation work, not only because machine learning produces black boxes, but to establish the utility and value of the calculation. What the algorithm
Learning algorithms thus displace uncertainty in customer knowledge. In accordance with the historical ambition of market research (Berghoff et al., 2012; Cochoy, 1998), they quite considerably reduce the uncertainty surrounding the customer’s tastes and behavior. They do, however, induce a new type of uncertainty, linked to the explanation of scores (black box effect), but above all to their interpretation. Contrary to symmetric critics and praises regarding machine learning algorithms, they do not produce a cold, seamless, and automated process of knowing and governing data points. Until the last moment, humans adjust the perimeter and nature of algorithmic action on people.
Conclusion
As we have seen from the cases presented here, mass personalization involves an iterative, back-and-forth process between the world and the data lab. Contrary to a widespread representation (Darmody and Zwick, 2020; Steiner, 2012; Zuboff, 2018), algorithms do not “manipulate” the social world from the outside, at the end of a linear process of learning, calculation and prediction. The world and the algorithms define each other, through the articulation work carried out not only by the “little hands” of the information society (Bowker and Star, 1999; Dagiral and Peerbaye, 2012), but also by data scientists. They centralize customer knowledge, harmonize the available data, interpret the results and reintegrate them into explanatory schemes specific to the predicted universes. As Grosman and Reigeluth (2019) point out, both learning objectives and training datasets are constructed, negotiated and adjusted by humans, themselves embedded in local knowledge regimes, organizations and cultures. In the case of predictive marketing, even the most abstract mathematical operations (such as the techniques to limit overfitting) are inseparably computational and social.
What happens to
Although Amoore and Piotukh (2015) rightly highlight the centrality of algorithmic calculation for making sense out of Big Data, this does not simply implies “throwing data at the algorithm” (Amoore and Piotukh, 2015: 343). “Ingestion” of data is actually a complex process involving non-mathematical, local epistemologies in a crucial manner. It does not abstracts data from their social context, or at least, not entirely. If anything, the reason why algorithms can predict consumer behavior is because their conception and interpretation are continuously nurtured by exogenous epistemologies of the consumer, produced at the interplay of working collectives, routines and inherited data infrastructures (Bowker and Star, 1999; Christin, 2017). In this respect, Big Data quantification is more of a “remediation”, than a disruption (McGuigan 2019: 3). As described by Bolin and Schwarz (2015) in the case of media planning, this remediation may involve a back-and-forth process between ancient and new modes of representation, between “Gaussian” and “Paretian” statistics (Bolin and Schwarz, 2015: 3). It may be due to “institutional inertia”, but it may also be – as it is here, or as shown in previous work (Kotras, 2015) – a crucial condition for epistemic success. This result argues for a deepened investigation into the hybridizations of the various forms of knowledge involved in the “manufacture of the public” (Bermejo, 2009).
This result also suggests that we need to take seriously the moral claims made by data science, which are keys to its growing pervasiveness. Here, the promise of supporting better market relations, more adjusted to individual authentic aspirations, is part of a longstanding criticism of traditional statistical categories, considered as insufficiently precise to do justice to the specificity of individuals (Boyd and Crawford, 2012; Desrosières, 2011). The description of data science as a science of “life” in general, as posited by the Bank’s data lab scientific director at the beginning of this paper, is a widespread moral horizon that must be studied as such. Of course, this does not mean taking the humanistic discourse of data marketers for granted. This article is rather an invitation to study its concrete consequences in terms of practices, attitudes and expectations. Documenting the way in which algorithms are constantly informed, parameterized, adjusted by exogenous knowledge (Bechmann and Bowker, 2019), also appears as a suitable way to overcome the divide between data scientists and social scientists, in which the latter are ultimately reduced to representing “ethics”, at the end of an opaque process (Moats and Seaver, 2019). As Lee et al. (2019) write, it is then less a matter of denouncing the “biases” of algorithms, or assessing
Finally, these results call for further contributions about algorithmic objectivity. Following Angèle Christin (2016), we can test the classic contribution of Lorraine Daston and Peter Galison (2010), who showed how objectivity has historically been defined as the ability to produce knowledge disconnected from the humans who produce it. “To be objective is to aspire to a knowledge that bears no trace of the knower – knowledge unmarked by prejudice or skill, fantasy or judgment, wishing or striving” (Daston and Galison, 2010: 17). Algorithmic modelling seems to be a mechanistic knowledge instrument, in which the automated abstraction of calculation should guarantee the elimination of human preconceptions and biases, and would thus produce more “objective” results than those generated by traditional market research. Nevertheless, this investigation into the practice of data marketing has shown the constant and decisive attention that data scientists pay to local forms of knowledge, and to the people who are the bearers as well as the subjects of this knowledge. Further contributions are needed to document the interweaving of knowledge embedded in humans and machines in the production of algorithmic objectivity.
