Sage Journals: Discover world-class research

Abstract

Machine learning models are becoming increasingly prevalent in algorithmic trading and investment management. The spread of machine learning in finance challenges existing practices of modelling and model use and creates a demand for practical solutions for how to manage the complexity pertaining to these techniques. Drawing on interviews with quants applying machine learning techniques to financial problems, the article examines how these people manage model complexity in the process of devising machine learning-powered trading algorithms. The analysis shows that machine learning quants use Ockham’s razor – things should not be multiplied without necessity – as a heuristic tool to prevent excess model complexity and secure a certain level of human control and interpretability in the modelling process. I argue that understanding the way quants handle the complexity of learning models is a key to grasping the transformation of the human’s role in contemporary data and model-driven finance. The study contributes to social studies of finance research on the human–model interplay by exploring it in the context of machine learning model use.

Keywords

Ockham’s razor machine learning models algorithmic trading distributed cognition model overfitting explainability

This article is a part of special theme on The Turn to AI. To see a full list of all articles in this special theme, please click here: https://journals.sagepub.com/page/bds/collections/theturntoai

In his 1976 article ‘Science and Statistics’, the British statistician George Box famously argues that ‘since all models are wrong, the scientist cannot obtain a “correct” one by excessive elaboration’ (Box, 1976: 792). Instead of seeking an unattainable perfect model, Box suggests the scientist ‘seek an economical description of natural phenomena’ because ‘the ability to devise simple but evocative models is the signature of the great scientist [while] overelaboration and overparameterisation is often the mark of mediocrity’ (792). Ockham’s razor – things should not be multiplied without necessity – is the principle that Box encourages scientists to stick to. This principle has been influential in scientific thinking, particularly in logic and statistics and more recently in machine learning. Whilst many scientists inside and outside the confines of academia consider Ockham’s razor an effective rule of thumb, critics question the principle’s soundness and argue it creates a simplicity bias that prevents scientists from properly engaging with genuinely complex phenomena (Bensusan, 1998; Domingos, 1999, 2012). The crux of the issue concerning simplicity of modelling is that whilst simple models may be less prone to error and easier to interpret than more complex ones, reducing model complexity may occur at the expense of accuracy.

This study considers the issue of using simplicity as a response to the complexity of machine learning models employed in the financial industry, specifically, in quantitative investment management and algorithmic trading. It has recently been argued that machine learning will transform how people trade securities and manage investments for generations (Lopez de Prado, 2018). With the increase in computing power, decrease in data storage costs, and availability of Big Data, applying machine learning techniques to trading and investment management problems has become an increasingly viable and exercised option in the financial industry (Arnott et al., 2018; Buchanan, 2019; Dixon and Halperin, 2019; Guida, 2019; Lopez de Prado, 2018). Machine learning is a subset of artificial intelligence that teaches machines how to handle data more effectively. Machine learning techniques appeal to the financial industry because of their capacity to effectively discover patterns, correlations and anomalies in large and complex datasets. I argue that machine learning techniques enhance financiers’ ability to take advantage of opportunities, but at the same time possess a degree of ‘unavoidable complexity’ (Burrell, 2016: 5), which developers and users of such techniques must find ways to manage and control.

Unlike more conventional rule-based algorithms that follow established rules, machine-learning algorithms learn from pre-set optimisation criteria, which offsets the if-then logic of their rule-based counterparts (Bank of England and Financial Conduct Authority, 2019: 6; Mullainathan and Spiess, 2017: 88). The use of rule-based high-frequency trading (HFT) algorithms has been thoroughly studied in the social studies of finance (SSF). Studies cover topics such as the material political economy of HFT (MacKenzie, 2017, 2018a, 2018b), regulatory challenges associated with trading at speeds that exceed human perception (Coombs, 2016; Lenglet, 2011), interactions of algorithms (MacKenzie, 2019), imitation and herding behaviour (Borch, 2016; Lange, 2016), epistemic regimes (Seyfert, 2016), HF trader subjectivities (Borch and Lange, 2017), and market rhythms (Borch et al., 2015). Whilst rule-based trading algorithms have been studied extensively in SSF, little attention has been paid to non-rule-based, adaptive machine learning models and the people who develop and use them for trading and investment management. As machine learning proliferates rapidly in the financial industry, studying how market participants adopt, assess, and manage these complex techniques is of increasing significance to the sociology of financial markets.

To grasp the transformatory role machine learning models allegedly play in financial markets, I argue it is necessary to examine the multifaceted and dynamic relationship between models and the people who develop and use them (Svetlova, 2018: 4). Various SSF studies have explored the reciprocal influence of models, equations, and algorithms on one hand, and developers and users on the other. MacKenzie’s work on the performativity of the Black–Scholes options pricing model is an example of a tone-setting contribution to the field (MacKenzie, 2003, 2008; MacKenzie and Millo, 2003), but several other thorough empirical studies have deepen the understanding of human–model interactions and entanglements in the financial markets (such as MacKenzie, 2011; MacKenzie and Spears, 2014a, 2014b; Millo and MacKenzie, 2009; Svetlova, 2012, 2013; Wansleben, 2014). One thing that differentiates the machine learning models examined in this study from, for example, the discounted cash flow valuation models examined by Svetlova (2012, 2013) is the capacity of the former class of models’ to learn and optimise without interference from model developers and users. By examining how these learning models shape the ways practitioners perceive and attempt to manage them, this study contributes to the sociological research on the use of models and algorithms in high finance.

To examine the question of how the complexity of machine learning models is being managed in the financial industry, I draw on interviews with market participants who develop and use machine learning techniques for specific trading and investment management purposes. As a theoretical frame, I use the perspective of ‘distributed cognition’, which originates from the idea that human cognition is always situated in ‘a complex sociocultural world and cannot be unaffected by it’ (Hutchins, 1995a: xii). Distributed cognition describes ‘a situation in which one or more individuals reach a cognitive outcome either by combining individual knowledge not initially shared with the others or by interacting with artifacts organized in an appropriate way (or both)’ (Giere, 2002: 641). Giere and Moffatt (2003) argue that ‘distributed cognitive systems’ enable ‘the acquisition of knowledge that no single person, or a group of people without instruments, could possibly acquire’ (305). In the context of financial markets, Beunza and Stark (2012) have demonstrated how traders working at an investment bank’s merger arbitrage trading desk reduce their ‘cognitive overload’ by using financial models and other instruments at their disposal (394). Similarly, Hardie and MacKenzie (2007; see also MacKenzie and Hardie, 2009) have shown how decision-making in hedge funds is characterised by distributed cognition, expressed in the way traders rely on colleagues, contacts, and technical devices when deciding what and when to buy or sell. In these studies, the instruments employed in the systems of distributed cognition compensate for traders’ bounded rationality, enable knowledge acquisition, and thus, inform human decision-making. However, as trading becomes increasingly automated algorithms no longer exclusively inform human decision-making; they often also determine what, when, and how to trade or invest, and execute accordingly.

Even though much trading is conducted by algorithms at speeds and a scope that exceed human cognition, humans are not marginalised by but remain a crucial cog in contemporary algorithmic trading (Beverungen and Lange, 2018). Whereas ultrafast HFT algorithms enhances traders’ ability to seize arbitrage opportunities long before any human would have been able to identify them, machine learning expands the scope of data mining and data processing and thus, enhances the capacity to trawl markets in search of patterns and correlations to exploit. I do not want to argue that machine learning extends the mind and thus, has epistemic agency (Clark and Chalmers, 1998; cf. Giere, 2007), but I contend that machine-learning algorithms’ ability to learn makes them different and more complex cognitive aids than most other technical devices used in trading and investment management.As Burrell (2016) points out, when a machine-learning algorithm learns, ‘it does so without regard for human comprehension’ (10), which influences the way practitioners see and attempt to manage such models. The analysis demonstrates that users of machine learning models are concerned that models might learn the wrong things and thus, become deceitful rather than informative. They use Ockham’s razor as a heuristic to help strike a balance between simplicity and complexity, and interpretability and accuracy. The value of Ockham’s razor in machine learning model building lies not necessarily in the scientific superiority of the simple model, as suggested by Box, but in ensuring the comprehensibility and explainability of the models that practitioners interact with. The simplicity heuristic, as employed by market practitioners, is best understood as a ‘proxy for comprehensibility’ (Domingos, 1999: 421), which helps secure a certain level of human control of model processes and enables explainability. This emphasis on comprehensibility and explainability reflects a more general industry concern about transparency and machine learning (Bracke et al., 2019; World Economic Forum, 2019).

This introduction is followed by five sections. The first is a brief outline of the empirical material and my approach to studying machine-learning model use. This is followed by a section on applying a specific machine learning technique, reinforcement learning, to algorithmic trade execution and the intricacies associated with developing such a distributed cognitive system. In the third section, I consider the role of simplicity as a heuristic for dealing with machine learning model complexity and incomprehensibility. More specifically, I analyse how modellers and model users employ Ockham’s razor as a means to cognitively ‘frame’ the model (Beunza and Garud, 2007; Hardie and MacKenzie, 2007) and thus, manage the problem of overfitting (i.e. overparametrized models that learn random things from processing data). In the fourth section, I discuss the role of the human element, to use an emic term, in distributed cognitive systems that involve machine learning techniques. An empirical example of the automation of Ockham’s razor in a hedge fund’s investment management system serves as the backdrop to the discussion about the changing role of the model user. The final section concludes.

Data and methods

An increasing number of trading and investment management firms adopt machine learning techniques and practitioners are generally willing to discuss possible applications of such techniques in finance. Nevertheless, it is difficult to obtain in-depth information about specific machine learning models. Unsurprisingly, firms are reluctant to share the recipe for the ‘secret sauce’ that gives them their competitive edge. Because machine learning models are complex systems comprising numerous lines of code and an internal decision logic that changes as models learn from running on training data (Burrell, 2016: 5), it would be difficult to extract meaningful information from the source code itself (if available) without also having to obtain information about the context of application, training data, and implementation process. It is nevertheless possible to obtain contextual knowledge about machine learning models by talking to people in the financial industry who develop and use them, because it makes little sense to talk about machine learning without also considering the specific context of application (Weigand, 2019: 95). Thus, my approach is to treat models not as technical objects decoupled from the organisational, institutional, and social context in which they are developed and applied, but rather as culture – that is, contributing to the socio-material enactment of practices and values (Seaver, 2017; see also Shackley, 1998). More specifically, the unit of analysis is ‘distributed, socio-technical system[s]’ (Hutchins, 1995b: 265) consisting of market participants and the technical artefacts (i.e. machine learning techniques) they develop and employ in attempts to tease out and exploit information in various types of datasets.

To examine these socio-technical systems comprising humans and models, I draw from 31 interviews with quants who either have been or are using machine learning techniques for trading and investment management purposes (see Table 1, in Appendix 1). The term ‘quant’ has been used mostly as an abbreviation for a quantitative analyst, yet today covers a wide range of roles in finance and thus, no longer refers exclusively to investment bank analysts involved in financial engineering (cf., Derman, 2004). Basically, a quant can be defined as someone who uses ‘tools and insights from economics, finance, statistics, math, computer science, and engineering’ to analyse and solve financial and risk management problems (Pedersen, 2009: 6–7). ‘Quant’ is an umbrella term comprising roles such as quantitative analysts, researchers, scientists, developers, engineers and data scientists. Whilst each of these roles is different and can vary considerably from firm to firm, common to the 31 interviews that constitute the main source of empirical material used in this study, is, as mentioned, that they are all conducted with market participants who are or have been engaged in applying machine learning techniques to financial problems.

The two main criteria for selecting interviewees were (1) expertise within the area of machine learning and/or (2) practical experience with regard to applying machine learning techniques in finance. Since developing implementable machine-learning algorithms for trading and investment management purposes is rarely a one-person endeavour, but rather a task distributed between a number of people in different roles, I have prioritised to gather input from a range of people involved in the practices of leveraging machine learning techniques in finance, instead of focussing solely on one group (e.g. data scientists). The majority of the interviewees worked in hedge funds or in algorithmic trading firms where they had different roles at different levels in the organisational hierarchy. I obtained consent from informants to use all quotes included in the article. A few interviewees wanted their identities disclosed, whilst the rest are either pseudonymised or anonymised.

The interviews are sampled from a body of 182 in-depth, semi-structured interviews conducted, during a two-year period (2017–2019) by members of the ‘Algorithmic Finance: Inquiring into the reshaping of financial markets’ research project of which I am a part. Interviewees worked in hedge funds, asset management firms, proprietary algorithmic trading firms, banks, brokers, regulators, exchanges, and technology vendors (see Table 2, in Appendix 1). Most of these firms and financial institutions are located in New York, Chicago, and London, but we also conducted interviews in Washington, San Francisco, and a few other cities in the US and continental Europe. Most interviews were conducted in situ, making it possible to observe traders, portfolio managers, quants, and others at work in different organisational setups. The interview data are supplemented by data obtained from these ad hoc observations and through more detailed observation studies carried out in trading and investment management firms. The duration of these observation studies ranged from one day to two weeks. I also attended industry conferences on the use of machine learning in algorithmic trading and quantitative investment management. To complement the empirical material obtained through the fieldwork, I read all the practitioners presentations, articles, and reports on machine learning in finance that I could obtain. These additional data sources provide invaluable background information about the industry in general and, more specifically, about attitudes toward applying machine learning to diverse financial problems.

The optimal trade: Reinforcement learning for trade execution

Machine learning techniques have existed in the financial industry for quite some time. Banks and credit card companies have used machine learning models for classification purposes in fraud detection since the 1990s and credit scoring since the early 2000s (Buchanan, 2019). In the fields of trading and investment management, notoriously secretive systematic hedge funds, like the Cold War code breaker James Simons’ Renaissance Technologies, D. E. Shaw, and Two Sigma, are said to have profited substantially from using machine learning models for a number of years (see, e.g. Zuckerman, 2019). Today, there is no shortage of investment management and trading firms experimenting with artificial intelligence and machine learning. Although only a fraction of firms may be able to boast of having successful machine learning systems in production, interest in the technology is undeniably massive and increasing.

In general, the machine learning technique choice is dependent on the task at hand and dataset available. As a quant working in one of the world’s leading quantitative hedge funds explained, machine learning is ‘just another mathematical technique used to extract information from a dataset and certain datasets have characteristics that lend themselves better to machine learning techniques than others’. In that sense, machine learning comprises just another set of techniques or tools in the quant toolbox. Like simpler calculative devices, machine learning models are technical aids firms use as a way of compensating for the limited information processing, calculative, and information storage (memory) capacities of humans. Thus, the task or problem at hand suggests which tool to use; that is, it indicates how the distribution of cognition materialises in concrete financial practices (MacKenzie and Hardie, 2009: 47). This distribution of a specific task to a certain type of machine learning technique is the main focus of the following analysis. Besides presenting the problem of optimal execution and type of algorithm proposed as its solution, I discuss challenges associated with devising and implementing learning algorithms in a dynamic complex system such as financial markets.

Within the spectrum of possible applications there are various examples of adding machine learning to existing, proven quantitative strategies. One area where machine learning techniques have shown considerable promise is algorithmic trade execution. When brokers, who act as intermediaries between investors and markets, execute large orders on behalf of institutional clients – each trade containing blocks of several thousand shares often worth millions of dollars – they do their utmost to impact the market as little as possible. If the entire order were submitted at once, the market would move instantaneously and result in what financiers and financial economists refer to as ‘slippage’. Slippage is the undesired discrepancy between the expected trade price when the order is submitted and the actual price at which it is executed (Frino and Oetomo, 2005). For example, say an investor wants to buy 100,000 shares of Apple and hope to purchase them at a market price of 230 USD per share. If the entire buy order were submitted to the market, Apple sellers would ask for a higher price and instead of executing the trade at the expected price of 23,000,000 USD, the effective price would be significantly higher. To reduce the cost of market impact as much as possible, many brokers use algorithms that slice and dice orders into smaller lots and submit them to the market at certain intervals over a set period of time (Dixon and Halperin, 2019: 4).

The emphasis on developing sophisticated execution algorithms has increased with the rise in HFT. The risk of being front runned by predatory HFT algorithms (Lewis, 2014) has arguably made the optimal execution problem more pertinent to institutional investors and the brokers who bring their orders to the market. A main purpose of execution algorithms has thus become to disguise orders from HFT algorithms, yet our interviews with HF traders suggest this is a tremendously difficult thing to do (corroborated by HF traders quoted in MacKenzie, 2018b: 1654–1655). Whilst financial economists are divided on the question of whether HFT actually impacts institutional execution costs (Brogaard et al., 2014; Kervel and Menkveld, 2019), practitioners are concerned with and try to account for the possible presence of predatory HFT algorithms when they execute large orders. When I asked the hedge fund quant whose definition of machine learning I quoted earlier in this section about their approach to aggressive HFT ‘taker’ algorithms (MacKenzie, 2018a), he explained that they basically tried to devise execution algorithms capable of making the small, so-called ‘child orders’ appear as random as possible when they are submitted to the market. ‘You try to randomise your orders so that they do not reveal your big order,’ he explained, and further noted that ‘in essence, it all comes down to having as little market impact as possible for your trade, and that is what execution research is doing’. The hedge fund has an entire team exclusively devoted to execution research (i.e. to devising execution algorithms).

There are two dominant approaches to the algorithmic execution of large trades: time-weighted average price (TWAP) and volume-weighted average price (VWAP). The TWAP strategy divides orders into bits and executes them evenly over a specified period of time; the average price of a security is calculated over a set period of time and trades are executed as close to that average price as possible. The VWAP approach also splits the position into child orders, but uses volume instead of time as the criterion for calculating the average price of each child order. The prices at which the most volume is traded weighs more in the VWAP calculation of average price (Law, 2018). The common objective of both the TWAP and VWAP is to provide the best possible solution to the optimal execution problem.

Machine learning is well suited to solving optimisation problems, such as the optimal execution problem. Reinforcement learning – a form of machine learning in which algorithms employ trial and error to identify solutions to problems (Osiński and Budek, 2018) – has shown significant promise when applied to the problem of optimised trade execution (Dixon and Halperin, 2019; Kearns and Nevmyvaka, 2013; Li and Lau, 2019). In a game-like situation, the agent is either penalised or rewarded when performing an action. The objective is for the model to maximise its total reward (Li and Lau, 2019: 2). The model learns the most effective way to maximise its reward by being rewarded (penalised) for making the right (wrong) decisions. Whilst the model developer defines the criteria for what should be considered right and wrong actions, the model figures out for itself how to solve the task at hand. Reinforcement learning is deemed an effective way to utilise a ‘machine’s creativity’ (Osiński and Budek, 2018) and unlike humans, the algorithms can learn and thus, gain experience from playing numerous parallel decision-making games simultaneously. It is thus a technique that expands the realm of possible ways to optimise market action by performing cognitive tasks at a scope that exceeds human capabilities. In this case, the task distributed to a reinforcement learning algorithm is to optimise trade execution.

Matt Le Tissier (pseudonym), who had about a decade’s worth of experience in devising machine-learning algorithms that execute large institutional orders in an optimal fashion, explained how reinforcement learning could improve a VWAP strategy:

Our problem is to take a large order and split it over time in order to ideally minimise market impact, but also achieve some kind of improvement over time, if you can. The baseline assumption would be VWAP, right? Take a VWAP strategy where you are trading uniformly over volume time. That is fine, but not good enough. People want some improvement over that. The RL [reinforcement learning algorithm] essentially applies an adaptation over the baseline strategy, but it does that by providing the optimization parameter, which controls how much it is going to shift the overall schedule forward and backward.

Thus, the reinforcement learning-informed execution algorithm is an adaptation of a generic VWAP strategy. Unlike the uniformity of the VWAP’s trading pattern (one slice of the parent order at a time at fixed intervals), the reinforcement learning algorithm figures out when is the best time to trade. This alters the schedule for when child orders will trade, insofar as it makes for more optimal execution. The execution strategy adaptations, which the model performs automatically, may also entail adjustments of order sizes and switching between order types. Again, the purpose of this algorithmic manoeuvring is for the trade to leave the smallest possible footprint in the market.

Although the reinforcement learning adaptation of the VWAP strategy allegedly provides better execution, Matt stressed that applying the technique in financial markets can be very complicated. One of the main challenges modellers face is making sure that the model is not derailed by shifts in the market regime. A market regime is basically the state or behaviour of a market at a given point in time. Regime changes occur when, for example, volatility spikes or liquidity dries up (Ang and Timmermann, 2012). Many market-internal and -external factors can cause changes in a market regime, but the point is that it can be difficult for both human and non-human trading agents to operate in changing market regimes. If a model has many parameters, as machine learning models can have, it might be capable of performing well on one dataset, but if there is even the slightest change in market regime, the model risks adapting to these changes in an undesirable way. Whilst the model might optimise order execution, its many variables simultaneously combined with the dynamic and complex nature of financial markets produce obstacles and challenges that quants need to deal with. Thus, distributing a task to an automated trading agent – in this case the reinforcement learning execution algorithm – involves considering how to appropriately (i.e. optimally) organise the machines used as aids in a given distributed cognitive system (Giere, 2002).

Devising adaptive models capable of performing consistently in changing market regimes is a major and very tedious challenge. To manage their machine learning models’ complexity and thus, ensure that they do not become uncontrollable and incomprehensible, quants use Ockham’s razor as a way of ‘framing’ their model. Framing is constructing ways to select, categorise, and interpret data or, in this case, an automated data processing tool (Beunza and Garud, 2007; Hardie and MacKenzie, 2007: 391). In the following section, I analyse and discuss ways quants use simplicity as a framing tool in model development. I argue that Ockham’s razor help them reduce the risk of producing overfitted models and enables them to better control the complexity of the machine learning techniques they employ.

The overfitting problem

‘[T]he noise in the data is clouding our vision.’ (Black, 1986: 541)

Financial markets are intricate systems full of information to be extracted and exploited, but they are also brimming with noise. Noise, Black (1986) argues, is ‘what makes our observations imperfect’ (529). It can be difficult to distinguish information from noise and occasionally people act on noise as if it was information. Mistaking noise for information is not limited to humans; from time to time, machine learning models also have their vision clouded by noise. Many such models tend to have a difficult time discriminating strong correlations from spurious ones. Consequently, there is a risk of making inferences about possible relationships in the market based on completely arbitrary and coincidental correlations between factors in the data. One of the problems emerging from this predicament is what modellers refer to as the overfitting problem.

Although I am mainly interested in ‘overfitting’ as an emic term used by market participants to ascribe meaning to specific concerns about the relation between model performance and model accuracy, I acknowledge it is a technical term that requires some definitional clarification. A technical way to understand overfitting is to distinguish between prediction errors caused by bias versus variance in the model. Bias is when models ‘consistently learn the same wrong thing’ and variance is ‘the tendency to learn random things irrespective of the real signal’ (Domingos, 2012: 81). Whilst bias produces underfitting, variance causes overfitting. A model is underfit if it allows for too much behaviour that is not supported by the data (van der Aalst et al., 2010). Developing models can thus be considered an act of finding the right balance between under- and overfitting. That being said, none of the interviewees mentioned underfitting as an issue they faced in their work, whilst almost all of them expressed concern about the risk of overfitting. In simple terms, overfitting is using models that ‘violate parsimony – that is, that include more terms than are necessary or use more complicated approaches than are necessary’ (Hawkins, 2004: 1). According to this definition, overfitting is a violation of Ockham’s razor. One sign a model is overfit is a significant disparity between the model’s in-sample and out-of-sample performance. Testing a model is generally referred to as backtesting, which is carried out in-sample and out-of-sample. Essentially, the in-sample phase is a model’s learning or training stage, whilst the out-of-sample phase involves testing the model on data not used in the training phase. Unlike a model backtest that is indicative of overfitting, a successful backtest is one in which the out-of-sample performance is consistent with the in-sample performance (Varian, 2014: 6).

Overfitting is an inaccuracy problem, but also one of deception. An overfitted model acts on ‘idiosyncratic noise’ (Mullainathan and Spiess, 2017: 91) or ‘random quirks’ (Domingos, 2012: 81) in the data as if it were valid information, and as a result, identifies correlations that do not exist. If a model’s complexity is high – that is, if it has a large number of parameters – then the risk of overfitting also tends to be high. As Arnott et al. (2018) argue, ‘the greater the complexity, and the greater the reliance on non-intuitive relationships, the greater the likely slippage between backtest simulations and live results’ (14). Based on views of the quants, it is apparent that their preference for simplicity in machine learning modelling is closely associated with their concerns about models fitting on noise. Simplicity (i.e. Ockham’s razor) functions as a safeguard against relying too heavily on non-intuitive relationships that might be purely spurious correlations discovered by an overfitted, and thus, inaccurate model. As a heuristic, Ockham’s razor allows quants to frame the model so that its complexity becomes manageable. It helps quants control model complexity and the process of adding features to models (i.e. adding complexity). Whilst machine learning techniques help compensate for the bounded rationality of analysts, traders, or portfolio managers – by processing data in ways, and at a pace and scope that exceed human capacities – simplicity helps quants mitigate the risk of being deceived by overly complex model processes.

For developers, let alone users, of machine learning models, it can be difficult not to be swayed by what at first glance appears to be a high-performing model. However, what looks like predictive power is often in fact overfitting. Being sceptical about a model’s output rather than naïvely trusting it arguably helps to mitigate the risk of being deceived by the model. This is how Alan Smith (pseudonym), a machine learning specialist, diagnosed the problem. Alan worked in Matt’s team at the brokerage firm, where he experimented with applying machine learning techniques to different financial problems, including the optimal execution problem. He aptly summarised the problem of overfitting and explained how it could lead to self-deception on the part of model developers and users if they were not careful.

A big mistake lots of people make is that they always want to use something really complex to solve a problem that does not need it. You always want to default the model with the least complexity. You do not want to be fitting to an artefact in the dataset. You just want to make sure that what you are seeing is real and you are not actually kidding yourself with any kind of predictive power.

Just like it can be difficult for complex machine learning models to distinguish information from noise, quants can be inclined to act on erroneous model output as if it were information. Several interviewees raised concerns about being fooled by the backtest performance of excessively complex models and losing track of what they were doing. In response to concerns about the complexity of machine learning models, many quants highlighted simplicity as an increasingly important modelling principle. As a way of framing the model development process, simplicity sets boundaries and, to paraphrase Alan, helps ensure the correlations detected are in fact real and not merely dataset artefacts. Preferring the simplest possible model as the default option does not mean simple models are necessarily better than complex ones. It merely emphasises the importance of assuring that modellers are capable of comprehending and controlling the process through which they devise machine learning models.

The importance of simplicity in modelling was further emphasised by another quant, Brian Peterson, who had many years of experience applying machine learning to financial and risk management problems. Brian now leveraged his expertise in the Chicago-based algorithmic trading firm Braverock Investments and as a lecturer at the University of Washington (computational finance and risk management) and University of Illinois (financial engineering). Brian explained that he too resorted to the simple model as the default option.

I always start with the simplest model that I think will be minimally informative, and then use that as a benchmark. We will, for example, take a logistic regression model and use it as our benchmark model, and then do experiments based on [the following questions]: Does adding this factor help? Does using a more sophisticated model help? The more sophistication I add to the model, the more model risk I add to the strategy. […] We always try to start with the simplest model we think will work and move very, very carefully to more complex models.

To Brian, simplicity is not itself a goal. Rather, it is a point of departure, a way of directing an iterative model development process, with two aims: (1) devising a model that solves the problem at hand in the most optimal way possible and (2) ensuring the model does not become unnecessarily complex. By gradually adding and occasionally subtracting factors that increase a model’s overall complexity, it becomes easier to strike a balance between sufficient and excess complexity. Thus, a simplicity preference also becomes a risk management tool or pragmatic safety valve. If factors are added to the model, model risk increases. In other words, model sophistication comes at a price and Brian’s reflections illustrate the need to consider and judge whether the additional risk is worthwhile.

To prevent adding more than manageable model risk without compromising model accuracy, Brian aimed for parsimony in modelling.

It is very difficult to beat a parsimonious model. All other things being equal, I always prefer the model that has fewer parameters to the model that has more parameters, because the model with fewer parameters is far less likely to break out-of-sample. But that is not always possible. Sometimes I need a large feature space to describe the data that is in front of me. However, given the choice, if I have two otherwise equivalent models, if I cannot statistically distinguish between the models and the predictions that come out of those models, I will always choose the simpler one, because otherwise the chance that I am going to fit on noise is really, really high.

All other things being equal, Brian preferred simple models to more complex ones. However, as Brian notes, all other things are not always equal. Sometimes, simplicity is not a viable option. Although Ockham’s razor most often functions as a good rule of thumb, there seems to be situations when a problem’s intricacy requires more complex solutions. The reality is that models must most often be quite complex to sufficiently capture the complexity and dynamic nature of financial markets.

As noted by Domingos (2012), simplicity is valuable in machine learning modelling because ‘simplicity is a virtue in its own right, not because of a hypothetical connection with accuracy’ (86). Simplicity is ‘a goal in itself’ insofar as it is considered a ‘proxy for comprehensibility’ (Domingos, 1999: 421). As the complexity of models used in finance increases, quants become increasingly preoccupied with questions about comprehensibility and explainability (Bracke et al., 2019). Firms want to be able to know, explain, and communicate what their models are doing, especially if something goes wrong (World Economic Forum, 2019). Investors can, for example, quickly lose faith in and abandon an asset manager who cannot convincingly explain their model in periods of weak performance (Kollo, 2019: 5). Thus, Ockham’s razor functions as enabling explainability rather than guaranteeing accuracy.

The problem concerning the comprehensibility and explainability of machine learning models was addressed, in relation to accountability, by Ernest Baver, senior research scientist at Hutchin Hill Capital. Ernest explained how the sheer complexity of machine learning models was hindering the ability to attribute a proper cause in the case of a model misstep.

If you use a more intuitive model and something does not work right, then you have a simple way to do attribution analysis. You can actually figure out what went wrong… If you are using complicated nonlinear models, it is harder to say what went wrong. There is no simple way to decompose a deep learning structure into simple blocks.

Note that a notion of an intuitive model here denotes a distinction between a simple, comprehensible model and a complicated, non-linear deep learning one. Just like simplicity, intuition serves as a proxy for comprehensibility. The limit to quants’ appetite for model complexity is exceeded if it is not possible for them to ‘come up with an intuitive picture of what this thing [the model] is doing’, Ernest further explained. He admitted feeling a little ‘uncomfortable’ thinking about the possibility of not being able to fully explain the behaviour of a very complex model if something bad should occur. Nevertheless, Ernest’s fund used machine learning to try to optimise trade execution, but he emphasised that they always stuck to and never deviated from ‘simplicity and common sense’ as the main principles in selecting, testing, and implementing such models.

Several other interviewees used the distinction between intuitive and non-intuitive models as a way of framing the models they applied. Some emphasised the importance of having an ‘intuitive feel’ for the model, which basically means they are aware of what the model is doing. Having an intuitive understanding of the model’s operations also means knowing when and how to intervene if the model suddenly begins to operate in discord with the modeller’s original intent. An intuitive feel for the model thus sustains a ‘qualitative overlay’ (Svetlova, 2012), which allows for judgment calls from traders, portfolio managers, quants, and others. However, machine-learning algorithms and, to some extent, automation, redefine the role of the qualitative overlay or, to use another term, human element in quantitative trading and investment practices. Whilst the human element remains crucial in the process of determining how and to what technical instrument a task should be distributed (i.e. the organisation of distributed cognition), the decision-making itself is increasingly performed by machines. In the following section, I present and discuss two perspectives of the role of simplicity preference and human judgment in machine learning-driven, automated investment management.

Automating Ockham’s razor or reimagining the human element

Ockham’s razor is our friend! We realised that we had to teach the machine to be the scientist and to prefer simpler theories. We look at all theories. We look at how much machine learning went in and how much overfitting the algorithms can do and then judge them accordingly.

These are the words of chief scientist Tom Cleverly (pseudonym) of a US-based hedge fund with a strong commitment to machine learning and automation. In the previous section, simplicity is regarded as a framing mechanism used in the modelling process and thus, is something that takes place ‘outside’ the model (Svetlova and Dirksen, 2014: 572). However, in this hedge fund’s case, Ockham’s razor is written into the machine learning system itself. Tom’s rationale for implementing simplicity preference in the system is that if Ockham’s razor, as proposed by Box, is the hallmark of the great scientist, then a good model must be capable of applying that principle. According to Tom, this built-in simplicity preference constitutes ‘meta machine learning’, that is, a layer of machine learning operating on top of a signal detection machine learning model. Whilst the basic model processes market data in search of signals to trade on, the meta-layer ensures only simple and robust strategies are pursued. The purpose of automating Ockham’s razor is basically to prevent the system from overfitting and thereby producing unreliable signals. Tom further emphasised that the meta-machine learning layer automates the actual investment decision process, which in most hedge funds take place in an investment committee. Normally, the investment committee is composed of a group of high-level people from the firm and its job is basically to determine the fund’s direction. Automating the investment committee function thus eliminates human bias from the strategy- and decision-making process. ‘There is no human layer on top,’ insisted Tom. In other words, the qualitative human overlay is replaced by a mechanised layer of machine learning.

Modellers use various techniques to offset or mitigate the risk of overfitting. One of these techniques is regularisation. Regularisation is a machine learning component that discourages selecting a more complex model. Only if the complex model significantly outperforms the simple model (determined by some prefixed value) does the machine learning system switch to the more complex one (Arnott et al., 2018: 13–14). The meta-layer on the hedge fund’s machine learning system can be regarded as a sophisticated regularisation tool. The system is built in accordance with a scientific ideal which says simpler, more economic descriptions of a phenomenon are preferable to more complex ones. However, when Ockham’s razor is automated, it is no longer a framing mechanism utilised by quants in the model development process: it reduces the heuristic to an algorithmic operation, not contingent on any intervention by human judgment. Making Ockham’s razor a fixed principle assumes the relationship between simplicity and accuracy, which Domingos (2012) describes as ‘hypothetical’ (86). Although automating Ockham’s razor does leverage the procedural qualities of the principle, it simultaneously eliminates the possibility that quants may consider if the practical concern (i.e. the specific problem in a given market context) in fact requires a more complex technical solution.

In general, the objective of using machine learning in finance is to discover reliable patterns and structures in the noisy, adaptive, and deeply entangled reality of financial markets. Because financial markets are anything but simple, the simplest model is not always the optimal solution when dealing with very intricate problems. Instead of perceiving Ockham’s razor as a truth claim, seeking simplicity in modelling may rather be a means for enabling quants to activate their judgment and expertise when working on models. It is a means to direct and control the model development process; it is not an end. As Domingos (1999) argues, modellers should ‘seek to constrain induction using domain knowledge, and decouple discovering the most accurate (and probably quite complex) model from extracting comprehensible approximations to it’ (421). Thus, the quant’s domain knowledge is considered a decisive factor in constraining complexity and certainly has not been made redundant by automation. As Beverungen and Lange (2018) argue in their study of distributed cognition and automation in HFT, a crucial part of being an HF trader is to ‘establish control, awareness and consciousness of what is going on’ when their algorithms trade in the markets (84). Something similar can be said about machine learning quants, who use simplicity to control and thus manage complexity when devising models.

Although the human element – the individual market participant’s domain knowledge and judgment – has not been rendered obsolete by machine learning, quants are forced to exercise their reasoned judgment differently. This alleged shift in how quants are supposed to utilise their expert judgment was addressed by Mahmood Noorani, the founder and CEO of the London-based analytics provider Quant Insight. Having worked in trading and portfolio management since the 1990s, Mahmood experienced first-hand the proliferation of increasingly sophisticated algorithms and models in the financial industry. Although he embraced the opportunities created by recent leaps in artificial intelligence and machine learning research, his view on applying learning algorithms in investment management was marked by a profound faith in expert judgment. According to Mahmmod, the non-static nature of financial markets combined with the scope and adaptability of the most sophisticated models used in the industry only made ‘the human overlay’ more indispensable.

The problem is that you cannot have a controlled repeated experiment in finance … If you are going to build machines to trade markets, one thing you still have to have is domain expertise, because the big issue on the quantitative side is the so-called overfitting issue. If you build a machine that goes and takes a vast amount of data and try to understand an asset price, you can get these spurious associations where rainfall in Brazil seems to be explaining, with a 1-week lag, the movement of the S&P 500. That is just wrong. You have to have a human overlay to understand causality. A couple of years back, we were trying to build a model of crude oil prices … [T]he question was: ‘should we have global inflation expectations as one of the explanatory variables for crude oil?’ We decided not to, because it should actually go the other way around. Changes in crude oil prices shift inflation activity. Changes in inflation expectations do not shift crude oil prices. You need to have a human, who understands how the relationships work. I do not think we are anywhere close to the point where the machine is going to understand that causality … [T]here still is a place for the human element, but it is in guiding and training the machine and choosing the right data, rather than going straight to the investment decision.

Whilst Mahmood argues the human element remains indispensable in machine-driven investment management, his logic also turns the human–model relationship on its head. Normally, the financial model informs the decision-making of the trader or portfolio manager who makes the final decision ‘through the application of a qualitative overlay’ or a ‘judgment call’ (Svetlova, 2012: 430). However, instead of seeing the model as a tool that informs human decision-making, the machine learning model is better understood as a device to which much of the data processing, signal identification, and trade execution is distributed. Therefore, the crucial importance of the human overlay lies less in making the final decision to buy or sell this or that security than in ensuring a largely automated investment process is properly organised. As discussed in this article, a central part of what constitutes a properly organised automated investment or trading process is to manage model complexity and thus, sustain a sufficient level of comprehensibility and control. In other words, quants are increasingly becoming organisers, facilitators, and assessors of distributed cognitive systems in which machines, instead of humans, make and execute the investment and trading decisions.

Conclusion

This article’s main objective was to delve into some of the challenges associated with applying machine learning techniques in financial securities trading and investment management. Specifically, the goal was to investigate how quants strive to maintain a certain level of comprehensibility and control in response to the model complexity resulting from the proliferation of machine learning techniques in the financial industry. I used the theoretical perspective of distributed cognition to capture the human–model dynamic and examine ways cognitive tasks are distributed to machine learning-powered financial models. This theoretical framing allowed for an analysis of the organisation of distributed cognitive systems where machine learning techniques are employed as technical aids. The study contributes to research on the relationship between and interaction of humans and models in finance and beyond by examining how this relationship and interaction dynamic is shaped by applying emerging technologies – in this case machine-learning algorithms used for investment management and trading.

The article examines how quants attempt to manage the inevitable complexity of machine-learning algorithms by resorting to simplicity as a virtuous rule of thumb in model development and model implementation processes. Quants consider simplicity or Ockham’s razor a heuristic that helps them manage and control machine learning model complexity. Applying Ockham’s razor in their modelling practices, quants seek to ensure their models are comprehensible, interpretable, and thus, explainable. Hence, Ockham’s razor helps frame the modelling process in a way that makes it more foreseeable and provides the quant with a certain degree of control, which is also a way of fortifying accountability (Goodman and Flaxman, 2017). Moreover, the study suggests comprehensibility and explainability become increasingly important factors in quantitative investment and algorithmic trading for the simple reason that the most sophisticated models used in finance – such as multi-layer artificial neural networks – comprise elements that cannot be fully comprehended and thus, explained. Rather than being able to account for every detail in learning models, comprehension is a matter of understanding the logic and being able to interpret model output. The ability to interpret and explain complex models is particularly important when faced with a model mishap.

Besides securing explainability, the simplicity preference functions as a safeguard against the risk of models learning from noise in the data rather than information. Quants’ concerns about the risk of models overfitting on noise is the primary reason they consider simplicity a virtue. Simpler models with fewer parameters are less likely to overfit and thus produce incorrect predictions. The quants’ preoccupation with Ockham’s razor should not be dismissed as blind reliance on a contested yet resilient scientific principle, but rather, as an expression of the crucial importance human judgment still has in machine-driven finance. By demonstrating how quants exercise judgment when dealing with complex machine learning models, the article contributes to pertinent academic discussions of the human role in the increasingly automated world of high finance. In electronic markets, trading automation has reshaped the work of traders; they seldom trade manually, but rather observe algorithm trading and intervene if an algorithm does not act as intended. Similarly, machine learning is changing the role of the quant. The quant’s job is to ensure models are maximally capable of determining how, what, and when to trade. Machine learning quants train and sustain automated decision-makers. In other words, they are responsible for the organisation and facilitation of socio-technical systems of distributed cognition.

Hence, the domain knowledge possessed by an individual expert is not made redundant by the propagation of adaptive machine-learning algorithms, but is used for guiding and ensuring stable, accurate, and transparent model performance. It is arguably not in spite of but because of the presence of learning algorithms that a human element is necessary in quantitative finance.

There is a need for further studies of the ways automation and the extended use of machine learning techniques shape model use practices in trading and investment management and, perhaps, to reconceptualise judgment as an action undertaken by socio-technical assemblages rather than by an individual aided by machines. Another potential avenue for further research presents itself in study of the subjective factors, the values shaping practices and decision-making in machine learning-driven quant finance. As Kuhn (1977) pointed out more than 40 years ago, decision-making in scientific practice is dependent on a ‘mixture of objective and subjective factors’ (106; cf., Laudan, 1984). Considerations around theory or model accuracy, simplicity, comprehensibility, etc. – such as those presented in this article – are presumably always influenced by ‘non-epistemic values’ (Douglas, 2000). The applied science conducted by machine learning quants (i.e. the application of theories and techniques from computer science, math, physics, etc. to financial problems) hinges on collaborative efforts (Rolin, 2015), negotiations between teams of quants on one side, and traders or portfolio managers, firms’ fiduciary responsibilities, etc. on the other side. All these internal and external factors set the scene where the practices of model development and model use are at play, and make these practices heavily value-laden.

Footnotes

Acknowledgements

Thanks to Daniel Souleles and Christian Borch who read and commented on early drafts of the manuscript. I am also grateful to Big Data & Society Managing Editor,Matthew Zook and the three anonymous reviewers for their insightful comments and for a constructive review process.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No 725706).

ORCID iD

Kristian Bondo Hansen

References

Ang

Timmermann

(2012) Regime changes and financial markets. Annual Review of Financial Economics 4(1): 313–337.

Arnott

Harvey

Markowitz

(2018) A backtesting protocol in the era of machine learning. SSRN Scholarly Paper Nr. ID 3275654. Available at: https://papers.ssrn.com/abstract=3275654 (accessed 10 October 2019).

Bank of England and Financial Conduct Authority (2019) Machine learning in UK financial services . London, UK.

Bensusan

(1998) God doesn’t always shave with Occam’s razor – Learning when and how to prune. In: Nédellec

Rouveirol

(eds) Machine Learning: ECML-98. Berlin: Springer, pp. 119–124.

Beunza

Garud

(2007) Calculators, lemmings or frame-makers? The intermediary role of securities analysts. The Sociological Review 55(2): 13–39.

Beunza

Stark

(2012) From dissonance to resonance: Cognitive interdependence in quantitative finance. Economy and Society 41(3): 383–417.

Beverungen

Lange

A-C

(2018) Cognition in high-frequency trading: The costs of consciousness and the limits of automation. Theory, Culture & Society 35(6): 75–95.

Black

(1986) Noise. The Journal of Finance 41(3): 529–543.

Borch

(2016) High-frequency trading, algorithmic finance and the flash crash: Reflections on eventalization. Economy and Society 45(3–4): 350–378.

10.

Borch

Hansen

Lange

A-C

(2015) Markets, bodies, and rhythms: A rhythmanalysis of financial markets from open-outcry trading to high-frequency trading. Environment and Planning D: Society and Space 33(6): 1080–1097.

11.

Borch

Lange

A-C

(2016) High-frequency trader subjectivity: Emotional attachment and discipline in an era of algorithms. Socio-Economic Review 15(2): 283–306. DOI: 10.1093/ser/mww013

12.

Box

GEP

(1976) Science and statistics. Journal of the American Statistical Association 71(356): 791–799.

13.

Bracke P, Datta A, Jung C, et al. (2019) Machine Learning Explainability in Finance: An Application to Default Risk Analysis. Bank of England Staff Working Paper 816: 1–43. DOI: 10.2139/ssrn.3435104.

14.

Brogaard

Hendershott

Hunt

, et al. (2014) High-frequency trading and the execution costs of institutional investors. Financial Review 49(2): 345–369.

15.

Buchanan

(2019) Artificial Intelligence in Finance. London: The Alan Turing Institute.

16.

Burrell

(2016) How the machine ‘thinks’: Understanding opacity in machine learning algorithms. Big Data & Society 3(1): 2053951715622512.

17.

Clark

Chalmers

(1998) The extended mind. Analysis 58(1): 7–19.

18.

Coombs

(2016) What is an algorithm? Financial regulation in the era of high-frequency trading. Economy and Society 45(2): 278–302.

19.

Derman

(2004) My Life as a Quant. Hoboken, NJ: John Wiley & Sons, Inc.

20.

Dixon, MF and Halperin I (2019) The Four Horsemen of Machine Learning in Finance (September 15, 2019). Available at SSRN: https://ssrn.com/abstract=3453564 or 10.2139/ssrn.3453564

21.

Domingos

(1999) The role of Occam’s razor in knowledge discovery. Data Mining and Knowledge Discovery 3(4): 409–425.

22.

Domingos

(2012) A few useful things to know about machine learning. Communications of the ACM 55(10): 78–87.

23.

Douglas

(2000) Inductive risk and values in science. Philosophy of Science 67(4): 559–579.

24.

Frino

Oetomo

(2005) Slippage in futures markets: Evidence from the Sydney futures exchange. Journal of Futures Markets 25(12): 1129–1146.

25.

Giere

(2002) Discussion note: Distributed cognition in epistemic cultures. Philosophy of Science 69(4): 637–644.

26.

Giere

(2007) Distributed cognition without distributed knowing. Social Epistemology 21(3): 313–320.

27.

Giere

Moffatt

(2003) Distributed cognition: Where the cognitive and the social merge. Social Studies of Science 33(2): 301–310.

28.

Goodman

Flaxman

(2017) European Union regulations on algorithmic decision-making and a “right to explanation”. AI Magazine 38(3): 50–57.

29.

Guida

(2019) Big Data and Machine Learning in Quantitative Investment. West Sussex: John Wiley & Sons.

30.

Hardie

Mackenzie

(2007) Constructing the market frame: Distributed cognition and distributed framing in financial markets. New Political Economy 12(3): 389–403.

31.

Hawkins

(2004) The problem of overfitting. Journal of Chemical Information and Computer Sciences 44(1): 1–12.

32.

Hutchins

(1995a) Cognition in the Wild. Cambridge, MA: The MIT Press.

33.

Hutchins

(1995b) How a cockpit remembers its speeds. Cognitive Science 19(3): 265–288.

34.

Kearns

Nevmyvaka

(2013) Machine learning for market microstructure and high-frequency trading. In: Easley

López de Prado

O’Hara

(eds) High-Frequency Trading: New Realities for Traders, Markets and Regulators. London: Risk Books.

35.

Kervel

Menkveld

(2019) High-frequency trading around large institutional orders. The Journal of Finance 74(3): 1091–1137.

36.

Kollo

(2019) Do algorithms dream about artificial alphas? In: Big Data and Machine Learning in Quantitative Investment. West Sussex: John Wiley & Sons, pp. 1–13.

37.

Kuhn

(1977) Objectivity, value judgment, and theory choice. In: The Essential Tension: Selected Studies is Scientific Tradition and Change. Chicago, IL: University of Chicago Press, pp. 102–118.

38.

Lange

A-C

(2016) Organizational ignorance: An ethnographic study of high-frequency trading. Economy and Society 45(2): 230–250.

39.

Laudan

(1984) Science and Values: The Aims of Science and Their Role in Scientific Debate. Berkeley: University of California Press.

40.

Law

(2018) A Dictionary of Finance and Banking. Available at: https://www.oxfordreference.com/view/10.1093/acref/9780198789741.001.0001/acref-9780198789741 (accessed 21 October 2019).

41.

Lenglet

(2011) Conflicting codes and codings how algorithmic trading is reshaping financial regulation. Theory, Culture & Society 28(6): 44–66.

42.

Lewis

(2014) Flash Boys: A Wall Street Revolt. New York, NY: W. W. Norton & Company, Inc.

43.

Lau

(2019) Reinforcement learning: Prediction, control and value function approximation. Working paper. Available at: https://arxiv.org/pdf/1908.10771.pdf (accessed 1 October 2019).

44.

Lopez de Prado

(2018) Advances in Financial Machine Learning. Hoboken, NJ: Wiley.

45.

MacKenzie

(2003) An equation and its worlds bricolage, exemplars, disunity and performativity in financial economics. Social Studies of Science 33(6): 831–868.

46.

MacKenzie

(2008) An Engine, Not a Camera: How Financial Models Shape Markets. Cambridge, MA: MIT Press.

47.

MacKenzie

(2011) The credit crisis as a problem in the sociology of knowledge. American Journal of Sociology 116(6): 1778–1841.

48.

MacKenzie

(2017) A material political economy: Automated trading desk and price prediction in high-frequency trading. Social Studies of Science 47(2): 172–194.

49.

MacKenzie

(2018a) ‘ Making’, ‘taking’ and the material political economy of algorithmic trading. Economy and Society 47(4): 501–523.

50.

MacKenzie

(2018b) Material signals: A historical sociology of high-frequency trading. American Journal of Sociology 123(6): 1635–1683.

51.

MacKenzie

(2019) How algorithms interact: Goffman’s ‘interaction order’ in automated trading. Theory, Culture & Society 36(2): 39–59.

52.

MacKenzie

Hardie

(2009) Assembling an economic actor. In: MacKenzie D (ed.) Material Markets: How Economic Agents are Constructed. Oxford: Oxford University Press, pp. 37–62.

53.

MacKenzie

Millo

(2003) Constructing a market, performing theory: The historical sociology of a financial derivatives exchange. American Journal of Sociology 109(1): 107–145.

54.

MacKenzie

Spears

(2014a) ‘A device for being able to book P&L’: The organizational embedding of the Gaussian copula. Social Studies of Science 44(3): 418–440.

55.

MacKenzie

Spears

(2014b) ‘The formula that killed Wall Street’: The Gaussian copula and modelling practices in investment banking. Social Studies of Science 44(3): 393–417.

56.

Millo

MacKenzie

(2009) The usefulness of inaccurate models: Towards an understanding of the emergence of financial risk management. Accounting, Organizations and Society 34(5): 638–653.

57.

Mullainathan

Spiess

(2017) Machine learning: An applied econometric approach. Journal of Economic Perspectives 31(2): 87–106.

58.

Osiński

Budek

(2018) What is reinforcement learning? The Complete Guide, 5: July. Available at: https://deepsense.ai/what-is-reinforcement-learning-the-complete-guide/ (accessed 19 September 2019).

59.

Pedersen

(2009) When everyone runs for the exit. The International Journal of Central Banking. Available at: http://www.nber.org/papers/w15297 (accessed 29 October 2014).

60.

Rolin

(2015) Values in science: The case of scientific collaboration. Philosophy of Science 82(2): 157–177.

61.

Seaver

(2017) Algorithms as culture: Some tactics for the ethnography of algorithmic systems. Big Data & Society 4(2): 2053951717738104.

62.

Seyfert

(2016) Bugs, predations or manipulations? Incompatible epistemic regimes of high-frequency trading. Economy and Society 45(2): 251–277.

63.

Shackley

(1998) Introduction to special section on the use of models in appraisal and policy-making. Impact Assessment and Project Appraisal 16(2): 81–89.

64.

Svetlova

(2012) On the performative power of financial models. Economy and Society 41(3): 418–434.

65.

Svetlova

(2013) De-idealization by commentary: The case of financial valuation models. Synthese 190(2): 321–337.

66.

Svetlova

(2018) Financial Models and Society: Villains or Scapegoats? Northampton, MA: Edward Elgar Publishing Limited.

67.

Svetlova

Dirksen

(2014) Models at work – Models in decision making. Science in Context 27(4): 561–577.

68.

van der Aalst

WMP

Rubin

Verbeek

HMW

, et al. (2008) Process mining: A two-step approach to balance between underfitting and overfitting. Software & Systems Modeling 9(1): 87–111.

69.

Varian

(2014) Big data: New tricks for econometrics. Journal of Economic Perspectives 28(2): 3–28.

70.

Wansleben

(2014) Consistent forecasting vs. anchoring of market stories: Two cultures of modeling and model use in a bank. Science in Context 27(4): 605–630.

71.

Weigand

(2019) Machine learning in empirical asset pricing. Financial Markets and Portfolio Management 33(1): 93–104.

72.

World Economic Forum (2019) Navigating Uncharted Waters: A roadmap to responsible innovation with AI in financial services. The Future of Financial Services series prepared in collaboration with Deloitte, October.

73.

Zuckerman

(2019) The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution. London: Penguin Random House.