Abstract
Keywords
This article is a part of special theme on The Turn to AI. To see a full list of all articles in this special theme, please click here: https://journals.sagepub.com/page/bds/collections/theturntoai
In his 1976 article ‘Science and Statistics’, the British statistician George Box famously argues that ‘since all models are wrong, the scientist cannot obtain a “correct” one by excessive elaboration’ (Box, 1976: 792). Instead of seeking an unattainable perfect model, Box suggests the scientist ‘seek an economical description of natural phenomena’ because ‘the ability to devise simple but evocative models is the signature of the great scientist [while] overelaboration and overparameterisation is often the mark of mediocrity’ (792). Ockham’s razor –
This study considers the issue of using simplicity as a response to the complexity of machine learning models employed in the financial industry, specifically, in quantitative investment management and algorithmic trading. It has recently been argued that machine learning will transform how people trade securities and manage investments for generations (Lopez de Prado, 2018). With the increase in computing power, decrease in data storage costs, and availability of Big Data, applying machine learning techniques to trading and investment management problems has become an increasingly viable and exercised option in the financial industry (Arnott et al., 2018; Buchanan, 2019; Dixon and Halperin, 2019; Guida, 2019; Lopez de Prado, 2018). Machine learning is a subset of artificial intelligence that teaches machines how to handle data more effectively. Machine learning techniques appeal to the financial industry because of their capacity to effectively discover patterns, correlations and anomalies in large and complex datasets. I argue that machine learning techniques enhance financiers’ ability to take advantage of opportunities, but at the same time possess a degree of ‘unavoidable complexity’ (Burrell, 2016: 5), which developers and users of such techniques must find ways to manage and control.
Unlike more conventional rule-based algorithms that follow established rules, machine-learning algorithms learn from pre-set optimisation criteria, which offsets the
To grasp the transformatory role machine learning models allegedly play in financial markets, I argue it is necessary to examine the multifaceted and dynamic relationship between models and the people who develop and use them (Svetlova, 2018: 4). Various SSF studies have explored the reciprocal influence of models, equations, and algorithms on one hand, and developers and users on the other. MacKenzie’s work on the performativity of the Black–Scholes options pricing model is an example of a tone-setting contribution to the field (MacKenzie, 2003, 2008; MacKenzie and Millo, 2003), but several other thorough empirical studies have deepen the understanding of human–model interactions and entanglements in the financial markets (such as MacKenzie, 2011; MacKenzie and Spears, 2014a, 2014b; Millo and MacKenzie, 2009; Svetlova, 2012, 2013; Wansleben, 2014). One thing that differentiates the machine learning models examined in this study from, for example, the discounted cash flow valuation models examined by Svetlova (2012, 2013) is the capacity of the former class of models’ to learn and optimise without interference from model developers and users. By examining how these learning models shape the ways practitioners perceive and attempt to manage them, this study contributes to the sociological research on the use of models and algorithms in high finance.
To examine the question of how the complexity of machine learning models is being managed in the financial industry, I draw on interviews with market participants who develop and use machine learning techniques for specific trading and investment management purposes. As a theoretical frame, I use the perspective of ‘distributed cognition’, which originates from the idea that human cognition is always situated in ‘a complex sociocultural world and cannot be unaffected by it’ (Hutchins, 1995a: xii). Distributed cognition describes ‘a situation in which one or more individuals reach a cognitive outcome either by combining individual knowledge not initially shared with the others or by interacting with artifacts organized in an appropriate way (or both)’ (Giere, 2002: 641). Giere and Moffatt (2003) argue that ‘distributed cognitive systems’ enable ‘the acquisition of knowledge that no single person, or a group of people without instruments, could possibly acquire’ (305). In the context of financial markets, Beunza and Stark (2012) have demonstrated how traders working at an investment bank’s merger arbitrage trading desk reduce their ‘cognitive overload’ by using financial models and other instruments at their disposal (394). Similarly, Hardie and MacKenzie (2007; see also MacKenzie and Hardie, 2009) have shown how decision-making in hedge funds is characterised by distributed cognition, expressed in the way traders rely on colleagues, contacts, and technical devices when deciding what and when to buy or sell. In these studies, the instruments employed in the systems of distributed cognition compensate for traders’ bounded rationality, enable knowledge acquisition, and thus, inform human decision-making. However, as trading becomes increasingly automated algorithms no longer exclusively inform human decision-making; they often also determine
Even though much trading is conducted by algorithms at speeds and a scope that exceed human cognition, humans are not marginalised by but remain a crucial cog in contemporary algorithmic trading (Beverungen and Lange, 2018). Whereas ultrafast HFT algorithms enhances traders’ ability to seize arbitrage opportunities long before any human would have been able to identify them, machine learning expands the scope of data mining and data processing and thus, enhances the capacity to trawl markets in search of patterns and correlations to exploit. I do not want to argue that machine learning extends the mind and thus, has epistemic agency (Clark and Chalmers, 1998; cf. Giere, 2007), but I contend that machine-learning algorithms’ ability to learn makes them different and more complex cognitive aids than most other technical devices used in trading and investment management.As Burrell (2016) points out, when a machine-learning algorithm learns, ‘it does so without regard for human comprehension’ (10), which influences the way practitioners see and attempt to manage such models. The analysis demonstrates that users of machine learning models are concerned that models might learn the wrong things and thus, become deceitful rather than informative. They use Ockham’s razor as a heuristic to help strike a balance between simplicity and complexity, and interpretability and accuracy. The value of Ockham’s razor in machine learning model building lies not necessarily in the scientific superiority of the simple model, as suggested by Box, but in ensuring the comprehensibility and explainability of the models that practitioners interact with. The simplicity heuristic, as employed by market practitioners, is best understood as a ‘proxy for comprehensibility’ (Domingos, 1999: 421), which helps secure a certain level of human control of model processes and enables explainability. This emphasis on comprehensibility and explainability reflects a more general industry concern about transparency and machine learning (Bracke et al., 2019; World Economic Forum, 2019).
This introduction is followed by five sections. The first is a brief outline of the empirical material and my approach to studying machine-learning model use. This is followed by a section on applying a specific machine learning technique,
Data and methods
An increasing number of trading and investment management firms adopt machine learning techniques and practitioners are generally willing to discuss possible applications of such techniques in finance. Nevertheless, it is difficult to obtain in-depth information about specific machine learning models. Unsurprisingly, firms are reluctant to share the recipe for the ‘secret sauce’ that gives them their competitive edge. Because machine learning models are complex systems comprising numerous lines of code and an internal decision logic that changes as models learn from running on training data (Burrell, 2016: 5), it would be difficult to extract meaningful information from the source code itself (if available) without also having to obtain information about the context of application, training data, and implementation process. It is nevertheless possible to obtain contextual knowledge about machine learning models by talking to people in the financial industry who develop and use them, because it makes little sense to talk about machine learning without also considering the specific context of application (Weigand, 2019: 95). Thus, my approach is to treat models not as technical objects decoupled from the organisational, institutional, and social context in which they are developed and applied, but rather as
To examine these socio-technical systems comprising humans and models, I draw from 31 interviews with
The two main criteria for selecting interviewees were (1) expertise within the area of machine learning and/or (2) practical experience with regard to applying machine learning techniques in finance. Since developing implementable machine-learning algorithms for trading and investment management purposes is rarely a one-person endeavour, but rather a task distributed between a number of people in different roles, I have prioritised to gather input from a range of people involved in the practices of leveraging machine learning techniques in finance, instead of focussing solely on one group (e.g. data scientists). The majority of the interviewees worked in hedge funds or in algorithmic trading firms where they had different roles at different levels in the organisational hierarchy. I obtained consent from informants to use all quotes included in the article. A few interviewees wanted their identities disclosed, whilst the rest are either pseudonymised or anonymised.
The interviews are sampled from a body of 182 in-depth, semi-structured interviews conducted, during a two-year period (2017–2019) by members of the ‘Algorithmic Finance: Inquiring into the reshaping of financial markets’ research project of which I am a part. Interviewees worked in hedge funds, asset management firms, proprietary algorithmic trading firms, banks, brokers, regulators, exchanges, and technology vendors (see Table 2, in Appendix 1). Most of these firms and financial institutions are located in New York, Chicago, and London, but we also conducted interviews in Washington, San Francisco, and a few other cities in the US and continental Europe. Most interviews were conducted
The optimal trade: Reinforcement learning for trade execution
Machine learning techniques have existed in the financial industry for quite some time. Banks and credit card companies have used machine learning models for classification purposes in fraud detection since the 1990s and credit scoring since the early 2000s (Buchanan, 2019). In the fields of trading and investment management, notoriously secretive systematic hedge funds, like the Cold War code breaker James Simons’ Renaissance Technologies, D. E. Shaw, and Two Sigma, are said to have profited substantially from using machine learning models for a number of years (see, e.g. Zuckerman, 2019). Today, there is no shortage of investment management and trading firms experimenting with artificial intelligence and machine learning. Although only a fraction of firms may be able to boast of having successful machine learning systems in production, interest in the technology is undeniably massive and increasing.
In general, the machine learning technique choice is dependent on the task at hand and dataset available. As a quant working in one of the world’s leading quantitative hedge funds explained, machine learning is ‘just another mathematical technique used to extract information from a dataset and certain datasets have characteristics that lend themselves better to machine learning techniques than others’. In that sense, machine learning comprises just another set of techniques or tools in the quant toolbox. Like simpler calculative devices, machine learning models are technical aids firms use as a way of compensating for the limited information processing, calculative, and information storage (memory) capacities of humans. Thus, the task or problem at hand suggests which tool to use; that is, it indicates how the distribution of cognition materialises in concrete financial practices (MacKenzie and Hardie, 2009: 47). This distribution of a specific task to a certain type of machine learning technique is the main focus of the following analysis. Besides presenting the problem of optimal execution and type of algorithm proposed as its solution, I discuss challenges associated with devising and implementing learning algorithms in a dynamic complex system such as financial markets.
Within the spectrum of possible applications there are various examples of adding machine learning to existing, proven quantitative strategies. One area where machine learning techniques have shown considerable promise is algorithmic trade execution. When brokers, who act as intermediaries between investors and markets, execute large orders on behalf of institutional clients – each trade containing blocks of several thousand shares often worth millions of dollars – they do their utmost to impact the market as little as possible. If the entire order were submitted at once, the market would move instantaneously and result in what financiers and financial economists refer to as ‘slippage’. Slippage is the undesired discrepancy between the expected trade price when the order is submitted and the actual price at which it is executed (Frino and Oetomo, 2005). For example, say an investor wants to buy 100,000 shares of Apple and hope to purchase them at a market price of 230 USD per share. If the entire buy order were submitted to the market, Apple sellers would ask for a higher price and instead of executing the trade at the expected price of 23,000,000 USD, the effective price would be significantly higher. To reduce the cost of market impact as much as possible, many brokers use algorithms that slice and dice orders into smaller lots and submit them to the market at certain intervals over a set period of time (Dixon and Halperin, 2019: 4).
The emphasis on developing sophisticated execution algorithms has increased with the rise in HFT. The risk of being front runned by predatory HFT algorithms (Lewis, 2014) has arguably made the optimal execution problem more pertinent to institutional investors and the brokers who bring their orders to the market. A main purpose of execution algorithms has thus become to disguise orders from HFT algorithms, yet our interviews with HF traders suggest this is a tremendously difficult thing to do (corroborated by HF traders quoted in MacKenzie, 2018b: 1654–1655). Whilst financial economists are divided on the question of whether HFT actually impacts institutional execution costs (Brogaard et al., 2014; Kervel and Menkveld, 2019), practitioners are concerned with and try to account for the possible presence of predatory HFT algorithms when they execute large orders. When I asked the hedge fund quant whose definition of machine learning I quoted earlier in this section about their approach to aggressive HFT ‘taker’ algorithms (MacKenzie, 2018a), he explained that they basically tried to devise execution algorithms capable of making the small, so-called ‘child orders’ appear as random as possible when they are submitted to the market. ‘You try to randomise your orders so that they do not reveal your big order,’ he explained, and further noted that ‘in essence, it all comes down to having as little market impact as possible for your trade, and that is what execution research is doing’. The hedge fund has an entire team exclusively devoted to execution research (i.e. to devising execution algorithms).
There are two dominant approaches to the algorithmic execution of large trades: time-weighted average price (TWAP) and volume-weighted average price (VWAP). The TWAP strategy divides orders into bits and executes them evenly over a specified period of time; the average price of a security is calculated over a set period of time and trades are executed as close to that average price as possible. The VWAP approach also splits the position into child orders, but uses volume instead of time as the criterion for calculating the average price of each child order. The prices at which the most volume is traded weighs more in the VWAP calculation of average price (Law, 2018). The common objective of both the TWAP and VWAP is to provide the best possible solution to the optimal execution problem.
Machine learning is well suited to solving optimisation problems, such as the optimal execution problem. Reinforcement learning – a form of machine learning in which algorithms employ trial and error to identify solutions to problems (Osiński and Budek, 2018) – has shown significant promise when applied to the problem of optimised trade execution (Dixon and Halperin, 2019; Kearns and Nevmyvaka, 2013; Li and Lau, 2019). In a game-like situation, the agent is either penalised or rewarded when performing an action. The objective is for the model to maximise its total reward (Li and Lau, 2019: 2). The model learns the most effective way to maximise its reward by being rewarded (penalised) for making the right (wrong) decisions. Whilst the model developer defines the criteria for what should be considered right and wrong actions, the model figures out for itself how to solve the task at hand. Reinforcement learning is deemed an effective way to utilise a ‘machine’s creativity’ (Osiński and Budek, 2018) and unlike humans, the algorithms can learn and thus, gain experience from playing numerous parallel decision-making games simultaneously. It is thus a technique that expands the realm of possible ways to optimise market action by performing cognitive tasks at a scope that exceeds human capabilities. In this case, the task distributed to a reinforcement learning algorithm is to optimise trade execution.
Matt Le Tissier (pseudonym), who had about a decade’s worth of experience in devising machine-learning algorithms that execute large institutional orders in an optimal fashion, explained how reinforcement learning could improve a VWAP strategy: Our problem is to take a large order and split it over time in order to ideally minimise market impact, but also achieve some kind of improvement over time, if you can. The baseline assumption would be VWAP, right? Take a VWAP strategy where you are trading uniformly over volume time. That is fine, but not good enough. People want some improvement over that. The RL [reinforcement learning algorithm] essentially applies an adaptation over the baseline strategy, but it does that by providing the optimization parameter, which controls how much it is going to shift the overall schedule forward and backward.
Although the reinforcement learning adaptation of the VWAP strategy allegedly provides better execution, Matt stressed that applying the technique in financial markets can be very complicated. One of the main challenges modellers face is making sure that the model is not derailed by shifts in the market regime. A market regime is basically the state or behaviour of a market at a given point in time. Regime changes occur when, for example, volatility spikes or liquidity dries up (Ang and Timmermann, 2012). Many market-internal and -external factors can cause changes in a market regime, but the point is that it can be difficult for both human and non-human trading agents to operate in changing market regimes. If a model has many parameters, as machine learning models can have, it might be capable of performing well on one dataset, but if there is even the slightest change in market regime, the model risks adapting to these changes in an undesirable way. Whilst the model might optimise order execution, its many variables simultaneously combined with the dynamic and complex nature of financial markets produce obstacles and challenges that quants need to deal with. Thus, distributing a task to an automated trading agent – in this case the reinforcement learning execution algorithm – involves considering how to appropriately (i.e. optimally)
Devising adaptive models capable of performing consistently in changing market regimes is a major and very tedious challenge. To manage their machine learning models’ complexity and thus, ensure that they do not become uncontrollable and incomprehensible, quants use Ockham’s razor as a way of ‘framing’ their model. Framing is constructing ways to select, categorise, and interpret data or, in this case, an automated data processing tool (Beunza and Garud, 2007; Hardie and MacKenzie, 2007: 391). In the following section, I analyse and discuss ways quants use simplicity as a framing tool in model development. I argue that Ockham’s razor help them reduce the risk of producing overfitted models and enables them to better control the complexity of the machine learning techniques they employ.
The overfitting problem
‘[T]he noise in the data is clouding our vision.’ (Black, 1986: 541)
Financial markets are intricate systems full of information to be extracted and exploited, but they are also brimming with noise. Noise, Black (1986) argues, is ‘what makes our observations imperfect’ (529). It can be difficult to distinguish information from noise and occasionally people act on noise as if it was information. Mistaking noise for information is not limited to humans; from time to time, machine learning models also have their vision clouded by noise. Many such models tend to have a difficult time discriminating strong correlations from spurious ones. Consequently, there is a risk of making inferences about possible relationships in the market based on completely arbitrary and coincidental correlations between factors in the data. One of the problems emerging from this predicament is what modellers refer to as
Although I am mainly interested in ‘overfitting’ as an emic term used by market participants to ascribe meaning to specific concerns about the relation between model performance and model accuracy, I acknowledge it is a technical term that requires some definitional clarification. A technical way to understand overfitting is to distinguish between prediction errors caused by
Overfitting is an inaccuracy problem, but also one of deception. An overfitted model acts on ‘idiosyncratic noise’ (Mullainathan and Spiess, 2017: 91) or ‘random quirks’ (Domingos, 2012: 81) in the data as if it were valid information, and as a result, identifies correlations that do not exist. If a model’s complexity is high – that is, if it has a large number of parameters – then the risk of overfitting also tends to be high. As Arnott et al. (2018) argue, ‘the greater the complexity, and the greater the reliance on non-intuitive relationships, the greater the likely slippage between backtest simulations and live results’ (14). Based on views of the quants, it is apparent that their preference for simplicity in machine learning modelling is closely associated with their concerns about models fitting on noise. Simplicity (i.e. Ockham’s razor) functions as a safeguard against relying too heavily on non-intuitive relationships that might be purely spurious correlations discovered by an overfitted, and thus, inaccurate model. As a heuristic, Ockham’s razor allows quants to frame the model so that its complexity becomes manageable. It helps quants control model complexity and the process of adding features to models (i.e. adding complexity). Whilst machine learning techniques help compensate for the bounded rationality of analysts, traders, or portfolio managers – by processing data in ways, and at a pace and scope that exceed human capacities – simplicity helps quants mitigate the risk of being deceived by overly complex model processes.
For developers, let alone users, of machine learning models, it can be difficult not to be swayed by what at first glance appears to be a high-performing model. However, what looks like predictive power is often in fact overfitting. Being sceptical about a model’s output rather than naïvely trusting it arguably helps to mitigate the risk of being deceived by the model. This is how Alan Smith (pseudonym), a machine learning specialist, diagnosed the problem. Alan worked in Matt’s team at the brokerage firm, where he experimented with applying machine learning techniques to different financial problems, including the optimal execution problem. He aptly summarised the problem of overfitting and explained how it could lead to self-deception on the part of model developers and users if they were not careful. A big mistake lots of people make is that they always want to use something really complex to solve a problem that does not need it. You always want to default the model with the least complexity. You do not want to be fitting to an artefact in the dataset. You just want to make sure that what you are seeing is real and you are not actually kidding yourself with any kind of predictive power.
The importance of simplicity in modelling was further emphasised by another quant, Brian Peterson, who had many years of experience applying machine learning to financial and risk management problems. Brian now leveraged his expertise in the Chicago-based algorithmic trading firm Braverock Investments and as a lecturer at the University of Washington (computational finance and risk management) and University of Illinois (financial engineering). Brian explained that he too resorted to the simple model as the default option. I always start with the simplest model that I think will be minimally informative, and then use that as a benchmark. We will, for example, take a logistic regression model and use it as our benchmark model, and then do experiments based on [the following questions]: Does adding this factor help? Does using a more sophisticated model help? The more sophistication I add to the model, the more model risk I add to the strategy. […] We always try to start with the simplest model we think will work and move very, very carefully to more complex models.
To prevent adding more than manageable model risk without compromising model accuracy, Brian aimed for parsimony in modelling. It is very difficult to beat a parsimonious model. All other things being equal, I always prefer the model that has fewer parameters to the model that has more parameters, because the model with fewer parameters is far less likely to break out-of-sample. But that is not always possible. Sometimes I need a large feature space to describe the data that is in front of me. However, given the choice, if I have two otherwise equivalent models, if I cannot statistically distinguish between the models and the predictions that come out of those models, I will always choose the simpler one, because otherwise the chance that I am going to fit on noise is really, really high.
As noted by Domingos (2012), simplicity is valuable in machine learning modelling because ‘simplicity is a virtue in its own right, not because of a hypothetical connection with accuracy’ (86). Simplicity is ‘a goal in itself’ insofar as it is considered a ‘proxy for comprehensibility’ (Domingos, 1999: 421). As the complexity of models used in finance increases, quants become increasingly preoccupied with questions about comprehensibility and explainability (Bracke et al., 2019). Firms want to be able to know, explain, and communicate what their models are doing, especially if something goes wrong (World Economic Forum, 2019). Investors can, for example, quickly lose faith in and abandon an asset manager who cannot convincingly explain their model in periods of weak performance (Kollo, 2019: 5). Thus, Ockham’s razor functions as enabling explainability rather than guaranteeing accuracy.
The problem concerning the comprehensibility and explainability of machine learning models was addressed, in relation to accountability, by Ernest Baver, senior research scientist at Hutchin Hill Capital. Ernest explained how the sheer complexity of machine learning models was hindering the ability to attribute a proper cause in the case of a model misstep. If you use a more intuitive model and something does not work right, then you have a simple way to do attribution analysis. You can actually figure out what went wrong… If you are using complicated nonlinear models, it is harder to say what went wrong. There is no simple way to decompose a deep learning structure into simple blocks.
Several other interviewees used the distinction between intuitive and non-intuitive models as a way of framing the models they applied. Some emphasised the importance of having an ‘intuitive feel’ for the model, which basically means they are aware of what the model is doing. Having an intuitive understanding of the model’s operations also means knowing when and how to intervene if the model suddenly begins to operate in discord with the modeller’s original intent. An intuitive feel for the model thus sustains a ‘qualitative overlay’ (Svetlova, 2012), which allows for judgment calls from traders, portfolio managers, quants, and others. However, machine-learning algorithms and, to some extent, automation, redefine the role of the qualitative overlay or, to use another term,
Automating Ockham’s razor or reimagining the human element
Ockham’s razor is our friend! We realised that we had to teach the machine to be the scientist and to prefer simpler theories. We look at all theories. We look at how much machine learning went in and how much overfitting the algorithms can do and then judge them accordingly.
These are the words of chief scientist Tom Cleverly (pseudonym) of a US-based hedge fund with a strong commitment to machine learning and automation. In the previous section, simplicity is regarded as a framing mechanism used in the modelling process and thus, is something that takes place ‘outside’ the model (Svetlova and Dirksen, 2014: 572). However, in this hedge fund’s case, Ockham’s razor is written into the machine learning system itself. Tom’s rationale for implementing simplicity preference in the system is that if Ockham’s razor, as proposed by Box, is the hallmark of the great scientist, then a good model must be capable of applying that principle. According to Tom, this built-in simplicity preference constitutes ‘meta machine learning’, that is, a layer of machine learning operating on top of a signal detection machine learning model. Whilst the basic model processes market data in search of signals to trade on, the meta-layer ensures only simple and robust strategies are pursued. The purpose of automating Ockham’s razor is basically to prevent the system from overfitting and thereby producing unreliable signals. Tom further emphasised that the meta-machine learning layer automates the actual investment decision process, which in most hedge funds take place in an investment committee. Normally, the investment committee is composed of a group of high-level people from the firm and its job is basically to determine the fund’s direction. Automating the investment committee function thus eliminates human bias from the strategy- and decision-making process. ‘There is no human layer on top,’ insisted Tom. In other words, the qualitative human overlay is replaced by a mechanised layer of machine learning.
Modellers use various techniques to offset or mitigate the risk of overfitting. One of these techniques is regularisation. Regularisation is a machine learning component that discourages selecting a more complex model. Only if the complex model significantly outperforms the simple model (determined by some prefixed value) does the machine learning system switch to the more complex one (Arnott et al., 2018: 13–14). The meta-layer on the hedge fund’s machine learning system can be regarded as a sophisticated regularisation tool. The system is built in accordance with a scientific ideal which says simpler, more economic descriptions of a phenomenon are preferable to more complex ones. However, when Ockham’s razor is automated, it is no longer a framing mechanism utilised by quants in the model development process: it reduces the heuristic to an algorithmic operation, not contingent on any intervention by human judgment. Making Ockham’s razor a fixed principle assumes the relationship between simplicity and accuracy, which Domingos (2012) describes as ‘hypothetical’ (86). Although automating Ockham’s razor does leverage the procedural qualities of the principle, it simultaneously eliminates the possibility that quants may consider if the practical concern (i.e. the specific problem in a given market context) in fact requires a more complex technical solution.
In general, the objective of using machine learning in finance is to discover reliable patterns and structures in the noisy, adaptive, and deeply entangled reality of financial markets. Because financial markets are anything but simple, the simplest model is not always the optimal solution when dealing with very intricate problems. Instead of perceiving Ockham’s razor as a truth claim, seeking simplicity in modelling may rather be a means for enabling quants to activate their judgment and expertise when working on models. It is a means to direct and control the model development process; it is not an end. As Domingos (1999) argues, modellers should ‘seek to constrain induction using domain knowledge, and decouple discovering the most accurate (and probably quite complex) model from extracting comprehensible approximations to it’ (421). Thus, the quant’s domain knowledge is considered a decisive factor in constraining complexity and certainly has not been made redundant by automation. As Beverungen and Lange (2018) argue in their study of distributed cognition and automation in HFT, a crucial part of being an HF trader is to ‘establish control, awareness and consciousness of what is going on’ when their algorithms trade in the markets (84). Something similar can be said about machine learning quants, who use simplicity to control and thus manage complexity when devising models.
Although the human element – the individual market participant’s domain knowledge and judgment – has not been rendered obsolete by machine learning, quants are forced to exercise their reasoned judgment differently. This alleged shift in how quants are supposed to utilise their expert judgment was addressed by Mahmood Noorani, the founder and CEO of the London-based analytics provider Quant Insight. Having worked in trading and portfolio management since the 1990s, Mahmood experienced first-hand the proliferation of increasingly sophisticated algorithms and models in the financial industry. Although he embraced the opportunities created by recent leaps in artificial intelligence and machine learning research, his view on applying learning algorithms in investment management was marked by a profound faith in expert judgment. According to Mahmmod, the non-static nature of financial markets combined with the scope and adaptability of the most sophisticated models used in the industry only made ‘the human overlay’ more indispensable. The problem is that you cannot have a controlled repeated experiment in finance … If you are going to build machines to trade markets, one thing you still have to have is domain expertise, because the big issue on the quantitative side is the so-called overfitting issue. If you build a machine that goes and takes a vast amount of data and try to understand an asset price, you can get these spurious associations where rainfall in Brazil seems to be explaining, with a 1-week lag, the movement of the S&P 500. That is just wrong. You have to have a human overlay to understand causality. A couple of years back, we were trying to build a model of crude oil prices … [T]he question was: ‘should we have global inflation expectations as one of the explanatory variables for crude oil?’ We decided not to, because it should actually go the other way around. Changes in crude oil prices shift inflation activity. Changes in inflation expectations do not shift crude oil prices. You need to have a human, who understands how the relationships work. I do not think we are anywhere close to the point where the machine is going to understand that causality … [T]here still is a place for the human element, but it is in guiding and training the machine and choosing the right data, rather than going straight to the investment decision.
Conclusion
This article’s main objective was to delve into some of the challenges associated with applying machine learning techniques in financial securities trading and investment management. Specifically, the goal was to investigate how quants strive to maintain a certain level of comprehensibility and control in response to the model complexity resulting from the proliferation of machine learning techniques in the financial industry. I used the theoretical perspective of distributed cognition to capture the human–model dynamic and examine ways cognitive tasks are distributed to machine learning-powered financial models. This theoretical framing allowed for an analysis of the organisation of distributed cognitive systems where machine learning techniques are employed as technical aids. The study contributes to research on the relationship between and interaction of humans and models in finance and beyond by examining how this relationship and interaction dynamic is shaped by applying emerging technologies – in this case machine-learning algorithms used for investment management and trading.
The article examines how quants attempt to manage the inevitable complexity of machine-learning algorithms by resorting to simplicity as a virtuous rule of thumb in model development and model implementation processes. Quants consider simplicity or Ockham’s razor a heuristic that helps them manage and control machine learning model complexity. Applying Ockham’s razor in their modelling practices, quants seek to ensure their models are comprehensible, interpretable, and thus, explainable. Hence, Ockham’s razor helps frame the modelling process in a way that makes it more foreseeable and provides the quant with a certain degree of control, which is also a way of fortifying accountability (Goodman and Flaxman, 2017). Moreover, the study suggests comprehensibility and explainability become increasingly important factors in quantitative investment and algorithmic trading for the simple reason that the most sophisticated models used in finance – such as multi-layer artificial neural networks – comprise elements that cannot be fully comprehended and thus, explained. Rather than being able to account for every detail in learning models, comprehension is a matter of understanding the logic and being able to interpret model output. The ability to interpret and explain complex models is particularly important when faced with a model mishap.
Besides securing explainability, the simplicity preference functions as a safeguard against the risk of models learning from noise in the data rather than information. Quants’ concerns about the risk of models overfitting on noise is the primary reason they consider simplicity a virtue. Simpler models with fewer parameters are less likely to overfit and thus produce incorrect predictions. The quants’ preoccupation with Ockham’s razor should not be dismissed as blind reliance on a contested yet resilient scientific principle, but rather, as an expression of the crucial importance human judgment still has in machine-driven finance. By demonstrating how quants exercise judgment when dealing with complex machine learning models, the article contributes to pertinent academic discussions of the human role in the increasingly automated world of high finance. In electronic markets, trading automation has reshaped the work of traders; they seldom trade manually, but rather observe algorithm trading and intervene if an algorithm does not act as intended. Similarly, machine learning is changing the role of the quant. The quant’s job is to ensure models are maximally capable of determining how, what, and when to trade. Machine learning quants train and sustain automated decision-makers. In other words, they are responsible for the organisation and facilitation of socio-technical systems of distributed cognition.
Hence, the domain knowledge possessed by an individual expert is not made redundant by the propagation of adaptive machine-learning algorithms, but is used for guiding and ensuring stable, accurate, and transparent model performance. It is arguably not in spite of but because of the presence of learning algorithms that a human element is necessary in quantitative finance.
There is a need for further studies of the ways automation and the extended use of machine learning techniques shape model use practices in trading and investment management and, perhaps, to reconceptualise judgment as an action undertaken by socio-technical assemblages rather than by an individual aided by machines. Another potential avenue for further research presents itself in study of the subjective factors, the values shaping practices and decision-making in machine learning-driven quant finance. As Kuhn (1977) pointed out more than 40 years ago, decision-making in scientific practice is dependent on a ‘mixture of objective and subjective factors’ (106; cf., Laudan, 1984). Considerations around theory or model accuracy, simplicity, comprehensibility, etc. – such as those presented in this article – are presumably always influenced by ‘non-epistemic values’ (Douglas, 2000). The applied science conducted by machine learning quants (i.e. the application of theories and techniques from computer science, math, physics, etc. to financial problems) hinges on collaborative efforts (Rolin, 2015), negotiations between teams of quants on one side, and traders or portfolio managers, firms’ fiduciary responsibilities, etc. on the other side. All these internal and external factors set the scene where the practices of model development and model use are at play, and make these practices heavily value-laden.
