Abstract
Introduction
Bitcoin has become the most famous decentralized crypto blockchain since it was first outlined in an article by Nakamoto (2008). Although Bitcoin is originally designed to be a peer-to-peer payment system (Nakamoto, 2008), it has been regarded as a financial asset or an investment tool for a long time. In addition to articles discussing the nature of Bitcoin (Baek & Elbeck, 2015; Baur et al., 2018; Kwon, 2020; Yermack, 2015), a large stream of the literature has explored its financial features, including its price (Baig et al., 2019; Corbet et al., 2018; Hu et al., 2019; Li et al., 2020), return and volatility (Baur et al., 2018; Bouri, Azzi, & Dyhrberg, 2017; Bouri, Jalkh, et al., 2017; Bouri, Molnár, et al., 2017; Dyhrberg, 2016a, 2016b; Klein et al., 2018; Thies & Molnár, 2018), market efficiency (Bariviera, 2017; Kurihara & Fukushima, 2017; Nadarajah & Chu, 2017; Urquhart, 2016), connection with traditional financial markets or within cryptocurrency markets (Andrada-Félix et al., 2020; Antonakakis et al., 2019; Borri, 2019; Kurka, 2019; Li et al., 2020; Nguyen et al., 2020), linkage between price or return and transaction activity on blockchain (Ante & Fiedler, 2020; Koutmos, 2018a, 2018b), and intraday dynamics based on high-frequency data (Bariviera et al., 2018; Eross et al., 2019; Zargar & Kumar, 2019). Most of these studies focus on the performance of Bitcoin using the daily or periodical price rather than the market microstructure of this new financial asset.
Recent research concentrates on the microstructure of the Bitcoin market from the view point of order flow or order imbalance using historical trade data. For instance, Dimpfl (2017) inferred Bitcoin trade directions from tick data downloaded from bitcoincharts.com. Feng et al. (2018) employed normalized order imbalance to measure informed trading in the Bitcoin market using microstructure data from bitcoincharts.com. Wang et al. (2020) studied the impact of informed trading indexed by order imbalance on Bitcoin return and volatility using data from the same source as Dimpfl (2017) and Feng et al. (2018). Ibikunle et al. (2020) also utilized order imbalance in research on Bitcoin price discovery based on historical trade data from Bitstamp. On one hand, microstructure data with an initiator are not always available. It is important to assign trade directions based on classification algorithms when the initiator is not indicated in market microstructure research. As far as we know, the tick rule has been applied to classify historical trade data in Dimpfl (2017), Feng et al. (2018), Wang et al. (2020), and Ibikunle et al. (2020) for the Bitcoin market. On the other hand, the classification accuracy of the tick rule in the Bitcoin market, which consists of multiple online crypto exchanges, remains unknown.
This study investigates the accuracy of the tick rule in the Bitcoin market. The commonly used tick level data of Bitcoin, which are freely downloadable from bitcoincharts.com, in research related to the Bitcoin market microstructure does not indicate the initiator. Nevertheless, the market data of Bitcoin/US Dollar (USD) with a trade indicator from Kaiko provide an opportunity to test the accuracy of the tick rule in the Bitcoin market. This study focuses on three main questions: First, what is the classification success rate of the tick rule for Bitcoin market data? Second, what factors are associated with the classification accuracy of the tick rule in the Bitcoin market? Third, how accurate are the order imbalances based on the classification results of the tick rule in the Bitcoin market?
The primary results of this study are as follows. First, classification of 11.9 million Bitcoin/USD trades on Bitstamp suggests that the overall classification accuracy of the tick rule is 76.87%, which is close to that in the stock markets, and daily accuracy ranges from 68.98% to 83.76% during the sample period. This study finds a positive relationship between higher likelihood of misclassification and longer period between trades. As information spillover exists across Bitcoin exchanges (Brandvold et al., 2015), bigger time gaps contain more information to drive price changes. Furthermore, all of the order imbalances computed using the tick rule are significantly different from the true ones. The biases of the order imbalances calculated using large-size trades are relatively smaller than those using the whole sample.
The empirical findings of this study contribute to the small but ongoing exploration of the Bitcoin market microstructure in terms of research methodology. To the best of our knowledge, this is the first study to assess the classification accuracy of the tick rule in the Bitcoin market. Recently, tick-based classification methods have been used in research on the Bitcoin market microstructure (Dimpfl, 2017; Feng et al., 2018; Ibikunle et al., 2020; Wang et al., 2020) without knowledge of the classification accuracy of the tick rule. The conclusions from this study can thus serve as a guide for future research related to the methodological approach to be applied to the Bitcoin market microstructure when the transaction direction is not available.
This study also contributes to the existing literature on trade classification algorithms. Except for Aktas and Kryzanowski (2014) and Carrion and Kolay (2020), the classification accuracy of the tick rule in the above-mentioned articles is examined in slower trading environments, namely when high-frequency trading is not widely applied. Using tick data stamped to seconds from the Bitcoin market, this article also confirms the findings of Carrion and Kolay (2020), namely that the classification accuracy of the tick rule is similar to those in slower trading environments.
The remainder of this article is organized as follows. “Literature Review” section briefly reviews the literature on the tick rule. “Classification Accuracy” section reports on the empirical analysis of the classification accuracy of the tick rule. “Order Imbalance” section presents the biases of order imbalances based on trade directions assigned by the tick rule. “Discussion” section discusses the empirical findings of this study. Finally, “Conclusions” section concludes the article.
Literature Review
Trade classification approaches are commonly used to discern trade intentions when there is no available trade direction in market data. Popular classification methods include the tick rule or tick test, quote rule, Lee–Ready algorithm proposed by Lee and Ready (1991), and bulk volume classification (Easley et al., 2012, 2016). The tick rule assigns trade direction based on price movements when no quoted data are available. If the transaction price is higher than that of a previous transaction (uptick), then the present transaction is classified as a buyer-initiated order. Conversely, if the transaction price is below the latest price (downtick), then the present transaction is classified as a seller-initiated order. If there is no price change (zerotick), the transaction direction is assigned as the same as the previous one. On the other hand, the quote rule assigns trade as a buyer-initiated (seller-initiated) order if the price is above (below) the midpoint of the bid and the ask. Meanwhile, the Lee–Ready algorithm (Lee & Ready, 1991) combines the two aforementioned rules. For unclassified trades at midpoint, the Lee–Ready algorithm applies the tick rule to discern directions. Finally, bulk volume classification (Easley et al., 2012, 2016) uses empirical distribution of price changes to infer the possibility of buyer-initiated (seller-initiated) volume from the aggregate volume of each bar. As transaction direction is not always available when conducting researches on market microstructure, classification algorithms are very important to infer trade intentions in researches.
The tick rule is usually used in research related to market microstructure when only trade data are available. For example, Barber et al. (2009) employed the tick rule to identify directions of partial trades. Bernile et al. (2016) also used the tick rule to compute order imbalance, which measures informed trading activity ahead of the Federal Open Market Committee’s policy announcements. Dimpfl (2017), Feng et al. (2018), Wang et al. (2020), and Ibikunle et al. (2020) used the tick rule to assign directions to measure informed trading activity in the Bitcoin market.
Although the tick rule has been used in existing studies, the accuracy of such results is a cause for concern. Theissen (2001) noted that research related to the market microstructure model can be systematically biased because of inaccurate trade classification. Undoubtedly, the accuracy of the tick rule has been examined on stock markets. Using TORQ data for the U.S. stock market, Odders-White (2000) reported that 78.6% of transactions were correctly classified by the tick rule. Tests conducted by Finucane (2000) using information from the TORQ database from November 1990 to January 1991 showed that the classification accuracy of the tick rule applied to a sample of 144 NYSE firms was 83.0%. Ellis et al. (2000) utilized NASDAQ data during the period of September 27, 1996, to September 29, 1997, and showed that the success rate of the tick rule was 77.66%. Tests by Chakrabarty et al. (2007) on NASDAQ stocks traded on INET and ArcaEx revealed that the overall success rate of the tick rule was 75.4% during the sample period. A recent examination by Carrion and Kolay (2020) showed that the classification success rate of the tick rule was 78.62% in a sample of data stamped to seconds from NASDAQ HFT database over a subset of dates during the period of 2008 to 2010 when trades were more frequent than before. And the classification success rate ranged from 69.75% to 83.34% across individual stocks. Similar studies have also focused on the non-U.S. stock markets. Aitken and Frino (1996) studied trades of 2 years on the Australian Stock Exchange and reported a success rate of 74% for the tick rule. Using 15 stocks on the Frankfurt Stock Exchange in 1996, Theissen (2001) documented that the tick rule correctly classified 72.2% of the transactions. An examination of classification algorithms on data from the Taiwan Stock Exchange by Lu and Wei (2009) revealed an overall success rate of 74.18% for the tick rule. Aktas and Kryzanowski (2014) examined the trade classification accuracy of different classification algorithms using the data of component firms of the BIST-30 index and found that the classification success rate of the tick rule ranged from 84.86% to 92.15% in different subsamples. In addition, Omrane and Welch (2016) found that the classification success rate of the tick rule for 1.2 million trades on the foreign exchange electronic communication network market was about 68%. To the best of the authors’ knowledge, none of the previous research has focused on the accuracy of trade classification algorithms for the cryptocurrency markets.
Classification Accuracy
Data and Methodology
Historical market data are required to test the accuracy of the tick rule in the Bitcoin market. Due to decentralization and lack of regulation, Bitcoin is traded simultaneously and continuously (24/7) on multiple online crypto exchanges. To examine the classification accuracy of the tick rule, the market data of Bitstamp, an order-driven online crypto exchange, is considered as being representative of its liquidity. Founded in Europe in 2011, Bitstamp is one of the oldest and largest global crypto exchanges that allows trading between Bitcoin and USD. The tick-by-tick trade data of Bitstamp were acquired from Kaiko, a cryptocurrency market data provider. Every transaction record includes a unique trade ID, timestamp, transaction price in USD, amount in Bitcoin, and trade direction indicator. This study includes classifiable trades from December 6, 2017, to October 7, 2018 (Greenwich Mean Time), amounting to a total of 11,919,298 observations. Since all these trades are stamped to seconds, some trades occur at the same timestamps because of fast trading. This study thus uses the trade ID to determine the order of the trades.
The detailed classification process of the tick rule applied to a specific tick
where
As defined in the previous literature, the classification accuracy
where
Classification Accuracy
Table 1 reports the classification accuracy of the tick rule for directions in Bitcoin transactions. According to the Bitstamp market data, true seller-initiated (buyer-initiated) orders account for 41.50% (58.50%) of all trades, whereas 45.84% (54.16%) of all trades are classified as seller-initiated (buyer-initiated) orders by the tick rule. Furthermore, the tick rule wrongly classifies 9.40% of all trades as buyer-initiated orders and 13.74% of all trades as seller-initiated orders. Thus, the overall classification success rate of the tick rule on the Bitstamp market data during the sample period is 76.87%.
Classification Accuracy for Trade Directions.
Figure 1 shows the daily change in the misclassification rate. It can be observed that the misclassification rate of the tick rule varies with time, ranging from 16.24% to 31.02%. This means that daily success rate of the tick rule on the Bitcoin/USD transaction data of Bitstamp ranges from 68.98% to 83.76%.

Daily misclassification rate.
Table 2 displays the misclassification rate for each of the three tick types. The misclassification rates for uptick, downtick, and zerotick are 16.92%, 26.76%, and 25.28%, respectively, indicating that the classification success rate of the tick rule for upticks is much higher than that of the other tick types. In addition, the misclassification of zeroticks contributes the most to the total errors.
Classification Accuracy for Ticks.
Misclassified Direction
Figure 2 displays the daily proportion of seller-initiated orders. Although the statistics in Table 1 show that the number of misclassified seller-initiated orders is less than those from the buyer side, Plot A in Figure 2 illustrates that the proportion of seller-initiated orders in misclassified trades, defined as the number of intraday misclassified seller-initiated orders divided by the number of intraday misclassified samples, changes from day to day. Moreover, this value is not always less than that of the buyer-initiated orders. Plot B presents the proportion of seller-initiated orders in intraday trades, defined as the number of intraday seller-initiated orders divided by the number of intraday trades. The Pearson (Spearman) correlation coefficient of the two time series is 0.76 (0.73). Accordingly, the misclassification rate of the seller-initiated trades is positively associated with the proportion of seller-initiated trades in the sample.

Proportion of seller-initiated orders: (A) daily change of misclassified sell/misclassified sample; (B) daily change of sell/total.
Classification Under Different Market Conditions
Table 3 reports the classification accuracy of the tick rule during two subperiods to examine whether Bitcoin market conditions could affect the accuracy of this classification algorithm. First, Panel A displays the classification accuracy during the first subperiod from December 7, 2017, through March 31, 2018, which covered a highly bull market in December 2017 and a subsequent crash in January 2018 in Bitcoin price. It is found that the tick rule wrongly classifies 9.59% of all trades as buyer-initiated orders and 13.57% of all trades as seller-initiated orders. Namely, the classification accuracy of the tick rule is 76.84% during the first subperiod. Next, Panel B displays the classification accuracy of the second subperiod spanning from April 1, 2018, to October 7, 2018 during which Bitcoin price was relatively stable. Similarly, the classification accuracy of the tick rule is 76.90% during the second subperiod. Hence, Bitcoin market condition does not significantly impact the classification accuracy of the tick rule on the whole.
Classification Accuracy During Subperiods.
Multivariate Analysis
To analyze the variables associated with misclassification, this study draws from Ellis et al. (2000) and examines the following four variables: true trade direction, time from previous trade, trade size in Bitcoin, and price in USD. In addition, as trades are stamped to seconds, the time gaps of trades occurring at the same timestamp are set at zero.
Table 4 reports the distribution of misclassification rate of the tick rule in different subsamples. Panel A divides all trades into four groups according to the length of time from the previous trade. In the first group, where trades occur in no more than 5 s, 22.19% of the trades are misclassified. Moreover, 33.14% of trades occurring in more than 60 s are misclassified. On the other hand, Panel B divides all trades into four groups based on trade size in Bitcoin. The classification success rate is the largest when the order amount is no greater than 0.01 Bitcoin among the four subsamples. However, no monotonic relation is observed between trade size and classification success. Last, Panel C divides all trades into four groups based on executed price in USD. Notably, 21.57% of trades in the highest price group are misclassified, while the misclassification rate in the lowest price group is 23.79%. Considering that all these variables originated from the same data, correlations may exist among them. Further studies on the relationship between misclassification and time from previous trade/trade size/price level are thus needed.
Time, Amount, Price, and Misclassification.
Table 5 reports the results of multivariate regressions, including ordinary least square (OLS) and logistic regressions. At the beginning, multivariate regressions of all trades indicate that seller-initiated order, amount, and price are negatively associated with the likelihood of misclassification, while time from previous trade is positively correlated with the likelihood of misclassification. However, the regression results of the subsamples are not in line with those of the whole sample. The results show that when time from the previous trade is no longer than 5 s, the likelihood of misclassification is positively associated with seller-initiated order and time from the previous trade, and negatively associated with trade size and price level. When trade size is no more than 0.1 Bitcoin, the likelihood of misclassification increases with all four independent variables. However, for an executed price of no more than 10,000 USD, the likelihood of misclassification decreases with price. In general, it is evident that the likelihood of misclassification is positively associated with time between trades in all regressions.
Multivariate Regression Results.
, **, and *** denote statistical significance at the 5%, 1%, and 0.1% levels, respectively.
Order Imbalance
Order imbalance is usually employed as a measure of informed trading activity that cannot be observed directly. Order imbalance here is defined as (
Figures 3 to 5 present daily order imbalances measured by OIN, OIS, and OID, respectively. Plots A and C in Figure 3 are true OINs calculated with the number of true trade directions and OINs calculated with directions assigned by the tick rule, respectively. Plot E in Figure 3 shows the bias of the estimated OIN, defined as the difference between the true OIN and the OIN estimated using the tick rule. Plots B, D, and F in Figure 3 use the same method but for large-size trades. The two lines in Plots E and F indicate the values of −0.1 and 0.1. The proportions of OIN bias whose absolute value is larger than 0.1 are 148/306≈48.37% in Plot E and 74/306≈24.18% in Plot F. Figures 4 and 5 report the daily order imbalances measured by OIS and OID, respectively, using the same method. Therefore, the proportions of biases whose absolute value is larger than 0.1 are 70/306≈22.88% (OIS) and 73/306≈23.86% (OIS95) in Figure 4, and 69/306≈22.55% (OID) and 72/306≈23.53% (OID95) in Figure 5. In general, all these order imbalance measures are biased to a certain degree.

Order imbalance based on number of trades (OIN): (A) true OIN; (B) true OIN95; (C) OIN estimated using tick rule; (D) OIN95 estimated using tick rule; (E) bias of OIN; (F) bias of OIN95.

Order imbalance based on trade size (OIS): (A) true OIS; (B) true OIS95; (C) OIS estimated using tick rule; (D) OIS95 estimated using tick rule; (E) bias of OIS; (F) bias of OIS95.

Order imbalance based on volume in USD (OID): (A) true OID; (B) true OID95; (C) OID estimated using tick rule; (D) OID95 estimated using tick rule; (E) bias of OID; (F) bias of OID95.
Table 6 reports the results of a parametric test (the Welch two-sample
Difference Between True and Estimated Order Imbalances.
, **, and *** denote statistical significance at the 5%, 1%, and 0.1% levels, respectively.
Table 7 reports the regression results of daily return and volatility of Bitcoin on each daily order imbalance to explore whether order imbalances could predict Bitcoin return or volatility. At the beginning, the first three columns display estimates of order imbalances from regressions of Bitcoin daily return. And it is found that all order imbalances (including the true and estimated ones marked by “TR”) are positively correlated with daily return at the 1% significance level. However, the adjusted R-squared shows that order imbalances computed using the tick rule rather than the true ones could predict more variation of Bitcoin daily return. Then, the next three columns display estimates of order imbalances from regressions of Bitcoin realized variance (RV, hereafter) multiplied by 104. Commonly, RV proposed by Andersen and Bollerslev (1998) is used as an ex-post volatility measure in financial literature. In this study, all estimates of order imbalances are negative and statistically significant at the 5% level, which means that realized variance decreases as order imbalances increase. Nevertheless, the adjusted R-squared of each order imbalance is less than 10% whereas order imbalances based on number of trades including OIN and OIN (TR) outperform others. Third, the rest of Table 7 displays estimates of order imbalances from regressions of positive semi-variance
Predictability of Order Imbalance.
, **, and *** denote statistical significance at the 5%, 1%, and 0.1% levels, respectively.
Discussion
Given that Bitcoin is listed on multiple unregulated online crypto exchanges, the classification accuracy of the tick rule in the Bitcoin market is similar to those in stock markets. The empirical analysis in this study shows that the overall classification accuracy is 76.87% and the daily classification accuracy ranges from 68.98% to 83.76% in the Bitcoin market. According to previous research, the classification success rate of the tick rule ranges from 72.2% (Theissen, 2001) to 92.15% (Aktas & Kryzanowski, 2014) on the U.S. and non-U.S. stock markets. Of the research cited in this work, the study by Carrion and Kolay (2020) presents the similar fast trading environment by using high-frequency NASDAQ data stamped to seconds. And the accuracy of the tick rule assessed in this study is close to the corresponding values of individual stocks in Carrion and Kolay (2020), namely from 69.75% to 83.34%.
Therefore, the empirical results indicate that there exists a positive correlation between the likelihood of misclassification and the time from the previous trade in the Bitcoin market, as shown in Tables 4 and 5. Conversely, Ellis et al. (2000) found a higher classification success rate when trades were slow, due to a higher turnover rate of quotes. The difference can be attributed to the fact that since Bitcoin is traded simultaneously on multiple online crypto exchanges, information spillover from other crypto exchanges could impact the price (Brandvold et al., 2015). Consequently, it may be more difficult to discern trade direction based on previous trade when a long period of time elapses between trades.
In addition, order imbalances calculated using large-size trades are relatively closer to their true values in the Bitcoin market during the sample period. As shown in Table 6, the results of the Welch two-sample
Conclusions
This study investigates the accuracy of the tick rule in the Bitcoin market, wherein Bitcoin is listed on multiple online crypto exchanges rather than traditional regular exchanges. Although the tick rule has been utilized in researches on the microstructure of this innovational market (Dimpfl, 2017; Feng et al., 2018; Ibikunle et al., 2020; Wang et al., 2020), the accuracy of this trade classification method requires further examination. This study attempts to address three issues: the success rate of the tick rule in the Bitcoin market, factor(s) associated with classification success, and bias of order imbalances, which are usually used as indexes for informed trading computed using the tick rule.
This study answers the three above-stated questions through empirical analysis using the tick-by-tick transaction data of Bitcoin/USD with signed initiators on Bitstamp from December 6, 2017, to October 7, 2018. First, this study finds that the overall success rate of the tick rule is 76.87%, and the daily accuracy ranges from 68.98% to 83.76% during the sample period. There are less misclassified seller-initiated orders than misclassified buyer-initiated ones on the whole, and this result is associated with fewer seller-initiated trades in the sample. In general, trade classification using the tick rule in the Bitcoin market has limited success. Second, this study finds that the longer the time between trades, the higher the possibility of misclassification. It is more difficult to discern transaction intentions when transactions are less frequent in this innovational market of multiple online crypto exchanges. Third, the empirical analysis indicates that the order imbalances computed using the tick rule in the Bitcoin market lack sufficient accuracy. However, order imbalances calculated using large-size trades are relatively closer to their true values. Evidently, attention must be paid to the accuracy of the trade classification algorithm when conducting research on the microstructure of the Bitcoin market.
