This is an original post from Henrik Warne’s blog
I really enjoyed reading Algorithmic Trading: A Practitioner’s Guide by Jeffrey M. Bacidore. Before starting, I imagined it would cover various strategies for trading in the markets, along the lines of “buy on this condition, sell on this condition”. But that is not what this book covers. What trade to make is always a given, typically from a portfolio manager. Instead, the book is all about how to make it happen, almost always by portioning out the trade little by little, while trying to get the best price.
It is fascinating how many factors come into play when implementing this seemingly simple task. The book covers all parts of this process in a clear and concise way, with lots of illuminating examples. The author has over 20 years of experience in the field of algorithmic trading, both from industry and academia. I particularly liked all the examples of implementation corner cases and gotchas that clearly come from experience.
TRADING, COSTS, ALPHA
The book starts by defining and explaining several concepts in trading. The most important concept is the order book. It is a list of bids and asks/offers (buy and sell orders) ordered by price levels. The price is the limit set when placing the order. The order book also includes the aggregate size at each level. The gap between the highest buy order and the lowest sell order is the bid-ask spread. A marketable order is one that can execute immediately, that is it will cross the spread.
The orders are sorted by price first, and within each price level by arrival time (first in first out). Orders have to be priced in specific increments, the minimum price variation, or tick size. There is a good example in the book explaining why a tick size is needed. Say that the current bid in the market is $20. If you want to buy at that price, you will be placed last in the list of orders. But since the sorting order is first by price, then by arrival time, you could get first in line by putting in an order with a price only slightly better than $20 (say $20.00000001). The tick size limitation stops this behavior. If the tick size is $0.01, you would have to bid at least $20.01 in order to get priority by price.
Many markets use the maker-taker fee structure. Traders that place orders that rest on the exchange earn a maker fee, and traders that “take” liquidity, that is execute orders against the existing resting orders, pay a taker fee. The taker fee is higher than the maker fee, and the exchange earns the difference between the two. This fee structure encourages traders to place resting orders on the exchange (providing liquidity), and this will in turn attract taker orders.
The explicit cost of trading is the fees paid. However, there are also several implicit costs. One such cost is the bid-ask spread. If we assume the current fair price is the midpoint of the spread, then the cost will be half of that spread. There can also be costs due to market impact. If a large order is placed, there is a risk that the price moves unfavorably when the other market participants adjust their prices to take advantage of the demand. In many cases it is therefore better to hide the size of the order, for example by dividing it up into smaller parts over a period of time. There is also the problem of adverse selection. This happens when one party first has to set a price. If you put an order out, an informed counterparty will only “select” to trade with you if the price is to their advantage. If not, they will not trade with you. Ways to avoid this is to trade with retail investors that typically trade for liquidity reasons (they have money to invest, or need invested money), and to get fast market data updates, so you don’t have prices based on old information.
In investments, alpha means return above some benchmark, typically the beta-adjusted market return of an assets. In this book, alpha means that over the trade horizon, the price moves in a specific direction. So for example, if you are buying, it could be that the price is expected go up during your trade (positive alpha).
A simple strategy is to spread out an order over a fixed time interval (for example one hour), and try to trade at a constant rate. This is the Time-Weighted Average Price (TWAP) algorithm. Typically, there will be upper and lower limits deciding how much ahead, or how far behind, of the ideal path the execution is allowed to be. Furthermore, it costs more to send marketable orders that will cross, compared to putting out passive resting orders that may or may not fill. The algorithm designer needs to find a balance here.
A variation on the TWAP strategy is Volume-Weighted Average Price (VWAP). Like in TWAP, the order is executed over a fixed time interval. But instead of using a constant rate over the interval, the volume traded is proportional to the historical volume of a typical day for the asset. US equities usually trade more in the first and last 30 minutes of the trading day, relative to the rest of the day. The idea here is to trade more when there is typically more other trades, and less when there is typically less other trades. The volume data is divided up into bins of for example 5 minutes, and within each bin the trading rate is constant.
A third algorithm is the schedule-based Implementation Shortfall (IS) algorithm, also known as Arrival Price. The benchmark to compare against is a hypothetical trade for the full volume of the order done costlessly when the order starts (that is, at the “arrival price”). Buying above, or selling below, the arrival price represents an implementation cost. Three factors influence how well the algorithm will do. First there is the execution cost: the more marketable orders that are used, the higher the cost, and the more passive resting orders, the lower the cost. This cost falls non-linearly with time. Second, if there is positive alpha over the trade horizon, it means that the market moves in the direction of the trade. This means that as time passes, the price will be less and less favorable. On the other hand, if there is negative alpha, the execution prices will get better with time. Finally, there is risk aversion. The longer it takes to complete the whole order, the greater the risk that the price moves unfavorably. Therefore, the less risk the trader is willing to take, the faster the trading should finish.
If there is no alpha, and no risk aversion, the only factor to consider is trading cost. The longer the trading horizon, the lower the total cost of the trade. However, both positive alpha and a risk aversion penalty increase with time. With a model for how these three components develop over time, it is possible to determine the optimal trade horizon (that is, at what time will this cost function be at a minimum). This determines how long the schedule for the trade should be. It can be difficult to estimate alpha, and to put a numerical value on the risk aversion. In practice it is common to combine these two values into an urgency parameter.
In the Percent of Volume (POV) algorithm, the aim is to participate at a certain rate, for example 10% of the realized volume traded. So when 900 shares have been traded, the POV algorithm submits an order for 100 shares. Those 100 shares will be 10% of the 900 + 100 shares. In the VWAP algorithm, the trading is proportional to the historic volume. In the POV algorithm, the idea is to participate at the given rate of the actual volume. This means that there is no fixed schedule. Instead, the end time depends on the traded volume. This strategy has intuitive appeal. It will trade more aggressively when risk increases, since risk and volume are positively correlated. It therefore reduces risk (reducing “inventory”) at times of increased risk.
One problem with the POV algorithm is that it is often implemented to stay quite close to the participation rate, which means it uses more marketable orders and less passive orders, leading to higher costs. Another problem is that its reactive nature means that it often follows volume rather than actually participating in the realized volume. If there has been a large order in the market, the POV needs to send a larger order to maintain its rate. This can draw other participants in, further increasing the volume. The result can be trades at temporarily inflated prices.
A way to combat this is to try to forecast the volume, and participate in it at the given rate. Then it will be easier to use passive limit orders to earn a spread. Another variation is to allow for a “must complete” option. Many portfolio managers prefer the orders to finish the same day. This can be accomplished by switching to a TWAP or VWAP schedule if the rate is not high enough for the order to complete by the end of trading.
“Hide and Take”
Opportunistic algorithms aims to take advantage of specific conditions. The Hide and Take algorithm will stay hidden, and only trade when favorable price or liquidity conditions occur. If the price moves favorably relative to some benchmark, it will send out marketable orders to exploit the opportunity. Likewise, if the liquidity increases, either by a larger depth, or a tighter spread.
The Adaptive IS algorithm conditions its trading on the direction and magnitude of price movements relative to the arrival price. For example, if the price moves in the trader’s favor (declining when buying, rising when selling), the algorithm will trade more to lock in the good price. If it moves in the opposite direction, the algorithm will trade less. Interestingly, some traders want an algorithm with the exact opposite behavior, that is trading more aggressively if the price moves unfavorably. The motivation in this case is to lock in a price before it gets any worse. There has been a lot of debate on whether these strategies are useful, or if they are only reacting to noise (and therefore only increasing cost without any benefit).
In some cases there is a relation between two assets. For example, stock ABC may usually be valued at twice the price of stock XYZ. In pairs trading, the aim is to exploit when this relation temporarily deviates from the historical or expected value. The algorithm is triggered when the deviation is large enough, for example 1%. It will then buy the stock that is relatively undervalued, and sell the stock that is relatively overvalued. When the relationship reverts back to its expected value, the trades are reversed.
The algorithm is executed in steps, buying and selling in equal values up to the maximum position size. One of the assets is the leader, the other the follower. The leader is typically the asset that is most difficult or costly to trade. The algorithm will trade the leader passively using limit orders. It will wait to trade the follower until it has “legged into” the leader. Then the follower will be traded, often with a marketable order in order to minimize the time the legs are unbalanced (leg risk).
Portfolio algorithms are different from pairs trading. They are multi-order extensions of single order IS algorithms. Like in the single order case, the idea is to balance the cost and the risk to find the optimal schedule for the individual trades. The trading cost for the portfolio is just the sum of the costs of trading the assets in the portfolio. However, if there is any correlation between different assets, the risk can be reduced by taking these correlations into account.
CHILD ORDER PRICING, SIZING AND ROUTING
When the overall algorithm has been decided (for example, using TWAP or POV), you still need to decide what limit price to set on each child order, how big the orders should be, and which venues it should be routed to (assuming there is more than one venue to choose from).
When considering how to set the price, it is useful to divide the price into two components: the fair value, and the edge. The fair value is the true economic value of the asset. For stocks, it could be calculated as the discounted value of all future cash flows. However, mostly the fair value is assumed to lie within the bid-offer spread. If it wasn’t, arbitrage trades would be possible. Often, the assumption is that it is at the midpoint of the spread. But other models are possible, for example by weighted average of the bid and offer sizes (or the logarithm of the sizes). However, these more complicated models have problems of their own, for example that size usually is more volatile than price, so simply using the midpoint is a good choice.
The edge is the discount received on a buy order, or the premium earned on a sell order, relative to the fair price. A positive edge is a gain, and a negative edge a loss for a given trade. The fair value is not affected by the order, so setting the limit price of the child order simply becomes deciding what the edge should be. The higher the edge, the lower the chances of the order being filled.
The first decision to make is whether to send a marketable order (with a negative edge), or to send a passive order that will rest on the book and maybe get filled. This can be affected by if the algorithm is ahead or behind in the schedule, or if the algorithm has a “must complete” instruction. To find how to set the edge, you calculate the expected gain for each value of the edge, and pick the edge with the highest expected gain. To calculate the expected gain, you need to be able to estimate the probability of a fill for a given edge, and you have to subtract the cost of a non-fill. The fill probability can be estimated from historical data.
The closer you are to the end of the schedule, the more urgent it will be to get a fill. By dividing the remaining time in for example minute intervals, you can work backwards from the last interval (where you must get a fill if the order is unfilled at that point) in a dynamic programming-like way to find how to set the edge optimally in earlier periods.
One strategy to update the edge as the market fair price and spread change is to use pegging. This means that the algorithm will set the price in relation to the current best bid or offer, either exactly, or with an offset from these values. As the market values change, the algorithm updates its order values in relation to those changes. There are however pitfalls with this strategy. Suppose the pegged limit order price is set to $20 to match the $20 current best bid. If all the other traders reduce their buy prices, the best bid would stay at $20 because of the pegged order. There is also a risk that short-lived (fleeting) orders will make the pegged order update its price. This can be countered by requiring that new prices must be present for at least X seconds before updating the price. But then the pegged order would be further back in the queue, reducing its fill probability.
There is also a special order type called post-only. It is designed to only supply liquidity, never take liquidity. If the market moves between the decision to send out an order, and the order reaching the exchange, the order will not cross. Instead, it will be hidden, or cancelled. This makes it easier for algorithm designers to get the behavior they intend (that is, resting orders will not accidentally be converted into crossing orders).
Schedule based algorithms can adjust the price depending on if they are ahead or behind in the schedule, setting a more aggressive price if they need to catch up. The same idea can be used regarding size – setting a larger size if they need to catch up, and a smaller size if they are ahead. There is also a technique to place multiple orders in the order book, where some are resting at more passive levels to be able to take advantage if an overly aggressive liquidity demander (prepared to pay a large premium) enters the market. This is called layering the book. A disadvantage of this is that some information on the size of the demand is leaked.
One way of hiding the size is to use a reserve order. In this type of order, only a fraction of the size is displayed in the market, and as soon as it is filled, the order is refreshed with more quantity from the hidden part. This is also called an iceberg order, since only the tip is visible. However, other market participants can infer the existence of a reserve order if they notice that the size keeps getting refreshed.
A Smart Order Router (SOR) is the component that decides where to send the order the algorithm has decided on. For a marketable order, the goal is to find the current best price. This can involve splitting the order up and sending the parts to different venues. Because the order books can change very rapidly, there is always a risk that the book has been updated when the order arrives. Furthermore, sending a regular limit order could mean that instead of crossing, the order would rest if the price has moved away. To handle this case, an Immediate or Cancel (IOC) limit order is used. If it can’t execute right away, the order is cancelled back to the sender.
For a non-marketable order, which can’t be immediately filled due to the limit price set on it, the goal for the SOR is to maximize the probability of a fill. Ideally as fast as possible, so the market doesn’t move away while resting. The fill probability depends both on the venue’s queue length, and on its trading rate. Even if the queue is longer, it can still be the better choice, if the orders tend to get filled there faster. The fill probability can also depend on the size of the order. The SOR needs a model so it can estimate the fill probability in each case. The model can use past data on asset-, order-, and market-level statistics to get an estimate, for example by using a logistic regression.
It’s tricky to measure how good an algorithm is, since there is no way of knowing what price it could have received if it had followed some other strategy. The best we can do is to compare the price a trade executed at to a benchmark. The realized performance will then be the difference in the actual price relative to the benchmark. If a trader bought an asset at $20 when the benchmark was $19, the realized performance was -$1. In other words, the cost was $1.
It is important to also consider unrealized performance. The unrealized price is defined as the prevailing market price at the end of execution. The portion of the original order that was not executed is compared to this unrealized price, while also taking into account what the trading costs would have been (for example by using a trading cost model). Combining the realized and unrealized performance gives the total performance of the order. If we don’t consider unrealized performance, it can appear better not to trade at all, since then you don’t incur any trading costs.
The most commonly used benchmark is the arrival price, that is the price of the asset at the start of trading. The trading cost is the difference between the realized price and what would have been paid if the trade had been (costlessly) executed at the arrival price. There are other benchmarks, for example the volume-weighted average price (VWAP) over the life of the order. However, the advantage of using the arrival price is that it is not affected by the trading itself (since the trading may move the price), and it can’t be influenced by the traders themselves. The disadvantage is that there will be a component of randomness to it, since the market price can vary during the trade horizon, independently of the impact from the trades. Even though these movements may average to zero, the impact can be large for the performance measurement of a single order. Therefore, a large number of samples is needed to reliably judge the performance of a given algorithm.
Performance is typically measured in percent of the price, to make the comparisons valid even if the size of the trades vary. And because trading costs tend to be small in terms of percentage, basis points (bps, pronounced “bips”) are used. One basis point is 0.01%, so for example 50 bps equals 0,5%.
Sometimes you are interested in the absolute performance, for example to estimate how much it will cost to execute a trade idea (to make sure it is likely to make money). One way of estimating the cost is to compare to previous trades in a similar situation regarding asset, size, spread, volatility, time-of day etc. However, there are so many parameters that can influence whether the situation is similar or not that this approach breaks down. Instead, you can use a trading cost model.
A simple but useful model is: ExpectedTradingCost = HalfSpread + σ * γ * sqrt(Ordersize/Volume)
σ is the volatility and γ is a model parameter estimated empirically. For small order sizes, the cost will be roughly the half-spread. The trading cost increases with increasing order size (relative to the traded volume), but at a decreasing rate.
To decide which trading strategy has better performance, a “horse race” can be used, where both strategies are used and you compare the results. However, the characteristics of the orders they are used for are most likely not identical. Therefore, you can use a cost model to try to account for differences in market conditions.
When analyzing performance, it is important to look at the distribution of the values, not just the averages. For example, sample A has three trades of size $50,000 each. Sample B has three trades of size $1, $50,000 and $99,999. Both have the same average size, but the orders in sample B will have higher average cost. This is because the average cost increases with the order size.
It is also important to watch out for outliers and influential orders. Samples often contain some orders that are hundreds or thousands of times larger than the other orders in the sample. These outlier orders have often been longer in the market (because of their size), so the variance is greater, which can lead to large performance numbers. To temper the effect of these orders, trimming or winsorizing can be used.
Influential orders may not cause extreme performance results, but should still be looked at. For example, if there are 1,000 orders for $2,000, and one order for $2 million, the large order may skew the result. One way to handle this is to run the analysis both with and without the influential order, and see if the result is robust, or if it changes.
I really like that the book is written in such a clear style. At only 222 pages, the information density is high. It also contains helpful tables and diagrams where appropriate. I also like that there are lots of examples of pitfalls, and how to avoid them. My only complaint is that I would have like to have the chapter number and title written somewhere on the page – as it is now it takes a bit of turning pages to find which chapter a given page is part of.
I find the subject of algorithmic trading quite interesting. It reminds me a bit of Core war – algorithms battling each other, but here trying to make as profitable trades as possible. There is an arms race in trying to outwit other algorithms, and no wonder, since so much money is at stake. This book is most useful if you work in the trading space. But even if you don’t, it is still worth reading, since the problems described are interesting, and because markets are of such importance today.