Understanding Trading Latencies 2024

Electronic Trading Operations
The Execution Venue
Sell-Side Broker/Dealer
The Buy Side
Different trading latency types
Techniques to Measure Trading Latencies
Trading Latencies’ Metrics
Reducing trading latencies in the order chain
TCA and impacts on trading latencies
Conclusion

Regardless of how fast a transaction occurs, there will always be some delay due to the number and types of devices between the parties. This is what is known as trading latency.

The time required to send data over long distances and through networking devices like routers and switches leads to delays in data centers, wide areas, and metro networks alike.

In addition to latency caused by computers, there are delays caused by their associated storage devices.

No doubt, it is clear that trading latencies must be controlled, measured and understood to mitigate any negative effects.

Electronic Trading Operations

As of today, the equity market structure consists of 15 national securities exchanges, some thirty alternative trading systems, multiple single-dealer platforms within brokerage firms, as well as order matching. There are also dozens of order types in the equity markets, as well as a multitude of market connectivity options and a range of market information products that provide data in microseconds. We use this data to form the basis for a wide variety of algorithmic trading strategies that rapidly submit orders across markets, resulting in price movements, which in turn generate more data used for the next algorithmic trading strategy.

The following table summarizes the percentage, share volume, and dollar volume of trades, reported by each registered exchange and trade reporting facility for all NMS stocks in 2019.

Venue	Trades	Shares	$Vol
Cboe BYX	6.2%	3.8%	3.0%
Cboe BZX	8.7%	5.5%	6.4%
Cboe EDGA	4.3%	2.2%	2.1%
Cboe EDGX	6.4%	4.8%	4.7%
IEX	3.8%	2.7%	2.9%
Nasdaq	24.1%	17.2%	19.7%
Nasdaq BX	3.1%	1.8%	1.8%
Nasdaq PSX	0.9%	0.7%	0.9%
NYSE	8.5%	13.5%	12.4%
NYSE American	0.4%	0.3%	0.2%
NYSE Arca	9.4%	8.4%	9.3%
NYSE Chicago	<0.01%	0.4%	0.8%
NYSE National	2.1%	1.4%	0.8%
TRF Nasdaq Carteret	18.6%	29.7%	29.3%
TRF Nasdaq Chicago	0.1%	0.1%	0.1%
TRF NYSE	3.5%	7.5%	5.6%
Source: NYSE TAQ

Approximately 78% of all trades were executed on traded exchanges. This came down to 63% of all shares traded, 37% off-exchange, and 65% of dollar volume.

It is much the same in Europe and Asia, with substantial numbers of trading entities. There are 48 exchanges in Asia, and seven major exchanges in Europe, along with 80 smaller exchanges.

The electronic trading chain is shown here in a simplified form.

Execution Venues | Exchanges | Buy-Side | Sell-Side | Broker/Dealer

The Execution Venue

The purpose of execution venues is that buyers and sellers are matched through the operation of an order matching engine. It is operated by some of the largest exchanges, ATS’s, and Dark Pools. The last two are private venues that do not publish the names of the parties involved in transactions.

Each of these venues has an “order matching engine” which is the heart of the execution trading system, maintaining a very large list of buy and sell orders that are matched in microseconds. When a match occurs, both parties in the sale are notified and the notice of the trade is sent to the market data distributor. This entire process usually occurs in less than 10 microseconds.

The market data feed sends trades to wide distribution and can amount to millions of trades per second for large exchanges. This information is used for price discovery and is accessible to all parties in the market. The information is typically presented in a set of times and prices. Traders will subscribe to these feeds from exchanges and use them in their algorithms to create trading strategies, or buy or sell certain securities based on what has occurred in the market so far for a certain period of time.

Sell-Side Broker/Dealer

Sell-side brokers/dealers are firms that take orders from buy-side firms and then “work” the orders. Depending on the case, this is typically achieved by splitting them into smaller orders which are then sent to the exchange or to other firms. In the case of a new order to buy or sell, it is first placed with a sell-side broker/dealer. It is then said to have “gone through” the broker. Then, depending on where the order is placed, there are different actions that might be taken.

The smart order router (SOR) will send the order to the exchange with the best price, or if there is no clear best price, it will route the order to multiple exchanges in an attempt to get the best execution price. The speed of light limits how fast an order can be sent from the broker/dealer to the exchange, so proximity to exchanges is also a consideration in choosing an execution venue. This is an example of trading latency that can provide either an advantage or a disadvantage.

The Buy Side

These are companies that invest in securities, including insurance firms, mutual funds, hedge funds, and pension funds. These firms buy securities for their own accounts or for clients with the goal of generating a return.

The electronic trading process on the buy-side begins with the generation of an alpha signal. This signal is then fed into an electronic trading model which will determine when to buy or sell a security. The model will also take into account factors such as price, volume, and liquidity. Once the order is generated, it is routed to an exchange.

Sometimes, bigger funds can trade directly with the exchanges and so don’t require a middleman. They can provide this service to their clients who want to benefit from DMA technology without having to go through an intermediary. This helps reduce trading latency for those who are more serious traders.

Low trading latency means less price slippage from when the buy/sell decision is made until the trade actually happens, which translates into better execution quality for the models. It also means less trading costs, since it takes less time and fewer resources to execute a trade. High volume, high frequency equates to low trading latency.

Different trading latency types

Despite several electronic trading applications, the key latency that the financial community is tackling is that associated with co-located brokers executing trades with their associated exchange.

This is to the extent that some exchanges are now offering cheaper and better trading colocation services. The challenge, then, has shifted to the change in market participants, who are more likely to trade through their co-located servers directly with the exchanges.

The resulting co-location strategy is to reduce the complexity of the trading infrastructure, which in turn allows for a more transparent and accessible market. It also reduces costs and the need for capital investments that must be absorbed over time.

Average Latency

The lower the latency, the better your algorithmic trading system will get the information, and in return, the better it will react – you want network communication times to be as “fast” as possible. Typical latencies for data networks are under one millisecond.

The benefits of faster trading latency are numerous, the most important of which is to gain information about trades as soon as possible so that the best course of action can be taken. Another potential benefit is to identify and react faster to market conditions and thus react with more accuracy.

Latency Jitter

There are scenarios in which quick and predictable message delivery latency is just as, if not more important than the average message latency. Low latency jitter means that there is little variation in the delay. This is also called “Low Latency Jitter” and generally describes a deviation from the mean. It should also be noted that there is a distinction between low latency jitter and low message delay. The terms “latency”, “delay”, and “jitter” are often used interchangeably. However, they are not the same thing, and different measures may be used to assess each of them.

Throughput

Throughput is a measurement of how quickly a system can process a given amount of data. It is usually defined as the number of messages being processed per unit output per second and is measured in updates.

This information is crucial for designing a system that can provide real-time data without any loss of data and with minimal variance in processing latency.

Techniques to Measure Trading Latencies

Techniques to measure trading latencies include the use of hardware and software to measure the delay from when an order is placed to when it is filled. The electronic trading system then needs to route the order through the necessary exchanges and ECNs to find a match. The time it takes for each exchange or ECN to fill an order varies, so it’s important for traders to have their systems configured in a way that will minimize trading latency.

-Pinging: This involves sending a signal from one computer to another and measuring how long it takes for the signal to return. Pinging can be used to measure the latency of individual components in a system, such as routers or switches, as well as the overall system latency.

-Queuing: This is a more sophisticated method of measuring trading latency that takes into account the time it takes for an order to enter and exit each component in the system. Queuing can be used to identify bottlenecks in the system and to determine which components are causing delays.

-Logging: This is a method of measuring trading latency that involves recording the time stamp of each event in the system, such as when an order is placed or when it is filled. Logging can be used to measure the overall latency of the system or to identify specific events that are causing delays.

Trading Latencies’ Metrics

Tick to Trade: Tick to trade is the time interval between receiving a market ‘tick’ (a price movement in the market) and processing the buy or sell order. In such time-sensitive markets, the time taken to respond to incoming market data determines how competitive trading can be. The quicker response rate means you can be more competitive at a rapid-fire pace.

Throughput: The throughput of a system can be estimated by dividing the number of messages processed in a given time period by the total time of that period. In other words, if a process processes 100 messages at 10 updates per second, then its throughput is 10/100 or 1%.

When we measure these metrics, the first thing we get is a list of dated events. That’s not very helpful. It’s important to collect this data in a way that will provide useful insights. To do so, you need to aggregate the data into subsets of related cases. This aggregation process can be carried out in many ways; however, distributing latencies according to their frequency is the most effective method.

We can divide the domain of the latencies into intervals or buckets, then count how many times each measure falls into each interval.

A graph showing an experiment measuring trading latency by requests per second

Despite being fairly straightforward, this distribution provides a lot of useful information. For example, we can find out the probability of an event taking a certain amount of time to happen.

Or alternatively, the probability that the event will experience a delay of less than a given threshold

Latencies are typically reported as percentiles. For example, 99.999% of the events delivered in 10ms or 97% of round-trip times within the order management system were within 15ms.

Reducing trading latencies in the order chain

Location

Financial institutions must be co-located with the exchange(s) that they operate through. That is, their computers must be located on the same network. This allows for the rapid execution of trades.

Networking & Kernel

Generally, the CPU which is often made by Intel connects to a 10GbE network via a PCI Express bus.

There are two key factors that impact the trading latency with which they do it:

a. How quickly the network adapter can shuttle packets back-and-forth from the network to your server’s memory across the PCI Express bus

b. Low-latency, high-performing network adapters offer an alternative to the current kernel and TCP/IP stacks for communicating with trading applications.

These two factors are usually intertwined, with some organizations having their OS kernel to be bypassed with direct communication between the network adapter and user space – the trading application.

Application

Architecting the system to handle a high throughput market data with the lowest trading latencies possible is critical for any HFT firm. There are a variety of strategies and techniques employed by HFT firms to gain an edge in the market, but they all require access to market data at extremely low trading latencies.

For more details, you can read our article with greater details here.

FPGA

The main advantage of implementing trading applications on FPGA is that the network distance from 10GbE to the FPGA fabric, where the application is implemented is around an order of magnitude more time-efficient than it would be through high-speed internal data bus.

The current generation of FPGAs is getting increasingly powerful. FPGA applications can be clocked faster, the quantity of resources within the FPGA, such as RAM, has increased by over an order of magnitude from previous generations without consuming more power. It has now become possible for multiple electronic trading applications to coexist simultaneously on the same FPGA.

With these approaches trading systems can reach up to 10 times faster than traditional software solutions (under 100 nanoseconds)

TCA and impacts on trading latencies

Transaction Cost Analysis is the idea of monitoring and reporting on trade performance. It is certainly not a new concept, buy-side firms have long been leveraging broker TCA services to analyze their executions and optimize equity order flow.

The goal of TCA is to achieve a seamless analysis of executions across all tradable markets and provide recommendations that are actionable to strategy behavior. This requires profiling order flow in the traditional (historical) sense measuring the performance of individual brokers and venues and also monitoring strategies in real-time. Execution analysis is also a tool for liquidity analysis highlighting value and toxicity across fragmented markets.

It includes low latency data delivery, fast execution technology, and analysis tools against a fragmented market structure. And determining exactly where to execute at any given moment has become a key driver for best execution. This includes the generation of real-time alerts on market conditions feedback into strategy logic so they can be modified mid-day.

Conclusion

Latency is a key indicator of responsiveness. For fast trading strategies that need to react in microseconds, the trading latency needs to be low.

Latency is inevitable, but there are tools that can help to identify its sources of it. These tools can then be used to improve on specific areas that are causing issues in trading latencies.

The low-latency arms race focuses on the need to squeeze every microsecond out of exchange-based transactions through improvements to hardware, software, and diagnostic tools. All of which contribute to reducing overall trading latencies.

We at SiS Software Factory implement these systems to improve trading latencies in every aspect of the development process.

2 thoughts on “Understanding Trading Latencies”

Mohneesh Kumar Swami says:

April 24, 2022 at 11:46 am

Thats a very informative article for newbie like me.
Thankyou sir.

1. Ariel Silahian says:
  
  April 25, 2022 at 10:49 pm
  
  Glad you can enjoy it

Understanding Trading Latencies

Table of Contents

Electronic Trading Operations

The Execution Venue

Sell-Side Broker/Dealer

The Buy Side