How do I design high-frequency trading systems and its architecture. Part II

In the first part, I explained basic concepts of architecting a low latency trading system and some examples on how to implement a very fast order book.

In this second part, I will explain how to implement the next components and the key part: what pattern to use.

3. Order management system: OMS

This module will manage all orders sent to the venues, based on signals generated by your strategy. It will handling sending, canceling and replacing orders as well as accessing information about executed orders, including pending and open orders.

We must send these orders in a very efficient and cost-effective manner, routing each order, depending on one or more of the following:

  • the signal strategy
  • venue costs
  • latencies between venues
  • best prices available
  • shares or contracts available on each venue


Also, needs to be smart enough to know when an order was:

  • Rejected or canceled by the venue
  • Partially filled
  • Fully filled

So, depending on the above statuses receive, your order management system may execute different paths.


4. Strategies

As the brain of our system, strategies will take limit order book from each venue and make defined decisions based on different parameter and values.

Some strategies will need to analyze the entire depth of books, others, just top of the book prices, that is the best bid and best ask.

Here, you can apply an almost infinite type of strategies and, of course making sure they will be profitable ideas.

One could be a simple latency arbitrage strategy: each venue receives market information at different times, and if our system can be faster enough we can take advantage of that price gaps. Usually, these discrepancies last no more than 500 microseconds, and after that, all market participants try to balance themselves.

Here’s one example. A big institution is in the market to buy a big order of a given stock. It will have algorithms execute the trade slowly, trying to get the best price and it will take whatever’s available at, say, $4.50 per share, and then what’s available at $4.51, and so on. This is where the “latency arbitrage” may come in. Our strategy can see that this fund’s algorithm is in the market and essentially buy up all the available shares at $4.50 an instant before they do. Now the firm’s algorithm moves on and looks for shares at $4.51. Our algorithm sells all the stock it just bought at $4.50, earning a completely risk-free penny a share.

Sounds small, but if you do this several thousand times per day, we will be adding up to many millions of dollars per trading day, and several billion per year.


Another example could be the well-known ‘triangular arbitrage’ – this is an arbitrage where there are price discrepancies between 3 currency pairs.  Forex is traded in pairs, i.e. EUR/USD EUR/GBP EUR/CHF.  What can happen during a big market event, for example, a failed coup in Turkey as an extreme example, EUR/USD will move faster than it should have to keep in ratio with the rate of EUR/GBP.  That can be just a market function, traders sell EUR/USD before EUR/GBP without algorithms.  Or, large orders can cause the difference between EUR/USD and EUR/GBP to be off slightly.  Even if only off by a fraction of a dollar, this can lead millions in profits if you are fast enough.


5. Software Architecture: Patterns

The fun starts when we want to put all these pieces together, to interact concurrently and with the lowest latency possible between processes.

The basic architecture looks like this:



As we already know, concurrency is key, and to have it working properly we will need to use synchronization methods (to concurrently access data in memory) and a design architecture that fits our needs of low latency.

That’s why we need to talk about “software design patterns” and choose the best for our needs.

There are several well-known options we can use here:

Observer Design Pattern: this is a software design pattern in which an object, called the subject, maintains a list of its dependents, called observers, and notifies them automatically of any state changes.

This is fine, but if you have multiple strategies running on the same system, the notification process will be processed one by one. Meaning, the subject will first notify “strategy 1”, do its calculations, send orders if some criteria are meet, and then continue with “strategy 2”, again do the calculation and see if some criteria are met. This sounds like a sequential process!!


So, our order book module will be the subject, and the strategy our observer.

As I show below, the implementation using C++ will look something like this:


Observer and Subject implementation



What I did here is setup an observer (the strategy) and the subject (the order book).

The order book will send notifications once any price has changed, so all the strategies can receive it and act on it. If the strategy meets its specific criteria, will trigger orders to the exchange. That process, of asserting if the criteria met, could be more or less complicated, hence could take more or less time.

As you can see, this is a serial process, and if for any reason “strategy 1” takes a couple of milliseconds to do some fancy calculations, then by the time “strategy 2” gets the notification is too late, and so on…. Until we get the notification for the last strategy, which is already making decisions on past information.

That’s the reason why we can’t use this design pattern at all.

Just in case you are thinking of throwing threads on each notification or doing it asynchronously, let me tell you that it will be even worst. The overhead will be such, that doing it sequentially it will be faster.

You can also find this code on my gist repository


Signal and Slots Pattern: used for communication between objects or processes. The underlying implementation is similar to the Observer Pattern, and its concept is that the observer can send signals containing event information which can be received by others using special functions known as slots. Similarly, in C++ callbacks (function pointers), but signal/slot system ensures the type-correctness of callback arguments.


Since this pattern also is derived from kind of observer pattern, I prefer not to use it either.

All types of event/messaging/signal patterns are kind of observer pattern, so is not suitable for my purposes of low latency.


Ring buffer pattern: now, we are getting closer. This pattern’s very performance effective, and it is implemented in many low-level applications. The ring buffer is a circular queue data structure. It has a first-in-first-out (FIFO) characteristic. It also has two indices, indicating from where your process can read, and from where it can write. So, no collisions will be in place, which will be translated in no need to synchronization. This kind of structures are called “lock-free”, and its performance is beyond other patterns.


One big adopter of this pattern is LMAX with its disruptor. Below an image of their implementation using this pattern.


This kind of pattern is ideal for socket communications where serial data has to be managed, so, in this case, is not suitable either for our trading architecture.


In the next article, I will explain what I would choose as architecture in an ultra low latency system like we intend to build


To be continued …


Ariel Silahian

Keywords: #hft #quants #forex #fx #risk $EURUSD $EURGBP $EURJPY


2 thoughts on “How do I design high-frequency trading systems and its architecture. Part II

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.