FRESH

Hacker News

On latency, measurement, and optimization in algorithmic trading systems

78 points by auc

by tombert

1 subcomments

I remember in 2017, I was trying to benchmark some highly concurrent code in F# using the async monad.
I was using timers, and I was getting insanely different times for the same code, going anywhere from 0ms to 20ms without any obvious changes to the environment or anything.
I was banging my head against it for hours, until I realized that async code is weird. Async code isn’t directly “run”, it’s “scheduled” and the calling thread can yield until we get the result. By trying to do microbenchmarks, I wasn’t really testing “my code”, I was testing the .NET scheduler.
It was my first glimpse into seeing why benchmarking is deceptively hard. I think about it all the time whenever I have to write performance tests.

by alexpotato

0 subcomment

by foobar10000

2 subcomments

by weinzierl

4 subcomments

Is your code really fast if you haven't measured it properly? I'd say measuring is hard but a prerequisite for writing fast code, so truly fast code is harder.
The number one mistake I see people make is measuring one time and taking the results at face value. If you do nothing else, measure three times and you will at least have a feeling for the variability of your data. If you want to compare two versions of your code with confidence there is usually no way around proper statistical analysis.
Which brings me to the second mistake. When measuring runtime, taking the mean is not a good idea. Runtime measurements usually skew heavily towards a theoretical minimum which is a hard lower bound. The distribution is heavily lopsided with a long tail. If your objective is to compare two versions of some code, the minimum is a much better measure than the mean.

by am17an

1 subcomments

Typically you want to measure both things - time it takes to send an order and time it takes to calculate the decision to send an order. Both are important choke points, one for latency and the other for throughput (in case of busy markets, you can spend a lot of time deciding to send an order, creating backpressure)
The other thing is that L1/L2 switches provide this functionality, of taking switch timestamps and marking them, which is the true test of e2e latency without any clock drift etc.
Also, fast code is actually really really hard, you just to create the right test harness once

by omgtehlion

1 subcomments

In HFT context (as in the article) measurement is quite easy: you tap incoming and outgoing network fibers and measure this time. Also you can do this in production, as this kind of measurement does not impact latency at all

by nine_k

1 subcomments

Fast code is easy. But slow code is equally easy, unless you keep an eye, and measure.
And measuring is hard. This us why consistently fast code is hard.
In any case, adding some crude performance testing into your CI/CD suite, and signaling a problem if a test ran for much longer than it used to, is very helpful at quickly detecting bad performance regressions.

by iammabd

1 subcomments

Yeah, Most people write for the happy path... Few obsess over the runtime behavior under stress.

by Attummm

0 subcomment

The title is clickbait, unfortunately.
The article states the opposite.
> Writing fast algorithmic trading system code is hard. Measuring it properly is even harder.