FRESH

Hacker News

High-Performance DBMSs with io_uring: When and How to use it

187 points by matt_d

by to_ziegler

2 subcomments

We also wrote up a very concise, high-level summary here, if you want the short version: https://toziegler.github.io/2025-12-08-io-uring/

by eliasdejong

1 subcomments

by CoolCold

2 subcomments

> Figure 9: Durable writes with io_uring. Left: Writes and fsync are issued via io_uring or manually linked in the application. Right: Enterprise SSDs do not require fsync after writes.
This sounds strange to me, of not requiring fsync. I may be wrong, but if it was meant that Enterprise SSDs have buffers and power-failure safety modes which works fine without explicit fsync, I think it's too optimistic view here.

by kinds_02_barrel

1 subcomments

Really nice paper.
The practical guidelines are useful. Basically “first prove I/O is actually your bottleneck, then change the architecture to use async/batching, and only then reach for features like fixed buffers / zero-copy / passthrough / polling.”
I'm curious, how sensitive are the results to kernel version & deployment environments? Some folks run LTS kernels and/or containers where io_uring may be restricted by default.

by jelder

3 subcomments

I just today realized io_uring is meant to be read as "I.O.U. Ring" which perfectly describes how it works.

by lukeh

2 subcomments

by melhindi

0 subcomment

by bikelang

2 subcomments

Do any cloud hosting providers make io_uring available? Or are they all blocking it due to perceived security risks? I think I have a killer use case and would love to experiment - but it’s unclear who even makes this available.

by maknee

1 subcomments

by LAC-Tech

1 subcomments

NVMe passthrough skips abstractions. To access NVMe de- vices directly, io_uring provides the OP_URING_CMD opcode, which issues native NVMe commands via the kernel to device queues. By bypassing the generic storage stack, passthrough reduces software- layer overhead and per-I/O CPU cost. This yields an additional 20% gain, increasing throughput to 300 k tx/s (Figure 5, +Passthru).
Which userspace libraries support this?. Liburing does, but Zig's Standard Library (relevant because a tigerbeetler wrote the article) does not, just silently gives out corrupt values from the completion queue.
On the rust side, rustix_uring does not support this, but widely doesn't let you set kernel flags for something it doesn't support. tokio-rs/io-uring looks like it might from the docs, but I can't figure out how (if anyone uses it there, let me know).

by anassar163

1 subcomments

This is one of the most easy-to-follow papers on io_uring and its benefits. Good work!

by user3939382

0 subcomment

I have a better way to do this. My optimizer basically finds these hot paths automatically using a type of PGO that’s driven by the real workloads and the RDBMS plug-in branches to its static output.