Linux Network Performance Ultimate Guide

c0l0

This would have been such a great resource for me just a few weeks ago!

We wanted to have finally encrypt the L2 links between our DCs and got quotes from a number of providers for hardware appliances, and I was like, "no WAY this ought to cost that much!', and went off to try to build something myself that hauled Ethernet frames over a wireguard overlay network at 10Gbps using COTS hardware. I did pull it off after a tenday of work or so, undercutting the cheapest offer by about 70% (and the most expensive one by about 95% or so...), but there was a lot of intricate reading and experimentation involved.

I am looking forward to validate my understanding against the content of this article - it looks very promising and comprehensive at first and second glance! Thanks for creating and posting it.

hyperman1

I wonder if it's worth it, with this amount of tunables, to write software to tune them automatically, gradient decent wise: Choose parameter from a whitelist at random and slightly increase or decrease them, inside a permitted range. Measure performance for a while, then undo if things got worse, do some more if things got better.

dakiol

I find this cool, but as a software engineer I rarely get the chance to run any of the commands mentioned in the article. The reason: our systems run in containers that are stripped down versions of some Linux, and I don’t have shell access to production systems (and usually reproducing a bug on a dev or qa environment is useless because they are very different from prod in terms of load and the like).

So the only chance of running any of the commands in the article are when playing around with my own systems. I guess they would be useful too if I were working as Platform engineer.

betaby

"net.core.wmem_max: the upper limit of the TCP send buffer size. Similar to net.core.rmem_max (but for transimission)."

and then we have `net.ipv4.tcp_wmem` which bring two questions: 1. why there is no IPv6 equivalent and 2. what's the difference from `net.core.wmem_max` ?

totallyunknown

What's missing a bit here is debugging and tuning for >100 Gbps throughput. Serving HTTP at that scale often requires kTLS because the first bottleneck that appears is memory bandwidth. Tools like AMD μProf are very helpful for debugging this. eBPF-based continuous profiling is also helpful to understand exactly what's happening in the kernel and user-space. But overall, a good read!

rjgonza

This seems pretty cool, thanks for sharing. So far, at least in my career, whenever we need "performance" we start with kernel bypass.

hnaccountme

Thank you