Quiet networking
The kindest thing a piece of network software can do for an operator is not have an opinion about it. The second kindest thing is to know, when it does have to have an opinion, that the operator already had one first.
Most of the network software we work with day to day fails on the first count and has never heard of the second. It assumes its own importance. It pages someone at 03:42 to tell them a thing flipped from "up" to "up". It "reconciles" config that was intentionally drifted ten minutes before. It logs a hundred lines of structured JSON to tell you that one packet looked, in passing, slightly unusual.
The problem isn't that the software is bad. Most of it is fine. The problem is that it has been allowed to grow without anybody whose job it is to take the side of the operator's evening.
Three rules we try to follow
1. The default for a new feature is off. Not "off behind a feature flag we'll forget about." Off, in source. Removing it should be a one-line patch. The line of code that makes a thing happen should be inside the user's repo, not ours.
2. The default log level is one we'd watch live. If you wouldn't tail it on a quiet Wednesday afternoon, it's not info. Most things that look like info are debug. Most things that look like warn are debug. The actual warn line, the one that says "something is becoming worse," is so rare you'll recognise it.
3. Identical inputs produce identical outputs, including in the logs. Don't include a hash of the current time. Don't include the request ID in the human-readable string. The line you're trying to grep for in the postmortem is the line you wrote three months ago, and you want it to match.
None of this is novel. It's mostly the residue of writing too many systems and watching too many people stop using them.