Hacker News new | comments | ask | show | jobs | submit login

A systems programmer will specialize in subdomains much like any other programmer, but there are some characteristic skills and knowledge that are common across most of them. Systems programmer coding style is driven by robustness, correctness, and performance to a much greater degree than higher up the stack. Most systems programming jobs these days is C/C++ on Linux and will be for the foreseeable future.

- Know how your hardware actually works, especially CPU, storage, and network. This provides the "first principles" whence all of the below are derived and will allow you to reason about and predict system behavior without writing a single line of code.

- Understand the design of the operating system well enough to reimplement and bypass key parts of it as needed. The requirements that would cause you to to build, for example, a custom storage engine also means you aren't mmap()-ing files, instead doing direct I/O with io_submit() on a file or raw block device into a cache and I/O scheduler you designed and wrote. Study the internals of existing systems code for examples of how things like this are done, it is esoteric but not difficult to learn.

- Locality-driven software design. Locality maximization, both spatial and temporal, is the source of most performance in modern software. In systems programming, this means you are always aware of what is currently in your hardware caches and efficiently using that resource. As a corollary, compactness is consistently a key objective of your data structures to a much greater extent than people think about it higher up the stack. One way you can identify code written by systems programmer is the use of data structures packed into bitfields.

- Understand the difference between interrupt-driven and schedule-driven computing models, the appropriate use cases for both, and how to safely design code that necessarily mixes both e.g. multithreading and coroutines. This is central to I/O handling: network is interrupt-driven and disk is schedule-driven. Being explicit about where the boundaries are between these modes in your software designs greatly simplifies reasoning about concurrency. Most common concurrency primitives make assumptions about which model you are using.

- All systems are distributed systems. Part of the systems programmers job is creating the illusion that this is not the case to the higher levels in the stack, but a systems programmer unavoidably lives in this world even within a single server. Knowing "latencies every programmer should know" is just a starting point, it is also helpful to understand how hardware topology interacts with the routing/messaging patterns to change latency -- tail latencies are more important than median latencies.

The above is relatively abstract and conceptual but generalizes to all systems programming. Domains of systems programming have deep and unique knowledge that is specific to the specialty e.g. network protocol stacks, database engines, high performance graphics, etc. Because the physics of hardware never changes except in the details, the abstractions are relatively thin, and the tool chains are necessarily conservative, systems programming skills age very well. The emergence of modern C++ (e.g. C++17) has made systems programming quite enjoyable.

Also, the best way to learn idiomatic systems programming is to read many examples of high-quality systems code. You will see many excellent techniques in the code that you've never seen before that are not documented in book/paper form. I've been doing systems work for two decades and I still run across interesting new idioms and techniques.






> Know how your hardware actually works, especially CPU, storage, and network.

I learned about MIPS CPUs in college, 20 years ago, with Patterson and Hennessy, and built one from logic gates. It was basically a high-powered 6502 with 30 extra accumulators, but split across the now-classic pipeline stages.

I understand that modern CPUs are much different than this. I understand that there are deeper pipelines and branch predictors and superscalar execution and register renaming and speculative execution (well, maybe a bit less than last year) and microcode. Also, I imagine they're not built by people laying out individual gates by hand. But since I can only interact with any of these things indirectly, I have no basis for really understanding them.

How does anyone outside Intel learn about microcode?


For the average systems programmer, 90% of the really useful CPU information can be found in two places that change infrequently:

- Agner Fog's instruction tables for x86[1], which has the latency, pipeline, and ALU concurrency information for a wide range of instructions on various microarchitectures.

- Brief microarchitecture overviews (such as this one for Skylake[2]), that have block diagrams of how all the functional units, memory, and I/O for a CPU are connected. These only change every few years and the changes are marginal so it is easy to keep up.

Knowing the bandwidth (and number) of the connections between functional units and the latency/concurrency of various operations allows you to develop a pretty clear mental model of throughput limitations for a given bit of code. People that have been doing this for a long time can look at a chunk of C code and accurately estimate its real-world throughput without running it.

[1] https://www.agner.org/optimize/instruction_tables.pdf

[2] https://en.wikichip.org/wiki/intel/microarchitectures/skylak...


Do we have similar resources for Discrete GPU's? I checked on wikichip, the GPU information is hard to come by or I am missing some Google fu.

Thanks for sharing the above very useful resources, blocked my Saturday for first scan!


My belief is that most programmers today are stuck with a model of the hardware architecture that theuly learned in university, which is why most people are unfamiliar with the bottlenecks in current architectures and lack "mechanical sympathy".

> Most systems programming jobs these days is C/C++ on Linux and will be for the foreseeable future.

I don't disagree, but at least anecdotally a lot of the shops I've been involved with/worked with are really excited about Rust and Go. A previous employer that used C++ exclusively has even shipped a few Golang based tools, and is planning to introduce it into the main product soon. No new projects are being started in C++ either.

Definitely recommend learning C, but having Rust (or Golang) exposure will likely be helpful in the near future.

Great post btw.


Golang is garbage collected. I thought due to this, system programmers don't like it.

It depends what you need from it. If you need real-time-like behaviour, it may be a problem. If you need static thread-to-cpu mapping, it may be a problem. Etc.

Know your requirements and then you can decide on the runtime.


I do systems programming professionally, there are many reasons that Go is unsuitable for systems programming but all of them come down to having very little control over the resultant assembly that your program generates and very little possibility for abstracting away low-level details. AFAIK modern C++ is excellent for both of these but I only use Rust, assembly and the tiniest amount of C. Rust is absolutely excellent for writing code that looks high-level but compiles to high-quality assembly, if you know what you’re doing.

Yes this is a problem for some things, but at least when I was working in systems you'd be surprised how frequently GC doesn't matter, especially with all the huge GC improvements in recent versions (and future versions) and the trend toward breaking apart monolithic code bases.

The more general problem is that the Go implementation depends on its own runtime library and is not particularly suited for linking into other executables or running in unusual contexts, e.g. without threads or without an OS.

Could you share some pointers to codebases you find high quality?

Suggested reading? I'm not planning to switch careers but am intrigued.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: