Hm. Seems very C++ oriented. Talks about specific keywords of C++ and their semantics. Not even mentioned "actor" in the whole document. "immutable" not found.
Seems rather like an very incomplete view of a C++ programmer, wanting to raise it to "systems programmer".
This document seems to be at a lower level. An actor model would be implemented in terms of these primitives. These primitives are available in many languages.
Immutable is kind of irrelevant in a discussion of multi-threading because immutable data has no problems in this regard. Which, of course, in a way also makes it relevant because one prevents issues. Ultimately one cannot have just immutable data because the computer would not be allowed to do anything. This document would be about data that cannot be immutable.
But the complaint sounds a bit like an article about assembly language getting the complaint that it does not explain how to implement a word processor.
> Immutable is kind of irrelevant in a discussion of multi-threading because immutable data has no problems in this regard.
To push back on this, no data is completely immutable, it is mutable on initialization, treating it as immutable between threads is ok after this stage, but if you disregard issues with timing and mutation you will run into problems between threads on initialization.
There's disagreement about what exactly makes you a systems programmer, but dealing with the reality that memory is finite is definitely a requirement.
You need mutation or garbage collection (and systems programming is the kind of environment where that's not just "someone else's problem").
Plenty of languages have immutable data without monads or even being statically typed. Clojure is a fine example. And there is also the somewhat popular https://immutable-js.com/ that brings the benefits of immutable data to JavaScript.
> Seems rather like an very incomplete view of a C++ programmer, wanting to raise it to "systems programmer".
I'm not sure I understand this comment. C++ is one of the very few languages that can be used for programming everything from a web server to a garbage collector to firmware. There's no "raising" going on here.
In this context, "systems programmer" means someone who writes software with tight, direct affinity to the underlying system (i.e. hardware, OS). C and C++ still dominate that world by huge measure.
Presumably, you're thinking of systems in the more abstract sense: systems of more or less pure software components that model some domain. That's just not what this article is about.
C++ is a very rare guest in that world. Not really worth mentioning. You could probably find more Python in that world than C++. Certainly more Shell than C++.
You could absolutely get a job as a system programmer not knowing and never touching C++, but not knowing Shell will not get you anywhere in that field.
It doesn't really have to be direct or tight (not even sure what that might mean). I'm guessing you were trying to say that it has to have minimal overhead compared to theoretically possible performance, in which case that's not true at all (a lot of Linux services and management tools are written in Shell, which just isn't known for great performance at all).
If it helps: among other examples, systems programming (as used here) characterizes much of what your Python and shell scripts call, whether commands or libraries.
C was the overwhelming language of choice for a very long time, but C++ made steady and verysignificant inroads in the last 20 years. Objective-C and Java had/have niches. Rust and Go have traction and continue to gain more. Swift is working towards it.
Actors and immutable types, which the GP mentioned, have less signifance here because they largely trade transparent resource management (which is critically important in many of these projects) for correctness guarantees and sophisticated domain modeling. That said, we can expect there to be more room for them, especially where implementations of those features are zero/low-overhead.
Nah. C++ is still pretty much hated in that field. It's the reason Go or Rust exist, not a complement.
C++ has very little (basically nothing) to do with what Python or Shell call. When I mention these languages in the context of system programming, I mean programs like, eg. ifup / ifdown or GRUB configuration, LDAP client utilities and so on.
C++ was and still is an unwelcome guest in this area. It's not minimalist enough to be easy to use in constrained environments, but when resources allow it, there are better languages for any task you can think of. In my experience, the only people who insist on using C++ are the Visual C++ Windows programmers. But, overwhelming majority of system jobs aren't for MS Windows, so, the Visual C++ people can be safely ignored. In the same way how you can safely ignore Objective-C, when it comes to system programming. You can work in the field of system programming whole your life, switch dozens of companies and products, and never hit C++ or Objective-C requirement. They are too niche and unimportant in the large scheme of things.
If, on the other hand, you were looking for a field where C++ is the most common language, then video games come to mind. Enterprise applications, especially targeting MS Windows, usually with a significant GUI component. It's really not meant, and doesn't work well as a system programming language. It's OK for application development (but there are better alternatives today). It survives due to the existing libraries and compilers, and, of course, expertise accumulated in programming community. But, that expertise and libraries are mostly in the fields mentioned above.
Check out this deep learning stuff people talk about. Torch is written in C++ using plenty CUDA (which is arguably a dialect of C++), and deals with all the resource management and hardware interfacing. People use python then interfacing with it.
I don’t particular like how torch is designed and written, but that’s a different story. TensorFlow is c++ as well, in case you wondered.
But maybe that all does not match your requirements for system programming? Try Haiku, at least back when it was BeOS I really liked that the whole OS was written in C++. I have a colleague who wrote some microcontroller code in C++ because he likes it so much and I don’t think there’s anything wrong with that either.
Torch is as far removed from system programming as humanly possible... CUDA is a library for application development, not for system programming.
The whole OS was written in many languages. It's hard to think about a mainstream language that wasn't used to write an OS. I think, you can even find OS written in PDF, if you persist at your search. This doesn't mean that PDF is a good system programming language, or that it's worth considering in this domain...
I think the term “system programming” is not sharply defined. I consider e.g. torch and CUDA more towards it as it requires knowledge and understanding of hardware and resources and they are rather providing an interface for the next layer of developers. Is pthreads or openmp written by system programmers or application programmers?
PDFs are if I’m not mistake not turing complete, they omitted those PS features from the standard for various reasons, so you probably can’t write anything reasonable in “PDF” (unless you use embedded JS etc)
There are more operating systems written in C++ than Go or Rust which you mention elsewhere. Microsoft Windows is the most popular operating system in the world and it's written in C++ primarily. C++ can be easily integrated anywhere C is used, provided hardware is good enough. System programmers are more likely to know C and C++ than Java, Go, or Rust. If a toaster has enough CPU to run Doom then I think your arguments against C++ "bloat" don't hold any water, and probably haven't made sense since the 90s.
As far as mentioning shell and Python, I don't consider those primary system programmer tools. You might need to know them to be a good generalist *nix system programmer, but it's really tangential to whether C++ has a place in systems programming. Python and shell are in competition with Perl and Tcl, not with C++. To say that they are is like saying a carpenter is in competition with a welder because they both build stuff.
Those are all gimmicks. They aren't really worth considering if talking about system programming. Again, if you want to be a system programmer, and you want to know what languages will get you there, then C++ isn't in that list. It's unimportant. Can you use it for system programming? -- Yes, if you don't have better things to do with your life. Will it be a good choice? -- No, for virtually any problem that exists in that domain there will be a better pick.
> Microsoft Windows is the most popular operating system in the world
No it's not. It's a very niche operating system. It's the most popular operating system in the world of desktops. But, if we are selecting arbitrary electronics form-factor, then any operating system can be made the most popular in the world. Pretty sure FreeBSD is the most popular operating system in the world, if you choose that the only computers that matter are those that run FreeNAS.
> C++ can be easily integrated anywhere C is used
And so are Python, JavaScript, Rust, Ada, PostScript, Awk, Forth and many, many other languages. What's your point?
> System programmers are more likely to know C and C++ than Java
Tell me you've never worked in the field of system programming without telling me you've never worked in the field of system programming. Java is a very popular pick for writing management components for enterprise-grade system products. I've seen multiple storage products where the part of the product that did the storage part was written in C, but the management was written in Java. I've seen this more often than I've seen any C++ used in system programming. The problem is more cultural, probably. The Visual C++ programmers usually don't have a good grasp of what good system programming tools / practices would be. Just the very mention of C++ in one's resume would often be enough for that person not being invited to an interview.
> I don't consider those primary system programmer tools.
Why should anyone care about what you consider? Are you an expert in this field? So far you've shown complete lack of expertise...
It's the most popular operating system except for on phones and it has been for decades.
>Why should anyone care about what you consider? Are you an expert in this field? So far you've shown complete lack of expertise...
I have over 20 years of experience writing C++ and I know it is used widely. Your characterization of it as niche in embedded or OS is completely wrong. I'm not going to say where I work but you've heard of it.
> The Visual C++ programmers usually don't have a good grasp of what good system programming tools / practices would be. Just the very mention of C++ in one's resume would often be enough for that person not being invited to an interview.
The fact you keep referring to "Visual C++" and make sweeping generalizations about C++, especially in a systems context, tells me you are severely lacking in expertise.
> Pretty sure FreeBSD is the most popular operating system in the world, if you choose that the only computers that matter are those that run FreeNAS.
This is such a mischaracterization that you should know better. Windows is on literally billions of computers and embedded things. FreeBSD is not even on most NAS devices.
>And so are Python, JavaScript, Rust, Ada, PostScript, Awk, Forth and many, many other languages. What's your point?
If you think any of those is as easy to integrate with a C operating system as C++, you're dangerously uneducated and have no place lecturing anybody about anything programming-related, especially systems programming. I don't enjoy being so blunt but you're wasting my time and yours and spreading silly anti-C++ propaganda.
To be fair most of the consensus-imposed[1] interface to memory ordering semantics was developed as part of the C++ memory model standardization process. GCC/clang have imported them as extensions for C code, etc...
[1] Which is not to say "best". My standard complaint in these threads is that the C++ ordering semantics are a mess designed to make things easier for compiler writers to target architectures with variant ordering guarantees and not for algorithm developers to create correct code. But lockless code is much harder to write than compilers! It's a bad trade. ARM had this right decades ago with simple read/write barriers, which are vastly easier to understand when reading code.
Because Linux pre-dates that work significantly (over a decade) and even multi-processor Linux pre-dates it by enough that it could not have waited, the Linux ordering rules are actually different from those C got from C++
One of the interesting problems for Rust-for-Linux is what to do about that, Rust has all of the C++ 11 memory ordering except the problematic (and in practice abandoned) Consume ordering, but it doesn't have a convenient way to admit a different model, yet Linus isn't exactly likely to wake up one day and declare now Linux has the C++ 11 ordering model.
As to your concern, of course the compiler vendors have to worry about portability. It's all very well to say this works on ARM, but the expectation from a person writing Rust or even C++ is that a correct program should work on all of an ARM, an x86, archaic PowerPC and somebody's cheap in-order embedded CPU.
If the vendor only really cares about one architecture (cough, Microsoft) then they might invent weird rules that only really make sense for that architecture and your "correct" programs either don't work, don't compile, or are very slow on every other CPU. But that's hardly a reliable way forward into the future.
> of course the compiler vendors have to worry about portability.
Yes, but the point was that "defining a language that doesn't suck" is a competing requirement. Again, C++ just dropped the ball here and solved the wrong problem. And we all suffer for it with repeated blog posts like this one trying desperately to explain exactly what "acquire" means and not how hardware actually works.
And yes: Linux has clean and easy (or as easy as these things get) read/write barriers, the implementation of which has now been terribly polluted by the toolchain's incompatible choice of semantics.
C tries to be everything to everyone. It has to be able to target every architecture, within certain limits. It took them a long time to standardize on two's-complement signed integers - they could only do that after the whole industry had migrated to two's-complement signed integers as they are the technically best way to represent signed integers.
The x86 is the weird architecture that tried to keep existing programs working when they moved to multicore. Most other architectures implemented whatever is easier to implement in hardware, and compilers have to target that. And the compiler can't simulate a strong memory model on top of a weak one, because that means lots of overhead for every memory operation and that's not what C does.
Everyone has, even the hardware vendors are treating these rules as the golden standard to which they can optimize and conform. So it's too late. But it's still a shame.
The acquire/release semantics make perfect sense for locks, because that's what it was designed for. Designing anything else with them is extremely difficult, and in practice such work tends to lean heavily on the full-barrier "sequentially consistent" primitive instead. The older read/write barrier instructions were much simpler. (And x86 is simpler still, with just one reordering rule to worry about and a single idea of serializing instructions to regulate it).
Hardware vendors adhering to C++ memory model to design their chips? I think you got it backwards or I misunderstood your point. Memory barriers exist because of how cache memory hierarchy and coherency protocols are implemented in multi-core chips. And not to "optimize and conform to C++ memory model".
C++ memory model is there to make life of a programmer easier across different micro-architectures because memory model of the micro-architecture is what differs.
There is an argument, particularly from C++ people, that the CPU architectures designed after C++ 11 shipped with the Acquire/Release model have all chosen to provide features that can do Acquire/Release smoothly.
After all, if the software most people execute rewards CPUs which do X and punishes CPUs which do Y instead, it's a tough sell to design your CPU to do Y, that means ordinary users will get the idea it's worse than it really is, and your sales people don't like that, it seems reasonable to argue that the reason modern x86-64 CPUs ignore the weird x87 FPU features is that nobody's programs want those weird features.
I remember in the mid-1990s there was a lot of hype for high clock frequency 80486 and clones because they had very good integer performance. However, Intel's Pentium had excellent FP performance at similar or lower clocks. The video game Quake needed floating point, and so Quake on my 100 MHz Pentium was very quick, while on a friend's 100MHz 486 it was pretty nasty. Increased sales of the Pentium were I believe widely attributed to the game Quake showing off the FP performance (it's nothing to look at in today's terms, we did not have GPUs back then).
I have never worked for a CPU design firm or CPU manufacturer, so I can't speak to whether this is in practice a meaningful influence.
While influential, I don't think C++11 was the primary reason forarchitectures standardising around acquire/release. I think that, with the advent of commodity multi core designs, the time was ripe to move beyond haphazardly, ad hoc and underspecified description of hardware rules to more formal and rigorous models.
Acquire/release happened to be a good fit, easy enough to program for, and relaxed enough to map relatively closely existing architectures. So C++11 embracing it was just moving with the zeitgeist.
For example Itanium got acquire/release operations well before the work on the C++11MM even started.
> There is an argument, particularly from C++ people, that the CPU architectures designed after C++ 11 shipped with the Acquire/Release model have all chosen to provide features that can do Acquire/Release smoothly.
I never heard that argument (source would be good?) but this is very different to what the parent comment said. Code does not run in vacuum - it runs on the CPU. And CPU does not exist vacuum - it's there to run that very code so they're both very inter-dependent and intertwined. As much as CPU designs are changing to accommodate new algorithmic requirements so does the code but vice-versa - to make a utility of new CPU designs.
So, of course, CPU vendors will do everything that is in their control to make the new chip design appealing to their customers. This has been done forever. If that means spending extra transistors to run some C++ heavy data center code faster by 10% of course they will do it - there's a very large incentive in doing so.
But that doesn't mean that CPU vendors are designing their chips to accommodate abstract programming language models. In this case, memory models.
Probably one of the most easiest examples of such practices to understand, and that I could think of right now, is Jazelle - an ARM CPU circuitry designed to execute the Java bytecode directly in the CPU itself.
I'm pretty sure there's a Herb Sutter C++ talk which explicitly associates newer CPUs having instructions well suited to the Acquire/Release model with that C++ 11 memory model. I have a lot of Herb's talks in my Youtube history so figuring out which one I meant will be tricky. Maybe one of the versions of "Atomic Weapons" ? This idea is out there more generally though.
I don't think I agree that this doesn't mean the memory model infects the CPU design. Actually I don't think I agree more generally either. For example I would say that the Rust strings are fast despite the fact that modern CPUs have gone out of their way to privilege the C-style zero terminated string. There are often several opcodes explicitly for doing stuff you'd never want except that C-style strings exist. They would love to be faster but it's just not a very fast type, so there's not much to be done.
Contrast this with say, bit count, which is a good idea despite the fact it's tricky to express in C or a C-like language. In a language like C++ or Rust you provide this as an intrinsic, but long before that happened the CPU vendor included the instruction because this is fundamentally a good idea - you should do this, it's cheap for the CPU and it's useful for the programmer, the C language is in the way here. "Idiom recognition" was used in compilers to detect OK, this function named "count_pop" is actually the bit count instruction, so, just use that instruction if our target architecture has it. More fragile than intrinsics (because it's a compiler optimisation) but effective.
At an even higher level, from the point of view of a CPU designer, it would be great to do away with cache coherence. You can go real fast with less transistors if only the stupid end users can accept that there's no good reason why cache A over here, near CPU core #0 should be consistent with cache D, way over on another physical CPU, near CPU core #127. Alas, turns out that writing software for a non-coherent system hurts people's brains too much so we've been resolutely not doing that. But that's exactly a model choice - we reject the model where the cache might not be coherent. Products which lack cache coherence struggle to sell.
RISC-V designers explicitly declare in their base specification: "The AMOs were designed to implement the C11 and C++11 memory models efficiently." So at least one example is present.
I think it is just a generational/educational difference. I learned lock-free programming around the time c+11 was being standardized, and the acquire/release model seems very natural to me and easier to understand and model algorithms in it than the barrier based model.
> Hm. Seems very C++ oriented. Talks about specific keywords of C++ and their semantics. Not even mentioned "actor" in the whole document. "immutable" not found.
The term "systems programmer" is right there in the title. They are also the very first words in the article. The examples are in C++, but the whole document focuses on both C and C++. The article clearly focuses on the primitives, as it should.
It's undoubtedly a great article on the subject, but nitpickers have to nitpick, and one-uppers have to one-up.
This 2020 review carefully summarizes the overly complex and error-prone outcome of programming language memory model changes in the early 2010's. It's tempting to master such tricky details, but doing so almost inevitably leads to the costliest bugs ever: rare data corruption that's impossible to detect and reproduce.
Russ Cox in 2021 offered a higher-level insightful history of evolving hardware and software support for concurrency, discussed just yesterday
https://news.ycombinator.com/item?id=42397737
His take (as I take it): systems programmers should only use sequentially-consistent atomics (and only via well-tested concurrent data structures).
I am inclined to agree with the opinion that Sequentially Consistent Atomics are another of the long list of wrong defaults chosen by WG21 (the C++ committee).
The argument goes roughly like this:
If you don't offer a default here, the programmer who supposes that the answer to their question is atomics will be obliged next to figure out which Order they wanted. Or they could give up and say "I need a grown-up". Because C++ chooses to offer Sequentially Consistent as the default that programmer will presume this must mean it now does what they naively anticipated. Russ' overview might suggest that's true too.
Here's the problem: It may not do what they wanted, it may be that no ordering rule would achieve their actual goals, but since they weren't asked to go read the ordering rules and pick one, they never confronted this awful problem, and are instead blissfully unaware. They are strolling into traffic knowing that they've picked the best possible road crossing rule and not aware that in fact no possible rule could make crossing this Interstate on foot safe to do.
Here's another bad problem: If it does achieve what they wanted, it's probably overkill. The only reason programmers should reach for atomics is that the provided facilities which were safe for them to use didn't deliver the performance they need. But the Sequentially Consistent ordering may hurt their performance, injecting barriers they don't need, using slower instructions or even stalling when that was unnecessary. If they'd read the rules they might for example have realised they only needed a Relaxed ordering - much better, or that with a slight tweak they can use Acquire/Release.
Having worked with some of the concurrency issues mentioned and observed both failures and successes, I’d like to share a perspective: developers who struggle most with concurrency often lack a formal Computer Science background. This isn’t to say that a CS degree is the only path to competence, but it provides structured exposure to concepts like concurrency through coursework, exams, and practical projects. These academic challenges can help solidify an understanding that’s harder to acquire in a fragmentary way
While a formal Computer Science background is helpful, i don't think it is necessary for learning Concurrency. The reason people struggle with Concurrency is because they are not taught properly and also of course many are not willing to study and try and figure it out for themselves.
Once you start thinking about a program as a sequence of loads/stores (i.e. reads/writes to shared memory) and note that Pipelining/OOO/Superscalar are techniques to parallelize these (and other) instructions for a single thread of control you start getting an idea of how sequential order can be preserved though the actual execution is not quite so. Now add in another thread of control on another core and you get the idea of Memory Consistency problems and can be introduced to various "Consistency Models" (https://en.wikipedia.org/wiki/Consistency_model). Next introduce caches in between and you get "Cache Coherence" problems (https://en.wikipedia.org/wiki/Cache_coherence) Walk through a simple example like "c=a+b" in thread1 and "z=x+c" in thread2 and observe the interleavings and possible problems. Now move on to "Atomic Instructions" (https://en.wikipedia.org/wiki/Linearizability#Primitive_atom...) and how they can be used to implement "Locking" (https://en.wikipedia.org/wiki/Lock_(computer_science) ). Finally show "Memory Barriers" (https://en.wikipedia.org/wiki/Memory_barrier) as a means of enforcing a checkpoint on memory operations. All these together should make an understanding of Concurrency clearer. One can then jump back up into the language and study the language primitives for concurrency management and their common usage patterns.
A formal education, in any domain, forces you to explore areas of knowledge that may not be directly applicable to immediate 'real world' problems. It does however give you the background to identify and explore those areas when you approach something that does overlap with work that people have spent literal lifetimes considering ahead of you.
I say this as someone without a formal CS education but has been working with computers for some time. I've spent countless hours, days, and months playing catch-up on that knowledge when I stumble into those blindspots.
Several other people have commiserated, so rather than do the same maybe we can help.
Are you quite sure you understand the atomic primitives you're working with? Mara's book https://marabos.nl/atomics/ is pretty good, it's written in Rust but like, if you thought you needed a Memory Ordering that doesn't cover you're boned because your tools don't support that order (and probably in an appendix or footnote somewhere they admit that) either.
Reach for tooling to diagnose ordering problems. Loom and Shuttle are where I'd start in Rust, hopefully you have equivalents in any language you're using. Those are: A thorough tool which tries every possible correct interleaving and tells you what happens, and, a probabilistic approach which just picks at random and figures on average it should find the problem cases (but we can't prove it does).
When in doubt, steal: There is often a published, correct, algorithm for whatever you're doing. It might take an hour to convince yourself that algorithm is correct, but hey, you're planning to spend four hours on your algorithm which isn't correct, that's a bad trade.
Rubber Ducks are invaluable. If you feel awkward addressing an inanimate object I don't recommend using pets or livestock as both have no reason to be attentive, try a human who likes you. The important property of the rubber duck is not that they understand what you mean, after all a plastic bath toy can't understand anything, but that you will feel the need to make whatever you're saying make sense and in doing so may uncover a place where your algorithm is inconsistent.
That is just not true and you are being unnecessarily hyperbolic. Even when i was learning/doing concurrency/multi-threading (using pthreads) long ago, i never got "into a world of random crashes and segfaults". It was of course challenging but not too difficult. You structure your application following standard usage patterns given in a good book (eg. Butenhof's) and then plug in your app logic in the thread routines. With some experience things get clearer over time and you begin to have an intuitive feel for structuring multi-threaded code. The key is to stay at a high enough level of abstraction appropriate for your usecase (eg. mutexes/semaphores/condition variables) before diving into compiler and hardware level intrinsics/atomics/etc.
A good book to study and get a handle on all aspects of Concurrent Programming is Foundations of Multithreaded, Parallel, and Distributed Programming by Gregory Andrews - https://www2.cs.arizona.edu/~greg/mpdbook/
In fairness the person you were responding to was referring to their own personal experience. They certainly are not the first person to conclude that doing non-trivial concurrent programming is too difficult for them. I agree that it is achievable with an appropriate level of care and experience, but I know there are many very smart people that conclude that multithreaded programming in C++ is too difficult for their taste.
Even Rich Hickey, when discussing concurrency in Java/C#/C++ said "I am tired of trying to get right, because it is simply far too difficult."
> In particular, talk about shared state, how we do it today, how you do it in C#, Java, or C++, what happens when we get into a multi-threaded context, and specifically what are some of the current solutions for that in those spaces. Locking, in particular. That is something I have done a lot of, over a long period of time, and I am tired of trying to get right, because it is simply far too difficult.
The first point to understand is that a knowledge of Concurrent Programming (in all its guises) is mandatory for all programmers today.
The second point to note is that when people like Rich Hickey or John Ousterhout talk about multi-threaded programming being "hard" they are talking about a level of depth far beyond what a "normal" application programmer will encounter in his/her entire career. These guys span everything from Apps/OS/Compilers/Language/Hardware and hence by necessity know the full gamut of complexity involved in concurrency. Trying to understand concurrency across all the above abstraction layers is very difficult and that is what they mean when they say it is "hard".
But for most application programmers the above is simply not relevant and they can comfortably stay at higher-level abstractions given by their language/library and ignore lower-level details unless and until forced by other needs like performance etc. One can do a lot with this knowledge alone and indeed that is what most of us do.
So instead of making wild statements like "random crashes and segfaults" and "too hard to program" learn to use heuristics/commonsense to simplify the code structure eg. a) copy code patterns given by reputed authors so one does not make unnecessary errors b) Keep the number of locks to a minimum by using a few "global locks" rather than a lot of "granular locks" c) Learn to use Thread Local Storage d) Acquire locks/resources in the same order to avoid deadlocks etc etc.
You can get pretty far with keeping sharing to an absolute minimum and when you do need to share data, slap a lock free ringbuffer between them to communicate.
Pretty simple to get right.
You are right, but what you call "lock free" is not the same as many other things that are called "lock free", even if indeed a ringbuffer needs no locks, so this may be confusing for newbies.
I strongly dislike the term "lock free", which is really just a marketing term invented by people trying to promote the idea that some algorithms are better than those "lock-based", when in fact those "lock-free" algorithms were only choosing a different trade-off in performance, which can be better or worse, depending on the application.
Even worse is that after the term "lock free" has become fashionable, it has also been applied to unrelated algorithms, so now it has become ambiguous, so you cannot know for sure what is meant by it, unless more details are provided.
When accessing shared data structures, the accesses are most frequently done in one of three ways.
The first is to use mutual exclusion, when the shared data structure is accessed within critical sections and only one thread can execute the critical section at a given time. This method is usually called as lock-based access.
The second is to use optimistic access, when the shared data structure is accessed concurrently by many threads, but they are able to detect interference from the other concurrent accesses and they retry their accesses in such cases. This is what is most frequently referred as "lock free" access. Compared to mutual exclusion, this access method may be faster in the best cases, but it is much slower in the worst cases, so whether this is a good choice depends on the application.
The third method happens when it is possible to partition the shared resource between the threads that access it concurrently, so their concurrent accesses can proceed without fear of interference. This partitioning is usually possible for arrays and for buffers a.k.a. FIFO memories a.k.a. message queues (including one-to-one, many-to-one, one-to-many and many-to-many message queues).
So your "lock free ringbuffer" refers to the third method from above, which is very different from the "lock free" algorithms of the second kind from above.
Whenever concurrent access to partitioned shared resources is possible, it is much better than accesses with mutual exclusion or optimistic accesses, which require either waiting or retrying, both of which are wasting CPU time.
Therefore using correctly-implemented message queues or other kinds of shared buffers is usually the best method to achieve high levels of concurrency, in comparison with other kinds of shared data structures, because it avoids the bottlenecks caused by mutual exclusion or optimistic accesses.
The fact that it has been coined by some academics in their research papers is not in contradiction with the fact that it has been chosen exactly like any marketing term, to imply that something is better than it really is.
The alternative term "optimistic access" describes much better the essence of those algorithms, while "lock free" attempts to hide their nature and to make them look like something that is guaranteed to be better (so receiving money for researching them is justified), because locks are supposed to be bad.
"Lock free" and "wait free" have been buzzwords that have provided subjects for a huge number of research papers in the academia, most of which have been useless in practice, because the described algorithms have been frequently worse than the lock-based algorithms that they were supposed to replace.
I don't agree with your characterization of these algorithms as "worse".
They have a desirable property. If you needed a wait-free algorithm, and this is a wait-free algorithm it's not "worse" for you than an existing algorithm that isn't wait-free, regardless of whether it's slower, or more memory intensive or whatever. You needed wait-free and this is wait free.
Why is wait-free desirable? Well, unlike a lock-free algorithm, the wait-free algorithm makes progress in defined time for everybody and it might be that it's actually much worse if anybody is stalled than for the averages to be bad for example.
If you mean "fast on average" say that. If you mean (as often C++ programmer do) "Validates my feeling of self-worth" then say that. I don't know whether anybody wants to pay you more money to validate your self-worth, but at least you're being honest about your priorities.
Step 1: use concurrency on a very high level. For example, write an app, and then run 4 instances of it, each working on a quarter of your data.
Step 2: when you absolutely need concurrency within one app, try using some library that already solved the issue. For example, there are lots of databases with pretty strong concurrency contracts while still being efficient.
Step 3: if you absolutely need custom solution, use a library/language that provides reasonable tools out of the box and keep your logic to minimum.
Following the first two steps will solve 95% of your concurrency issues, if you include step 3 it goes to 99%.
If C++ makes it possible to blow your leg off, multithreading with C++ makes it possible to blow yourself up, along with half your neighborhood and some homes halfway around the country, unexplicitly.
Really? I don't share either sentiment. I found multithreading in C++ to be pretty mundane, and in fact a lot easier than it probably once was with lambdas. You definitely need to be on top of the multithreading primitives such as atomics, mutex, condition variables, shared_ptr, etc. But otherwise it's pretty straightforward.
One of the problems I'm usually facing with this is the need to call library functions from C++ code. Especially TLS-related stuff. It doesn't matter what primitives C++ has to offer, as long as you are using library code things will break.
The standard library concurrency primitives are still too low-level for a lot of general purpose concurrency needs. IMO, the minimum level of abstraction that you should start with for many apps is a thread-safe queue that dispatches messages with immutable data to one or more worker threads (e.g. actor-like model). You can go lower-level, but it needs to be deliberately chosen.
People keep reinventing those and thread pools over and over in C++. I've been researching one of our older systems (slated for decommissioning in 2021, as is typical that did not happen). In trying to understand it I have found many areas of concern around how they deal with concurrency, in particular they created their own queue and thread pool. Based on past experience, there's 50/50 chance for each that they were created correctly (with proper concurrency controls), and less than that that the tasks submitted to the thread pool themselves make use of proper concurrency controls rather than assuming that they can read/write whatever they want as if the system were single-threaded.
We had multi CPU stuff in the 90s. C/C++ was dead set on ignoring it for a long time. Every OS had its own way of handling it and all with subtle weird bits that did not act like other OS's. You could not even trust the stdlib or crt to get it right with their globals that could change underneath you.
So it was left to the developer. It is much better now but for so long the problem was ignored and now we have decades of 'not sure if I can touch that code'. Also by default C/C++ are fairly open about sharing memory. So it is very easy create race conditions on memory. It would be nice if the base language had a concept of 'locked to a thread' or 'i want to share this with other threads' then the compiler can flag where I have wandered into the weeds for a class so we could catch the race conditions at compile time, at least some sort of warning.
Sharing semantics were awful for a long time. stdlib has done some very good things to help clean that up but it is still very easy to share between threads and cause yourself a headache.
C++ itself doesn't have the owning mutex, but there is one in Boost for example.
The problem with an owning mutex in such a language is that you can (on purpose or by mistake) keep accessing the thing it owned after you've released the protecting mutex. Rust's Mutex<T> owns the T but it has the behaviour you want where if you tried to keep the access to T but give back the mutex that doesn't compile. "I tried to unlock the mutex but I still need it, d'oh".
And the same problem applies broadly, you should not share write access to things without an ordering guarantee, but it's hard to ensure that guarantee is followed in C++.
Exactly. This stuff has been known about for a long time. It was just kind of ignored and you kinda hoped your library might have something to deal with it (boost, win32, pthread, etc). Then each one acted differently on different platforms or with each other. Some of the std lib is starting to have things we need. But now I have to deal with things in the crt and stdlib that actively break multi threading. Mutexes, semaphores, flags, pulsing, etc is not exactly new patterns. Real mess and you have to understand it too deeply for it to be meaningful to more people. It is why things like promise/async/await are very popular with javascript and their libraries. As it looks like multithreaded programming with a decently clear interface as to what it is going on.
CPU design makes it inherently hard. C or C++ is just a thin layer above it making no tradeoffs. If you can live with the tradeoff then Rust land or VM-language land is more appropriate.
In theory none but in practice the codebase ends up littered with hidden shared state mostly disguised through one or another shared pointer implementation. And this happens because that's what the Rust compiler is pessimistically enforcing upon you by default.
For heavy workloads, this approach doesn't scale particularly very well.
It sounds to me as though what you're saying is that when you write Rust programs which don't scale very well they don't scale very well, whereas when you write C++ programs you don't do that, I suggest learning not to do it in Rust either.
Easier to be said than done since that's one of the core tradeoffs of the Rust language. Language is forcing these semantics upon you and while it is possible to get around it with unsafe blocks it turns out to be much more difficult in the practice. So, by default Rust code will almost certainly in majority of cases going to be designed around shared ownership.
If you actually have shared ownership but in C++ you're getting away with pretending you don't, chances are that'll bite you really hard. Maybe it's already biting and you didn't notice, so Rust actually do you a massive favour.
If there is no shared ownership then inventing it so as to make your Rust slower is just a problem with you, not with Rust.
No. For 98% of the multi-core sensitive code I don't have nor I need shared ownership. While C++ doesn't force you into such semantics but provides you with the ability to do so, the Rust semantics and compiler pesimisstically do. I am going to stop here since I'm repeating myself and you're blatantly going over my points.
Pdfs are perfect on the kindle scribe, so long as they are not color. This is my primary use for the device. On smaller kindles, pdfs are probably a pain to read though.
(I can share my converted file, but I don't know which file-sharing service is permitted on HN. If anyone can give me suggestions on this, I will be happy to upload it and share my epub file.)
> whatever happened to that Parallella "super computer" project?
I had a couple of those. (Well, still do – but not using them for anything now. Mine are the ones with 16 cores, that I got for backing them on Kickstarter when they did their original crowd funding) And I was wondering the same a while ago. This guy that was part of Adapteva, makers of Parallella, is working on some ASIC stuff now it looks like.
And since Adapteva website also says: “Adapteva is now Zero ASIC”, I guess maybe some other people that were originally doing the Parallella thing are now doing ASIC things with that guy too.
It often refers to programmers that develop things that other programmers use to construct applications. A systems programmer may work on compilers, operating systems, framework libraries, etc.
They will also often work with lower-level abstractions, such as threads, so that the higher-level developers need not think about them in their own designs.
Below system level you can harmonize (merge) branches of the same system (repository). At system level you would harmonize repositories - might involve politics I guess
Tree vs forest (more than one root)
Beyond systems level would be organic (system of systems)
For the purpose of job interviews (and not necessarily any kind of sound definition), this means that the position you are looking at will have you dealing with software that implements some operating system services.
To give you some examples: you might be working on an SDS product (software-defined storage), perhaps a filesystem, or maybe a network-attached block device etc. that operating system has a way of interacting with and exposing to other software as it's own service. To make it even more concrete: you can write a filesystem in a way that it interfaces with Linux kernel, and, once installed, users will interact with it through VFS interface (so, they don't even have to know they are interacting with your specific product). ZFS is a good example of this kind of software.
Similarly, it could be SDN (network), eg. various VPN software s.a. OpenVPN. Or a users / permissions management software (eg. LDAP server, s.a. slapd). Software to manage system services (eg. systemd). Software to manage specialized hardware (on top of drivers) that, again, provides its services through the operating system (eg. nvidia-smi). Or, maybe it's a software that manages memory, terminals, virtualization, containers (eg. DMA over IB, tmux, QEMU, containerd).
Sometimes this category is extended to system monitoring or higher-level management tools that still give some kind of general service similar to the one provided by the operating system. Think about SAR or mdadm kind of tools. Or, this could also be testing of operating system services (eg. FIO).
Sometimes, this can also mean some kind of administrative tools that perform either more high-level management of operating system (eg. yum, the RHEL package manager), or management of groups of operating systems. Cloud tools can fall into this category (eg. boto3 AWS client). But, this is kind of stretching the definition.
----
So, as per usual in the field of programming... we don't have an authoritative source and a definition for a widely used term. You just have to accept (at least for now), that different people mean different (but often significantly overlapping) things when they use the term and ask more questions to refine what they mean by it.
These days? A programmer that does anything but CRUD for web apps.
In the before times the terminology came from system 360 and descendent mainframe systems where the system programmers tended to be intimately involved with the deployment and programming of the operating system and associated "exits", and such tasks tended to presume some working knowledge of assembly language (operating systems were very crude at that time...and many shops tended to "customize" them if that makes sense...you would sometimes have serious problems with operating system level things and at that time the generic response would be "contact your systems programmer", who would either fix it or get in touch with IBM...so TLDR is basically a sysadmin that is good at assembly). Big companies could afford to have systems programmers on staff and needed them because every shop was different and computers were very custom and crude at that point in history.
That's where the terminology came from originally but now it has expanded and can mean different things.
The "systems programmers" terminology was mainly just a differentiation from the more generic "application programmers" (who would probably program in COBOL or something similar, doing business related tasks) or "system operators" (who would not usually be programmers at all, and would do a lot of what we would now consider sysadmin work).
Nowadays it just means anything that isn't bog standard CRUD or frontend web app stuff. The terminology has evolved but has a somewhat similar meaning to when it originated. It's tough to understand exactly without the historical context which is why I have included that.
For obvious reasons, this has some overlap with the more modern definition of "programmers that build systems that other programmed then use". Because the application programmers on those mainframe systems would be relying on the systems programmers to not screw up their assembly wizardry or else everything would fall apart.
Seems rather like an very incomplete view of a C++ programmer, wanting to raise it to "systems programmer".