What's this referring to? Has something been proposed recently?
> Now, many compilers have done some work with this using what are known as intrinsics. However, these do not allow us to fully exploit what we can in assembly. Most of the intrinsic instructions follow the C calling conventions, so they are required to load from RAM, do something with it, then store back into RAM.
> Since we store the intermediate values, we’re exercising the memory controller for every intrinsic. For our assembly, we load at the beginning and store at the very end. Nothing is in between.
I don't understand this claim. It seems demonstrably false to me. Intrinsics don't have to follow the C calling convention. And they in practice don't follow it. You can try it for yourself. It definitely doesn't store intermediate results to memory for his example code.
In fact, isn't what clang generates roughly the same assembly as the hand-written assembly in the article?
The author looks like an expert so I guess I'm missing something.
The introduction was meant to be tongue-and-cheek. There was a talk back in January which made headlines due to its claims. https://www.youtube.com/watch?v=8SoJR3sCaR4
When I started learning the Nim programming language, the claims were that what you can do in C is what you can do in assembly... In other words, there was no more reason to learn or write assembly, unless you were making a C compiler yourself. I don't think they were ever taken seriously, but neither was my introduction supposed to be taken seriously. The point being that instead of having your brand new language compile to assembly, why not compile to C?
Regarding the intrinsics, you are correct if you optimize. Using your same link, you can turn off the optimizer and see that it follows C ABI conventions--mainly so the debugger can follow each step.
The point here is that we're still writing assembly, with a C marinade on top. _mm_haddps is wrapping an assembly instruction directly. However, to see what this is actually doing, we need to know a bit of assembly. In other words, we're writing C in syntax only.
Perhaps I can reword that part better. Thanks for the comment!
That talk has 3194 views at the moment. That's pretty obscure.
> Using your same link, you can turn off the optimizer and see that it follows C calling conventions--mainly so the debugger can follow each step.
If you change -O0 to -O3 you will see a lot of stack traffic, that's true. This has nothing to do with calling conventions -- there are no calls generated for the intrinsics. Rather, you just see stack traffic because GCC doesn't try to do proper register allocation at -O0. Nothing to do with intrinsics.
I guess it depends on the intended audience of this post?
I might also find it hard to understand some of the context of a post poking fun as some trend in some knitting mailing list, if I wasn't a member of that list.
I think by "made headlines" they might mean "made a stir in some of the communities I frequent". Given the idea of bloggers as news people, a "headline" is a blog post that gains traction in circles you care about...
The C standard states that an unqualified "char" type (without the "signed" or "unsigned" keywords) is implementation-defined to be either signed or unsigned.
> char - type for character representation which can be most efficiently processed on the target system (has the same representation and alignment as either signed char or unsigned char, but is always a distinct type). Multibyte characters strings use this type to represent code units. The character types are large enough to represent any UTF-8 eight-bit code unit (since C++14). The signedness of char depends on the compiler and the target platform: the defaults for ARM and PowerPC are typically unsigned, the defaults for x86 and x64 are typically signed.
This article is a fantastic look at the lower parts of the stack, and it’s really made me look forward to the next time I’ll be teaching this topic in my own classes.
RISC looks much easier to learn than the code I’ve been using: clang generated asm in macOS.
I was actually thinking of moving to using assembler in DOSBox (1) for nostalgia reasons and (2) because it could be a lot simpler and less crufty.
What’s an equivalent platform / set of tools with which I could do some real world RISC? Is my macOS assembler RISC already, but just not as elegant as this author’s examples?
RISC (https://en.m.wikipedia.org/wiki/Reduced_instruction_set_comp...) is a class of computer architectures. RISC-V (https://en.m.wikipedia.org/wiki/RISC-V) is an architecture in this family.
> I was actually thinking of moving to using assembler in DOSBox
That would mean using cumbersome tools in a cumbersome environment to target the same cumbersome x86 architecture. It's a horrible idea. Please don't inflict your nostalgia on students. Your past context means nothing to them, and it would not teach them anything useful.
> What’s an equivalent platform / set of tools with which I could do some real world RISC?
If your Clang/LLVM is compiled appropriately, it includes a RISC-V backend. You can pass something like --target=riscv64 to get code generated. You will need a cross-libc installed if you want to be able to call anything but syscalls. Then use an instruction set simulator to execute the binary. There is a simulator called Spike for RISC-V, or you could use Qemu. The answer here sketches these steps: https://stackoverflow.com/questions/54670887/how-can-i-compi...
Alternatively, set up a whole RISC-V Linux system inside Qemu.
This might be beyond the scope of the article, but it may be worth noting https://ispc.github.io/, which is a C language extension and accompanying (LLVM-frontend) compiler. ISPC exposes an SPMD programming model that efficiently maps to the CPU's vector lanes (think CUDA kernels, but running on those vector lanes instead of a GPU). Except for a few new keywords and some library functions, it is almost identical to C and has C linkage.
People always confuse these things, and it's important that we don't to continue striving for better programming languages.