
Assembly's Perspective of C - pavehawk2007
https://blog.stephenmarz.com/2020/05/20/assemblys-perspective/
======
chrisseaton
> If you haven’t been living under a rock quarantined against COVID-19, you’ve
> probably heard about C becoming assembly 2.0, or the new assembly, or YES
> THIS.

What's this referring to? Has something been proposed recently?

> Now, many compilers have done some work with this using what are known as
> intrinsics. However, these do not allow us to fully exploit what we can in
> assembly. Most of the intrinsic instructions follow the C calling
> conventions, so they are required to load from RAM, do something with it,
> then store back into RAM.

> Since we store the intermediate values, we’re exercising the memory
> controller for every intrinsic. For our assembly, we load at the beginning
> and store at the very end. Nothing is in between.

I don't understand this claim. It seems demonstrably false to me. Intrinsics
don't have to follow the C calling convention. And they in practice don't
follow it. You can try it for yourself. It definitely doesn't store
intermediate results to memory for his example code.

In fact, isn't what clang generates roughly the same assembly as the hand-
written assembly in the article?

[https://godbolt.org/z/rfmFB5](https://godbolt.org/z/rfmFB5)

The author looks like an expert so I guess I'm missing something.

~~~
pavehawk2007
Hello,

The introduction was meant to be tongue-and-cheek. There was a talk back in
January which made headlines due to its claims.
[https://www.youtube.com/watch?v=8SoJR3sCaR4](https://www.youtube.com/watch?v=8SoJR3sCaR4)

When I started learning the Nim programming language, the claims were that
what you can do in C is what you can do in assembly... In other words, there
was no more reason to learn or write assembly, unless you were making a C
compiler yourself. I don't think they were ever taken seriously, but neither
was my introduction supposed to be taken seriously. The point being that
instead of having your brand new language compile to assembly, why not compile
to C?

Regarding the intrinsics, you are correct if you optimize. Using your same
link, you can turn off the optimizer and see that it follows C ABI conventions
--mainly so the debugger can follow each step.

The point here is that we're still writing assembly, with a C marinade on top.
_mm_haddps is wrapping an assembly instruction directly. However, to see what
this is actually doing, we need to know a bit of assembly. In other words,
we're writing C in syntax only.

Perhaps I can reword that part better. Thanks for the comment!

~~~
tom_mellior
> There was a talk back in January which made headlines due to its claims.
> [https://www.youtube.com/watch?v=8SoJR3sCaR4](https://www.youtube.com/watch?v=8SoJR3sCaR4)

That talk has 3194 views at the moment. That's pretty obscure.

> Using your same link, you can turn off the optimizer and see that it follows
> C calling conventions--mainly so the debugger can follow each step.

If you change -O0 to -O3 you will see a lot of stack traffic, that's true.
This has nothing to do with calling conventions -- _there are no calls
generated for the intrinsics_. Rather, you just see stack traffic because GCC
doesn't try to do proper register allocation at -O0. Nothing to do with
intrinsics.

~~~
kbenson
> That talk has 3194 views at the moment. That's pretty obscure.

I guess it depends on the intended audience of this post?

I might also find it hard to understand some of the context of a post poking
fun as some trend in some knitting mailing list, if I wasn't a member of that
list.

I think by "made headlines" they might mean "made a stir in some of the
communities I frequent". Given the idea of bloggers as news people, a
"headline" is a blog post that gains traction in circles you care about...

~~~
tom_mellior
Yes, the video was probably passed around in a the Nim community. The featured
article isn't directed at the Nim community, though. The author just misjudged
the video's reach outside of their knitting mailing list.

------
rajeevk
I wrote very similar article long back.
[http://www.avabodh.com/cin/cin.html](http://www.avabodh.com/cin/cin.html)

------
nayuki
> All in all, the compiler, when emitting its assembly code, must make these
> decisions. One data type of note is char. The standard does not 100% state
> whether we should sign extend or zero extend a char. In fact, I noticed a
> difference between GNU’s RISC-V toolchain vs GNU’s Intel/AMD toolchain. The
> former will sign extend, whereas the former zero extends.

The C standard states that an unqualified "char" type (without the "signed" or
"unsigned" keywords) is implementation-defined to be either signed or
unsigned.

[https://en.cppreference.com/w/cpp/language/types#Character_t...](https://en.cppreference.com/w/cpp/language/types#Character_types)

> char - type for character representation which can be most efficiently
> processed on the target system (has the same representation and alignment as
> either signed char or unsigned char, but is always a distinct type).
> Multibyte characters strings use this type to represent code units. The
> character types are large enough to represent any UTF-8 eight-bit code unit
> (since C++14). The signedness of char depends on the compiler and the target
> platform: the defaults for ARM and PowerPC are typically unsigned, the
> defaults for x86 and x64 are typically signed.

------
loosetypes
Would assembly consider c a declarative language?

~~~
uryga
"it's so simple, you just describe that you want `x` to become equal to `y+1`,
and the compiler figures out the implementation" ;)

------
daniel-thompson
> "Another reason we use assembly over a high-level language is to use
> instructions that C or C++ cannot really use. For example, the advanced
> vector extensions (AVX) for an Intel processor don’t have a 1-to-1
> conversion from C."

This might be beyond the scope of the article, but it may be worth noting
[https://ispc.github.io/](https://ispc.github.io/), which is a C language
extension and accompanying (LLVM-frontend) compiler. ISPC exposes an SPMD
programming model that efficiently maps to the CPU's vector lanes (think CUDA
kernels, but running on those vector lanes instead of a GPU). Except for a few
new keywords and some library functions, it is almost identical to C and has C
linkage.

------
gorgoiler
The thing I adore about computer science is that it is quite tractable to
understand absolutely everything from theories of data and computation,
programming languages and operating systems, and hardware.

This article is a fantastic look at the lower parts of the stack, and it’s
really made me look forward to the next time I’ll be teaching this topic in my
own classes.

RISC looks much easier to learn than the code I’ve been using: clang generated
asm in macOS.

I was actually thinking of moving to using assembler in DOSBox (1) for
nostalgia reasons and (2) because it could be a lot simpler and less crufty.

What’s an equivalent platform / set of tools with which I could do some real
world RISC? Is my macOS assembler RISC already, but just not as elegant as
this author’s examples?

~~~
tom_mellior
> RISC

RISC
([https://en.m.wikipedia.org/wiki/Reduced_instruction_set_comp...](https://en.m.wikipedia.org/wiki/Reduced_instruction_set_computer))
is a class of computer architectures. RISC-V
([https://en.m.wikipedia.org/wiki/RISC-V](https://en.m.wikipedia.org/wiki/RISC-V))
is an architecture in this family.

> I was actually thinking of moving to using assembler in DOSBox

That would mean using cumbersome tools in a cumbersome environment to target
the same cumbersome x86 architecture. It's a horrible idea. Please don't
inflict your nostalgia on students. Your past context means nothing to them,
and it would not teach them anything useful.

> What’s an equivalent platform / set of tools with which I could do some real
> world RISC?

If your Clang/LLVM is compiled appropriately, it includes a RISC-V backend.
You can pass something like --target=riscv64 to get code generated. You will
need a cross-libc installed if you want to be able to call anything but
syscalls. Then use an instruction set simulator to execute the binary. There
is a simulator called Spike for RISC-V, or you could use Qemu. The answer here
sketches these steps: [https://stackoverflow.com/questions/54670887/how-can-i-
compi...](https://stackoverflow.com/questions/54670887/how-can-i-compile-with-
llvm-clang-to-risc-v-target)

Alternatively, set up a whole RISC-V Linux system inside Qemu.

------
pavehawk2007
I write OSes most of the time, but seeing how languages end up boiling down is
a very good exercise in learning the architecture.

------
Ericson2314
The nice thing about assembly code isn't the lack of abstractions, but the
fact that the image of compilation (the assembly process) is bigger than most
languages' compilers: just about any output you want has a corresponding
input.

People always confuse these things, and it's important that we don't to
continue striving for better programming languages.

