1. On amd64 those ints are actually 64bit. If you used int32 then they would be be word aligned in the parameter list. However, there is a gotcha with that. The return values will always start at a dword aligned offset on 64bit system.
2. NOSPLIT is defined in "textflag.h" which Go's compiler automatically provides. However, NOSPLIT is, from everything I've read, only respected on runtime.XX functions, so it's not doing anything there, and it's also not necessary. NOSPLIT tells the compiler not to insert code to check if the stack needs to split because it's going to overflow, which is technically unnecessary if the function doesn't need any stack space. It's basically only there on the function that checks for stack splits, to prevent that code from being injected into itself.
> 4 represents “NOSPLIT” which we need for some reason
For those who are curious: "In the general case, the frame size [the parameter after NOSPLIT] is followed by an argument size, separated by a minus sign. (It's not a subtraction, just idiosyncratic syntax.) The frame size $24-8 states that the function has a 24-byte frame and is called with 8 bytes of argument, which live on the caller's frame. If NOSPLIT is not specified for the TEXT, the argument size must be provided. For assembly functions with Go prototypes, go vet will check that the argument size is correct."
This is a fair question. Initially I just assumed this was true. Because someone who did research on this topic would not get something like this wrong. And besides that, if you know a little about the project this name could make some sense.
But the more I look into it, the more I think this is just a LLM hallucination.
The doc about the 'assembly' format doesn't give a proper name. It just calls 'go assembler'.
And I think the source of this hallucination was this first paragraph:
> The assembler is based on the input style of the Plan 9 assemblers, which is documented in detail elsewhere. If you plan to write assembly language, you should read that document although much of it is Plan 9-specific. The current document provides a summary of the syntax and the differences with what is explained in that document, and describes the peculiarities that apply when writing assembly code to interact with Go.
Maybe you should actually read something from the official website before spending time writing multiple paragraphs assuming it's fake. Alot of the people involved in golang also were involved in bell lab's plan9 project, going back to the 1980s (Kernighan and Pike especially go back that far). The CSP threads from plan9 were influential in the development of the programming language. And you can find this on their official site:
No, it doesn't have a name. Plan 9 is an operating system, and this style of assembly language syntax originates from the assembler used on this operating system. Its like saying "The GNU Compiler Collection uses its own internal assembly language called Unix."
They aren't Linux, they use the Linux kernel, alongside a Java or JavaScript userspace, not really the same thing, and a reality termux refuses to acknowledge and that it is why it is no longer available on Play Store.
My dear summer child. My degree trained me to build computers from logic, write an operating system, write userspace code and applications (with a side of AI) all before the year 2000.
I don't know where you did your degree or when. But my friend you are objectively wrong.
Termux no longer runs because it no longer allows (possibly using Linux capabilities?) subprocesses from around Android 10. Android 12 if memory serves actually starts killing background processes.
No hacks. Unless your degree says using the POSIX fork()/exec() API as "hacks".
Please don't embarrass yourself further. It was quite painful reading your prior response.
A hack that termux folks now suffer from, because it fails Play Store API validation for forking processes, which sidelining works around, until Google decides to forbid that as well.
Coding since 1980's, and only fools are afraid to be embarrassed.
Hmm the NDK is for userspace, you can remove functions out of the standard libraries, but the Linux syscall API will likely be untouched
Apple does this too for its more locked down devices.
I've been coding since the 80s too. I had assumed from your hubris and ignorance that you were young. My mistake, it's clear that you're merely an idiot.
Enjoy the weekend, happy in the knowledge that I shall no longer be engaging with you.
> Overall, pretty weird stuff. I am not sure why the Go team went down this route. Maybe it simplifies the compiler by having this bespoke assembly format?
Rob Pike spoke on the design of Go's assembler at a talk in 2016 [1][2]. I think it basically came down to the observation that most assembly language is roughly the same, so why not build a common assembly language that "lets you talk to the machine at the lowest level and yet not have to learn a new syntax." It also enables them to automatically generate a working assembler given an instruction manual PDF for a new architecture as input.
Yes, this is a great leap forward in my opinion. I had to do a project at a previous job where I wrote an agent that ran on x86, MIPS and ARM, and doing it in Go was a no-brainer. The other teams who had a bunch of C code that was a nightmare to cross-compile were so jealous they eventually moved a lot of things to Go.
I've been doing this for 35 years and cross compiling anything nontrivial was always a toolchain nightmare. Discovering a world where all I had to do was set GOARCH=mips64 (and possibly GOOS=darwin if I wanted mac binaries) before invoking the compiler is so magical I was extremely skeptical when I first read about it.
It's still pretty slow, but overall correct. There's tricks, like reader connections and a single writer connection to reduce contention. There was a blog post on here detailing some speedups in general.
A fair enough assessment, it be that way, however I will note that a large reason that C exists in the first place was to have a machine independent language to write programs in.
> however I will note that a large reason that C exists in the first place was to have a machine independent language to write programs in.
That's fair, but what we call a monstrosity by modern standards is much simpler than porting the assembly
There were cross plaform languages before C, but they never really took off for system development the wat C did (OSs, for example were commonly written in pure assembly)
A side effect of C not having a price tag associated with it, anyone with UNIX source tapes got a C compiler for free, until commercial UNIX became a thing, and splitted into user/developer SKUs, and thus GCC largely ignored until then became a thing worth supporting.
mips64!? That's a blast from the past. It must be some kind of legacy hw that's getting current software updates in some kind of really niche use case. Or academia. :)
Like previous you, I have to admit I'm skeptical but would be happy to be wrong.
> mips64 .. must be some kind of legacy hw that's getting current software updates
Hundreds of thousands of linux-based smartnic cards, actually. Fun stuff. Those particular ones were EOLd and have been replaced with ARM but the MIPS based ones will live on in the datacenters until they die, I'm sure.
> Like previous you, I have to admit I'm skeptical but would be happy to be wrong
Seriously, you are going to be delighted to be wrong. On your linux machine, go write a go program and write "GOOS=darwin GOARCH=arm64 go build ..." and you will have yourself an ARM mac binary. Or for going the other way, use GOOS=linux GOARCH=amd64. It really is that simple.
Ah I found this https://ctrl-c.us/posts/test-goarch I guess it's qemu-user-binfmt registering the alternate bin formats to automatically run under QEMU, that's pretty neat
The Go build system runs under your current architecture, cross-compiling tests to your target architecture.
Then, the Go test runner also runs under your current architecture, orchestrating running your cross compiled test binaries.
Since you registered to run cross-compiled binaries under QEMU, those test binaries magically run through QEMU.
The Go test runner collects test results, and reports back to you.
The first run might be slowish, as the Go compiler needs to cross compile the standard library and all your dependencies to your target platform. But once that's done and cached, and if your tests are fast, the edit-test cycle becomes pretty quick.
"EdgeOS" is based on Linux, and people run vanilla Linux distributions on those boxes, as well as OpenBSD and NetBSD.
I wonder how long Marvell will continue selling those Octeon MIPS64 chips, though. Marvell (then Cavium) switched to ARM nearly a decade ago (2016) for newer chips in the Octeon series. I think Loongson sells more modern MIPS64 (or at least MIPS64-like) chips, but they don't seem to be commercially available outside China.
Go essentially copied the design from Plan9 compilers, which it was originally based on. It's one of the many things it inherited from Plan9 environment.
I would love to see a deep dive on what features / architectural paradigms the Golang runtime shares with Plan9. Has anything like that been written?
One that always sticks out to me personally is the use in Go of the term "dial" instead of "connect" for network connection establishment. This is, AFAICT, another Pike+Thompson-ism, as it can be seen previously in the form of the Plan9 dial(3) syscall — https://9fans.github.io/plan9port/man/man3/dial.html .
---
A tangent: I have wondered before whether Pike and Thompson drafted the design for the language that would become Golang long before working at Google, initially to replace C specifically in the context of being the lingua-franca for systems programming on Plan 9. And that, therefore — at least in the designer's minds — Golang would have always had Plan9 as its secret "flagship target" that it should have a 1:1 zero-impedance abstraction mapping onto. Even if they never bothered to actually make a Plan9 Golang runtime.
You could test this hypothesis by implementing an actual Golang runtime for Plan9†, and then comparing it to the Golang runtimes for other OSes — if Plan9 were the "intended home" for Golang programs, then you'd expect the Golang runtime to be very "thin" on Plan9.
(To put that another way: imagine the Golang runtime as something like WINE — a virtualization layer that implements things that could be syscalls / OS library code, in the form of client-side runtime shim code. A "WINE implementation for Windows" would be an extremely thin shim, as every shim call would just point to a 1:1-API-matched piece of OS-provided code. My hypothesis here is that "Golang for Plan9" is the same kind of thing as "WINE for Windows.")
† I was saying this as a thought experiment, not thinking there would actually be a Plan9 implementation of the Golang runtime... but there is! (https://go.dev/wiki/Plan9) So someone can actually check this :)
> I would love to see a deep dive on what features / architectural paradigms the Golang runtime shares with Plan9. Has anything like that been written?
If it has, then it's most likely available on https://cat-v.org/. Even if it hasn't, cat-v.org is a great starting point.
Besides, close to your line of thought, and assuming you didn't knew about this already, Pike & al previously worked on Limbo[0], a "predecessor" of Go, used to wrote Inferno[1], a Plan9-like OS, which could be hosted on arbitrary OSes via a bespoke virtual machine called "Dis".
So there were indeed a few previous "drafts" for Go. I'd doubt that Go has been designed "for" Plan9 though.
There’s also libthread[1] which implements concurrency model similar to goroutines (at least earlier, as they appear to be no longer just cooperatively scheduled?). That manual also mentions Alef and Newsqueak as influential predecessors.
While good cross-compilation is one of Go's strengths... I don't really think a unified assembly language contributed to that? Any assembly code can't be shared across multiple architectures in general, so the only potential benefit would be the easiness of parser development, which doesn't sound like a big thing. Rob Pike himself even noted that it can be offputting to outsiders but considered it's a worthy trade-off nevertheless [1], the conclusion I don't really think justified.
This is by far one of the best parts of Go. Its all around simple and painless to use. anyone designing a language should study what Go did well and what they didn't.
Relevant to which processors go supports, is this section (1). Base x64 support includes SSE and SSE2. I don't know if the go compiler produces it, though. Unlike extremely complex compilers like gcc, where performance is the top priority, the go compiler favours simplicity in a Wirthian(2) fashion favouring a simple, fast compiler.
I was gonna say this it seems like an LLMisinterpretation of the code. I can't think of any other way how one would know the term plan9 and how to dive into assembly while not knowing what waters he is in. And then i read others had the same thought.
If this is true, i kindly ask the author to not feel embarrassed or "exposed", but be honest, so we can learn from this. I'd like to gain confidence in these type of "LLM exposed" things, but it never seems like people would admit, no matter how obvious. And of course, here it is not obvious, this is of course a wild, very judgemental guess.
A bit over my head, but I enjoyed the way the writing brings us along for the ride.
This can’t be the first pass someone has made at something like this, right? There must be literal dozens of SIMD thirsty Gophers around. Would a more common pattern be to use CGO?
The problem with cgo is the high function-call overhead; you only want to use it for fairly big chunks of work. Calling an assembly function from Go is a lot cheaper.
I think people certainly have been trying for a while. In fact, I recall being on a (Skype?) call with my brother almost a decade ago while he was trying to write an SIMD library in Go. If I remember correctly, at that time, a bunch of the AVX instructions weren't even encodable in Go's Plan9 assembler - so we had to manually encode them as bytes [0].
The most complete library I've seen (though admittedly never used) uses CGO _partially_, with a neat hack to avoid the overhead that it comes with [1].
I would recommend checking out Avo (https://github.com/mmcloughlin/avo) if you're interesting in writing Go assembly programs. It provides type safety and does some checks to ensure you output valid assembly. It can dynamically allocate registers for you and calculate things like stack and frame size so you don't have to do that by hand. It also can handle calling convention details for you, very easy to load an argument into whatever register/location you'd like.
I recently ported all of the amd64 assembly in Go's crypto libraries over to Avo. Very useful library for this sort of work!
Go's calling convention is using registers, except when writing own assembly functions then it's stack-based; the latter is also how it worked in the past. See https://go.dev/s/regabi and https://go.dev/doc/asm
Build tags have a form "go:build" not "+build" since Go 1.17, that is for couple of years already.
More about build tags: using both build tags and filename suffix for arch-based conditional compilation is redundant. Just use one of them, not both.
This is neat, and it's great that Go provides such simple access to low-level primitives.
But for the particular case of SIMD operations, wouldn't it make more sense to use the GPU instead of the CPU? GPUs excel at parallelism and matrix operations, so the performance difference would be even greater. I suppose the lack of well maintained GPU packages and community around it don't make Go particularly well suited for this.
For a tool you run locally, a GPU could be an easy win. But most Go code is probably run on cloud servers. Adding requirements to your runtime environment isn't something to do lightly.
- SIMD: up to 400% speed boost, most likely on the same VMs you were already using
- GPU: orders of magnitude faster, but now you need to make sure your cloud servers have compatible GPUs attached
If you really do need crazy performance then it's worth it. If you're already stable and right-scaled and SIMD allows you to reduce your VM spend by 25%, then you probably just take the savings and move on.
Very minor nit, doesn't change anything about the article otherwise, but the SIMD acronym stands for *Single* Instruction, Multiple Data conventionally.
> The assembler is based on the input style of the Plan 9 assemblers, which is documented in detail elsewhere. If you plan to write assembly language, you should read that document although much of it is Plan 9-specific.
Go is the evolution of Limbo from Inferno, which was designed based on the failure of Alef on Plan 9, combined with a minor set of Oberon-2 influences.
There was a controversy when Go came out about the naming due to another language also being called Go, and the top voted alternative name was Plan9, and as an homage they may have used that internally instead.
The top voted alternative was "Issue 9" which served as a reference to Plan 9 and happened to be the actual issue number in the Go project on Google Code opened by the guy who's programming language (named "Go!") was already out there.
Erm... The title is rather wrong.. doesn't encourage thinking the author has actually understanding of the topic - I like to think he had, but the way he writes about the topic, also within the article, is confusing at least..
I mean, the fact that Go has own Plan9-derived format for assembler, has absoluely nothing to do with the task author aims to solve.
Here is a good explanation provided by Ian Lance Taylor:
> This proposal is focused on code generation rather than language, and the details of how it changes the language are difficult to understand. Go is intended to be a simple language that it is easy to understand. Introducing complex semantics for performance reasons is not a direction that the language is going to take.
In case it's not obvious, the "Plan 9" in the Go Assembler's name comes from https://en.wikipedia.org/wiki/Plan_9_from_Bell_Labs, and the reason for that is of course that two of the "Go founders" (Rob Pike and Ken Thompson) are Bell Labs alumni. Some more background on the Go assembler: https://go.dev/doc/asm
The reason for that is not just being alumni, but also that Go implementation started on top of Plan 9's compiler infrastructure that was later rewritten in pure Go
Interesting trivia about the connection to plan9 the operating system.
>Go uses its own internal assembly language called Plan9.
Plan9 is the name of the OS. You wouldn't name a programming language "Linux", even if Linus created it and it was super related or not at all related.
It's not "plan9 assembly language" as in "the assembly language named plan9". Read it as "the otherwise unnamed custom assembly language used in the plan9 operating system".
The article simply misspoke by saying that the assembly language is "called plan9".
No way, the article consistently refers to the assembler syntax as "Plan9" throughout the text and title and they talk about "x86 Plan9" and "arm Plan9".
Considering there is no introduction at all to this beyond "I discovered it's called Plan9", I'm assuming the author really thinks this is a language widely named "Plan9".
> IIRC before Go was self compiling, it was compiled using 9c, and its architecture inherits from 9c.
Back in those days I actually found that, with a few small tweaks, I could compile the Plan 9 operating system using Go's C suite. We didn't pursue it further but this was one of the options we looked into for cross-compiling Plan 9 from other operating systems.
Since to Go C suite was ported Plan 9 compilers, I'm not sure why this would be a surprise. Since I'm obviously missing something would you share your thoughts on what challenges you expected?
It wasn't any surprise, nor did I intend to imply it was surprising, just relaying an anecdote. There were a few differences that required small tweaks to the source, but we got the Plan 9 kernel compiled with the Go C compilers in a day.
EDIT: it seems the author is just mistaken, the Go assembler is just referred to by everyone else as "Go Assembler" and nothing to do with plan9, other than the various connections to its origin.
1. On amd64 those ints are actually 64bit. If you used int32 then they would be be word aligned in the parameter list. However, there is a gotcha with that. The return values will always start at a dword aligned offset on 64bit system.
2. NOSPLIT is defined in "textflag.h" which Go's compiler automatically provides. However, NOSPLIT is, from everything I've read, only respected on runtime.XX functions, so it's not doing anything there, and it's also not necessary. NOSPLIT tells the compiler not to insert code to check if the stack needs to split because it's going to overflow, which is technically unnecessary if the function doesn't need any stack space. It's basically only there on the function that checks for stack splits, to prevent that code from being injected into itself.