> To make it a real thing I'd start by calling morestack manually from a NOSPLIT assembly function to ensure we have enough goroutine stack space (instead of rolling back rsp) with a size obtained maybe from static analysis of the Rust function (instead of, well, made up).
> It could all be analyzed, generated and built by some "rustgo" tool, instead of hardcoded in Makefiles and assembly files.
Maybe define a Go target to teach Rust about the Go calling conventions? You may also want to use "xargo", which is specially built for stripping or customising "std" and to work with targets without binary stdlib support.
Two main points:
-Go uses a very small stack for GoRoutines to make them dirt cheap. When you exceed this stack, Go maps in more stack for you transparently. Rust generated ASM is running on the Go Stack, but it is expecting when it exceeds its stack to explode, like a C program. As that should be the OS Stack. Larger problem then you think, Rust likes to put a ton of stuff on the stack. This is one of the nice things about Rust is putting _a ton_ of data on the stack is cheap, and makes ownership simpler.
-Go's system calls, and concurrency primitives are cooperative with its runtime. When make they communicate the routine can yield to the runtime. Targeted Rust code would _also_ have to make these calls, as well as 3rd party crates.
Again none of this is impossible, linker directives and FFI magic could import these functions/symbols. But this would also require Go have a stabilized runtime environment for other languages to link against. Currently just stating Go has a runtime is controversial, so I expect this won't happen soon.
If you want full Go embedding sure, but this discussion is in the context of TFA whose stated purpose is the ability to build optimised "pure" sub-functions without having to use plan9 assembly.
In that case, what a Go target does is remove the need for a trampoline and manually fucking around with calling conventions.
This point confuses me; if Rust expects to run on a limited stack, why would it expect to put a ton of data on the stack?
> Currently just stating Go has a runtime is controversial
I've never heard any controversy... The Go community certainly call it a runtime, and there's even a "runtime" package. Do folks from VM languages get grumpy because Go's runtime is statically linked?
Rust runs on a C stack, while it's not infinite it's a whole other ballpark than a Go stack since it's non-growable (Rust used to use growable stacks before 1.0): the default C stack size is in the megabyte range (8MB virtual on most unices), in Go the initial stack is 2kB (since 1.4, 8k before).
 you can set the size to "unlimited", systems will vary in their behaviour there, on my OSX it sets the stack size to 64MB, Linux apparently allows actual unlimited stack sizes but I've no idea how it actually works to provide that
 I think libpthread defines its own stack size circa 2MB so 8MB would be the stack of your main thread and 2MB for sub-threads, but I'm not actually sure
As for concurrency and system calls, I'd say that go people probably would get a tremendous amount of value out of just single threaded rust code that didn't do I/O. Like parsers.
If you ever want to do more than the most trivial FFI you'll eventually want to be able to pass types back and forth(usually opaque to either side). AFAIK Go doesn't offer any pinning if it's GC'd types so you can have the collector move them from under you.
C# has this beautiful thing where you can pass a delegate as a raw C fn pointer. It makes building interop a wonderful thing but you have to make sure to pin/GCHandle it appropriately.
Defining a Go target for Rust actually makes sense in the context of replacing assembly (which has no runtime, GC or concurrency connotations), I was just too lazy to do it that way :)
This is a good thing because `cgo` is really bad. No ALSR, always forking. These are completely _insane_ defaults, it manages to be slower then the JNI  which is an FFI from a completely managed stack based VM universe! Not a _compiled_ language.
Somehow a compile language calling a static binary manages to be slower then a dynamic language's runtime calling a static binary...
`cgo` isn't doing _anything right_. It is doing a lot of things wrong.
Also, I don't see any reason why jni should be slower than cgo. Go has its own scheduler that brings overhead to cgo and needs to switch stacks whereas Java doesn't have to deal with such things.
Well, sometimes the solution is to reinvent it and provide another solution. Sometimes a project has a specific goal which might preclude them from using your idea, or the people involved just have a slightly different vision.
Sometimes, if you believe the existing project is wrong from the ground up, the solution is to reinvent it as something else. Sometimes it doesn't pan out, sometimes it does.
That's the beauty of open source.
Excuse my language, or don't. But the Go-Maintainers really don't give shit about improving their languages performance. Also FFI ALSR is disabled for debugging simplicity. Which I can only imagine means they debug by reading literal core dumps by hand.
Furthermore I'd rather not donate my time and energy to a company as large as Google.
>Also, I don't see any reason why jni should be slower than cneeds to switch stacks whereas Java doesn't have to deal with such things.
The JVM has its own scheduler to keep locks fair, also OpenJDK does green threading in its runtime to allow for GC cycles on JIT'd code. I'm pretty sure Oracle and Azule do a well since JIT execution/cleanup requires doing stack swapping.
On my machine (2010 MBP @ 2.4GHz), cgo calls have an overhead of ~120ns.
Based on https://github.com/golang/go/issues/12416 / https://golang.org/cmd/cgo/ :
> However, C code may not keep a copy of the Go pointer after the call returns.
It looks like they've just punted on the pinning issue and not allowed code that does it.
Having written a ton of FFI code across a wide range of languages this type of restriction means that you're not going to be able to implement certain things which is unfortunate.
Is there some issue with this approach that I'm missing? Is the additional process overhead really enough that it's worth bending over backwards to avoid it?
The thing that should probably be said is that the difficulty is all on the Go side. Rust doesn't have any of the clumsiness that Go has when interacting with other languages. It's fully fluent in the lingua-franca of FFI (C ABI). If you were integrating your Ruby, Node or Python code with Rust instead of Go, there are nice libraries  that make it simple, easy and very low overhead.
For users of these scripting languages, Rust is a nice tool to keep in your back pocket to pull out in the rare cases that you're not getting the performance you need. It means being able to choose your tools based solely on developer ergonomics and existing team knowledge knowing that in the rare cases that you do need do something computationally intensive, you can drop to Rust, push everything through a Rayon parallel iterator, write the performance-sensitive logic and push the result back. It's also really useful to use Rust in environments like Lambda/Cloud Functions that only support those scripting languages since those environments tend to charge based on memory and CPU time and Rust makes it easy to get by with a minimum of both.
 https://github.com/tildeio/helix (Ruby)
 https://github.com/neon-bindings/neon (Node)
 https://github.com/pyo3/pyo3 (Python)
I would think it would be better overall to just create a local http server in Go and use that instead. Or sockets if you're feeling up to it.
I'm not seeing any performance issues with stdout, but I'm also not writing much data.
Sorry, what? Just make the port configurable?
Anyway, for my purposes, this wouldn't work, since the executable is embedded in libraries that are meant to run anywhere without any configuration. But yeah I could see that being fine under other circumstances I guess.
C is still the common denominator, you'd think it'd be easy, but it's hard. Years ago when LLVM was showing promise and Google was going to get Python running on top of it I was hopeful.
I guess nowadays it's a better design to run separate processes and have your languages communicate out of process (pipes, http, etc) rather than in-process.
Go strives to find defaults that are good for its core use
cases, and only accepts features that are fast enough to be
enabled by default, in a constant and successful fight
"A constant and successful fight against knobs" really gets at a lot of what makes Go (often) a joy to use.
- Fast compilation
- Multi platform
- Great tooling
- Good std lib
- Easy learning curve
- Fast enough for most scenarios
- Good concurrency model
With the caveat that I'm still fairly new to Go, I'm not sure that I'd call this a complete win for Go over Rust. With regards to editing tools, yes, Go's has the edge; completion, formatting, and function lookup are all still better than anything Rust has to offer, although rustfmt has come leaps and bounds from where it used to be, and RLS looks very promising. With regards to what I'll call "infrastructure" tooling (building, packaging, testing, etc.), Rust absolutely wins, hands down. Cargo is miles ahead of anything I've heard about in the Go world, which currently has multiple commonly used tools for dependency management (none of which are nearly as good as Cargo), and I've already encountered Makefiles and custom shell scripts for common building tasks that would be a breeze with Cargo in the few weeks I've been building Go. That's not to say that there might not be equally good options out there for Go, but if they exist, they don't seem to be universally used, which mitigates their usefulness.
More on a personal opinion level, I also utterly despise the way GOPATH works. I generally try to group the projects I work on by category rather than by language (i.e. ~/code/projects for side projects, ~/code/forks for open source repos I contribute to but don't maintain, ~/code/scratch for simple projects in each language I use for if I want to try out a package or something, etc.), which GOPATH is completely incompatible with, as I'd need to add each of them to my GOPATH and then put all my Go projects of each type into a "src" directory in each of them and then either move all my non-Go projects there or have a bunch of Go-specific directories littered around in each of them. If I don't put my Go projects in a ":src" directory inside one of the paths on my GOPATH, then I can't use gorename to change the name of a variable or function, which is frankly ridiculous. This isn't to say that my way of organizing my projects is the "best" way, but I feel like enforcing a directory structure outside of the project that the tool is being used in is borderline hostile to the user.
In return, you spend more time debugging since there are no interesting static guarantees.
So is Rust? Even if it weren't, you've already tied yourself to the platform support of Rust at this point.
Not familiar enough to comment on this one.
I can't take Go's stdlib seriously with the way they handle errors.
Despite using it for a year I still have to google Go's syntax and semantics daily. By far the least consistent and hardest to learn general purpose language I have touched.
This is literally just wrong. How can you claim to be serious about concurrency when you have no concept of immutability in your language?
- In return, you spend more time debugging since there are no interesting static guarantees.
I spend almost no time debugging my Go code... It generally either works correctly, or fails to compile... and in the cases where it doesn't work correctly, I'd prefer to have a test that makes it obvious what went wrong, and then make the test pass.
- - Great tooling
- Not familiar enough to comment on this one.
Go's tooling is one of the reasons I use it, esp. the very strict formatting style
- - Good std lib
- I can't take Go's stdlib seriously with the way they handle errors.
I prefer the way Go handles errors, cause it makes it so all control flow is visible by default.
- - Easy learning curve
- Despite using it for a year I still have to google Go's syntax and semantics daily. By far the least consistent and hardest to learn general purpose language I have touched.
I'd seriously question this, I was competent in the language after about a week, meaning at the point of just having to look up package specific stuff. Granted, I have a lot of background in C family languages, but still...
- - Good concurrency model
- This is literally just wrong. How can you claim to be serious about concurrency when you have no concept of immutability in your language?
Share memory by communicating, don't communicate by sharing memory. Yes, it's completely different from what most people are used to, but it's a valid paradigm. I'd go look up "communicating sequential processes" and do some reading if I were you.
I suspect the parent was referring to the fact that Go just has this as a convention, there's no static guarantee that the programmer gets it right and it is undefined behaviour if they don't (fortunately the race detector exists). Interestingly, Rust does provide guarantees about this sort of concurrency patterns, allowing one to get stronger control around sharing using Go/CSP-like channels.
On the other hand, it's fair that the convention/tooling is usually good enough; this is a stronger argument and probably a better one to be making than a dubious one about being irrelevant.
LOL this is the classic sort of condescension from the Go community that makes the programming languages community rage. There is a lot of irony here because Go is basically founded on an anti-intellectual ignoring of all previous research.
He is just writing a more direct manual version of CGo in assembly that bypasses a lot of what CGo does, to be much faster.
> Before anyone tries to compare this to cgo
The only meaningful message in this blog is it possible to write a faster CGo, that's it. Comparing it to CGo is the only useful possible outcome, but...
> But to be clear, rustgo is not a real thing that you should use in production. For example, I suspect I should be saving g before the jump, the stack size is completely arbitrary, and shrinking the trampoline frame like that will probably confuse the hell out of debuggers. Also, a panic in Rust might get weird.
So when you actually fix all those things you might be back where CGo was at the beginning.
This guy comes across as a classic "but i wanna be cool" hacker who discovers that when you bypass all the normal protections in a library and make some kind of direct custom call, things can be faster.
I guess so what?
This is for a few reasons:
Rust has support for 128-bit integers (i.e. u128) which allows for faster arithmetic when operating on what are effectively multi-u128 bignums in an elliptic curve library.
LLVM is generally more sophisticated about optimizations. It wasn't until recently that go had an SSA-based compiler, so its optimizer isn't nearly as sophisticated as LLVM's, which has been developed for many years now.
There is some ongoing work into an LLVM based compiler for Go, and LLVM now has Go as official supported bindings.
Rust vs Go
Rust vs C
Building is faster, though, so that's nice.
You can also eliminate some slice bounds-checking by "asserting" the size of the slice outside the hot path; see http://www.tapirgames.com/blog/go-1.7-bce
Go also has a great profiler for discovering exactly which functions/statements are causing slowdown or allocating excessive memory.
At the end of the day, though, I don't think most Go programs can be optimized to run as fast as their Rust equivalents (without dropping into asm, at least). There's just too much overhead that you aren't allowed to disable.
To be clear, I meant programs that don't rely on things like concurrency or networking. For example, I think editing binary blobs (like images) would be just as fast as Rust, if I remember some of the research I did correctly. I don't have any sources on that, and it was a while back, so it's kinda irrelevant, I guess.