
Go vs. Crystal Performance - open-source-ux
https://ptimofeev.com/go-vs-crystal-perfomance/
======
systemvoltage
I am not an expert at this but I want to inquire:

\- Everytime there is any kind of benchmark between 2 or more languages, there
is always a caveat - "Not a fair comparison" or the algorithm isn't right.

What, then, is a good way to compare languages? After all, they are not
apples-oranges comparison. They are a tool to get things done. It is like I am
comparing 2 different brands of hammers. Sure the grip is different, the shape
of the head is different, but if we are testing a particular aspect of nailing
- say nails/sec, then it is worth investigating which one is better. One
hammer takes a lot of expertise to use it and yet another one is easy to use.
So, then, write those pros and cons down, it doesn't make the comparison a
worthless activity which is the cliche response to benchmarks - developer time
matters and not the language.

What is a good way to compare 2 languages. Say Python vs. C? Are there
standard implementations of algorithms (e.g. Mandelbrot) or something like
that we can use and definitively compare the speed? Shouldn't we have
standards around these benchmarks? Some kind of a ISO implementation of an
algorithm reviewed by experts that can be used for benchmarking?

For majority of programmers, the speed doesn't quite matter. But that
discussion is off the table and orthogonal. Sure for most things, I personally
pick up Python and it is the fastest way to develop for me. But there are many
reasons why we should compare languages and these concerns are shouldn't stop
us from evaluating objectively.

~~~
espadrine
> _What, then, is a good way to compare languages?_

Digging into details and yielding very specific conclusions.

Let’s take the example benchmarks.

The first one, Fibonacci, can only be properly assessed by inspecting the
assembly. It is an analysis of _compiler output_ , not of the language, not
even of compiler optimizations.

It therefore only measures the CPU speed of function calls, and one compiler
outputs a different set of assembly instructions:

• Go (go tool objdump fib) is all like “CMPQ JBE(label(morestack)) …
LEAL(-1,cx) MOVL(cx) CALL(fib) …×3 ADDL(-2,cx) MOVL(cx) CALL(fib) MOVL ADDL
…×3 RET label(morestack) CALL(morestack) JMP”.

• Crystal (crystal build --emit=asm fib.cr) is all like “subl(1) …×6
callq(fib) …×1 subl(2) …×7 callq(fib) movl addl …×6 addq retq” for the
recursive condition.

So the conclusion is that Go outputs less instructions, but has to maintain
its segmented stacks (useful for goroutines, its lightweight threading system)
on every call, while Crystal uses the standard assembly calling system with
little overhead beyond light dynamic typechecks.

So there is a language-motivated difference: calls in Go are a tiny bit more
expensive, but built-in lightweight threads consume little memory because
their stack can increase dynamically according to use[0].

The second benchmark, HTTP servers, has everything to do with the
implementation of the default library, and nothing to do with either the
language or the compiler. And I presume Crystal is backed by an optimized C
library for TCP[1] while Go is probably a full reimplementation of TCP and
HTTP.

[0]: [https://blog.cloudflare.com/how-stacks-are-handled-in-
go/](https://blog.cloudflare.com/how-stacks-are-handled-in-go/)

[1]: [https://github.com/crystal-
lang/crystal/blob/master/src/sock...](https://github.com/crystal-
lang/crystal/blob/master/src/socket/tcp_socket.cr)

~~~
abainbridge
> Go outputs less instructions, but has to maintain its segmented stacks

I think they dropped segmented stacks in 2014:
[https://en.wikipedia.org/wiki/Go_(programming_language)#Vers...](https://en.wikipedia.org/wiki/Go_\(programming_language\)#Version_history)

The stack can still grow, but contiguously. I'm not sure if there are any
instructions required to monitor the stack size and grow if it needed.

~~~
amscanne
> CMPQ JBE(label(morestack)) …

^^ It's still the preamble for most functions :)

------
totalperspectiv
Completely anecdotal:

\- In my small benchmarks, Crystal is very fast \- In benchmarks not my own,
Crystal is very fast \- Crystal has a very nice batteries included stdlib.

Benchmarks that aren't mine: [http://lh3.github.io/2020/05/17/fast-high-level-
programming-...](http://lh3.github.io/2020/05/17/fast-high-level-programming-
languages)

The only other language that is high level in its ballpark is D.

~~~
mratsim
Benchmarking strings is often benchmarking memory allocation speed or the GC.

If you manipulate strings day in day out in your workflow, the first thing you
do is remove all those allocations and for what is left you use a memory pool.

The speedup could easily go into the 20x in any language. It's actually one of
the area where Python or JS can beat statically typed languages, they focused
a lot into optimizing strings while statically typed languages often leave
that as an exercise to the reader.

~~~
throwaway894345
> It's actually one of the area where Python or JS can beat statically typed
> languages, they focused a lot into optimizing strings while statically typed
> languages often leave that as an exercise to the reader.

I think this has more to do with writing the string handling functions in
heavily-optimized C rather than in Python/JS than it does with memory tricks,
but I'll happily accept correction.

~~~
mratsim
Yes low-level C + retaining memory around in the GC. The second part avoids
many malloc/free that may be hidden in destructors in low-level languages.

------
cfors
Sounds like another win for languages with an LLVM backend. Definitely excited
to see this language grow, especially as it has generics already.

Does anybody use Crystal professionally here? Any thoughts so far if so?

~~~
mauricio
We run it in production. Our apps are mostly Ruby but we have been rewriting
services in Crystal. Originally we were attracted by the speed and type
checks, but one surprising benefit has also been the reduced memory
consumption. It's difficult to compare directly, but in some cases it cosumes
10x less memory and performs around 20-35x better. Even I/O bound services are
sped up since we can take advantage of Crystal's concurrent fibers (rumored to
come in Ruby 3.0).

The main downside has been the somewhat frequent deprecations of methods and
changes to the standard lib. But it's mostly due to the preparation to launch
1.0.

~~~
Exuma
Is there any web framework for it, like rails? That's really the main reason I
use ruby is because as a single developer it's amazingly fast to set up new
apps (admin panels, reporting panels, sales funnels, etc). The out-of-the-box
functionality of rails makes this super painless. I'd LOVE to use something
faster but I don't really want to have to custom write a lot of things (CSRF,
sessions, cookies, blah blah).

Also, how do you handle jobs with Crystal? Sidekiq being the common one for
ruby.

~~~
paulcsmith
Hello! I'm the creator of the Lucky web framework
[https://luckyframework.org](https://luckyframework.org). I've been building
it for about 3 years and we've got a number of people using it in production.

It still lacks some features found in bigger frameworks, but is nearing 1.0
and gaining many new contributors that are helping us fill in the gaps.

Feel free to hop on our chatroom to ask questions about it. We try to be super
friendly and love answering questions and getting feedback
[https://gitter.im/luckyframework/Lobby](https://gitter.im/luckyframework/Lobby)

~~~
Exuma
Awesome, thanks!! What would you say the top 2-3 missing things are currently?

~~~
paulcsmith
Right now I'd say we need to make it easier to work with nested params so you
can easily save Parent + (n) children. Easier handling of uploaded files is
another big one. It's being actively worked on right now. There are a few more
escape hatches that are needed when you need to break out of the framework.
But overall it is fairly full featured.

You can check out our roadmap to 1.0 here:
[https://docs.google.com/document/d/1EYzx37Kq5h7iLH9SQTFyXNwb...](https://docs.google.com/document/d/1EYzx37Kq5h7iLH9SQTFyXNwby2xVvzuRUlMuxcoktx8/edit)

------
awb
This is a pretty basic benchmark, but there are more robust ones for Crystal
and others here:
[https://www.techempower.com/benchmarks/](https://www.techempower.com/benchmarks/)

Under Filters you can select the languages / frameworks that are of interest
to you. For example Go seems to have several implementations that beat
Crystal:
[https://www.techempower.com/benchmarks/#section=data-r19&hw=...](https://www.techempower.com/benchmarks/#section=data-r19&hw=ph&test=fortune&l=zdjvnj-1r)

Having programmed in both though, I far prefer the language / idea of Crystal
for innovation. But it is hard to beat the performance and explicit nature of
Go for production applications.

~~~
throwaway894345
The non-recursive Go fibonacci implementation computes in ~160ms and uses only
1.5mb on my MacBookPro (and I can get the binary size down to 850K by
stripping symbols and using the println builtin instead of fmt.Println). This
microbenchmark seems to tell us more about function call overhead than it does
about typical application performance.

Further, while I don't doubt that Crystal binary sizes are smaller in general,
I don't think we can get much information about binary sizes from these micro-
toy-sized programs. For example, the Go version shaved off a full mb by using
the `println` builtin rather than `fmt.Println`, probably because the former
pulls in more of the runtime than the latter. These toy benchmarks are very
sensitive to these kinds of things, and the size of the runtime is going to be
a trivial portion of the size of any real application anyway.

------
zelly
Benchmarks don't matter. It's 2020. We use Stock Price Driven Development now.

The main reason to use Swift is Apple created it.

The main reason to use Go is it's the first two letters of Google.

The main reason to use Python is Google hired GvR in the 2000s and created
momentum.

The main reason to use Kotlin is Google announced at I/O 2017 its support in
Android Studio.

Get with the times gramps.

~~~
7532yahoogmail
Wow. No.

Engineering school was/is supposed to help us (you) move beyond cynical
judgements to thinking.

~~~
crimsonalucard5
I could care less if a statement was positive or negative what is important is
if the statement has a realistic possibility of being true.

If anything Engineering school should have taught you how to logically observe
situations rather than viewing things through an optimistic or cynical lens.

------
dcu
the HTTP benchmark is not fair since the crystal implementation is setting the
content type explicitly while the Go implementation is auto detecting it.

~~~
Thaxll
Indeed the benchmark was dismissed on Reddit coupe of days ago:
[https://www.reddit.com/r/programming/comments/h0knmi/go_vs_c...](https://www.reddit.com/r/programming/comments/h0knmi/go_vs_crystal_performance/)

~~~
dgb23
The repost[0] in r/golang shows similar critiques. This perf comparison is
naive on many levels.

\- binary size (static vs. dynamic linking)

\- recursion is not idiomatic in go, it is assumed that you write imperative
for loops.

\- mathematical functions like the Fibonacci sequence are a rather atypical
computation use-case for a Go program (I don't know about Crystal). Tree/graph
traversal/mutations would be a more fitting test. Or generally something that
is composed of dynamically growing and shrinking slices and maps.

\- http test apparently cannot be reproduced, some get better results for Go,
some for Crystal.

\- http tests w/o involving some parsing/marshaling/serialization or something
along those lines aren't that useful. You usually want to either read or send
some JSON string or similar.

[0]
[https://www.reddit.com/r/golang/comments/h0kogq/go_vs_crysta...](https://www.reddit.com/r/golang/comments/h0kogq/go_vs_crystal_performance/)

~~~
yxhuvud
Recursion is really not the idiomatic way to solve most things in Crystal
either, not that it make the test relevant.

------
euph0ria
Why doesn't Crystal use more than 100% CPU like Go does if the server has 8
cores? Why didn't both languages max out the cores? Is it because the params
to WRK was within what one core could handle in Crystal?

~~~
WatchDog
It seems crystal is single threaded by default. Although you can configure it
to use more threads.

[https://crystal-lang.org/2019/09/06/parallelism-in-
crystal.h...](https://crystal-lang.org/2019/09/06/parallelism-in-crystal.html)

~~~
euph0ria
Yeah, but in a benchmark it seems strange that he doesn't restrict both to use
just one CPU or unlimited.. kind of compares apples/oranges.

------
karmakaze
I've taken Crystal out for a spin with a number of popular and less popular
frameworks. Even for my small test application, edit/compile/run times were
slow. Only the thinnest frameworks like Kemal seem tolerable to me.

I really do hope that the compiler gets faster and the larger frameworks
figure out how to compile faster. These are the benchmarks that matter to me
before I'd make a recommendation.

~~~
Trasmatta
My understanding is that the compiler performance issues in Crystal are sort
of inherent to the language design and making it much faster will be very
hard. I believe it's related to Crystal's type inference -- it has to traverse
every code branch to ensure type safety. I believe explicitly typing
everything is supposed to help, but I'm not sure by how much (especially if
you have dependencies that aren't doing that).

~~~
entha_saava
maybe LLVM as well..

------
danielsokil
Here Is a Gist where I compare Crystal, Go, OCaml including compile time.
[https://gist.github.com/s0kil/155b78580d1b68768a6c601a66f8e2...](https://gist.github.com/s0kil/155b78580d1b68768a6c601a66f8e29b)

~~~
dleslie
Why did you combine compile time and execution time?

~~~
danielsokil
The idea is to have a balance of compilation + runtime performance.

------
piinbinary
I wonder how the GC pause times and throughput compare. My understanding is
that Go sacrifices some compute performance for better GC pause times.

~~~
dgb23
> better GC pause times

What does that mean exactly / in this case?

My assumption is that "better" means more predictable. Or does it mean
straight up fewer/shorter?

~~~
recursivecaveat
Go sacrifices basically every other GC metric to minimize average pause times.
Sortof a 'if you can't succeed, redefine success' mentality if you will.

~~~
sharpy
That's not necessarily a bad thing for writing web services. I used to spend a
fair bit of time tuning JVM to achieve acceptable tail latencies for Java
services, not to mention once in a while, they would need to be tuned again...
Not so with Go.

~~~
apta
Java now has two low latency collectors: ZGC and Shenandoah.

------
seabass
It seems strange that Crystal would not utilize more than 100% of the CPU in
the HTTP benchmark given that it ran on an 8-core machine and with fibers
should be able to split the load across multiple cores. What explanation is
there for that? Also, it's impressive that while using one third of the CPU
Crystal outperformed the Go HTTP server on throughput.

~~~
yxhuvud
Because currently Crystal is singlethreaded. You only get multithreading if
you compile with -Dpreview_mt . I suppose it will be on by default at some
later point.

------
jjtheblunt
Might binary size comparisons be misleading, unless the Crystal binary is
statically linked, as presumably the Go binary is?

------
samuell
Anybody knows the status of light-weight threads, channels and automatic
multiplexing ("mapping") of light-weight threads on operating system threads
in Crystal?

(This has been promised to come in Crystal (as opposed to pretty much any
other language than Go), but has always seemed to be "yet to come").

~~~
mauricio
It's available [https://crystal-
lang.org/reference/guides/concurrency.html](https://crystal-
lang.org/reference/guides/concurrency.html). Fibers (like goroutines) are
scheduled by Crystal and map to system threads.

The work on parallelism is available as a compile-time flag, but not yet GA:
[https://crystal-lang.org/2019/09/06/parallelism-in-
crystal.h...](https://crystal-lang.org/2019/09/06/parallelism-in-crystal.html)

~~~
kodablah
One wonders, once it is GA, if there is value in a Go transpiler. There is a
lot of useful software in Go land that would be immediately useful for Crystal
devs, and Go itself is not too complicated to map to another language with the
same features (a few of the runtime features will have to be emulated).

~~~
unixhero
How would this be achieved? Could you expand a little on this idea?

~~~
kodablah
Literally convert Go code to Crystal code. I have not looked into Crystal
enough to confirm it's features are a superset of Go's. For example, I saw
that Kotlin w/ the advent of their coroutines, had most of Go's features so I
wrote a transpiler[0] that got pretty far (can run all of this [0]). I
abandoned the project because I have abandoned the JVM.

0 - [https://github.com/cretz/go2k](https://github.com/cretz/go2k) 1 -
[https://github.com/cretz/go2k/tree/master/compiler/src/test/...](https://github.com/cretz/go2k/tree/master/compiler/src/test/go)

------
dzonga
I always find these benchmarks superficial. unless you're doing some serious
number crunching, it doesn't make sense. people need to start bench marking
languages on speed of development, developer ergonomics, error reporting, ease
to deploy etc.

------
mikece
Tangent: would technical writers please stop using the phrasing “is X times
smaller” when “is one Xth the size” is more accurate? I had to read the
reference to “Crystal’s binary size is 5 times smaller than Go’s” twice to
realize it was 0.2x and not 5x.

~~~
jiofih
I don’t get it. “5x smaller” and “0.2x the size” are the same.

Can you explain then what do you expect to be the meaning of “two times
smaller”?

~~~
mikece
Clarity of language for one thing, but also that "times" is indicative of
multiplication so "five times" anything has to be bigger[1]. One fifth is far
clearer as there's no way to interpret that other than being a _part_ of the
size of the reference, ergo, smaller. Is it impossible to understand? No, but
it's poor style for technical writing.

[1] I'm ignoring the case of decimal math since the reference in this case is
always the whole integer 1.

~~~
jiofih
I don’t know under which rock you’ve been hiding but “x times faster/smaller”
has been in common use for _decades_ , including technical writing. Everyone
understands it as 1/X, there is absolutely no confusion.

------
skyzyx
Without having done _any_ of my own research, I’m initially skeptical of
Crystal’s binary size. I initially saw something similar with Swift, but
that’s just because the runtime is external to the binary.

------
pier25
Now compare Crystal with Nim.

~~~
nobleach
I want to say this: [https://framework.embarklabs.io/news/2019/11/18/nim-vs-
cryst...](https://framework.embarklabs.io/news/2019/11/18/nim-vs-crystal-
part-1-performance-interoperability/index.html) was published on HN at one
point. Regardless, it's a decent rundown.

~~~
pier25
Oh wow I expected Nim to be at least as fast as Crystal.

~~~
mratsim
Nim JSON parser is not optimized, it was mostly written for maintenance (for
example it allocates a Table per node in the json file)

I.e. the difference in speed here is a difference in elbow grease.

I have yet to come into an optimization problem where you cannot reach the
speed you can achieve in C in Nim.

------
gigatexal
Interesting. But more important is how productive can I be and how expansive
is the ecosystem? What’s the developer experience like? Is the community
welcoming?

------
decafbad
I suggest the word femtobenchmark for this kind of work.

------
strictfp
Consider using wrk2 in order to avoid coordinated omission.

------
sbmthakur
Has anyone compared Crystal and Rust on similar lines?

~~~
tracker1
Did a rust build with the following...

    
    
        fn fibonacci(n: u32) -> u32 {
          match n {
            0 => 1,
            1 => 1,
            _ => fibonacci(n - 1) + fibonacci(n - 2),
          }
        }
    
        fn main() {
          println!("{}", fibonacci(47));
        }
    
    

size: 2.67kb time: 7.391s

I think rust wins :-D

\--- edit: not sure if I used valgrind/massif right, but the largest values in
the output file.

    
    
        mem_heap_B=1816
        mem_heap_extra_B=32
    

I think this means it stayed under 2k ram?

~~~
sk0g
Wouldn't processing time depend on your CPU, and unless you run the benchmarks
in the article, this comparison would be meaningless?

~~~
tracker1
Don't have a go environment setup, but could do that. The executable size and
memory use was also lower.

~~~
sk0g
Well to be honest, Go binary sizes are always going to be big, to the benefit
of ease of development and deployment.

Memory usage, I don't know. The Fibonacci benchmark code was a bit... shit.

Do you want to send over your Rust code in a Gist or something, and I can
compile and compare?

~~~
tracker1
It's in the comment above...

------
poorman
I'm still waiting for someone to write an LLVM backend for Go.

~~~
hootbootscoot
gollvm? vs gogcc. that's why i said "an IR interface" when someone asked about
transpiling crystal to go and vice-versa.

------
ed25519FUUU
It’s always interesting when people compare Go to other languages they always
use file size of binary, as if that’s something that anyone in modern
professional engineering considers.

Another good example would be to statically cross-compile a non-contrived
program into ARM64, 32-bit linux, and Darwin without needing google!

And if you think you have your binary statically linked, go test it in the
“scratch” docker image. You may be surprised at how difficult it really is.

~~~
pjmlp
Sure we do, I have spent the last week trying to upload 10 MB files in a quite
slow uplink, and yes in Europe, suburban area of a German town.

First reason why people uninstall apps on mobile devices is app size.

~~~
sk0g
Are mobile apps the target market for Go? CI/CD takes care of upload for the
dev side at least.

It would be handy to have a compile time flag indicating the desired file size
optimisation level still, with tradeoffs being ease of debugging, and
performance.

~~~
pjmlp
Given that gomobile exists, maybe.

CI/CD is something almost unknown outside HN bubble.

Also that was just two examples, here are another two, cost of production for
USB Armoury keys running Go bare metal, or download costs for WebAssembly
modules written in Tiny Go.

~~~
sk0g
Are you sure re: CI/CD? I was looking around the job market recently, and
every role was interested in my experience in it, since they were using it
too. To be honest, that could be self-selecting, as my resume likely attracts
companies similar to ones I have worked at. I still do hear about
microservices, CI/CD a whole lot in job descriptions etc, but whether they are
already practicing it (well), is another question...

I was going to suggest TinyGo! I mainly work on a backend system, so I guess
we have different priorities and needs. The embedded work I have done has all
been C/ ASM, though it could be fun to revisit with Go.

~~~
sfkdjf9j3j
In terms of sheer numbers, I would guess that at least a plurality of websites
are running old versions of PHP and are released via FTP. But I don't think
that's really meaningful or interesting to worry about, since those services
aren't even considering the tooling decisions we're talking about.

