
One Program Written in Python, Go, and Rust - gilad
http://www.nicolas-hahn.com/python/go/rust/programming/2019/07/01/program-in-python-go-rust/
======
yalue
The "image" interface is one of my favorite parts of Go's standard library,
and I think it's one of the best showcases of Go's "interface" feature. Want
to save something as an image? Just implement three functions, two of which
are trivial boilerplate and the third of which is just "return the color at
this coordinate". You don't even need to manage the buffer of image data
yourself this way. For example, if you want an image full of noise, just make
your "get color" function return a random value every time it's called. I've
used this myself for things like simple fractals.

And,on top of all that, once you've written code satisfying the image
interface, the standard library even includes functions for saving your image
as one of several possible image formats. And, due to the fact that the
interface itself is specified by the standard library, virtually all third-
party image encoding or decoding libraries for Go use it, too. So, every image
reader or writer I've seen for Go, even third-party ones, can be drop-in
replacements for one another.

Anyway, it's not Go's standard use case, but as someone who loves fractals and
fiddles with images all the time it's one of my favorite parts of the
language.

~~~
benhoyt
I like the simplicity of this (very Go-like). However, what are the
performance implications of only being able to get one pixel value at a time?
Wouldn't it be much less efficient than say "get this line of pixels" or "get
this rectangle of pixels"?

~~~
jerf
You'll get the overhead of a non-inlineable vtable-based method call for each
pixel. How badly that hurts you depends on the ratio of expense of that call
vs. how expensive the pixel is to generate. If you've already got all your
pixel values manifested in memory and you're just indexing into an array, it's
going to be fairly expensive. If you're generating noise with a random number
generator, it's going to be noticeable but not necessarily catastrophic (since
"generating a random number" and "making a method call" are somewhat
comparable in size, varying based on the number generator in question). If
you're generating a fractal the overhead will rapidly be lost in the noise.

But I'd also point out that the Go standard library does not necessarily claim
to the "last word" for any give task; it's generally more an 80/20 sort of
thing. If you've got a case where that's an unacceptable performance loss, go
get or write a specialized library. There's nothing "un-Go-ic" about that.

~~~
diroussel
I would expect the stable based dispatch to be handled quite well with branch
prediction. And surely the cache misses from those nested loops would have a
much worse impact. Even if a few extra instructions have to run per pixel it's
going to be quicker than a fetch from main memory.

All the examples, in each language, could be rewritten as a cache oblivious
algorithm to optimise cache usage. This would speed them all up. See
[https://en.wikipedia.org/wiki/Cache-
oblivious_algorithm](https://en.wikipedia.org/wiki/Cache-oblivious_algorithm)

~~~
jerf
In general, I tend to agree there's a lot of people who have kind of picked up
"vtables always bad and slow" and overestimate the actual overhead.

But I have actually benchmarked this before, and it is possible to have a
function body so small (like, for example, a single slice array index lookup
and return of a register-sized value like a machine word) that the function
call overhead can dominate even so.

(Languages like Erlang and Go that emphasize concurrency have a constant low-
level stream of posts on their forums from people who do an even more extreme
version, when they try to "parallelize" the task of adding a list of integers
together, and replace a highly-pipelineable int add operation that can
actually come out to less than one cycle per add with spawning a new execution
context, sending over the integers to add, adding them in the new context, and
then synchronizing on sending them back. Then they wonder why
Erlang/Go/threading in general sucks so much because this new program is
literally hundreds of times slower than the original.)

But it is true this amortizes away fairly quickly, because the overhead isn't
that large. Even the larger random number generators like the Mersenne Twister
will be a long ways towards dominating the function call overhead. I don't
even begin to worry about function call overhead unless I can see I'm trying
to do several million per second, because generally, you _can 't_ do several
million per second on a single core because the function bodies themselves are
too large and doing too much stuff, such that even if function call overhead
was 0 it would still be impossible in the particular program to do it.

------
deathanatos
> _Rust slaps you and demands that you clean up after yourself. This stung at
> first, since I’m spoiled and usually have my languages pick up after me,
> moreso even than moving from a dynamic to a statically typed language._

And yet, in the author's example, all memory handling in Rust was completely
automatic. And that includes, AFAICT, no Box'd pointers¹, no ref-counting, and
certainly no raw/unsafe pointers.

IME, this seems to be a common response among people coming from GC'd
languages; I think the expectation is that they're going to be doing C-style
memory management (manually alloc/free pairs), when the truth is that 99.9% of
allocations will just happen automatically and invisible thanks to RAII.

In the end, I really think it's _resources_ , of which memory is just one type
of, that matter. Python doesn't do anything for resources, and you have to
know you're allocating something (e.g., files, locks, connections, connection
leases, etc.) that will require a manual .close() or with statement s.t. it
gets dealloc/cleaned/released; RAII will handle this just like memory, and
automatically handle it, and because of that, I find myself doing less
_resource_ management in Rust than I do in Python.

¹There are conditions in which I would argue that some uses of Box are
"automatic", depending on the reason it's pulled in. E.g., I've used it to
reduce the size of an enum in the common case, but still allow it to store a
heavy structure in the rare case. The handling of the Box itself is
essentially still automatic.

~~~
piinbinary
> Python doesn't do anything for resources

The "with" blocks can help with certain kinds of resources, for example:

    
    
        with open('foo.txt', 'r') as inf:
            data = inf.readlines()

~~~
deathanatos
That statement was too vague, you're right. I meant that it doesn't
automatically do anything. I'm aware of with (it's mentioned in the post
you're responding to ;) ).

The thing about `with` (and similar constructs in other languages) is that you
must _remember_ to use them at every single site of use. If you forget, while
the resource usually gets cleaned up when the language's GC decides to
finalize the object, it's non-deterministic and might be too late. The
consequences of forgetting are significantly better vs. C, as the GC will
_usually_ get to it in time, but that's not something I'd like to _depend_ on.
But the amount of "effort" and _possibilities_ for mistakes is, generally, the
same as C (in number, not severity): each resource allocation and each
resource free requires the programmer's attention — excepting the specific,
and I grant, quite common, case of memory, which one can say the GC will
handle for you. RAII makes it s.t. you don't have to remember except in the
(hopefully rare) case of implementing a new resource, and you can treat
variables holding resources at sites-of-use like any other variable.

For your specific example, if you can do w/o the list of lines (which I find
is often possible) Python's actually got a somewhat nicer construct w/ Path:

    
    
      pathlib.Path('foo.txt').read_text()
    

This only reads the data into a single string, whereas yours is a list of
strings. Slight different, but usually it is acceptable. But it moves the
required `with` into Path's helper method, so then it's harder for you to
forget it. But this is a specific case (reading all data in a file), and
doesn't generalize.

~~~
piinbinary
Good point. There is a big gulf between "can do for you" and "automatically
does for you."

I really appreciate the fact that Rust does it automatically (and that it's
not easy to turn that automatic management off).

------
chriswitts
You can use Hyperfine [1] instead of time for a nicer CLI benchmark.

I'd also be curious to know is pillow-simd [2] gets the Python performance
closer to Go/Rust, and if using Rayon [3] and changing your .iter()'s in your
Rust code to .par_iter()'s will yield an improvement there.

[1]
[https://github.com/sharkdp/hyperfine](https://github.com/sharkdp/hyperfine)

[2] [https://github.com/uploadcare/pillow-
simd](https://github.com/uploadcare/pillow-simd)

[3] [https://github.com/rayon-rs/rayon](https://github.com/rayon-rs/rayon)

------
kwhitefoot
I don't know much Rust (essentially none) but the sample doesn't seem to me to
warrant the 'murkier' comment. It took me a few moments to realize that the
loop variable was not produced from

    
    
        image1 
    

but instead from

    
    
        image1.raw_pixels().iter().zip(image2.raw_pixels().iter()
    

I would have created a local for that before the loop for clarity. And I would
not have declared the ratio variable, that is surely unidiomatic as the last
expression provides the result.

But apart from those stylistic quibbles it seems a model of clarity, the
opposite of murky, whereas the Python version simply glues together opaque
library calls.

~~~
kevincox
Alternatively you can avoid the loop for the summing and just do.

    
    
        let diffsum: u64 = image1.raw_pixels().iter()
          .zip(image2.raw_pixels().iter())
          .map(|(&p1, &p2)| u64::from(abs_diff(p1, p2)))
          .sum();
    

I find this a lot easier to read because there is no control flow in this
snippet. I can see that it is just summing the diffs. I know some people
prefer iterators and some prefer loops, however I think mixing them like was
done often leads to less readable code.

~~~
carlmr
I prefer iterators in concise places like this, but loops when there's a bit
more logic inside.

~~~
kevincox
Definitely. If your logic flow naturally maps to common iterator methods I
think it helps readability. If you need complex control flow it is best to use
a loop then try to wrangle iterator methods to achieve it.

------
majewsky
> No Optional Arguments: Go only has variadic functions which are similar to
> Python’s keyword arguments, but less useful, since the arguments need to be
> of the same type.

What I like to do is to make the last arg take a struct such that you
indirectly have named (and optional) arguments:

    
    
      type DownloadOpts struct {
        UserAgent string
        TimeoutSeconds int
        //...
      }
    
      func Download(url string, opts DownloadOpts) (io.Reader, error) {
        ...
      }
    
      //use like:
      contents, err := Download("https://news.ycombinator.com", DownloadOpts {
        UserAgent: "example-snippet/1.0",
      })
    

> Never needing to pause for garbage collection could also be a factor [in
> Rust's greater speed compared to Go].

Would be nice if the author had checked

    
    
      var ms runtime.MemStats
      runtime.ReadMemStats(&ms)
      print(ms.NumGC)
    

to see if there was actually any garbage collection performed.

~~~
bkq
>What I like to do is to make the last arg take a struct such that you
indirectly have named (and optional) arguments:

Rob Pike, and Dave Cheney both posted about this [1][2]. They summarised that
using self-referential functions were a more optimal way for handling options
to a function. This gives the benefit of allowing the options to be easily
extensible by yourself, and any users that would be interacting with your API.

[1] - [https://commandcenter.blogspot.com/2014/01/self-
referential-...](https://commandcenter.blogspot.com/2014/01/self-referential-
functions-and-design.html)

[2] - [https://dave.cheney.net/2014/10/17/functional-options-for-
fr...](https://dave.cheney.net/2014/10/17/functional-options-for-friendly-
apis)

~~~
nemo1618
Functional options were exciting when they were first discovered, and they're
used in a few popular APIs (e.g. gRPC), but they haven't become as ubiquitous
as you'd expect if they were truly superior. My guess is that a struct is the
more direct and obvious approach for most people. Functional options also make
it awkward to save and reuse a set of options later.

------
vorachose
Nice writeup overall. Tiny nitpick however - in your Rust generation of the
diff image, you go by column instead of by row, making it quite a lot slower
than it should be (probably because of cache locality, I ain't no expert).
Switching from

    
    
        for x in 0..w {
            for y in 0..h {
            let mut rgba = [0; 4];
            [...]
    

to

    
    
        let mut rgba = [0; 4];
        for y in 0..h {
            for x in 0..w {
            [...]
    

cuts down runtime by between 35 and 50% on my side.

EDIT : moving the call to get_pixel outside the inner loop take off another
~10%, bringing it to sub 0.145 from ~0.290.

    
    
        let mut rgba = [0; 4];
        for y in 0..h {
            for x in 0..w {
                let pix1 = image1.get_pixel(x, y);
                let pix2 = image2.get_pixel(x, y);
                for c in 0..4 {
                    rgba[c] = abs_diff(
                        pix1.data[c],
                        pix2.data[c],
                    );
                }
                let new_pix = image::Pixel::from_slice(&rgba);
                diff.put_pixel(x, y, *new_pix);
            }
        }

~~~
diroussel
The optimal cache usage would be to not use staright nested for loops, but to
segment the whole image recursively. So take the whole imahe, then take each
quarter, then proceeds each quarter again.

At least that is my understanding of how to implement a cache oblivious
algorithm. See [https://en.wikipedia.org/wiki/Cache-
oblivious_algorithm](https://en.wikipedia.org/wiki/Cache-oblivious_algorithm)

Used that approach once when computing a large cross product and it gave a
good speed up.

------
valzam
To anyone coming from Python and wanting to try out Rust: If you haven't
worked with C before I highly recommend spending a weekend hacking something
in C before starting Rust. It really made me appreciate the borrow checker and
super restrictive type system.

~~~
capdeck
Nope. That's a 7 mile "shortcut" for 100m bridge. It really takes time to feel
the pain of all the foot guns in C. Just read Jim Blandy's awesome "Why Rust"
summary instead [https://www.oreilly.com/programming/free/files/why-
rust.pdf](https://www.oreilly.com/programming/free/files/why-rust.pdf)

~~~
valzam
Thx for the link, it looks useful. However, I disagree. The experience of
trying to make something work when the compiler gives grief for no obvious
reason or doesn't give any errors but you segfault for the 10th time cannot be
substituted by reading a book.

------
kerkeslager
> I’ve used statically typed languages in the past, but my programming for the
> past few years has mostly been in Python. The experience was somewhat
> annoying at first, it felt as though it was simply slowing me down and
> forcing me to be excessively explicit whereas Python would just let me do
> what I wanted, even if I got it wrong occasionally. Somewhat like giving
> instructions to someone who always stops you to ask you to clarify what you
> mean, versus someone who always nods along and seems to understand you,
> though you’re not always sure they’re absorbing everything. It will decrease
> the amount of type-related bugs for free, but I’ve found that I still need
> to spend nearly the same amount of time writing tests.

This is one of the biggest confusion points when comparing Python and Go.

Go's static type system is weak, the weakest of any widely-used static-typed
languages except C/C++. Lack of interfaces means you have to cast in/out of
object, so you're constantly sidestepping the supposed benefits of the static
type system. Type errors will, of course, be caught at runtime, but that's
completely identical to Python. You can, of course, use go generate to avoid a
lot of the object casting, but then you get all the wonderful problems of
macros.

Python doesn't enforce types at compile time (because there isn't a compile
time) but it does enforce types--it _does not_ "nod and seem to understand
you", if, for example you decide to do `42 + "Hello, world"`. It very much
tells you that this is not valid.

This confusion comes from the common misconception that static types = strong
types, and dynamic types = weak types. The truth is that there are a good
number of static languages (C being the most obvious) which have much weaker
type-checking that some dynamic languages like Python.

~~~
nerdponx
Is "duck typing" strong or weak?

Maybe one could argue that Python has a robust but hard-to-use interface
system called "documentation".

------
room271
Thanks, this is a pretty great write up. It's not a langauge-war, the
discussion is knowledgeable and pragmatic, and I like that you recognise that
a language suitable for work might be different from one you'd use on a side
project.

------
Thorentis
This is a pretty silly comparison to be honest. Why would you choose an
example for which a library exists in Python that can already do most of the
work for you? Sure, it might be the most "Pythonic" thing to do (importing a
library) but for an actual language comparison, you should be comparing things
1 to 1.

EDIT: For example, implementing a merge sort (without using any .sort()
functions) would be interesting to compare between the three languages. Though
to be honest, I wouldn't expect major differences aside from basic syntax.

~~~
coldtea
> _but for an actual language comparison, you should be comparing things 1 to
> 1._

Not really, you should be comparing the idiomatic ways in each language.

And if Python is more of a "batteries included / easily installable" language,
then that's how you should use it.

> _EDIT: For example, implementing a merge sort (without using any .sort()
> functions) would be interesting to compare between the three languages._

Only if you want to see how each language feels in terms of primitives and
syntax and so on. Not if you want to see how you'd actually use the language,
and what facilities and ecosystem you can leverage, which is more important.

If I'm going to compare Python for scientific computing to Java, for example,
of course I'll consider that Python has Nympy, Pandas, Anaconda, and so on,
and similar for what Java offers, not just try to e.g. write my own math code
in pure Python and Java. Same for most domains...

~~~
Thorentis
Sure, that makes sense in a specific way - perhaps if you were deciding which
language to use for a particular project. You'd look at which libraries it
offers, which features you will be making the most use of etc.

But in this case, it's comparing two languages generally to each other. And
the example chosen biases Python in terms of brevity and complexity because
Python has a library for the use case chosen for the comparison. If you want
to do a general comparison you need to be general in your approach.

~~~
chucksmash
> the example chosen biases Python in terms of brevity and complexity because
> Python has a library for the use case

Which would also be true for many other examples they might have picked, which
makes it a reflection of the ecosystem, not a distortion.

It's a "subjective, primarily developer-ergonomics based" comparison. Seems
like fair game to me.

------
epx
That's ironic, I also wrote a similar app for image diffing and it is also
written in Python, Go, Rust... and Node. It also spits a number as diff
metric, but it is an integer. Code is at [https://github.com/elvis-
epx/pictdiff](https://github.com/elvis-epx/pictdiff)

------
leshow
> There was one place where because of the type system, the implementation of
> the imaging library I was using would have led to an uncomfortable amount of
> code repetition.

If you truly don't care about handling the different variants and want them
all to execute the same code you could have just as easily done:

    
    
        let w = image1.width();
        let h = image1.height();
        let mut diff = image::DynamicImage::new_rgb8(w, h);
    

If you don't care about image1.color() 's variants, why even have the match at
all? I guess I don't understand why it 'rubbed you the wrong way'. You don't
have to match anything if you don't want to, and if you do want to match
variants, you have all the tools necessary to reduce duplication & only handle
the variants you want to.

~~~
deathanatos
I don't know if others do this, or if the Rust style guide particularly
sanctions it, but in match sections that get overly verbose like that, I find
it useful to do a local use statement to pull the enum variants into scope.
E.g.,

    
    
      use image::ColorType::*;
    
      match ... {
        RGB(_) => DynamicImage::new_rgb8(w, h),
        RGBA(_) => DynamicImage::new_rgb8a(w, h),
      }
    

If you don't necessarily want the variants in the broader scope, it keeps them
out of it, but within the local context of the match, the reader often will
know what's being referred to. (And if they don't, the use statement will tell
them.)

Also, the author already has DynamicImage available in the scope (there's a
use for it at the top) so the image:: prefix isn't needed in that section.

There's also what looks like a manual expansion of try!() in run(), that could
just be create_diff_image(...)? which would be less verbose.

~~~
leshow
I prefer the constructors being in the scope personally because it feels
similar to Haskell. I'm not sure why they decided to make them hidden.

------
WoodenChair
I think the author does a good job comparing the three languages here. That
said, comparing just one program can provide some insights, but not enough to
know if the language is a good choice for larger projects. I have rewritten a
whole book of programs, first in Swift, then in Python, and soon in Java
([https://classicproblems.com](https://classicproblems.com)). And if I didn't
know the three languages well, I probably would have to say that even writing
that book has not provided me enough insight to say "this language is good for
my next 100,000 line project."

------
icebraining
I wish these comparisons that involve typing would include Mypy. Python is now
effectively an optionally-typed language, not just a dynamically-typed
language, and in my experience it solves a good deal of the problems, while
not getting in the way.

------
truth_seeker
I don't understand the obsession of people trying to compare statically typed
and dynamically typed language performance.

Besides that, cpython is not the only Python runtime or interpreter. The
author should have tried pypy runtime which has mature JIT.

~~~
nicolashahn
Yeah, there's a lot more I could do to have made this fairer. But, in the end,
my goal was to write the program in the most pragmatic way I could for each
language, not level the playing field as much as possible.

Also, this blew my mind:
[https://lobste.rs/s/xz5l8t/one_program_written_python_go_rus...](https://lobste.rs/s/xz5l8t/one_program_written_python_go_rust#c_lxm4vf)

~~~
tracker1
any chance I can get an invite? tracker1 at gmail

------
gigama
Interesting comparison, coming from a python background this was actually
helpful info!

Just odd to read noun-verb combinations like:

> "If you’re comfortable with Python, you can go through the Tour of Go in a
> day or two..."

> "I would go as far as saying that Go’s strength is that it’s not clever."

> "I decided to give an honest go at learning Rust."

> "Go propagates errors by returning tuples: value, error from functions
> wherever something may go wrong."

------
dev_dull
> _Its minimalism and lack of freedom are constraining as a single developer
> just trying to materialize an idea. However, this weakness becomes its
> strength when the project scales to dozens or hundreds of developers_

I actually appreciate this less for other developers, but even for my own
code. Let’s be honest, It’s difficult to come back to a code base a year or
two after it went into maintenance even if you were the writer.

------
zeveb
I love that this Lisp version from a comment at lobste.rs
([https://lobste.rs/s/xz5l8t/one_program_written_python_go_rus...](https://lobste.rs/s/xz5l8t/one_program_written_python_go_rust#c_lxm4vf))
runs in less than half the time of Rust, and is more readable to boot:

    
    
        (declaim (ftype (function (string string) double-float) img-diff))
        
        (defun img-diff (first-file-name second-file-name)
          (declare (optimize (speed 3) (safety 0) (debug 0) (space 0)))
          (let ((im1 (png:decode-file first-file-name))
                (im2 (png:decode-file second-file-name)))
            (declare ((SIMPLE-ARRAY (UNSIGNED-BYTE 8) (* * *)) im1 im2))
            (/ (loop
                for i fixnum below (array-total-size im1)
                summing (abs (- (row-major-aref im1 i)
                                (row-major-aref im2 i))) fixnum)
               (* (ash 1 (png:image-bit-depth im1)) (array-total-size im1))
               1.0))) ; Convert rational to float
        
        (img-diff "file1.png" "file2.png")

~~~
kiaulen
Less than half the time is provable. More readable is a bit of a stretch.

What does declaim mean? What about ftype? What is the (* * *) in the array
declaration? What is ash? I've read through some of CLtL and the rust book
both, but none of those are constructs I've come across.

Also (this doesn't matter in practice due to rainbow parens for almost every
editor), it's really hard to read lisp code without syntax highlighting. Rust
isn't super easy, but I don't have to count braces to see what lines up.

------
Touche
Great post, I like comparisons like this because its about implementing
something practical. I like to see how the languages do in the real world.

------
todd8
This is great. I’ve been thinking about doing this very thing in a different
problem domain.

------
DannyB2
If there were one perfect language, for all porpoises, we would be using it
already.

Also a genuine dollars and cents factor: Development Time.

Do not just measure for cpu cycles and bytes. Development time is money, and
that is a measurable factor too that must be considered as a language
tradeoff.

------
yydcool
Good to read! let me know that pillow has lot of space for optimizing.

------
wiineeth
Should have also considered C++

------
s_k_
you could've given at least one JVM language, maan o_0

~~~
BuckRogers
I always feel that way about the Python/Go/Rust obsession on HN. For the
majority of cases, you could just use Java (or Kotlin etc) or C# and while
"boring", the overall benefits nad balance those two provide would usually
make Python/Go/Rust pointless. Doing it with one language as well.

