
Nim binary size from 160 KB to 150 Bytes - def-
http://hookrace.net/blog/nim-binary-size/
======
teh
Today I looked at Nim in a bit more depth because it keeps popping up. I have
a slightly uncomfortable feeling about it that I hope is unfounded!

To me it looks like it makes the unsafety of c more accessible because of
better tooling and nicer syntax. Looking at e.g. [1] there are still pointers,
null-pointers etc, just like in c. So now you have a language that looks
superficially simple but is actually very dangerous. Compare this to e.g. rust
which was the most painful thing I learned recently but I also know that it
brings something fundamentally new to the table.

Anyway, there's a lot I don't understand about Nim and I'd be happy to see
evidence to the contrary.

[1] [http://nim-lang.org/0.11.0/tut1.html#advanced-types-
referenc...](http://nim-lang.org/0.11.0/tut1.html#advanced-types-reference-
and-pointer-types)

~~~
filwit
So Rust has some cool safety features, especially for concurrent code. But,
and perhaps I'm just uninformed, I never really understood the safety benefit
of Rust's 'never nil' design. Nil is a useful modelling tool, even in Rust
where it exists via Option<>/None, correct? Perhaps by forcing you to be
extremely explicit (and enforcing `match` always handles all conditions) you
gain some arguable safety, but at what cost? It's certainly not easier to use
and reason about, IMO. And it seems just as likely you'll end up crashing your
program due to a bounds-check error (which may happen more often since Rust
encourages indexing over references due to this very design.. at least, so
I've read).

It seems to me the design was chosen more as a way to ensure memory lifetime
could be better predicted by the compiler rather than any strong argument for
safety.. but then, I'm not well read on the subject, and It's very likely
there's good safety arguments for it I'm not aware of.. either way, in my
experience nil-deref errors are rarely a painful thing. They happen often, but
are also fixed quickly.

~~~
pcwalton
> Nil is a useful modelling tool, even in Rust where it exists via
> Option<>/None, correct?

It's not that null is not useful. It's that most pointers can never be null,
so nullability is the wrong default. And it is useful for the compiler to
force you to handle the case in which pointers are null.

> Perhaps by forcing you to be extremely explicit (and enforcing `match`
> always handles all conditions) you gain some arguable safety, but at what
> cost?

There's basically no downside to having no null pointers. With constructs like
Option::map the code is usually even less verbose than the equivalent code
with null.

> It's certainly not easier to use and reason about, IMO.

You never have to worry about your program failing whenever you type "." or
"∗". With null, the semantics of the language are that an exception can be
thrown [1] whenever those constructs are invoked. That's pretty much
_objectively_ easier to reason about.

> It seems to me the design was chosen more as a way to ensure memory lifetime
> could be better predicted by the compiler rather than any strong argument
> for safety

Huh? Lifetimes are totally independent. We could have had null pointers with
the lifetime system (and there were languages like Cyclone that had both). The
system exists precisely because of safety.

We also get some really nice optimizations out of it that are impossible to
get in C. All pointers in Rust are _dereferenceable_ per the LLVM definition,
which opens up some really neat optimizations like loop invariant code motion
on loads.

> in my experience nil-deref errors are rarely a painful thing. They happen
> often, but are also fixed quickly.

Not in my experience. They show up in production all the time.

[1]: Or you could do what Nim does, and make dereferencing null undefined
behavior instead of guaranteeing that an exception is thrown, but that strikes
me as worse than what Java does.

~~~
filwit
> it is useful for the compiler to force you to handle the case in which
> pointers are null.

Well I agree that it's very useful (and we have that in Nim), but..

> With constructs like Option::map the code is usually even less verbose than
> the equivalent code with null.

I'm still not convinced of this part. It certainly hasn't been the case with
the, admittedly small amount of, Rust code I've seen. However, I'll look for
more comparisons in the future (or offer Nim comparisons to Rust snippets
anyone posts). Point is, nil is still a useful and commonly used tool. So the
argument for verbosity and conveniences is relevant, IMO.

> With null, the semantics of the language are that an exception can be thrown
> [1] whenever those constructs are invoked. That's pretty much objectively
> easier to reason about.

That completely depends on how often you want to use nil refs, and how easy
they are to use. Like I said in another response, I agree Rust's design may be
better for some domains, but I certainly wouldn't call it "objectively" easier
to reason about in a general sense.

> Huh? Lifetimes are totally independent.

Well like my post implied, I was only guessing as to the design. And it's
interesting to hear that it takes advantages of special compiler
optimizations. That said, I still don't see how it's completely decoupled from
the life-time system.. you're saying that if I have a Option<> reference to a
mutable list in Rust, the compiler can determine weather or not the list is
'frozen' based on the runtime state of that reference?

> Or you could do what Nim does, and make dereferencing null undefined
> behavior.

I didn't think derefing nil was undefined behavior. I thought only
dereferencing a pointer which points to once-valid-but-now-free memory was
undefined behavior, and that situation is covered by GCed refs. Can you
explain this a bit?

EDIT:

> Not in my experience. They show up in production all the time.

I did say 'rarely', and I drew a comparison to bounds-check crashes, which
surely also show up in production.

~~~
pcwalton
> Point is, nil is still a useful and commonly used tool. So the argument for
> verbosity and conveniences is relevant, IMO.

The only advantage of having null references is that the pattern "if this
reference is null, dereference it; otherwise throw an exception" is shorter.
But the question is: how often do you want that pattern? In a robust program,
the answer to that is "rarely".

Put another way, it would be trivial to add sugar for the ".unwrap()" pattern
to Rust (perhaps with the "!" operator) if it were necessary, gaining back the
only verbosity-related advantage of null pointers. But nobody in the Rust
community is asking for it. That's because _this pattern is rare_. If it were
a problem, someone would have at least submitted an RFC by now!

> I certainly wouldn't call it "objectively" easier to reason about in a
> general sense.

If you write down, formally, what the star or dot operators do, there are
strictly more steps involved when you have null pointers. That's why a
language without null is objectively easier to reason about.

> if I have a Option<> reference to a mutable list in Rust, the compiler can
> determine weather or not the list is 'frozen' based on the runtime state of
> that reference?

I don't know what this means. Lifetimes rule out dangling pointers. They don't
have anything to do with nullability. The borrow checker only cares about the
structure of your data enough to construct loan paths.

> Can you explain this a bit?

Dereference of null is undefined behavior in C, and Nim compiles to C code
that blindly dereferences pointers without inserting null checks. So
dereference of null is UB in Nim too. In an earlier comment I was able to
construct a Nim program that exhibited very different behavior in debug and
optimized builds, using nothing but GC'd pointers.

> I did say 'rarely', and I drew a comparison to bounds-check crashes, which
> surely also show up in production.

Actually, Rust does try to prevent indexing-related issues by preferring
iterators to raw array indexing. But, in any case, the comparison isn't
relevant for a couple of reasons. First of all, in a general sense if you have
big problems A and B, the fact that you can't solve B isn't an excuse to not
solve A. More specifically, though, the amount of type system machinery needed
to fully eliminate bounds check failures is much higher than that needed to
eliminate null pointer exceptions—you basically need dependent types, whereas
to eliminate null pointers all you need are bog-standard algebraic data types,
which have existed since the 70s.

~~~
filwit
> there are strictly more steps involved when you have null pointers.

Well yes, and both Nim and Rust have non-nil pointers.. I suppose I misread
your original statement as "Rust is objectively better ..." when you actually
just said non-nil vars are an objectively better design pattern in general. My
mistake.

Our argument seems to stem around the two assertions (one from you, and one
from me), those are: "nil vars _are_ be rare (in optimally written code)", and
"Rust's way of working with 'nil' vars is verbose". I suppose I'll concede
that non-nil vars is a better default (though I will hold reservation until I
see more real statistics, I don't find "no RFC yet!" as hugely convincing),
but I also feel Rust could do a better job of giving access to "nilable" vars
when they're needed.

> I don't know what this means. Lifetimes rule out dangling pointers...

I mean, Rust prevents you (via compile-time mechanisms) from mutating a
variable while it's borrowed by another reference.. If that reference is
Option<>, it's only known at runtime weather or not a reference has actually
borrowed said varaible. Rust must either treat every Option<> reference as a
potential 'loan path', which would significantly diminish their usefulness as
a references, encouraging indexing for these scenarios, which leads to almost
identical potential for out-of-bounds crashes... or it's relying on some kind
of more complex mechanism (lifetime vars maybe?).. or additional runtime
overhead.

I really don't know enough about Rust to know how far off-base that is. So any
clarity is appreciated.

> In an earlier comment I was able to construct a Nim program that exhibited
> very different behavior in debug and optimized builds, using nothing but
> GC'd pointers.

I remember this comment, but I didn't remember it achieving UB in debug code..
I'll look through the history and take another look.

~~~
phaylon
> Rust must either treat every Option<> reference as a potential 'loan path',
> which would significantly diminish their usefulness as a references,
> encouraging indexing for these scenarios, which leads to almost identical
> potential for out-of-bounds crashes... or it's relying on some kind of more
> complex mechanism (lifetime vars maybe?).. or additional runtime overhead.

Can you give a concrete example of this? I'm a bit confused, but it might just
be a terminology thing. In Rust, `Option<T>` does not imply a reference. If
you have a `Option<i32>` there are no references involved. An `Option<T>` also
owns the `T` if there is one. You can get a reference to it, but you have to
check that it indeed holds a `T` (via `match` or `match` using functionality).

Note: I should clarify: What's confusing me is the indexing stuff. I'm not
sure if this is referring to something about the `Option<T>` or something
else.

~~~
filwit
> I should clarify: What's confusing me is the indexing stuff. I'm not sure if
> this is referring to something about the `Option<T>` or something else.

By indexing, I meant as an alternative to references.. For example, if you had
a Sprite type which held a 'reference' to a Texture in your game's Texture
list.. as soon as you allocate a Sprite it must borrow a reference to a
Texture, preventing any future mutation of the Textures list for the lifetime
of the Sprite, which obviously is too restricting for most games.. so the
alternative is to have the Sprite simply hold an index to an item in the array
instead, but this basically comes with the same pitfalls as nilable refs (ie,
if you accidentally change it, your program can crash due to bounds-checking
errors.. or end up with visual glitches.. not sure which would be more
annoying).

The other alternative is to use an Option<&Texture> instead. However, I'm not
familiar enough with Rust to know of the restrictions here, or even if that's
possible (taking a look at the docs, it looks like it's possible, but life-
time vars come into play, which could complicate things).

~~~
phaylon
Rust solutions would probably be the following: Some kind of runtime
assistance (`Rc<T>`, `Arc<T>` et al), using indices as you mentioned (though
with `list.get(index)` you'd still have to deal with the fact that it might
not be valid, since `get` will return an `Option<T>`)[1] Another solution
might be to allocate the textures in an arena that lives outside of the scope
of your game logic, and have both the texture list and sprites contain borrows
(Note I'm not sure about this, as I haven't done much with arenas yet).

Although I'm unsure where the `not nil` as discussed above comes into play
here. What part in Nim would be `nil` here where Rust would have `Option<T>`?
The difference between `Option<&Texture>` and `&Texture` is that you have to
somehow deal with the possibility of no texture when handling the former.

[1] I should note that actual indexing behavior (`list[index]`) will assume
you know there is one in there and panic if it isn't. This is one of the
things I dislike and hope there will be an optional (no pun intended) lint
post-1.0.

------
ilitirit
I've tried Rust, Nim and Go and I prefer Nim. But this could also be because
of my background as a C/C++ programmer, and my particular requirements
(general purpose programming language that doesn't try to hold my hand _too_
much).

There are things I don't like about the language (eg. case-insensitivity), but
overall if I had to choose a newish language for a new task, I'd choose Nim
over Rust and Go. (However, if you threw D into the equation I'd probably go
with D simply because I feel it's slightly more mature).

Incidentally, the way I tried to teach myself Nim (and to see if the language
was usable for creating small Windows apps) was to write a WinApi program. It
took about the same or less effort as what it would have taken me in C/C++,
but it just felt much safer and more pleasant to work with.

~~~
k__
Do you have some directions on the WinAPI in Nim?

I felt a bit overwhelmed with the whole wrapping thing.

~~~
ilitirit
You don't have to wrap anything. It's exactly like using it from C, except
maybe a bit easier/safer. Just import windows. Then you can write code like
this:

    
    
        hWndMain = CreateWindowEx(
            WS_EX_TOPMOST,              # Optional window styles.
            CLASS_NAME,                 # Window class
            WINDOWNAME,                 # Window text
            windowStyles,               # Window style
    
            # Size and position
            centerX, centerY,
            APP_WIDTH, APP_HEIGHT,
    
            cast[HWND](nil),        # Parent window    
            cast[HMENU](nil),       # Menu
            hInstance,              # Instance handle
            cast[LPVOID](nil)       # Additional application data
            );

------
masklinn
Comparison summary between the Rust inspiration and the Nim version:

* Nim/GCC gains 2 bytes by smartly reusing the previously set AX register's value to set DI where Rust/Clang uses an immediate

* Nim can't express that stuff after the EXIT syscall is unreachable and wastes a byte on a RETQ.

------
Fastidious
I read this, and wonder why software has gotten so fat? Any simple application
these days is easily on the few dozen of MB, most a few hundred, with a few on
a few GB in size. Why aren't we streamlining software to reduce its size? I
understand we have gotten "rich" on storage, but if the trend continues...

I am sure many portable devices would benefit if applications were trimmed
down.

~~~
Sir_Cmpwn
My experience writing KnightOS has given me the understanding that we are
wasting the obscene amount of resources available to us from modern computers.

------
_blrj
I love these articles; I make sure to bookmark them just in case one day I
want to build binaries that do nothing.

~~~
level
In case you haven't seen it, here's a similar article on making a tiny ELF
executable. [1]

[1]
[http://www.muppetlabs.com/~breadbox/software/tiny/teensy.htm...](http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html)

------
onedognight
This post is Nim specific, but the key ideas for getting to a small binary
(optimize for size, remove the standard library, avoid compiler main() / crt0
baggage by defining _start, use system calls directly) are the same in C, C++,
Rust, etc.

------
bonesmoses
I like the end result. However, it makes me wonder just why it's so acceptable
that simple programs like this even compile down to a 160KB executable in the
first place.

The actual active code is essentially some text and an interrupt. That much,
at least, should be language independent. Are modern compilers incapable of
discarding unreferenced code, or am I missing something?

~~~
def-
The first compilation, which is 160 KB, is totally unoptimized, contains all
kinds of debugging and checks. It's just supposed to be for yourself during
development of the program.

Also there is some overhead every Nim program has. But if you get to bigger
programs you'll see that Nim's binary size is just fine, for example a NES
emulator is just 136 KB: [http://hookrace.net/blog/porting-nes-go-
nim/#comparison-of-g...](http://hookrace.net/blog/porting-nes-go-
nim/#comparison-of-go-and-nim)

~~~
bonesmoses
Good to know. Also, pretty cool emulator!

------
killercup
> (1 byte smaller than in Rust)

Nice achievement! The article is quite the journey through various build
parameters, switching gcc for clang and glibc for musl along the way. In the
end, the secret sauce is syscalls and custom linking, though (as always with
this kind of thing).

------
thom_nic
This seems mostly useful in highly constrained embedded environments (AVR,
MSP430, ARM M0, PIC etc.) Unfortunately it seems like none of these "modern"
system languages (Nim, Rust) seem to be putting too much effort towards
embedded platforms :(

~~~
steveklabnik
Rust isn't putting a lot of specific effort into embedded, but we already work
on many embedded platforms. As the language matures, I expect that support to
grow.

~~~
simias
Embedded and especially bare-metal applications really are 2nd class citizens
in the rust ecosystem though. You basically can't use cargo (or at least not
without a whole bunch of hacks) and many important features for low-level code
are still gated and won't be available for 1.0.

I think it's a bit of a shame because that's basically the #1 differentiator
with languages like Go or Java as far as I'm concerned.

But beyond that it's true that the language itself has a lot of potential for
embedded applications. The runtime can be made almost as tiny as C's and with
libcore you get a much nicer and safer "bare metal" environment than what
you'd get in C. And thanks to LLVM you can easily target a whole bunch of
architectures.

~~~
tshepang
Why can't one use Cargo for bare-metal applications?

~~~
steveklabnik
See all the extra work that had to be done here:
[https://github.com/Ogeon/rust-on-raspberry-pi](https://github.com/Ogeon/rust-
on-raspberry-pi)

------
jug
> Who needs error handling when you can have a 6 KB binary instead

Haha!

What I found was most impressive was the small binary even without the tricks.

------
delinka
"The speed optimized binary is much smaller..."

Did I miss where he optimized for speed?

~~~
lukevers
Including `-d:release` optimizes it, so that's probably what he meant. It's
one of the first things in the [tutorial]([http://nim-
lang.org/0.11.0/tut1.html](http://nim-lang.org/0.11.0/tut1.html)).

------
cristaloleg
WOW. 10/10

~~~
nodejsisbest
Hey this isn't a constructive comment & you'll probably get downvoted for it.
It happens to me whenever I comment in a thread that mentions Node.js because
of my name. It is the best, but not everyone understands.

