
Introducing .NET IL Linker - jsingleton
https://github.com/dotnet/announcements/issues/30
======
jbevain
Fun fact, the Mono Linker project started 10 years ago during the second
edition of the Google Summer of Code!

It was originally used to reshape the entire Mono class library into a subset
to expose the Silverlight class library API surface for Moonlight.

It was then used to link iOS and Android applications in MonoTouch and Mono
for Android, and Xamarin continued to use and improve it.

The Mono Linker has an open architecture making it reasonably easy to
customize how it processes code , and detect patterns specific to each
platform to link them away. Xamarin added linker steps to do more than tree-
shaking, and remove dead code inside methods. For instance:

    
    
      if (TargetPlatform.Architecture == Architecture.X64) {
        // ..
      }
    

The entire if body can be removed if the linker knows that
TargetPlatform.Architecture will not be X64.

And now, it's the base for the .NET Core linker. Quite a journey!

------
christophilus
It's been a few years since I used .NET, but the self-contained deployment
story sounds like a big win that I used to really want:

"Self-contained deployment. Unlike FDD, a self-contained deployment (SCD)
doesn't rely on the presence of shared components on the target system. All
components, including both the .NET Core libraries and the .NET Core runtime,
are included with the application and are isolated from other .NET Core
applications. SCDs include an executable (such as app.exe on Windows platforms
for an application named app), which is a renamed version of the platform-
specific .NET Core host, and a .dll file (such as app.dll), which is the
actual application."

So, this IL Linker is in support of that story, and aims to reduce the over
all size of self-contained applications... Maybe it's time to start tinkering
w/ .NET again.

~~~
danek
My company has some apps that we need to install on our customers servers. A
linker would have been amazing for us 7 years ago as we could have targeted
the (as of then new) 4.0 runtime and redistributed or binary. But as it was,
we could only expect the 2.0 runtime to be installed so we can't really
leverage a lot of stuff that's been out since then and had to write our own
hacky versions of lots of common things.

Nowadays with dotnet core you can bundle the entire runtime with you binary.
The linker is cool, in that it will create a strictly smaller binary, as in
"hello world" might be 12mb instead of 50mb. That's cool, but it makes me
wonder if .NET would be in a strategically better place today if the linker
were present 10 years ago. How many companies didn't choose to write their app
in c# simply because a customer might not have the runtime installed?

In any case I'm really impressed with everything Microsoft is doing with
dotnet core and hope that I get to use more of it.

~~~
int_19h
It certainly would. The reach of .NET was artificially restricted in many
respects by making it a component of Windows. Mono provided the ability to
just ship the runtime alongside the app, but Mono was also not "enterprise
grade"; and of course there were always the missing bits and pieces,
especially on the desktop front where such deployment made most sense.

It took a Microsoft that understood open source and non-Windows platforms, and
was serious about it, to ultimately get us to the point where .NET Core could
be a thing.

~~~
pjmlp
Until .NET Core provides support for many of the .NET Framework APIs still
lacking from .NET Standard 2.0, many companies won't consider it "enterprise
grade".

Examples on our case, WPF, WCF, ODP.NET drivers for ADO.NET.

~~~
svick
I think .Net Core will never include WPF.

As far as I know, there are not plans for cross-platform GUI on .Net Core. And
WPF is pretty much a dead technology.

------
blinkingled
> And, it wouldn't be right to announce this linker without a hat tip to Joel
> Spolsky who we can give the honorary and historical distinction of feature
> requestor.

[https://www.joelonsoftware.com/2004/01/28/please-sir-may-
i-h...](https://www.joelonsoftware.com/2004/01/28/please-sir-may-i-have-a-
linker/) \- Joel asked for this in 2004 - Better 13 years late than never!

------
pcunite
If .NET (using C#) can produce a native binary I'll switch to that for as many
projects as I can. Currently I use C++, mostly to interface with the OS. I
love the feel I get when working in C#. Everything is so nice and cozy.

Memory: The memory usage "looks" bad. I suppose maybe GC would kick in when it
needs to. It has not been a problem for me, but when you see the code in C++
taking 1MB ram and a C# implementation taking 10MB, well, you have to wonder.

Performance: C# performs bounds checking, which I take it is a significant
performance hit. I've not had this matter _yet_ , but I've not converted
everything over to .NET.

~~~
int_19h
The nice thing about C# is that, unlike Java, you actually get all the low-
level things. They're just tucked away, out of sight of your average coder
who's more likely to shoot themselves in the foot with them. But if you know
how to use them, it's all there. Let me enumerate.

C# has value types, which follow the same memory model as in C (i.e. stack-
allocated for locals, embedded directly into the outer object as fields). You
can request explicit layout mode, whereby you can set an offset for every
field manually - if you set them all to zero, you get a C union.
Alternatively, you can request automatic layout mode, where the JIT is allowed
to reorder the fields in arbitrary ways to optimize memory and/or access,
which is something you don't even get in C.

It has raw pointers. Unlike object references, these have the usual C
semantics - there's no GC involved there, and you're responsible for keeping
the objects alive, so you can have dangling pointers etc. No boundary or null
checks, either - it's zero overhead. You can do pointer arithmetic on them,
including indexing with []. If you P/Invoke malloc or equivalent, this gives
you C-style heap allocated arrays.

It has stackalloc, which is a language operator that's equivalent to the non-
standard alloca() function in C - allocate a chunk of memory on the stack, and
return a pointer. This immediately gives you stack-allocated arrays as
flexible as C99 VLAs.

With generics, if a generic type parameter is a value type, the specialization
for that type is separately JIT-compiled and optimized. This can be used in a
way very similar to C++ templates, for zero-overhead inlined callbacks and
similar shenanigans.

All in all, if you want to write high-perf code in C#, you certainly can.
You'll hit the limits of the JIT optimizer soon enough - it can't be as good
as a full-fledged AOT C++ compiler, say - but you can certainly shed most of
the overhead associated with a managed language.

~~~
pjmlp
In Java some of them are there, although the Unsafe package is not sanctioned,
and marked to be replaced by something safer that can provide the same
capabilities (e.g. VarHandles on Java 9).

As for the rest at least we can hope Java 10 brings them.

You missed a few C# goodies from 7.0 up to 8.0, coming from the experience
with Midori.

~~~
int_19h
The goodies in question are local refs and ref returns, right? Which really
only make a difference when you're writing memory safe code. If you're
dropping down to "unsafe" and manual memory management for other things, then
refs don't do anything that raw pointers don't.

~~~
pjmlp
There are other goodies, like memory views, blittable structs, interior
pointers and readonly refs.

[https://github.com/dotnet/roslyn/blob/master/docs/Language%2...](https://github.com/dotnet/roslyn/blob/master/docs/Language%20Feature%20Status.md)

------
caleblloyd
This is neat as it brings smaller DLL sizes. It does not include self-
contained executables that can be shipped, however. You still end up having to
ship a directory full of DLLs. I've been wanting self contained executables
for a while now. CoreRT (.NET Native) seems to have been put on the
backburner, plus sometimes the JIT case is faster.

Source: discussion at
[https://github.com/dotnet/core/issues/915#issuecomment-32608...](https://github.com/dotnet/core/issues/915#issuecomment-326081258)

~~~
joshschreuder
Not sure about .NET Core as I haven't used it yet, but you can use ILRepack to
merge DLLs into the executable so you don't need to ship a directory full.

[https://github.com/gluck/il-repack](https://github.com/gluck/il-repack)

We do this at work with one of our utilities

------
neilsimp1
Relevant Scott Hanselman blog post:
[https://www.hanselman.com/blog/ExperimentalReducingTheSizeOf...](https://www.hanselman.com/blog/ExperimentalReducingTheSizeOfNETCoreApplicationsWithMonosLinker.aspx)

------
sharpercoder
I thought this technique of eliminating dead code was called tree-shaking. And
linking to be something entirely different: linking assemblies at
(compile|run|analysis)-time.

~~~
mitchty
I've only ever heard it called dead code elimination. Tree shaking appears to
be more a javascript thing? Not sure I don't participate in that community at
all.

[https://en.wikipedia.org/wiki/Dead_code_elimination](https://en.wikipedia.org/wiki/Dead_code_elimination)

~~~
cwzwarich
It's an old Lisp term going back decades. Strictly speaking it's not the same
as interprocedural DCE, because due to the dynamic features of Lisps you might
need the user to specify the extent to which they are willing to disable the
dynamic features to save space.

~~~
mitchty
Gotcha, would it be fair to characterize it as tree shaking is more white
listing things that have been used versus eliminating code paths that cannot
be accessed?

------
revelation
Now they just need a modern (XAML, QML) cross-platform UI story and they can
eat Qt and Electrons lunch.

~~~
ledgerdev
MS/Xamarin is working on XAML Standard implementations for Win/Mac/GTK.

[https://blogs.windows.com/buildingapps/2017/05/19/introducin...](https://blogs.windows.com/buildingapps/2017/05/19/introducing-
xaml-standard-net-standard-2-0/)

------
Nuzzerino
> "The linker removes code in your application and dependent libraries that
> are not reached by any code paths. It is effectively an application-specific
> dead code analysis."

Putting aside the discussion on whether this is a good practice, I assume this
would remove methods that are only called dynamically, or via reflection.

~~~
danek
I'm also curious how this would work without breaking reflection

~~~
matthewwarren
They talk about this here
[https://github.com/dotnet/core/blob/master/samples/linker-
in...](https://github.com/dotnet/core/blob/master/samples/linker-instructions-
advanced.md#limitations)

~~~
danek
Thank you

------
marssaxman
I wonder how this interacts with the "no-PIA" feature, where the compiler
would embed and dead-strip COM interop types. Seems like some of that system's
guid-based type identity attributes might be relevant.

~~~
svick
This is (at least for now) only for .Net Core, which does not support COM
interop, I believe.

------
64738
Can you create a self-contained deployment (SCD) of a GUI app, or does it only
work for console apps?

~~~
mellinoe
The technology will work with any kind of app. I know that some folks working
on one of the bigger UI frameworks, Avalonia
([https://github.com/AvaloniaUI/Avalonia](https://github.com/AvaloniaUI/Avalonia))
have toyed with it recently. One thing to keep in mind is that lots of these
frameworks do dynamic, reflection-based data binding, which might be easily
broken if the linker removes too much stuff. You can tinker with the settings
of the linker to force it to preserve things that are actually necessary.

------
flukus
Don't forget to check the license of every library, even many of the more
permissive ones will require an attribution notice.

------
manigandham
Nice to see this progress, hopefully it's a serious part of the roadmap and
not just a side experiment.

------
bernadus_edwin
Is java has linker feature?

~~~
pjmlp
Yes, shipping on Java 9.

------
userbinator
Going from over 46MB to under 12MB seems impressive on a relative scale, but
if this is the code of the "dotnetapp-selfcontained" example,

[https://github.com/dotnet/dotnet-docker-
samples/blob/master/...](https://github.com/dotnet/dotnet-docker-
samples/blob/master/dotnetapp-selfcontained/Program.cs)

That is still, absolutely speaking, _twelve million bytes_ for not much more
than "Hello World" levels of functionality, which means there remains _plenty_
of room for improvement. I estimate the lower limit for this particular app is
somewhere in the hundreds of bytes, most of it being string constants.

~~~
judah
I see this fallacy repeated often. "Hello World is 12 MB - just imagine how
big a real app will be!!!"

Hello World doesn't benefit from that 12MB. Your real app would.

12MB is not for Hello World. 12MB is for Hello World + Common Language Runtime
(garbage collection, bounds checking, memory protection, execution environment
etc.) + the .NET standard libraries (LINQ, task parallel library, etc.)

You're probably going to need those things in a real app.

If you're actually doing Hello World -- and that is _all_ you're doing -- and
if using 12MB of your 16GB of RAM is unacceptable for you, yes, by all means
write some native code.

~~~
algorithmsRcool
Put another way: what should be left after linking is just the parts of the
CLR + BCL that the runtime needs to function. Apparently that takes 12MB
today. There is a JIT, GC and Execution Engine all in there.

The number might shrink as the linker gets more aggressive but i wouldn't
count on it much.

The application IL code is perhaps 50 bytes to for hello world.

~~~
Kuraj
12 megabytes doesn't sound too bad for the things you mentioned, especially if
it's a constant cost.

