Hacker News new | past | comments | ask | show | jobs | submit login

> Take a moment to appreciate what just happened here - I downloaded a Windows build of Zig, ran it in Wine, using it to cross compile for Linux, and then ran the binary natively. Computers are fun!

> Compare this to downloading Clang, which has 380 MiB Linux-distribution-specific tarballs. Zig's Linux tarballs are fully statically linked, and therefore work correctly on all Linux distributions. The size difference here comes because the Clang tarball ships with more utilities than a C compiler, as well as pre-compiled static libraries for both LLVM and Clang. Zig does not ship with any pre-compiled libraries; instead it ships with source code, and builds what it needs on-the-fly.

Hot damn! You had me at Hello, World!

> Take a moment to appreciate what just happened here - I downloaded a Windows build of Zig, ran it in Wine, using it to cross compile for Linux, and then ran the binary natively. Computers are fun!

Even though it probably doesn't qualify this is pretty close a Canadian Cross, which for some reason is one of my favorite pieces of CS trivia. It's when you cross compile a cross compiler.


> The term Canadian Cross came about because at the time that these issues were under discussion, Canada had three national political parties.

In what way is this even close?

What are the three targets in this case? It simply isn’t relevant at all.

It is tangentially relevant CS-Trivia. I found it to be interesting and fun. The only think completely useless is unfortunately your comment.

You wrote:

> this is pretty close a Canadian Cross

And on that point, your correspondent is right. The two bear no real resemblance to each other. The cross compilation approach described in the article is not something to be held in high regard. It's the result of poor design. It's a lot of work involving esoteric implementation details to solve a problem that the person using the compiler should never have encountered in the first place. It's exactly the problem that the Zig project leader is highlighting in the sticker when he contrasts Zig with Clang, etc.

The way compilers like Go and Zig work is the only reasonable way to approach cross compilation: every compiler should already be able to cross compiler.

Thanks for putting it this way. I always wondered why cross compilation was a big deal. For me it sounds like saying "look, I can write a text in English without having to be in an English-speaking country!".

The problem with cross compilation isn’t the compiler, it’s the build system. See eg the autoconf, which builds test binaries and then executes them, to test for availability of strcmp(3).

I feel like there should be a more sane way to test for the availability of strcmp than to build and run a whole executable and see if it works.

The sheer number of things autoconf does even for trivial programs has always been baffling to me.

Go doesn't cross compile: it only supports the Go operating system on a very limited number of processors variants.

If zig were to truly cross compile for every combination of CPU variant and supported version of every operating system, it would require terabytes of storage and already be out of date.

It doesn't require nearly as much storage as you think. For zig, there's only 3 standard libraries (glibc, musl, mingw) and 7 unique architectures (ARM, x86, MIPS, PowerPC, SPARC, RISC-V, WASM). LLVM can support all of these with a pretty small footprint and since the standard libraries can be recompiled by Zig, it really only needs to ship source code - no binaries necessary.

If it's only supporting two OS runtimes and a small subset of hardware it's mostly just a curiosity.

People really do appreciate such convenience. I am not familiar with Zig, but GO provides me similar experiences for cross-compilation.

Being able to bootstrap FreeBSD/amd64, Linux/arm64, and actually commonly-used OS/ARCH combinations in a few minutes is just like a dream, but it is reality for modern language users.

I'm all for cross compilation, but in reality you still need running copies of those other operating systems in order to be able to test what you've built.

Setting up 3 or 4 VM images for different OSes takes a few minutes. Configuring 3 or 4 different build environments across as many OSes on the other hand ...

And actually building on these potentially very low-power systems...

Sure, but building typically takes more resources than executing, so it's not really feasible to use a Raspberry Pi to build, but it can be for testing.

Yes but the dev setup is not really necessary for all of those OSes.

Only if not using OS APIs.

Yeah sorry I didnt think about. Probably very important as low level code like this mainly for talking with OS APIs

You can do that in clang/gcc but you need to pass: -static and -static-plt(? I can't find what it's called). The second option is to ensure it's loader-independent, otherwise you get problems when compiling and running across musl/glibc platforms

Could you elaborate/link on the loader-independency topic?

In brief, most programs these days are position-independent, which means you need a runtime loader to load sections(?) and symbols of the code into memory and tell other parts of the code where they've put it. Because of differences between musl libc and gnu libc, in effect for the user this means that a program compiled on gnu libc can be marked as executable, but when they try to run it the user is told it is "not executable", because the binary is looking in the wrong place for the dynamic loader, which is named differently across the libraries. There are also some archaic symbols that gnu libc describes that are non-standard, which musl libc has a problem with, that can cause a problem for the end-user.

e: I didn't realise it was 5am, so I'm sorry if it's not very coherent.

I would also appreciate if you manage to be even more specific once more "coherency" is possible. I'm also interested what you specifically can say more about "The second option is to ensure it's loader-independent, otherwise you get problems when compiling and running across musl/glibc platforms"

Ok so, it's been a year or so since I was buggering around with the ELF internals (I wrote a simpler header in assembly so I could make a ridiculously small binary...). Let's take a look at an ELF program. If you run `readelf -l $(which gcc)` you get a bunch of output, among that is:

    alx@foo:~$ readelf -l $(which gcc)

    Elf file type is EXEC (Executable file)
    Entry point 0x467de0
    There are 10 program headers, starting at offset 64

    Program Headers:
      Type           Offset             VirtAddr           PhysAddr
                     FileSiz            MemSiz              Flags  Align
      PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                     0x0000000000000230 0x0000000000000230  R      0x8
      INTERP         0x0000000000000270 0x0000000000400270 0x0000000000400270
                     0x000000000000001c 0x000000000000001c  R      0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                     0x00000000000fa8f4 0x00000000000fa8f4  R E    0x200000
you can see that in the ELF header is a field called "INTERP" that requests the loader. This is because the program has been compiled with the -fPIE flag, which requests a "Position Independent Executable". This means that each section in the code has been compiled so that they don't expect a set position in memory for the other sections. In other words, you can't just run it on a UNIX computer and expect it to work, it relies on another library, to load each section, and tell the other sections where to load it.

The problem with this is that the musl loader (I don't have my x200 available right now to copy some output from it to illustrate the difference) is usually at a different place in memory. What this means is that when the program is run, the ELF loader tries to find the program interpreter to execute the program, because musl libc's program interpreter is at a different place and name in the filesystem hierarchy, it fails to execute the program, and returns "Not a valid executable".

Now you would think a naive solution would be to symlink the musl libc loader to the expected position in the filesystem hierarchy. The problem with this is illustrated when you look at the other dependencies and symbols exported in the program. Let's have a look:

    alx@foo:~$ readelf -s $(which gcc)

    Symbol table '.dynsym' contains 153 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
         1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __strcat_chk@GLIBC_2.3.4 (2)
         2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __uflow@GLIBC_2.2.5 (3)
         3: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND mkstemps@GLIBC_2.11 (4)
         4: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND getenv@GLIBC_2.2.5 (3)
         5: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND dl_iterate_phdr@GLIBC_2.2.5 (3)
         6: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __snprintf_chk@GLIBC_2.3.4 (2)
         7: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __pthread_key_create
         8: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND putchar@GLIBC_2.2.5 (3)
         9: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND strcasecmp@GLIBC_2.2.5 (3)
As you can see, the program not only expects a GNU program interpreter, but the symbols the program has been linked against expect GLIBC_2.2.5 version numbers as part of the exported symbols (Although I cannot recall if this causes a problem or not, memory says it does, but you'd be better off reading the ELF specification at this point, which you can find here: https://refspecs.linuxfoundation.org/LSB_2.1.0/LSB-Core-gene...). So the ultimate result of trying to run this program on a GNU LibC system is that it fails to run, because the symbols are 'missing'. On top of this, you can see with `readelf -d` that it relies on the libc library:

    alx@foo:~$ readelf -d $(which gcc)

    Dynamic section at offset 0xfddd8 contains 25 entries:
      Tag        Type                         Name/Value
     0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
     0x0000000000000001 (NEEDED)             Shared library: [ld-linux-x86-64.so.2]
     0x000000000000000c (INIT)               0x4026a8
Unfortunately for us, the libc.so.6 binary produced by the GNU system is also symbolically incompatible with the one produced by musl, also GNU LibC defines some functions and symbols that are not in the C standard. The ultimate result of this is that you need to link statically against libc, and against the program loader, for this binary to have a chance at running on a musl system.

Wow. Your answer is really a good fit to the details provided by the author of the original article.

Many, many thanks for the answer! I've already done some experimenting myself and wanted to do more, so it really means a lot to me.

For further interest you might want to take a look at:


I altered a version of that ELF64 header for 64 bit, and then modified it to work under grsec's kernel patches: https://gitlab.com/snippets/1749660

One example of a description of how the Linux linkloader works is here [0]. Other OSes are similar.

[0] https://lwn.net/Articles/631631/

Dlang is a better C. DMD, the reference compiler for Dlang, can also compile and link with C programs. It can even compile and link with C++03 programs.

It has manual memory management as well as garbage collection. You could call it hybrid memory management. You can manually delete GC objects, as well as allocate GC objects into manually allocated memory.

The Zig website says "The reference implementation uses LLVM as a backend for state of the art optimizations." However, LLVM is consistently 5% worse than the GCC toolchain at performance across multiple benchmarks. In contrast, GCC 9 and 10 officially support Dlang.

Help us update the GCC D compiler frontend to the latest DMD.

Help us merge the direct-interface-to-C++ into LLVM D Compiler main. https://github.com/Syniurge/Calypso

Help us port the standard library to WASM.

>However, LLVM is consistently 5% worse than the GCC toolchain at performance across multiple benchmarks

That is true, but it is ALSO true that LLVM is consistently 5% better than the GCC toolchain at performance across multiple benchmarks

D seems like its pitch is "a better C++," but "a better C" doesn't seem quite right.

D's whole premise of "being a better C++" has always made them look like argumentative jerks. Why build a language on top of a controversy? Their main argument from early 2000s: C++ requires a stdlib and compiler toolchain is not required to provide one. Wtf D? I mean I understand that C++ provides a lot of abstractions on top of C to call itself "more" than C, but what does D provide other than a few conveniences? If you even consider garbage collection or better looking syntax or more consistent, less orthogonal sytax a convenience. It didn't even have most of it's current features when it was first out back in early 2000s. Trying to gain adoption through creating some sort of counterculture what are they? 14?. /oneparagraphrant

It is probably the case that D has a brilliant engineer team who doesn't really focus on the PR side of things. D definitely provides value over C/C++ other than a few sugars for the syntax. It is just not communicated that well.

It has an official subset and associated compiler flag. (https://dlang.org/spec/betterc.html)

Can DMD compile C programs though? That's what "zig cc" does, and it's so much easier to get up-and-running than any crossdev setup I've used before.


Not really, because unlike Zig, D doesn't allow for C common security exploits, unless one explicitly write them as such.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact