These kinds of claims make me so dubious for a language. Also who is trying to switch from C to get an ever-so-slightly performance improvement? That use-case goes to specialized hardware such as FPGAs.
You picked that quote from the main page, not from the release notes, here is what directly follows:
* The reference implementation uses LLVM as a backend for state of the art optimizations.
* What other projects call "Link Time Optimization" Zig does automatically.
* For native targets, advanced CPU features are enabled (-march=native), thanks to the fact that Cross-compiling is a first-class use case.
* Carefully chosen undefined behavior. For example, in Zig both signed and unsigned integers have undefined behavior on overflow, contrasted to only signed integers in C. This facilitates optimizations that are not available in C.
Zig directly exposes a SIMD vector type, making it easy to write portable vectorized code.
So the argument is "exact same compiler as C/C++, but more opportunities for optimization thanks to better semantics and better access to native instructions". This seems reasonable on its face, so care to elaborate on the doubt?
The arguments for not switching from C are often performance or target related, so a language that purports to be an alternative to C would want to point out that those issues aren't a problem.
The reasons to switch away from C are numerous and well documented.
The latter. Zig actually kinda does define this behavior as "`a + b` shall not overflow" and inserts checks in debug and safe builds for it. To get overflow, which zig defines as wrapping overflow, you use a different operator and no check is inserted "`a +% b`". For speed optimized builds, unless you've explicitly told the compiler to insert checks in that scope, it will turn the checks into nops.
So, while it is technically correct to say that it has undefined behavior for overflow, the practical reality is quite different.
We do the same thing in Rust, but I think that characterizing this as UB is misleading, personally. We created a new category, "program error", for this, to distinguish from UB proper.
I'm not sure if Zig inherited the defined/implementation defined/undefined hierarchy from C and C++ though.
Undefined as in undefined by language spec. There are various processor implementations that have different results that are often quite useful. Would you preclude their use?
If something is defined or not is a property of the language specification, as you stated.
What you’re now bringing up is something different: should a specification define this behavior, or not? I think you’ve properly identified a trade off, but mid-identified the details. Defining a behavior here does privilege certain architectures, but it doesn’t make it impossible. It means the behavior must be replicated in code on some architectures, which is slower.
This is the trade off that Rust (and apparently Zig) are making. This is informed by the architectures that exist, and which ones are wished to support. At this point, two’s compliment is overwhelmingly used, and so the decision was made that defined behavior is worth it. Note the parallel here with POSIX specifying that CHAR_BIT must be set to 8.
Notably, the situation here is so overwhelming that even C++ is specifying two’s compliment: C++20 says that signed integers must be. This was recently voted in. I haven’t read the paper recently and I don’t remember if it changes the UB rule here or not, but it does define the representation of integers, a very similar thing that’s the motivation for UB here.
I was holding out for hardware enforced overflow and underflow checking but I guess it has been abandoned. Thanks for the info. These choices are being made as was made opposite in C standard deliberations... I am interested to see how it will work out.
The way we’re dealing with that is that there’s room in the spec where we’re allowed to turn it to “always checked” as a compiler setting, and will if it ever gets good enough hardware support. We’ll see though!
It's not about the speed being a benefit per-se, but rather not a determent. One of the big reasons people still use C over other languages is that C is perceived to be faster (and indeed usually is).
I would, first of all, call bullshit on this claim. A language can't be "faster than C". Comparing a good implementation with a bad C implementation is where these claims come from.
- to map cleanly with what compilers know to optimize best
- fine control over stack and heap allocation
- has intrinsics / assembly escape hatch
- allows you to specify that pointers don't alias (restrict in C, or default in Fortran)
- gives you prefetching primitives
You will be able to reach hand-tuned Assembly-like performance (and not just C-like performance).
Case in-point, I tuned my own matrix multiplication algorithms in Nim to carefully control register allocations, L1, L2 and L3 cache usage, and vector intrinsics to reach the speed of assembly tuned OpenBLAS and Intel MKL-DNN (no assembly at all):
And matrix multiplication has decades of research and now dedicated hardware (tensor cores, EPU, TPU, NPU, ...) as this is a key algorithm for most numerical workloads.
Why not?
For instance, with current hardware, the good use of SIMD is critical. If you can't write vectorized code in (standard) C while you can do it in another Language, that would be a "faster" language.
Same for other language features that enable faster code: Compile-time computations, better memory model with different pointer aliasing semantic, ...
Yea. It's always funny when people say C is faster than Fortran for scientific computing. I mean, it's true, but only for a very limited amount of people who want to do serious numeric work. C/C++ certainly aren't rare in this area, but Fortran is pleasant to work with for most numeric work and has lots of helpful built-ins and runs close to C in speed.
These kinds of claims make me so dubious for a language. Also who is trying to switch from C to get an ever-so-slightly performance improvement? That use-case goes to specialized hardware such as FPGAs.