> To release compression code in a non-safe language is risky enough At the mome...

PeCaN · on July 7, 2016

> At the moment, what's their real alternative?

Ada? Chapel? ATS? D if you avoid the GC?

> That said, I agree this isn't acceptable C code for something that runs on untrusted data while using tons of pointer arithmetic.

Why not? It's not the prettiest code ever, but it gets the point across of what's going on.

pilif · on July 7, 2016

You're right about C. C in general, I would find acceptable, because, yes, there aren't that many good alternatives around for this kind of code.

But there's nothing stopping you from writing readable C code. That's where my concerns come from.

msvalkon · on July 7, 2016

I don't really understand where the downvotes come from? I find the readability concerns legitimate, and would like to understand why compression algorithm developers feel like this is OK? Is it just the math heavy background? Can't think of any real benefits to this style.

throwaway2048 · on July 7, 2016

If you don't understand the underlying mathematical algorithms its using, no amount of explicit varible names are going to help you. If you do, the concise structure makes things straightforward. The code is not meant to be read alone and understood, the papers published along with it need to be understood first.

rimantas · on July 7, 2016

Exactly. Not all code can be understandable to layman with zero effort.

dang · on July 7, 2016

I suspect the downvotes are because code readability is a complex, subtle topic that often gets reduced to flamewars by people who are sure they know ‘the’ right way to do things.

Code readability is relative to the reader, the programming language, and the conventions of a codebase. That's a lot of things to be relative to! Knowing that ought to put speed bumps on the way to dismissing code one isn't familiar with.

I remember having a reaction years ago on seeing some of P.J. Plauger's C++ standard library code. I think I burst out laughing and said I'd fire anyone who wrote code like that for me. Years of subsequent experience have brought multiple layers of understanding how wrong I was.

72deluxe · on July 7, 2016

May I ask what is not memory safe about C++ in this situation? I am talking C++11, not the widespread C++98 stuff we find everywhere.

Or why not use local variables that are guaranteed to be cleaned up?

I suspect Apple would have an alternative to write this in Swift, but that would probably have speed implications (I am guessing).

vertex-four · on July 7, 2016

Which bits of the Rust runtime(?) do you think are too high overhead for this?

johncolanduoni · on July 7, 2016

Using rust 1.9.0, an empty (save for a function that adds two u32s) standalone dynamic library built in release mode on OS X is 1.6 MB. A static library is a whopping 2.4 MB. The comparable number for C are 4K and 800 bytes respectively.

Asking every client of the compression library to pull in that much overhead would likely make it rather unpopular. Until Rust gets better at eliminating unnecessary parts of the runtime when a program doesn't use it (something like GCC's gc-sections), it's not going to be feasible for small libraries to be written in it.

vertex-four · on July 7, 2016

My understanding is that the majority of that is jemalloc, libbacktrace, and glibc. jemalloc can be replaced with system malloc easily, libbacktrace can be removed by setting the compiler to interpret panics as aborts (which you need to do in a library used by C anyway, really), and glibc can be replaced with musl. This can bring binary size down to about 160kb for a binary which just does printf!(). Still not quite as good as C, but a lot better than default Rust with just a few tweaks.

steveklabnik · on July 7, 2016

  > (which you need to do in a library used by C anyway, really)

You can also use panic::catch_unwind at the boundary too, it depends on what you want to do.

nmjohn · on July 7, 2016

> A static library is a whopping 2.4 MB. The comparable number for C are 4K and 800 bytes respectively.

It is possible to significantly optimize that number [0]. Not that binary size is not an issue, but rather 2.4mb vs. 4kb is not an apples to apples comparison

[0]: https://lifthrasiir.github.io/rustlog/why-is-a-rust-executab...

johncolanduoni · on July 7, 2016

The article you linked is doing all of this with a binary. Last time I tried something like this with Rust, there were a lot more obstacles to cutting down this overhead with a library than a binary. Also note that towards the bottom he cuts out libstd, which loses any form of dynamic memory allocation, as well as a significant chunk of Rust's usability advantage.

The biggest factor however is that you have to read through that whole page, use unstable features (alloc_system) that condemn you to the nightly, and download and compile musl. This is a huge, brittle pain at the moment, and far from obvious to anyone who comes upon Rust and is thinking of building a C-compatible library using it.

steveklabnik · on July 7, 2016

> as well as a significant chunk of Rust's usability advantage.

What bits are you thinking of here? Just curious, as I do a lot of no_std work, and don't feel that way, and am probably blind to it :)

Rust 1.10, coming out later today, has a new crate type that removes Rust-specific metadata for dynamic libraries, by the way, making them a bit smaller for this kind of case.

johncolanduoni · on July 7, 2016

Unless I'm mistaken, no_std means no built-in non-manual dynamic allocation (Box, Rc, etc.), unless you use "extern crate alloc", once again requiring the nightly. Some fundamentals one expects from a modern language like Vec are also missing in either case.

This is fine if you're using no_std for something where these are anathema anyway (writing bare-metal OSes comes to mind) but a huge limitation for a humble user-space library. As it stands if you want to take advantage of Rust's safety you're going to need to reimplement at least Box, probably Vec, and Rc if your program requires that kind of thing. This isn't a huge time suck, but if I were feeling out C-compatible languages before writing a library it would be a major turnoff.

I really like Rust for low-overhead binaries but it is missing a lot when it comes to writing non-rlib libraries.

steveklabnik · on July 7, 2016

Ah I see. There's two things here: first off, I'm using it in an OSdev context, so I don't expect any allocation to exist, since I haven't actually implemented that yet. And second, I took your comment to mean the language itself, which doesn't lose anything with no_std, but you mean the convenience of the libraries, which makes sense.

By the way, you _can_ reintroduce just those things if you want to. no_std means "don't include std", but you can then require them:

    #![feature(alloc)]
    #![feature(collections)]
    #![no_std]

    extern crate alloc;
    extern crate collections;

    use alloc::boxed::Box;
    use alloc::rc::Rc;
    use collections::vec::Vec;

    pub fn foo() -> Box<i32> {
        Box::new(5)
    }

    pub fn bar() -> Rc<i32> {
        Rc::new(5)
    }

    pub fn baz() -> Vec<i32> {
        let mut v = Vec::new();

        v.push(5);

        v
    }

Of course, as you can see, the facade crates are largely not stable, so doing this on _stable_ rust isn't quite there yet, which is a thing that matters, as you originally pointed out. I expect as Rust grows for this stuff to stabilize, after all, the std versions are re-exported, so this example is de-facto stable, other than maybe the 'use' lines, which is an easy fix in the future.

Thanks for elaborating :)

johncolanduoni · on July 8, 2016

Yes, my issue is just that creating a small library with a C API requires a lot of machinations that are going to turn off anyone who isn't really set on going with Rust. For OS development, standalone binaries, or Rust libraries, I think Rust is in excellent shape as it is.

chillydawg · on July 7, 2016

I suspect any runtime, at all, would be too much overhead.

jdub · on July 7, 2016

This C code doesn't use the C library. Code like this in Rust wouldn't engage with any part of the Rust standard library either. There's no runtime work to be done, in either language.

johncolanduoni · on July 7, 2016

There is still an overhead in terms of executable size, unless you use #![no_std]. This environment is not easy to code in; it doesn't even have heap allocation.

vertex-four · on July 7, 2016

There isn't a Rust runtime, though, in the sense that you seem to be implying. There's a standard library, which is what I assumed they meant.

johncolanduoni · on July 7, 2016

Yes, I was using it to describe the standard library. C and C++ are both described as having a runtime, and Rust has one in the same sense. In any case I would consider the code needed for handling stack unwinding to be worthy of the name runtime (small though it may be).

I understand it doesn't have a runtime in the way a JIT'ed or GC'ed language has a runtime, but it's a runtime nonetheless.