Hacker News new | past | comments | ask | show | jobs | submit login
Re: Integrating "safe" languages into OpenBSD? (2017) (marc.info)
319 points by xnyt 34 days ago | hide | past | web | favorite | 389 comments



This is less "On Rust" and more "On accepting a rewrite of any tool into OpenBSD, on the merits of memory safety alone".

There isn't really much of a statement or judgment on Rust. At most there's an interesting point on it's value proposition:

> However there is a rampant fiction that if you supply a new safer method everyone will use it. For gods sake, the simplest of concepts like the stack protector took nearly 10 years for adoption, let people should switch languages? DELUSION.

This is mostly true. Developers in particular don't generally care about security, so selling Rust as a "secure" language is not going to be enough. I've said this since 1.0. But it's not entirely true for products, which often drive development - Chrome's pairing of performance and security led to tons of marketing wins.

Given that tools like "cat" etc are:

a) Not generally security sensitive

b) Target developers

I don't see anyone choosing the "rust cat" over the builtin. This is why people build tools like "bat", that aren't just memory safe copies, but they're memory safe tools that add features and target slightly different cases.

Not much else to get from this post, I think.


Yeah. Not sure about BSD, but I was wading into building the GNU coreutils and other GNU packages just yesterday. Fresh hell, they all seem to be build dependencies of each other. ‘sed’ is its own build dependency.

The whole C ecosystem is a joke with respect to builds—all dependencies are implicit; you’re just expected to have the exact dependencies installed on your system at the exact versions and in the exact locations on the filesystem that the build tooling is willing to look. Autotools and CMake are absolutely terrible; they make Python’s build ecosystem look sane and well-designed.

So no, “security” isn’t the most compelling use case (for me, anyway), it’s moving past these dumpster fire build systems as quickly as possible so mere mortals can build their own software. Specifically, hasten the day where my entire application dependency tree doesn’t bottom out in some Autotools, CMake, shell script, project.


> Yeah. Not sure about BSD, but I was wading into building the GNU coreutils and other GNU packages just yesterday. Fresh hell, they all seem to be build dependencies of each other. ‘sed’ is its own build dependency.

This is an area where the BSDs are pretty good. There may not be a dependency system here exactly, but because XBSD base builds XBSD base, everything you need is in one checkout. Generally with just (BSD) Make driving the build.

Programs outside of base are genrerally built with Portfiles which specify dependencies such that (BSD) Make can satisfy them. But somebody has to write the Portfile, and it's often not the authors of the program.


Agreed on a hundred thousand.

I used to feel sheepish defending npm and the whole node package ecosystem against its critics, but once I started trying to run deep learning applications or WebRTC media servers I quickly realized that some of the critics are probably coming from a much worse package management system that they’ve merely grown familiar with.

Want to have two pieces of software on one computer that rely on... idk, two different versions of libwebsockets? Yeah, you’ll need either docker/overlayfs, or some filesystem or pkgconfig hack that I’m sure a comment reply will helpfully mention if you want two have to versions of a library installed for two different executables.


"two different versions of libwebsockets"

If your programs depend on different MAJOR versions of libwebsockets and libwebsockets' build system doesn't easily allow that to happen then complain to the developer of libwebsockets about it. If the developer of libwebsockets is not interested in solving this problem then complain to the developers of the programs which depend on libwebsockets that they should find a better library which doesn't suck. If they don't listen, rewrite the tools you depend on to not suck.

If, on the other hand, the programs depend on different minor versions of libwebsockets and libwebsockets broke API in such a way that just using the highest common version doesn't work then complain to the developer of libwebsockets that they should stop doing terrible things like breaking API and then complain to the developers of those applications that they shouldn't let this kind of breakage slide and should move to a different library.

If you think this is unrealistic then I think your push towards insane let's-pull-all-the-dependencies-into-one-binary/package build systems is unrealistic.

You want a sane build system? The change starts when you stop tolerating idiots who cause things not to work.

We don't tolerate engineers who build crappy bridges which break and kill people. You shouldn't be tolerating programmers who build crappy libraries which break API or don't allow two major-versions to be easily simultaneously installed and linked against.


My issue is that the default behavior is to install these libraries globally, even when they could probably be installed locally for the thing you’re trying to build. (I picked libwebsockets out of the air, I don’t necessarily have a complaint with that particular library)

On my old MacBook Pro I have dozens of node projects that I’ve downloaded over the years just to mess around. If I ever want to clean up, I can delete a project’s folder, and know that everything it installed is now gone (because it would be in its node_modules folder).

On my linux desktop if I want to mess around with a cool project that’s written in C/C++ I immediately pull out Docker and start trying to build up an image, since I don’t want to pollute my global library folders with libraries that I might be downloading just to try out this one thing. I can then delete the docker image if I want to clean up. This works, but it feels like a workaround to the main problem of libraries being global.


My experience has been that, for sane libraries, the configuration process supports installing to an arbitrary prefix. In general, I end up making a new prefix for every project. And if doing this turns into dependecy hell, shouldn't that make you think twice about your dependencies? Worst case, you find out you need a different libc, and then you start questioning your life decisions.


The benefit of global libraries is that they are easier to update or patch. If there is a vulnerability in an npm package, you will need to update it in all your node projects (and/or rebuild all your docker images analogously).

A hybrid solution where the package manager is aware of all installed versions, can symlink to a globally installed copy and can update all symlinks would probably work.

Unfortunately, there is very little interest in fixing the problem on both sides (distro PMs and language PMs) since they both believe their own approach is right and don't care about the other group's problems.


yarn uses a global cache and symlink if I remember correctly.

and npm has an "audit" feature, which is default on, so there's quite a bit of movement in JS land to address the disadvantages of having a myriad small packages.


Its not about the size of the packages, its about managing an entire system, for whatever definition of the word "system"

In the "olden" days, you would patch the vulnerable dependency and be sure that all services running on the machine have been updated to the new version. (in theory - if everything works great and there is ABI compatibility etc etc).

Since the library is installed in a well-known system location, there are only a few paths in the filesystem that you would need to check to make sure it's up to date. This makes machine maintenance easier.

Nowadays, to patch a vulnerable dependency you need to potentially go to all individual projects, check whether they depend on it, update the version for that project, make sure to rebuild any containers that... contain it. Rinse, repeat.

There is also not a single place where the dependencies live, so you can't just run the same command across multiple machines to do the check - you have to check the directory structure of every individual service.

This can be very slow and painful.


Agreed. Yet it seems the ecosystem simply accepts this trade-off. For increased productivity, increased memory-management security, etc.


You seem to be confused about how typical C/C++ projects are built. If you need globally installed dependencies, you're going to have to install them yourself. A typical makefile is not going to download a long list of random junk from the internet and install it your machine without asking. Although you'd be forgiven for thinking so if you're used to npm.

If you don't like where the project installs its binaries or libraries, you can configure that before `make` or `make install`. You can install it to your home directory, have a separate directory tree for experiments, or whatever you want, and clean it up whenever it is convenient. You can also install things in usr/local and just ignore them. The system is actually well designed if you learn how to use it.

If you need a dependency, use your package manager to install it and you can trivially delete it later... and if you try five different frontends for the same library, you only install one copy of it.

Docker is a workaround for a lack of Unix awareness and sysadmin experience.


The system is not well designed. For example: if I have 3 projects I'm developing locally, each needing a dozen (somewhat overlapping) libxxx-dev dependencies to build, how can I automatically clean up the dependencies that are no longer needed when I delete one of the 3 projects?

Distro package managers are focused on providing a good experience for sysadmins while ignoring needs during development time. Unfortunately, docker and the developer-centric improvements in language package managers have come at the expense of worse sysadmin experience.

The situation will remain sub-optimal until both groups start working together to address both needs.


If you've got some leftover libraries in /usr/local/lib on your development machine because of an experiment, why is that the end of the world? If you're worried about it, you shouldn't be.

The worst thing about Docker is that people think learning Docker is a substitute for basic competence as a sysadmin.


I'll put it in context.

Let's say I have leftover garbage in usr/local/lib. I clone a new project that I need to patch because it's broken. I try to fix the new project but the fix isn't working. I have no idea why. I spend hours trying to trace down the problem.

What was the issue? The project picked up some of the old garbage libraries I had in usr/local/lib, which had a bug that made my fix not work.

For developers, global dev library garbage = global mutable state. It interferes with development, creates more sources of potential issues, and takes time to trace and debug. A clean and/or localized environment is essential for the accurate pinpointing of bugs and issues.


> I spend hours trying to trace down the problem.

This is the problem. Once you understand library loading, once on your system, and learn, once, how to query the system and see what libraries are being loaded and from where, you won't have to waste hours on this problem again. Until that point, you're operating a system without understanding, and that will always lead to breakage.


Of course I understand how it works. I just can't constantly keep in mind what garbage I played around with 3 months ago that might be responsible for the issue I'm looking at right now. I don't expect it, and more importantly, I shouldn't have to expect it.

It's a basic tenant of development. Eliminate all additional sources of issues and complexity by presenting a clean, well-controlled sandbox in order to effectively diagnose it and fix the real bug.

In fact, you should be able to try a newer version of a dependency, an older one, or run 10 variants (containers) each with different versions to e.g. test whether the project you're working on is widely compatible with a flexible range of that dependency. These things are done all the time.

The tools are inadequate, not the developers. Engineers on both sides need to stop assuming everyone else is an idiot.


I've installed hundreds of pieces of software from source, and I have a fairly decent idea of how linking and loading works. I still have to spend time figuring out why something doesn't build when it should due to something strange about my system.


As someone who’s spent a lot of time as a sysadmin over the years, your last comment sounds backwards to me. The DevOps trend has pushed more general awareness and Docker has played a key role in three ways: making explicit automation requirements for the setup process (Dockerfiles have to work, unlike that MS Word document someone last updated 3 years ago), being private by default and requiring someone to reverse engineer the system to learn which things should be shared, and being immutable means that teams are forced to document shared state and backups rather than leaving messes behind for ops to deal with.

The key to understanding the benefit is the power of having done this in a standard, widely used way which is supported by many tools, used by an increasing fraction of projects, and which makes knowledge and training broadly portable. Some shops had some of those benefits but most fell short in multiple areas, ran into arguments with vendors / contractors, or required substantial ramp-up time for anyone new. Distribution packaging solves some problems very well - there’s a reason almost everyone uses it inside containers - but it has well-known problems with conflict resolution and coordination which are not easy to solve, tending to lead to inflexible policies and restrictions. Having a standard answer for the boundaries between generic system-level and application-specific concerns has been a great way to improve both sides while saving a good deal of time.


> The key to understanding the benefit is the power of having done this in a standard, widely used way which is supported by many tools, used by an increasing fraction of projects, and which makes knowledge and training broadly portable.

You're talking about Makefiles here circa 25 years ago, right?

Makefiles have to work, and /bin/sh is still there last I checked on my machine. The problem with the broadly portable knowledge and training is that many people seem to think "don't bother with all that, just use docker" is an acceptable answer and "best practice" is not to learn Unix because it's not new and shiny.


Saying makefiles solve that is like saying screws solved construction. They partially address only one of the concerns I mentioned, doing nothing for the others, and as the numerous tools in that space (automake, cmake, etc.) show they’re hardly a complete solution even for just the challenge of building things.


> Makefiles have to work

And yet I get GNU Makefiles all the time that don't work when I type "make" at my computer, since they use things that aren't compatible with my version of Make. It's clearly more of a problem than you think it is.


It sounds like you just want a good package manager.

On my system I have built lots of random projects to "mess around" and to achieve the same effect of cleaning them up I can just make a list of all the packages I explicitly installed to make them work (including packages which I already had) in a simple flat file. I can then feed that file into my package manager to delete only packages which aren't used by anything else.

Now, I can hear you say: "this isn't practical, these projects depends on 400 dependencies". The question I would then ask is why has it become commonplace for a "dependency" to sometimes equate to 20 lines of code. For these "dependencies" sure I think maybe some kind of local storage is appropriate, but I think that this is already solved in languages like C using git submodules. I don't personally like this approach myself, I would rather people just flesh out a standard library or other common libraries than to produce a 20 line snippet that everyone includes in their project through submodules. But the fact of the matter is, this is still an option.

The other thing I should point out is that your docker image is surely going to take up an order of magnitude more space than just installing a few libraries and keeping them around. This in turn means that you could have just kept an order of magnitude more projects in the same amount of space. And finally, the approach of putting everything in one directory is a big hit in terms of disk space. I don't know much about node or JS but I have to assume that the language is still primarily text based. Keeping around hundreds of dependencies written in text (even with minification but does that happen with node?) is going to take up a lot more space than a few shared objects. So in an ecosystem where everyone has decided that disk space isn't so important, what is the point of easy cleanup?

Given the wording you use by "pollute" I assume you literally mean having those files there in the first place. But I don't understand what the problem would be with that. Especially when most projects use dependencies you're most likely already going to have installed. As long as you don't work against your package manager, nothing should ever break as a result of having too many packages installed.


> Want to have two pieces of software on one computer that rely on... idk, two different versions of libwebsockets?

You've pointed this out yourself, but this works just fine in any sensible OS. You just have two or more versioned instances of libwebsockets in your /usr/lib directory, and a symlink (or multiple symlinks with different partial-semvers) that points to some appropriate default. I think even Windows is doing something like this nowadays, as part of the whole WinSxS mess.


Why does a "default" need to exist at all at the system level? Why not just have various versions available and have version decisions be local to the software being developed/used.


It doesn't, strictly speaking, and in fact, it barely does.

It's usually just a symlink from libwhatever.latest.version.number to libwhatever.so, on the theory that you probably want the latest version of the library (with all its improvements and bugfixes).

There's sometimes an intermediate symlink to the newest major versions too (libwhatever.latest.so, libwhatever.2.so), to avoid breaking changes.


It doesn't. Just don't install it if you don't want it.


Security. By having defaults the operative system can share the most secure minor version of that library (or multiple major if necessary). When you have a copy of a library for each program then most of your programs will end up using obsolete or unsecure dependencies.


Plus it also cuts down on disk space usage.


There are a lot of package ecosystems, but more packages, and one of these packages will always depend on something implicit.

Building a new system is not going to help. Xkcd 927 about standards applies


I agree that we shouldn’t try and wrangle npm, gems, cargo, etc. into one package manager. I have no problem with different communities having different package ecosystems.

My main problem is that when it comes to C/C++ dependencies, the package ecosystem defaults to installing globally and has no standard way to specify dependencies aside from having instructions to “sudo apt install these 20 libs”.


> Want to have two pieces of software on one computer that rely on... idk, two different versions of libwebsockets? Yeah, you’ll need either docker/overlayfs, or some filesystem or pkgconfig hack that I’m sure a comment reply will helpfully mention if you want two have to versions of a library installed for two different executables.

Just setup the correct flag / env variable. That's not named a "hack", that's named knowing how to use a computer.

If you do not know how to compile and link an hello world in C. Then indeed, you should probably stay with an hello world in NPM/JS and its 1500~ packages dependencies.


> Fresh hell, they all seem to be build dependencies of each other. ‘sed’ is its own build dependency.

Except that people have run into the exact same issue with stuff like rustc and cargo. Bootstrapping an entirely new platform is just hard, no way around it.


Comparing coreutils to a self-hosting compiler is a strained analogy. Bootstrapping a compiler is about as hard as it gets; there’s no reason coreutils needs to depend on itself.


> Comparing coreutils to a self-hosting compiler is a strained analogy.

What? Why? Unix is a self hosting environment. You can't even boot the OS you're building on without coreutils or the equivalent.


There's "no reason" why the rustc build should depend on cargo, but apparently it does. And this makes it way harder than it should be to keep rustc support up-to-date in any distro, which in turn impacts stuff like Firefox updates.


> There's "no reason" why the rustc build should depend on cargo,

rustc is a Rust project. Cargo is a build system for Rust projects.

> this makes it way harder than it should be to keep rustc support up-to-date in any distro

I am not aware of distro maintainers complaining about this, or at least, not any time in the last few years. Do you have some context to share? We want distros to have a good experience.


Debian unstable (IIRC) updated Firefox a couple weeks late because cargo got a couple of new dependencies and nobody added them to the Debian repos. This is because they forbid static linking, so there are hundreds of packages of popular Rust crates compiled as shared libraries. Even worse, some of them have multiple variants, for various combinations of features.


Sure. This is one instance of this happening, and it was fixed, it just took some work. Nobody even filed an upstream bug. This isn’t something exclusive to Rust. It’s almost arguably more due to the way that browsers are treated as an exemption from distro packaging policies (which I appreciate, personally.)


That’s a Debian issue. They’re trying to foist a dependency management strategy intended for C on a different language with a different strategy


> Fresh hell, they all seem to be build dependencies of each other. ‘sed’ is its own build dependency.

Well, they are the "core" utilities after all. This becomes less of an issue the further out in the ecosystem you get.

> at the exact versions and in the exact locations on the filesystem that the build tooling is willing to look.

This isn't strictly true.. you just need _compatible_ versions to be available, and most build tools and even the compiler itself will check several commonly used locations for code.

There are a few corner cases, but by and large, this isn't as big a problem as it might be for other languages that make writing incompatible interfaces orders of magnitude easier than with C.

> Specifically, hasten the day where my entire application dependency tree doesn’t bottom out in some Autotools, CMake, shell script, project.

Each those solve the same problem at different levels. Perhaps this is because there is no _one true_ way to replace them, and the advantage is that each project uses the tool most suited to their case.

Perhaps the built-in tools with these "modern" languages are a good general tool for the majority of developers, perhaps they aren't. If they aren't, then now you have the same problem all over again in a new ecosystem.


I don’t know man, I’ve written a lot of C and C++ and other languages and the sheer effort to get your average C project to compile is much greater than others. The dependencies aren’t listed anywhere, they are rarely compatible with the versions available via your package manager (so now you’re building more things from source), ./configure regularly runs into problems (“couldn’t guess which platform you were on”, “your version of grep is incompatible”, etc). It’s just a big mess of yak shaving. If you think I’m making too big a deal, I invite you to spend some time in the Go or Rust ecosystems where you just “cargo build” or “go build” and you’re done. You don’t need to understand anything about the project, there are no shitty DSLs to grok (looking at you, CMake). Things just work.


With CMake you can build statically in a very reproducible cross platform way if you use Hunter and toolchain files.

the tools are all there - just they're not popular and aren't widely used

I wrote a bit of an old intro to the workflow: https://geokon-gh.github.io/hunterintro.html

This is an example of a project that's well setup using this pattern: https://github.com/elucideye/drishti

And having things work dynamically ... well that's what package maintainers due. It seems inherently very fragile. Developers only build test and develop against whatever dependency versions they have locally. you can't test against every version under the sun


Regarding build tools for C, you might find this one interesting:

https://github.com/vmchale/cpkg


Yeah, there are lots of good build systems for C (didn’t know about this one; will have a look; thanks for sharing), but the problem is that they have tiny adoption, especially among “core” ecosystem packages, like coreutils and friends. The problem isn’t technical, it’s political and cultural. This ecosystem is hostile toward its users for no discernible reason.

If anyone knows of a distribution that aims to use only software with sane build systems, let me know, I would be very curious.


Bootstrapping and too many binary dependencies at the early stage is a known problem in GNU Guix and there are people working to fix it, such as the GNU Mes project to bootstrap the system using a minimal compiler, shell and set of utilities written in Scheme. Despite this, there are still many steps to bootstrapping a full system and the complete process is not simple.

It's not for "no reason", in general you will not be able to get away from having a complicated build system at the lowest level. Otherwise you would need to be prepared to give up on nice things like cross-compiling and supporting more than one compiler, CPU architecture, etc.


Actually focusing on cross generally usually forces when to be bootstrapping better.

The problem is most distros / deployed systems are managed and bootstrapped imperatively, which there is nothing pushing back against complicating the bootstrapping. Thus, entropy wins and it gets more complex.

As someone who as actually started about reproducing BSD bootstraps declaritively (https://github.com/NixOS/nixpkgs/pull/82131/files) they are way more entangled than they need to be, so Theo de Raadt has no ground to stand on.


Nix is becoming more of a name in dependency management systems: https://blog.galowicz.de/2019/04/17/tutorial_nix_cpp_setup/

Its claims to fame are support for multiple languages, reproducible builds, and system-wide package caching.


Yeah, Nix is really neat. It's the right direction for sure, but it's mostly used as a wrapper around autotools and cmake projects by third parties and not so much as a first-class build tool by the maintainers of C/C++ libs (this isn't going to change because Nix prefers to be "innovative" and unfamiliar in just about every design decision--its core approach to reproducible builds is right on, but the execution leaves a lot to be desired if you want to convince a community of developers to adopt it as a build tool). So anyone writing a Nix expression to package an autotools or cmake project (me, in this case) still has to deal with their nonsense.


Nix is a package manager, not a build tool. It's never been in-scope for it to figure out how to invoke a particular compiler or linker. All it can do is make it easier for you to invoke another build tool, of which there are quite a few.

I would not recommend the use of autotools in new projects, but in my experience CMake with Ninja is fine for distro maintainers to deal with — All other modern build tools I've seen are roughly the equivalent of it. If you find that it's hard to get over the hump with that, then I hate to say it but I don't think solutions will come easily elsewhere on any OS or platform. The state of things on free software operating systems is constantly improving but for now it's something you have to get used to. There are no silver bullets here.


> It's never been in-scope for it to figure out how to invoke a particular compiler or linker.

True

> Nix is a package manager, not a build tool.

False. Nix should be the go-to tool for executing and caching all dependency graphs. Just the planning of those graphs is domain-specific and out of Nix's purvue.

Build systems that use Ninja actually get the execution-planning separation right. (Try Meson if you hate CMake.) So basically, we just need to make Nix better than Ninja in every way, and then hook it up to these build systems (directly or with ninja2nix).

Of course, then we will demonstrate all the impurities and non-determinism in people's CMakeLists. That alone is the cultural problem to solve for normie evangelizing. Everything else is technical and there's use bemoaning Nix is just too weird.


> Nix is a package manager, not a build tool. It's never been in-scope for it to figure out how to invoke a particular compiler or linker. All it can do is make it easier for you to invoke another build tool, of which there are quite a few.

I agree that Nix is focused on package management and not building, and that’s a bummer because Nix is perfectly capable technically. In particular, Nix can trivially invoke a compiler or linker; its issues are that it assumes that it will only be used by a relatively small number of package maintainers instead of a much larger number of developers, so it prefers to be novel and innovative instead of familiar and straightforward. This makes it really hard to get developers to buy in, but there’s no technical reason why it can’t be used as a build system.

> I would not recommend the use of autotools in new projects, but in my experience CMake with Ninja is fine for distro maintainers to deal with — All other modern build tools I've seen are roughly the equivalent of it. If you find that it's hard to get over the hump with that, then I hate to say it but I don't think solutions will come easily elsewhere on any OS or platform. The state of things on free software operating systems is constantly improving but for now it's something you have to get used to. There are no silver bullets here.

The silver bullets are getting away from the C ecosystem to the greatest extent possible, otherwise you will have to build way more autotools and CMake projects and this isn’t a good use of anyone’s time. As fun as C is, for most applications, we have better alternatives nowadays. They aren’t perfect, but they let you move a lot faster especially with respect to building software.


[flagged]


The reason most of these things aren't replaced is sheer inertia. You're asserting they got this position due to merit. I assure you, spend any appreciable amount of time working with Autotools, or any of the "traditional" Unix build systems, and you'll find their success in large open source projects is only due to

1. Actually being better than the hodgepodge of shell/sed/awk/perl/etc scripts that people used before Autotools came around

2. The massive amount of work done by people to fix all of the corner cases where Autotools didn't work

I think you're maybe looking at them with rose colored glasses. When you've spent hours trying to track down obscure, undocumented `./configure` options, dealing with vague linker and compile issues, implicit, circular or broken dependencies, buggy and arcane configuration scripts... I can go on and on and on about the problems of building open source software. Building things from source is a yak shaving exercise -- always chasing your tail. It's great when you're using a configuration that has been tested and mostly fixed up, but step outside of those parameters and it can be hell.

It's not that newer build systems don't have their own issues, they just try to address the historical problems that people have experienced from using things like Autotools for decades. You can scoff all you want, but no sane individual is going to use Autotools as their modern build system. They're going to use something like CMake, or Bazel, or their language's build system.


> It's not that newer build systems don't have their own issues, they just try to address the historical problems that people have experienced from using things like Autotools for decades

I totally agree with what you said. And I am not defending Autotools here. I would even advise any new person here to NOT use Autotools if they can and stick to something more modern ( CMake, Meson or other).

There is however many reason why C build systems are what they are (heterogeneity of platforms, multiples compilers, ABI, multiple standards and multiple build systems themselves to deal with) that many other languages do not have (or not yet ?).

And it would be good to understand that before calling them dumpster fire because they are not as convenient as your cargo install.

This is not helping and deeply counter-productive.

This is this kind of attitude that headed to the situation that we have currently : 100 build systems and even more problems because almost none of them took lessons of the past (Scons, Waf, imake, B2, QMake, qbs, etc, etc ).


"If you are not able to run a ./configure / make / make install & 3 apt-get properly."

Nearly choked on my water reading this. Incredible how many developers are stockholme syndrome'd by their godawful tooling experience.


> Nearly choked on my water reading this. Incredible how many developers are stockholme syndrome'd by their godawful tooling experience.

I tend to name a "magic bulshit bloat tool" that will makes me download 350MB of crap to compile an 'hello world' a "godawful tooling experience".

We are all free to have our own definition.


You can tell me that other build tools have problems, but the "You must know how to run some os/project-specific commands to write a program or you're a bad developer" is the funniest kind of gatekeeping to me.

But keep calling people "script kiddies" that's cool.


> commands to write a program or you're a bad developer" is the funniest kind of gatekeeping to me.

I am curious of your notion of "gatekeeping", because you are the one being funny here.

People have been able to understand Make, use it and compile C for 20 damn years: Are people being more stupid today ? ( Spoiler: No)

If "today", with the unprecedent level of documentation you have on the Web, it is a problem for you to do so, then yes: you are indeed a bad developper.

I am a strong believer of "the show me the code" approach. It is easy to call all the actual tooling a "dumpfire", then show me a better one, or at least document in details what "a better" one would be.

Critics are easy and in this precise context, without arguments, they are just FUD or religious beliefs.


> I am curious of your notion of "gatekeeping"

Repeatedly telling people that they're bad developers, should walk away from a computer, that they are "script kiddies", because they do not know or enjoy some arcane technical artifact. I call this gatekeeping.


> enjoy some arcane technical artifacts

Except that I do not think that you can call executing "make" and using your native package manager an "arcane", I am sorry.

These are basic actions that are documented even from tutorials from the 90's. They are basic actions that are generally part of any first year study in computer sciences.

That is not 'gatekeeping', there is no gate to keep here. I name more that fight laziness and understand your tooling.


Speaking as a C developer of twenty or thirty years or so who's used make for all that time:

Make is absolute, utter arcane bullshit. It is mad. It is complete nonsense.


> Make is absolute, utter arcane bullshit. It is mad. It is complete nonsense.

Great. And you could be a developer for 40 years, if you do not precise why and how. This is your comment which is arcane bullshit and complete nonsense.


I think there’s a difference between being able to use make and ending bale to understand make. My suspicion is that most people can’t understand the makefile for a project they didn’t write or spend significant time on.

It’s nice that their is a stable convention, but it’s a shame that it’s so dated and is missing so many modern conveniences (mostly to do with dependency handling).


Understanding make is the easy part.

Manual dependency resolution is a tedious waste of time, no matter how well you understand what you are doing. It can and should be automated.


Considering the staggering number of languages out there that have implemented their own worse-version of make, or even those that haven't and the language-agnostic tooling systems that are implementing make instead...

I hate to say this but Bazel is probably the first time we have a tool that's better than make. It took over 40 years to get here. And it's still a pain in the ass.


> Considering the staggering number of languages out there that have implemented their own worse-version of make, or even those that haven't and the language-agnostic tooling systems that are implementing make instead...

The language-specific build tools tend to be good, at least for mainstream languages introduced in the last 20 years. Certainly better than make.

Bazel and especially Nix are really interesting and directionally correct, but like you mention, they are pains in the ass. Fortunately, their issues aren’t fundamental to the category of general purpose reproducible build tools. They’re just issues of execution. And arguably Nix isn’t even trying to be a build tool as much as a packaging tool (subtle but manifestly important difference), which is a damn shame.


> The language-specific build tools tend to be good, at least for mainstream languages introduced in the last 20 years. Certainly better than make.

They are good for 'quick & dirty' dev usage. And terrible for everything else. Specially integration.

If they were perfect, you would not see things like lerna in JS (https://github.com/lerna/lerna) or Conda in python (https://docs.conda.io/en/latest/) to go around them.

> And arguably Nix isn’t even trying to be a build tool as much as a packaging tool (subtle but manifestly important difference), which is a damn shame.

And they are right to be so. Build system and Package manager should be separated, or at least should be usable independently.

If your environment is multi-languages or system oriented, there is no possible practical way you can get a tool that reliably do both and scale. It would imply to rewrite every the language specific builder of every language.

I would say that in an ideal world, we would have language specific build system and one generic & portable set of dependency resolution tools (package managers).

In our current world, every language is its own isolated island and making them interoperate is a mess.


Yes! It’s ridiculous how modern languages all try to do package management too, and how they’re all mutually incompatible.

I’m not so sure about language-specific build systems, though. Trying to set up a custom asset pipeline in most build systems is a pain, when they want you to shoehorn everything into a single language rather than just calling out to external tools and scripts (ideally including tools built as part of the same project!) Bazel gets this right.


Python's build ecosystem IS sane and well-designed. It's the pinnacle of package handling so far. A separate package manager which works together with a separate build system which still works when a distribution wants to package a python package.

What build systems like cargo seem to think is awesome is probably the most backwards idea of build systems I've ever seen. Static linking and automatic downloading of dependencies. Even the ability to directly pull from a github repo. No wonder almost nothing written in rust ever gets packaged.

Dependencies are hard, following a spec like semver seems to have confused people. Developers are lazy and don't want to think about settling down their API. Solution? Just give up and encourage bad behaviour. Thanks guys. Great solution. I think I'll stick with the previous one.


> Python's build ecosystem IS sane and well-designed. It's the pinnacle of package handling so far.

Big oof. Python packaging has a lot of problems, and defaulting to a single shared location when installing, that might conflict with your default package manager, is very high on the "What the fuck are you even doing" list of issues. You also very easily get into version hell. Unless you start using virtualenv, at which point you're basically doing the same thing cargo is.

> automatic downloading of dependencies

How is that a problem? Cargo has a lockfile to ensure the dependency downloaded is the exact same, various flags to prevent cargo from updating the lockfile (--frozen and --locked), and even the ability to vendor dependencies with cargo vendor[0]. It defaults to the most convenient things for developers because packages will be worked on many more times than they are packaged.

> Even the ability to directly pull from a github repo.

So does python/pip. And most package managers I've worked with. Crates.io will refuse packages that have github dependencies though, so this only comes up when packaging something that isn't really ready to be packaged.

As for linking, there are several rust-specific problems to dynamic linking:

1. Rust currently has no stable ABI (apart from the C ABI). This means updating rustc (and maybe even updating llvm) may cause the ABI to change, and libraries to become incompatible. There is currently no plan to fix this on Rust's part, and in fact some opposition, since it would prevent some optimizations from happening (such as niche-filling optimizations).

2. Generic functions and impls would not be part of this, as monomorphised variants would have to end up in the final binary. I believe C++ has the same problem with templates. A lot of Rust APIs are generic. This could theoretically be fixed (at least partially) through automatic promotion to dynamic dispatch, or other similar schemes.

[0]: https://doc.rust-lang.org/cargo/commands/cargo-vendor.html


> Python packaging has a lot of problems

Sure, but I think that what systems like cargo provide is worse across the board.

> defaulting to a single shared location when installing, that might conflict with your default package manager, is very high on the "What the fuck are you even doing" list of issues.

This is easily solved if pip was to have a way of listing all dependencies which it's about to install. Then I could mangle the names to fit my package manager's style and automatically install them with my package manager. But I will give it to you, this omission is one of the few things which pip could improve on.

> You also very easily get into version hell.

This is not a pip problem. Stop using things which result in dependency hell. Semver solved this problem a long time ago. If developers insist on not stabilising their APIs and breaking them all the time those developers libraries' shouldn't be used.

> Cargo has a lockfile to ensure the dependency downloaded is the exact same,

Great, now I can make my software depend on an out of date and vulnerable version of a library. Why do you think encouraging bad behaviour is a good thing? Yes, getting versions right is hard, but it's better than the alternative of static linking and having to update 100 pieces of software when you find a critical vulnerability in <popular network library>. This gets even worse when not only does every developer have to change this lockfile to update their code but now because they used lockfiles and as a result the upstream library developers felt it was safe to break API 100 times between when that library was locked and when it was patched. Now a simple update of <popular network library> which would involve pushing an ABI compatible patched version via distribution channels requires 100 developers to all push updates to their software which may or may not include having to re-write portions of their software.

> It defaults to the most convenient things for developers because packages will be worked on many more times than they are packaged.

Only because you allow and encourage it to happen. Which I hope I've made my point clear is NOT a good thing.

> So does python/pip. And most package managers I've worked with. Crates.io will refuse packages that have github dependencies though, so this only comes up when packaging something that isn't really ready to be packaged.

So it's good to see that this issue has been appropriately addressed. I wasn't aware that this was a restriction of crates.io. That being said, it doesn't stop everyone and their grandmother from just not using crates.io for certain things and not having their stuff on crates.io.

I wasn't aware this was possible with python but I also never see python packages using it.

> Rust currently has no stable ABI (apart from the C ABI).

Which isn't a good thing.

> 2. Generic functions and impls would not be part of this, as monomorphised variants would have to end up in the final binary. I believe C++ has the same problem with templates. A lot of Rust APIs are generic. This could theoretically be fixed (at least partially) through automatic promotion to dynamic dispatch, or other similar schemes.

I think Ada has a solution for this. Not 100% sure but it can't be impossible to solve if only rust developers made it their goal to make libraries work somehow.


Rust not having a stable ABI is not universally bad; it allows for breaking changes obviously. Swift has stabilized its ABI with generics, but its implementation is not quite zero-cost so I am not sure if Rust could go that direction.


> No wonder almost nothing written in rust ever gets packaged.

rustc is also built with cargo, and it is packaged on most distros. A lot of distros are including packages of other tools too.


Python packages don't interact well with the built in package manager on my os. It doesn't even try to, which is a major design flaw.

Now if python had a way to do this and package managers refused to interopate (as I believe would happen) we would have a different discussion.


Pip doesn't need to integrate, it just needs to install elsewhere or give you a way of listing what will be installed. Cargo on the other hand just doesn't even bother letting you try to use it with a distro package manager.


This is pretty much a judgment of Rust:

  Such ecosystems come with incredible costs.  For instance,
  rust cannot even compile itself on i386 at present time
  because it exhausts the address space.

  Consider me a skeptic -- I think these compiler ecosystems
  face a grim bloaty future.
I would think that OpenBSD developers/users would care about security, since that's pretty much the value-proposition of OpenBSD.


Bizarre that he thinks a language or compiler is doomed unless it can compile itself on each platform it supports. Microcontrollers and cross-compilation wouldn't exist if that were the case.


I think it was more the implication that requiring more than 2GB of RAM to compile itself was an indication of its profligate resource usage.


That doesn’t seem like an especially strong argument since computing is full of time/space trade-offs and 2GB of RAM hasn’t been a significant amount for over a decade at this point (X86-64 only goes back to 2003 but it was a latecomer following Intel’s epic failure to deliver Itanium). Things like whole program optimization seem like a reasonable trade off for <$20 worth of RAM.


How much of that is Indicative of rustc vs llvm? The latter isn't exact a lightweight piece of software.


Until there is a rustc in production that doesn't use llvm to emit code in the backend, the point is moot, no?


To me the question is moot if the problem is llvm. I don't know if that's the case or not.

Modern tools require modern hardware. If the problem is that the target constraints don't allow modern tools to be used, the problem is the constraints and not the tools.

But maybe I just don't understand their stance here.


Depending on what you are working on / with, you may not be able to choose your constraints.

Constraints that you cannot change can certainly dictate what tools you have to choose from.


No, I don't think so. If LLVM is at fault then the issue is definitely not inherent to Rust the language or Rust the development culture.


Your code is what it depends on. If you are picking dependencies with no concern for the overhead they introduce, well, it says everything it needs to say: you value that dependency over the associated overhead it introduces.

Asking an OS to adopt those same preferences, without any appreciation for their needs or preferences is pretty rude and arrogant.

A better solution would be to ask “what would it take, because I’d like to understand your needs”. Maybe then Theo could help you understand their constraints. Trying to argue with him that he needs something he clearly doesn’t isn’t very nice and won’t get a nice response if you push it too far.


You are ultimately responsible for your dependencies. No program is innocent of its dependency graph.


Interesting point.

Does anyone know how much memory is needed to compile llvm+clang? GCC?


That varies widely on your linker and how many threads you use. I have compiled LLVM on systems with 4 GB of RAM, though; but have been forced to fall back on single-threaded compilation.


IDK about compiling llvm/clang, but using clang/llvm needs far less memory than gcc. I've used clang in machines with less than 256mb of RAM.


It seems like a number of *BSD projects are generally in favor of llvm and clang, in part because they dislike GNU/FSF.

I know OpenBSD supports some more esoteric hardware that LLVM probably will not target, however.


Well OpenBSD has this "dumb" rule that snapshots/releases are never cross compiled and be done on real hardware. So it's pretty much a rule that every hardware platform has to be self hosting.

Microcontrollers are the exception, but then they don't run a full blown OpenBSD stack anyways.


[flagged]


I don't know what project this is but a blanket "add a test which fails without your patch for EVERY PATCH no exception" rule is a bit cargo-culty.

Especially since the kind of test you're asking for, when written by a developer who is less than passionate about writing such a test but rather more passionate about just solving the problem his patch addresses, is probably not going to be worth much compared to a properly thought out set of tests which target the class of problems which the patch solves. Or possibly even an adjustment to existing tests on the basis of what the patch solves.

Finally, surely you get plenty of patches which do small useful things like cleanup but which it would likely be impossible to write a test for.


If you're saying great tests are better than adequate tests, nobody's disagreeing. But untested code puts a variety of burdens on project maintainers. I don't see why maintainers should be obliged to take on that burden when the person wanting the patch won't bother.

Also, I don't see your quoted rule in the post you're responding to. Instead I see them saying that in a specific case, they ask for tests with patches. How did you get from that specific case to such a universal, no-exceptions rule?


> I don't know what project this is but a blanket "add a test which fails without your patch for EVERY PATCH no exception" rule is a bit cargo-culty.

We had (and still have) 100% code coverage on Windows MSVC, Windows GNU, MacOS, iOS, Android, Linux, FreeBSD, NetBSD, DragonflyBSD, Solaris, WASM, and many other platforms, as well as all possible hardware (ARM32/64, x86/x64, ppc32/64/be/le, sparc64, mips32/64/be/le, mips64, riscv32...).

The OpenBSD parts of our code base weren't tested at all; they weren't even compiled.

In practice that meant that pretty much every PR broke the OpenBSD support in some subtle way, and people would only discover those failures during the release process, which took a lot of time to fix, usually from somebody that did not care about OpenBSD at all, requiring them to create a VM to be able to develop and test from OpenBSD itself.

So yeah, we decided to unanomously require that code that cannot be tested be removed, and gave the OpenBSD devs the chance to fix it. Starting with requiring OpenBSD contributor on their PRs to add testing support.

It wasn't practicable because their platform is designed to prevent non-OpenBSD users to develop for it. So we removed the code, and this became some OpenBSD user group problem.

This solution actually worked better for them, because it became up to them to decide when new versions of our project got released for OpenBSD, so they could get the latest release, apply patches, test that it compiled locally and passed tests, etc. before doing a release for OpenBSD, something that we did not care about doing.

We believe that code that isn't compiled doesn't compile, and that code that isn't tested doesn't work correctly. Our OpenBSD parts were living proof that this is 100% true.

I find it quite ironic that this project is security related, and some of the bugs that hit OpenBSD were CVEs that were completely preventable. This CVEs only hit OpenBSD because it was the only untested platform. In a sense, OpenBSD users actually were quite lucky, in that most releases broke compilation for OpenBSD, and that saved them from many CVEs. Unluckily for them, not all bugs result in compilation errors.

> Finally, surely you get plenty of patches which do small useful things like cleanup but which it would likely be impossible to write a test for.

We have 100% code coverage. All our functionality is heavily tested and fuzzed. The modifications in a cleanup PR are exercised by many tests, and therefore tested.

If someone submits a bugfix, that means we were missing a test, so we require users to add one. OpenBSD devs weren't able to do that, so we stopped accepting their bugfixes, and at some point removed all the code.

If you don't think that bugfixes should be accompanied by tests that check that the fix works and the bug cannot be reintroduced again, then I disagree with you, and hope I never have to use your software.


> This is pretty much a judgment of Rust:

I guess. It's really a judgment of Rust (specifically the compiler toolchain) for this one use case.

> I would think that OpenBSD developers/users would care about security,

Maybe. Again, I think "cat" and other system tools are really low on the list of priorities for anyone securing a system. Not to say that they don't represent attack surface, by any means, but there are just a lot of things to do before rewriting the utility entirely.


“cat” and related tools make perfectly interesting exploit targets on Unix systems that follow the Unix philosophy:

https://www.cvedetails.com/cve/CVE-2014-9471/


There is a big difference between being having vulnerabilities and being valuable attack surface. Again, I am not saying that cat and others do not provide attack surface, or even valuable attack surface, only that step 1 is not "rewrite cat".

Generally "cat" is not exposed to the internet, and if you're running a service and you said "I'm concerned about local attackers using cat" it's probably a lot easier to just put the service into some sort of environment (container, sandbox, user, whatever) that doesn't give access to cat.


> step 1 is not "rewrite cat"

Totally agree on this point. In fact, I think it’s brash for the rust armada to think that the borrow checker alone is going to help replace 20+ years of maintenance on these kinds of tools, which often crufty, and gnarled by the sands and whims of generations that preceded us.


As a member of the Rust Armada I can tell you that I don't believe that rewriting cat in rust is something I consider a high priority.


That's not BSD cat, that's GNU cat. And I am using AWK and KSH + dcgi in my Gopher server with a chrooted root gopher hole, it works fine.


In fact, it’s neither, it’s coreutils (gnu, yes) `date`. My point is that seemingly innocuous system utilities do make for exploit targets, because folks shell to them.

Yes, you can lock things down with jails/containers, but it’s a valid target. Defense in depth, and all that jazz.


They do care about security. They just don't care about Rust.


They can not care about more than one thing.


> Developers in particular don't generally care about security, so selling Rust as a "secure" language is not going to be enough.

You do realize that OpenBSD is, by far, the most secure general purpose operating system specifically due to decades of thankless work by people like Theo de Raadt, right?

I don’t think healthy skepticism of Rust is strong enough evidence to conclude that they don’t care about security.


I thought it was the most secure general purpose operating system because it's so rarely used - especially for use cases like running untrusted code in unprivileged accounts - that it's a low-value target. Sure, OpenBSD doesn't have exploitable security holes when used as e.g. a packet router, but when was the last time Linux did?

OpenBSD's website says things like "Only two remote holes in the default install, in a heck of a long time!", but remote holes in the default install is such a small surface on any OS (besides e.g. Windows XP). When was the last time you saw a remote hole in the default install of, say, Ubuntu desktop? macOS? The Amazon Linux AMI?

The interesting vulnerabilities are in server software that isn't running by default, local privilege escalation, etc.

Also, weren't there like four security bugs in December, at least one of which was remotely exploitable?


> Sure, OpenBSD doesn't have exploitable security holes when used as e.g. a packet router, but when was the last time Linux did?

didn’t linux move a packet routing vm into the kernel in the last ten years? I’d say chances are high there are multiple exploitable holes just based on time and surface area alone.


> When was the last time you saw a remote hole in the default install of, say, Ubuntu desktop?

Literally right now, with snaps. If you only mean unintentional remote holes, and not deliberate backdoors, then we have very different ideas of what "most secure general purpose operating system" means.

Also, BSD doesn't generally actively refuse to fix security holes when thousands of people compain about them.


Granting your framing for the sake of argument, then we've got one remote hole in a heck of a long time in Ubuntu desktop, and still nothing in macOS or Amazon Linux. I'm certainly not claiming that any of these OSes is better than OpenBSD. I'm claiming they're all about the same, and the "OpenBSD cares more about security than everyone else" narrative isn't actually based in evidence. OpenBSD has a particular view of security, which they care about very much, and OpenBSD has done some very cool and precedent-setting things - but that particular view is applicable to narrow use cases, and other people quite reasonably care about other parts, while incorporating versions of innovations from OpenBSD too.

Also, if I understand your unstated argument correctly, even if we don't admit a difference between unintentional remote holes and "deliberate backdoors," there's still a huge and meaningful difference between remote holes that can be exploited by a tiny number of people and remote holes that can be exploited by anyone.


> we've got one remote hole

Are you forgetting the time they sent local filesystem searches to some spyware company? And that's just things that were a: deliberate, and b: public enough that I remember them off the top of my head despite not having used Ubuntu in years? (I forget which specific problem made me drop it, or I'd probably have a third example.)

OSX is a toxic, vendor-supplied-malware infested cesspit that I've never used and don't pay much attention to, and I've never even heard of Amazon Linux, so I wouldn't expect to have examples for those.

> the "OpenBSD cares more about security than everyone else" narrative

Actually, my claim was that Ubuntu (and maybe a significant fraction of "everyone else", but that wasn't really my point) is actively opposed to security.

> even if we don't admit a difference between unintentional remote holes and "deliberate backdoors,"

There is a difference; there's a huge difference; deliberate backdoors are much, much worse. This kind of shit is something I would expect of Microsoft (Windows) or Google (Chrome).


> Are you forgetting the time they sent local filesystem searches to some spyware company?

That was not a remote hole.

> OSX is a toxic, vendor-supplied-malware infested cesspit that I've never used

Sure, but is any of the vendor-supplied malware in that cesspit a remote hole?

I'm happy to have broad, open-ended arguments about who sucks more in new and innovative ways, but let's finish the argument we're already having first. Is OpenBSD meaningfully more secure than other operating systems on the axis they are choosing to advertise, namely "remote holes in the default install," than other OSes?

In particular, whatever you believe about deliberate backdoors, toxic cesspits, being actively opposed to security, etc., none of that is something the choice of programming language is in any way relevant to. If Theo's argument were "We don't need Rust because we are the only operating system that isn't actively opposed to security, so everyone else has lost the game already," we'd be having a very different conversation. But it's not and we aren't.


> You do realize that OpenBSD is, by far, the most secure general purpose operating system specifically due to decades of thankless work by people like Theo de Raadt, right?

Yes, I think it is fair to say that I am very familiar with all of this information. I work in information security, and was quite into Linux kernel security for a while. I disagree with your assertion about it being the most secure general purpose OS but I'm not gonna go there :)

> I don’t think healthy skepticism of Rust is strong enough evidence to conclude that they don’t care about security.

Cool, that was not my conclusion either. I'm saying that Rust's value proposition in general includes memory safety and I think that it's not compelling for most developers, especially with regards to rewrites of low-value attack surface.


> I disagree with your assertion about it being the most secure general purpose OS but I'm not gonna go there :)

Could you please go there? I'm very curious what the contenders for "most secure general purpose OS" might be.


I think geofft's points go a long way. I think having this conversation would require a lot though - what is "Secure" what is "general purpose". Is OpenBSD even general purpose? Is the base install general purpose? What makes an OS secure?

I think maybe as a "with 0 configuration to the OS/ services" OpenBSD could be a top contender. With "I have practical security challenges to solve and I'm willing to change things in this system" I might choose Linux. If I'm handing a laptop out to an employee or friend, maybe a Chromebook, or even Windows! It's a pretty nuanced discussion that I couldn't do justice to without really caring and feeling that everyone involved in the conversation is equally invested/ coming in with the right attitude, which is impossible on an internet forum.


Well, there is only one real "general purpose" OS that runs on supercomputers, mainframes, servers, laptops, tablets, mobile and embedded, and that's Linux. So it's also the most secure.


I like bat more than cat, not because it’s memory safe, but because it’s benefiting from the entire Rust ecosystem and provides features I enjoy that cat does not have.

It’s not always security, but that’s always a plus (though I have no idea if bat is any more secure than cat). I can make the same statement about exa and many of the other tools that are getting better in the Rust suite of CLI tools.


Indeed, I came here to say that I’ve replaced many such standard tools with more functional equivalents which happen to be written using Rust - bat, git-delta etc.


> Developers in particular don't generally care about security,

Bad developers don't, but many developers do. The Rust project itself has hundreds of contributors, to the point that it feels that it has more contributors than LLVM itself (I work on both, and this is an unbacked feeling I get from the velocity of the contributions).

Point being, if developers wouldn't care about Rust, they wouldn't be developing it.


Maybe I shouldn't have said that, because it's contentious, but I do believe it to be true.

I believe Rust's success has less to do with memory safety, which I think most developers (anyone coming from a GC'd language) consider table stakes, and much more to do with great documentation and incredibly powerful primitives and ecosystem such as the type system, cargo, crates.io, etc.


A GC only protects you from memory errors, not threading errors.

In general, Rust allows you to write software that doesn't break silently. That's a quite good value proposition for large scale software, where other languages often require programers to be super careful with refactorings, while in Rust you can really refactor all the things.

The reason people are afraid to do large refactorings in say C++ is often "security-related": fear of introducing segfaults, memory errors, undefined behavior, threading errors, etc. but one can also see these fears as "productivity-related" (hours and hours of debugging), or through many other lenses (shipping bugs to users, having a segfault in the middle of a demo that costs you a client, introducing a segfault one day before the release of your game, etc.).

In general, every programmer wants to have a certain degree of security that their software "works" for some definition of "works". For some this security might be actual network security, but for others it might mean that they don't want their data-science app that has been running for a week burning thousands of dollars to crash due to a segfault while writing the results.


Rust's type system also only protects about threading errors if they happen to be data races for in-process data structures.

If they are external resources, there is little it can do.

While an improvement, it isn't a full solution, specially in the domain of distributed computing.


For multi-process data-structures using inter-process shared memory, you can still write Rust APIs that protect you from data-races as long as all processes involved use those APIs. If they don't, then protection is up to the operating system (you can lock and unlock shared memory, but whether processes can violate the locks is an OS thing).

For distributed computing, you typically use message passing of some form via network sockets, that's "safe" by design. You can also use RDMA, and there usually your process needs to opt into that, create an RDMA readable/writable region, and you are back to the mercy of your network stack up to which kind of protection you can enforce there.

From Rust point-of-view, if you make the wrong assumptions about shared memory or RDMA regions, your program has a bug. The fix is simple, use appropriate atomic (or atomic volatile) to read/write from those regions. Even if another process is writing while you are reading, as long as the operations are the right ones (e.g. <128-bit atomic reads/writes), you can avoid memory unsafety and data races.

This won't protect you from deadlocks or race conditions, etc. but that's not something that any programming language does. Rust has deadlock protection within a process, but that's not required for safety, so it is an extra feature on top.


That is exactly my point, while it is definitely good to have, it isn't the end solution that many assume at first thought.


However there is a rampant fiction that if you supply a new safer method everyone will use it.

A fiction indeed. This is how developers actually behave. If you build a better mousetrap, lots of developers will all too quickly give you n reasons why it's terrible, and why you won't be able to pry their (punch cards/assembler/compiler/favorite-language/favorite-tool) out of their cold, dead hands.

This has been true, literally since the modern era of computer programming began in the middle of the last century, and it shows no signs of abating.


Yes, we've changed the title from "Theo de Raadt on Rust", which broke the site guidelines:

"Please use the original title, unless it is misleading or linkbait; don't editorialize."

https://news.ycombinator.com/newsguidelines.html


(2017)


Oh good grief. I don't know how we missed that. Added now, thanks!

HN has enough sensational Rust discussions with fresh articles and didn't need to supplement those with a 3-year-old one. Oh well, win some lose some.


Anecdata to the rescue: since installing ripgrep, my use of grep has plummeted. If I ever use it, it's because it comes as part for a copy/paste command I'm running.


A lot is missing also.

First he doesn't know about ripgrep, which is far better than GNU or BSD grep.

Second,none of the known grep's can find unicode strings. Redhat carried along the uni patches for a while, but when people complained about performance on the new utf8 locales, they dropped it. coreutils is still missing unicode support, so we don't find equivalent strings with different bytes. No normalization, no fold casing. Rust based coreutils could solve that, because they have the proper libs which are better than in C land and could survive the perf critics.

Third, rust is not secure. Just more secure than C. Memory safety, thread safety and type safety are lies. People buy it, but Theo should know.


> First he doesn't know about ripgrep

ripgrep isn't POSIX compliant. Never was and never will be. So I don't think it's really applicable here. ripgrep is maybe an existence proof that a competing tool can be written, but it is certainly not a suitable POSIX compliant grep replacement. Building a fully POSIX compliant grep tool with good performance like GNU grep is pretty difficult. It could be done. It would probably take me a couple months (and only by leveraging my existing work). I just don't have any incentive to do it, personally.

> Second,none of the known grep's can find unicode strings.

That's definitely not true. GNU grep can do this just fine:

    $ echo 'Δ' | LC_ALL=en_US.UTF-8 grep '\w'
    Δ
    $ echo 'Δ' | LC_ALL=en_US.UTF-8 grep 'δ' -i
    Δ
GNU grep does pay more of a performance penalty for Unicode support than ripgrep does. It's easy to see this with a somewhat pathological case by comparing `rg '\w{42}'` with `LC_ALL=en_US.UTF-8 grep -E '\w{42}'`. (Not all cases are pathological, but ripgrep and GNU grep are both very good about literal optimizations, so it's easiest to demonstrate with a pathological case.) I'm not sure if one can attribute this to better library support in the Rust ecosystem though. It's really about how the regex machinery itself is built.


Unicode search is not searching for byte equivalents. Unicode characters can be composed in different ways, with different bytes. What you see is a á but internally it can be composed of different bytes. the single a with accent or with a mark, a plus accent.

Unicode search needs do be optimized to normalize characters, and similar problems exist for case folding, grep -i

These problems are also security relevant btw. Many unicode strings are now identifiers, like names or file paths. that the stdlib still provides no functions to search and compare strings is a much bigger beef. grep is just a symptom of a much bigger problem


Yes, you mentioned normalization and I didn't correct that, because indeed, grep can't do that. But you also just said "Unicode search," which is a pretty general term. GNU grep is certainly Unicode aware, as I demonstrated. It just doesn't solve every use case you want. But it does have Unicode-aware character classes and also Unicode aware case folding.

A grep tool that did normalization would be extremely slow. You'd probably be better served by a specialized tool. Or even better, it should be possible to write a fairly simple wrapper that puts everything into your desired normal form before searching.


- Theo doesn't give a shit on ripgrep, but grep(1) complying with standards and working under OpenBSD scripts.

- I am using grep(1) on unicode stuff in Spanish on XTerm just fine, even in ed(1). This is not a GNU craputils base, but the OpenBSD one. Heck, I type Spanish only characters with ed(1) in order to write my phlog. I can use nvi(1) from ports, but ed(1), fold(1) and ispell(1) are more than enough and I can forget about switching modes.

- That's a great example of Cargo Cult.

- On Unix and descendants, plan9 gave us UTF8, and fore sure it's grep supports more UTF8 stuff than your non-existant Rust based OS.


In a thread about OpenBSD,

> Developers in particular don't generally care about security

Thanks, now I have to clean a mouthful of coffee off my screen!


Is the joke that Theo said virtually the exact same thing in the linked thread?

Again, quoting Theo directly...

> However there is a rampant fiction that if you supply a new safer method everyone will use it. For gods sake, the simplest of concepts like the stack protector took nearly 10 years for adoption, let people should switch languages? DELUSION.

Anyways, I think you've clearly misunderstood my point.


> This is less "On Rust" and more "On accepting a rewrite of any tool into OpenBSD, on the merits of memory safety alone".

If people wanted memory safety so much, they could have done that years ago with Java


It also doesn't help that Rust keeps on pushing on the idea that "static linking is the only way to go". This is another cargo-cult which I wish didn't end up being engrained so deep in the toolchain because while it has some merits, it also has significant drawbacks of a typical unix distribution.

Static linking might be good for folks distributing a single server binary over a fleet of machines (pretty much like Go's primary use case), or a company that only cares about a single massive binary containing the OS itself (the browser), but it stops "being cool" very quickly on a typical *nix system if you plan to have hundreds of rust tools lying around.

Size _matters_ once you have a few thousands binaries on your system (mine has short of 6k), think about the memory requirements of caching vs cold-starting each, and running those and how patching for bug or vulnerabilities will pan out. There's a reason dynamic linkage was introduced and it's _still_ worth it today.

And rebuilding isn't cheap in storage requirements either: a typical cargo package will require a few hundred megabytes of storage, likely for _each_ rebuild.

Two days ago I rebuilt the seemingly innocuous weechat-discord: the build tree takes about 850mb on disk, with a resulting binary of 30mb. By comparison, the latest emacs binary from git rebuilds in a 1/10 of the time with all the features enabled, the build tree weights 150mb (most of which are lisp sources+elc) with a results in a binary of 9mb, not stripped.


I don't think the position of the language teams is that static linking is better than dynamic linking. I think that static linking is a significantly easier target for a relatively new language whose _novel_ features make it difficult to define a stable ABI.

Under the very specific constraints that Rust is operating in, static linking is currently the best option.

That said, I wouldn't mind seeing some kind of ABI guarantees that make dynamic linking possible. https://gankra.github.io/blah/swift-abi/ is a great, accessible read on some of the challenges of making a stable ABI, so I certainly don't expect it SOON.

I do wonder if there is some scheme that could be adopted in a performant way for making the ABI stable as of the Rust Edition, even if it were only for core and std, statically linking everything else?


Deployability really should be king. I hinted at this in my other post, but I don't think quite enough people think about this.

Static linking makes your deployability worries go away. That's not to say that isn't at least possible with dynamic linking, but the complexity and gymnastics will sink you. Everyone gets bit by it eventually.


Eh, even with Rust's static linking, that's not a guarantee that there will be no worries with deployment. Most SSL libraries in Rust, for instance, currently link to (Open/Libre)SSL dynamically. So you often can't take a Rust binary compiled on a distro with one version of the SSL library, to another distro with a different version.


So you're saying that static linking isn't a guarantee, because the build was polluted with dynamic linking.

Isn't that furthering my point?


Much of the ecosystem seems to be trending towards using rustls though, where this is of no concern.


Actually openssl is pretty compatible between versions, usually not an issue in the real world.


Just to drive the point home.

If all of your engineers die in a fiery plane crash en-route to the company offsite, or your datacenter is wiped out in a flood, at least you have your statically linked binary that can run on commodity servers somewhere.

You have the peace of mind of knowing that your code as built should be able to run somewhere else in its current state without modification. You don't have to worry about the package availability of something that may have been around when you shipped your servers but may not be when you go to ship them again, or something that only coincidentally worked because your systems were installed via a certain upgrade path that's no longer reproducible.

It's a simple matter of business risk and minimizing surprises.


> Deployability really should be king.

Not everyone values deployability as much as you (or employers, I suspect) do. On my personal computer I care about getting security patches and disk usage.


Lots of people don't measure risk very well and get hit by black swan events.

Over 65% of startups have less than 6 months of cash reserves right now and 74% have been laying off staff. It turns out the majority of people are poor long-term planners.

I personally don't care about how you manage your personal workstation, but you're not who most of us are building for. Most of us aren't building tools to support you. We're writing a big ecosystem for everyone to collaborate.

In a professional setting, problems with how people manage their computers aren't acceptable. Any decently-sized company will get rid of such a problem quickly. In smaller companies I've seen people get fired over poor management of their workstation's environment.


Not everyone who uses software is a software engineer…


Not everyone who uses a hammer is in the trades either, but that's who hammers are designed for because that's who is buying.


I think most of the people who use computers are not software engineers.


The computer isn't the tool here, the context of our conversation is about software.


Deployability is the job of the package manager, not compiler. You can always link dynamically and ship with all libraries and set rpath accordingly.

Also: static linking is not deployability if the source code is not shipped to the target - you don't have _my_ version of a library which is patched to support my hardware.


Static linking is more ergonomic for developers.

Dynamic linking is a better use of user resources, which is sort of ergonomics.

To say one metric should be king without context is missing the point.


How much do your programs really share? After libc, libm, and pthreads, the most common thing they link to is probably pcre, and I'm sure you can guess how many of your programs are using that.

A good linker will shave off the parts of a library you're not using, and the parts which are left over are usually not very big. The problem isn't with static linking, it's that some "developers" think that bundling an entire Chromium build with their app is a good idea.

Rust has a problem with big binaries (so does Go), but that's Rust's problem, not static linking's problem.


Dynamic linking is pretty common in GUI applications (xlib, GTK/QT). Server-side there's also libssl, xml libraries, zlib, curl.


Outside of OS distro packaging, most Qt apps actually copy dynamically linked Qt binaries to the same app folder for re-distribution to end users within some form of installer or disk image. The same probably applies to Gtk apps too, as I think the last time I manually installed Gtk for Windows for a Gtk app was a decade ago.

Dynamic linking only works if you can guarantee ABI stability and folks haven’t had to deal with ABI changes since C++13 to the point where if the C++ folks can’t change the ABI by C++23 we will forget it was ever a problem because we’ll make the cost of change too hard. And the current C++ ABI currently makes some parts of C++ executables sub-optimal unless you hunt down your own standard-library alternatives.

Additionally, with dynamic linking, both code authors and code users now need to agree on versions to support under assumptions that newer features in newer libraries aren’t worth adopting quickly. To that end, some OSes do update dynamic libraries more quickly, but doing so theoretically requires a lot more recompilation and potentially you’re downloading the same binaries more than once. At that point, dynamic linking is worth less than a binary-optimized compression algorithm, no? Especially for distributing changes to said executables.

Which isn’t to say, for OS distros, that dynamic linking is bad, far from it, it tends to be the only valid solution for programming against OS core components, but that in our haste to update dynamic libraries independently of code compiled for them, we tend to forget ABI compatibility and the costs to maintain API compatibility across a wide variety of dependency versions for packagers and developers (or alternatively, the lack of updates to new library features for end users).

Windows APIs never changing is the reason Windows stagnates more than macOS, where Apple is less afraid to say your older app simply won’t run any longer. Linux suffers less from this, but as pointed out in the email, part of that is because POSIX implementations are relatively stable over decades, whether or not significant improvements are still possible for more modern UX or security, for example.

The details of dynamic linking on OS platforms can be found in this recent series of posts: https://news.ycombinator.com/item?id=23059072 (in terms of stability guarantees besides libc dynamic linking)


It is also one of the reasons why macOS is not welcomed by enterprise IT for large scale deployments.

You can have dynamic linking with ABI stability with stuff like COM and UWP.

It is also the only viable way to do plugins in scenarios where IPC is too costly.


Wouldn't GUI bindings link to the underlying system library anyway, even in Rust? I think those Rust bindings are thin enough where having them as an additional dynamic library would not be helpful. Similar for the server side - in practice, Rust-based solutions are quite rare and ones that don't expose a stable C ABI for ease of using from C/C++ are even rarer.


> bundling an entire Chromium build with their app is a good idea.

Does Chromium offer any other type of linking? I mean, is the problem in the developer including it when another option is possible or the fact that no other option is possible?


The problem is reaching for Chromium to solve any problem in the first place.


But chromium offers a lot of relevant functionality. If I remember correctly one of the alleged reasons Microsoft moved to a chromium browser was to actually sort of include it in the OS and have electron apps link to it dynamically.

But before that time the chromium bloat is 100% independent from the static linking bloat.


On my computer, there's the standard libraries and frameworks that provide things like GUI widgets, cryptography routines, and access to various user databases.


> I wish didn't end up being engrained so deep in the toolchain

For what it's worth, rustc does support dynamic linking, both for system dependencies and crates. You can compile a crate with the type `dylib` to produce a dynamic rust library, that can then be linked against when producing a final binary by passing ` -C prefer-dynamic`.

I don't think it's possible to make this work with cargo currently, but there are a couple open issues about it (albeit without much activity).

And of course, rust not having a stable ABI means the exact same rust compiler (and LLVM version, I suppose) would need to be used when compiling the dynamic libraries and final binary.


Generics in Rust makes this difficult, not to mention the lack of a stable ABI. I would think it's still possible, but I'm not sure just how much easier static linking is over dynamic linking.

There has been work done on using sscache for cross crate build artifact caching to reduce the footprint when building many binaries.


I don't think static linking is a cargo cult at all. It's just a simple and good enough solution, and there hasn't been enough interest into supporting anything else well. If it would show up as an articulated obstacle, you'd see movement to resolve it.


I can see your point, but to give you a little more perspective on the breadth of the problem domain:

I wish they went as far as Go with their ability to produce fully static binaries and allow for cross compilation. Go's ability to have one CI pipeline running on Linux that then produces binaries that run on every conceivable version of Linux, Mac and Windows is a huge productivity boost.

For most of my use cases, they don't go far enough with static linking!


Note that Go implements most of its toolchain itself while Rust uses parts of the C/C++ toolchain (llvm, C++ linker instead of go's own linker, etc). Go even has its own library loader. Once you have that it's not hard to make it compile to any target you want.

Also Rust is still evolving. They still have to add major new features to the compiler instead of being able to implement features like raw dylibs which would remove the need to include .lib files.


The problem is not the toolchain itself, it’s the runtime. Go’s runtime is interfacing directly with OS kernels, Rust is linking against glibc, etc. I wish the Rust community would invest in a runtime that inreracts with the kernel directly in the same way Go does. Current approaches aim to implement libc drop in replacements in Rust. There are also the musl targets which could make building static binaries easier. A completely rusty approach without going through the libc interface would have tons of benefits, though!


I meant to include the runtime in the "toolchain" term. See how I spoke about the library loader that Go implements.

The libc is only part of the greater issue of community tolerance of C components in the stack, like openssl or host OS TLS implementations, while Go mainly seems to just use the tls implementation of the language creators. There is rustls but it's not regarded as good enough and the defaults are using native-tls or openssl. And even rustls uses C components as it builds on ring which has C components of its own...


Cargo cult - what a great pun!


You can find many articles on the origin of the term. I like this one from 1959:

https://www.scientificamerican.com/article/1959-cargo-cults-...


You're being downvoted--it's because you missed the pun. Everyone knows the term "cargo cult". The pun is combining that with that the Rust package manager is named "Cargo".


Okay, thanks for telling me.

Not everyone knows this term, surely.

And I didn't know the Rust package manager is called "Cargo".


Dynamic linking is just not very useful for rust due to lacking a stable ABI. So yes, you can build shared libraries and link to them but you'll be using the C ABI at the interaction boundary, which some devs might find unintuitive. (Or else you have to make sure that everything you link to is compiled by the same version of rustc and llvm, which is not very realistic.) Static linking sidesteps this potential issue.


A systems language should support everything regardless how useful it might be.

Unless it wants to leave some scenarios open for the systems programming languages that offer tooling for them.


It starts being cool way more quickly on typical /* nix systems as a developer

Due to subtle binary incompatibilities in shared libraries, a CD pipeline I maintain costs twice as much money to run in order to target MacOS and Debian systems. It's not "worth it" to me.

Static linking is a cornerstone of what I'd call "deterministic deployment." The cost savings of deterministic deployment are so immense that after reading your comment and reacting to it, I'm tempted to estimate how much money dynamic linking will cost my business this year and how much money we'd save by reworking our build tree to compile our dependencies from source and maintain static libs on the targets we care about.

>And rebuilding isn't cheap in storage requirements either: a typical cargo package will require a few hundred megabytes of storage, likely for _each_ rebuild.

shrug I don't care about build storage requirements until we start talking tens to low hundreds of GB and disk I/O becomes the time bottleneck or caching costs real money in CI.


Suppose a closed-source program you use has a critical vulnerability in rustls and the company that wrote it is out of business. How much effort do you need to hot-patch the binary in a safe manner? I routinely use 15-years old software which is only usable with replacing libraries with their recent counterparts and changing the names to masquerade for the old ones. And I do prefer to have access to old files.


I'm not much for Rust, but I'm essentially a deploy/tooling engineer and supporting a build system for hundreds of engineers and tens-thousands of servers. I've worked in a few dozen languages at this point and have seen 20 years of the problems in this field.

I'm going to tell you that static linking is the only way to go.

The idea that needs to die is the personal workstation. It's just another deploy target. Ideally whatever is doing builds is reproducible and throwaway. Leaving those artifacts around on disk introduces its own host of problems.

Disk is cheap and you shouldn't have tons of binaries on your systems for no good reason.

(Aside from this though, I'm in full agreement with Theo.)


> Disk is cheap

Arguable, but I'll grant you this. But RAM ain't cheap.


Single Responsibility Principle except apply it to infrastructure. We're in a world where both virtualization and containerization are the norm and with either flavor this is easy.

Each of your virtualized/containerized systems are going to have their own distinct instances of the shared libraries, so you're getting a fraction of the benefits of dynamic linking if you're doing your infrastructure properly today anyway. Stop keeping pets.

How many of these big binaries are you running on your servers at once?


> Each of your virtualized/containerized systems are going to have their own distinct instances of the shared libraries,

Certainly not mine, my memory deduplication is doing just fine.


We've had various mechanisms for this and some of them have even been deprecated. Transparent Page Sharing was deprecated in VMware as it turned out to not be super useful in large pages. We have Kernel Same-Page Merging but many folks are disabling this in a post-Meltdown/post-Spectre world as the potential vulnerabilities aren't worth it.


It’s a lot cheaper than it’s been in the past, especially for what we’re talking about: people are using more data these days but the kinds of system shared libraries where common page mapping helps has seemed like diminishing returns for years, both as a fraction of private data size and because applications don’t sync their dependencies so you end up with multiple versions of each library cutting into the savings.


It is certainly possible to use Rust with COM, like Rust/WinRT [1].

I'm currently building Firefox, it takes ages. It is marvel such complexity still manageable on customer hardware. Emacs is from another age.

[1] https://blogs.windows.com/windowsdeveloper/2020/04/30/rust-w...


The rationale is that rust can only give any guarantees at all if the resulting binary is self-contained. A shared library swap can break everything that the compiler guaranteed during compilation. This fits nicely with the generally restrictive mindset of the rust designers.


This is not "the rationale" whatsoever.

Stable ABIs are hard, and we have a lot of other work that's a higher priority for our users. There is no ideological opposition to dynamic linking, only practical blockers.


Let's try to summarize what he tries to say but kinda doesn't do very well:

_Rust and OpenBSD are not a good fit._

OpenBSD keeps to certain aspects about how to do thinks which just don't work well with rust. This doesn't mean they are better or worse. They are just different.

For example outside of rustc-dev hardly anyone compiles rust them self. As such not many work on making rust compile itself on exotic systems (and yes in 2020 i386 is a exotic system, even through it was the standard in the past). I mean improving cross compilation is for now much more important (as far as I can tell).

But OpenBSD requires the ability to self compile and support hardware for a very long time so rust (for now) doesn't fit.

Another example is that many rewrites of classical small tools provide little value to anyone _if_ they are 100% compatible. As such they tend _to not be compatible_ in favor of better interfaces/output etc. (E.g. ripgrep!!)

Lastly OpenBSD is stuck in a world where thinks move very slow. But this is also why many people use it. One the other hand rust is from a world where thinks move much faster, sure with small steps and backward compatibility but a small step every 6 weeks is still a lot over a fiew years. So they just don't fit well together.

I also believe that in the future they might fit well together at some point. But not for now and probably not in the next 3 or so years.


> I wasn't implying. I was stating a fact. There has been no attempt to move the smallest parts of the ecosystem, to provide replacements for base POSIX utilities.

This is in fact incorrect--there is a project aiming to build all of the coreutils in Rust (https://github.com/uutils/coreutils).

More to the point: while I do concur in the conclusion that Rust shouldn't be a part of the OpenBSD base system, the gatekeeping implied here (it's not a serious language because it's not used to build an operating system) is really toxic. Especially considering that the gate in question has already been thoroughly breached by the language in question (some universities have switched to using Rust in their OS courses, for example).


There is no gatekeeping. To actually achieve POSIX-compliance is not an easy task and requires lots of testing. If you don't do it then your replacement will break everyone's scripts. A distro maintainer will also want to retain compatibility with their supported GNU/BSD extensions so that's more work to add on.

I would still agree with his statement at least as far as BSD is concerned. Outside of Redox I haven't seen any serious projects to implement POSIX compatibility. The Rust coreutils project is a good start but from their readme it looks like they are more aiming to achieve GNU compatibility on Microsoft Windows while being MIT licensed — I don't think anyone is seriously using it to build a BSD (or even a GNU) distro. If I'm wrong about that I'd love to hear it though. Rust is a good choice to write these things in but let's be realistic about the time frame required to rewrite everything.

Plus I just downloaded those Rust coreutils and tried to build it and now I'm waiting for 400 (!!!) Rust dependencies to download and compile. Is this really appropriate for a core system component to have this many dependencies? Or is there something I'm missing here? I admit I am not familiar with best practices in Rust. As the project stabilizes I assume they will want to start eliminating the dependencies or vendoring them upstream? At what point do we decide to put these into a separate library like libbsd or gnulib?


> Or is there something I'm missing here?

There may be some platform-specific dependencies, but at least when building on Windows, I saw 67 dependencies downloaded.

One thing that may be skewing the count is that internally coreutils is packaged with one crate per command. There are about 100 commands, so you'll have seen at least that many separate crates building. They're listed here: https://github.com/uutils/coreutils/blob/6e8c901204934029c88...

My impression is that ecosystems where tracking dependencies is hard tend to discourage them, while ecosystems where tracking dependencies is easy encourage them. The coreutils crate depends on separate md5, sha1, sha2, and sha3 crates, if managing external dependencies was hard I imagine those would be bundled together. But it is easy, so it makes sense to organise the crates so that each just provides one specific piece of functionality. Think single responsibility principle applied to packaging.


Urgh, that's just a verification nightmare waiting to happen. Who is going to wade through all that mess, review every package, pin its version etc...? You need reproducability before you can get safety and security.


Cargo uses lockfiles by default to pin all dependencies of an application, so reproducibility shouldn't be much of an issue.


No one reads lockfiles. In python, people have successfully backdoored projects by sending a PR that upgrades a dependency and regenerates the lockfiles.

Somewhere in the 1MB lockfile patch, they pin a dependency to a version with a known remotely exploitable flaw (sometimes a zero day, sometimes not).


If you care about security, api stability and legal compliance, you should minimize the number of developers and packages you trust. My rule of thumb is that, for a team of ~10 developers, each package dependency adds a week or two of developer maintenance time each year or so.

Having a dozen packages where one will do, and having them do the same thing recursively means you need to spend 144 times longer dealing with such things.


Why is number of packages and not package size the relevant metric here? Especially when all the packages live in the same repository, there's practically no more overhead to review each package qua package, versus just a bunch of directories in a package monolith.

This "144 times longer" number seems pulled completely out of thin air.


Twelve packages with twelve dependencies each is 144. I’ve seen >10x this level of blow up on multiple products at multiple companies.

One company missed installing malware behind all the firewalls of most of the fortune 500 by less than a week.

We made a procedural change to the build process of a product that had been shipping for years.

The change hit our CI/CD 5 days before one of the big language package managers started spewing malware at us.

Package maintainer count bloat is a serious real world problem.

Ignoring it is negligent.


I'm not suggesting that we should stop reviewing dependencies. I'm pushing back against the idea that internally structuring a large repo or project via independent modules necessarily explodes the burden to review.

It may be the case that this preference for well modularized code leads to lots of disparate small packages in the ecosystem, but within the scope of a single project, I see no reason why modularization poses any intrinsic risk.


Both are relevant. To their point, each additional maintainer is also a hidden dependency of each package, so the number of packages is relevant.


In the case in question the dependency count as crates is separate from the dependency count as git repos. In this case things that would normally be one big library where split in many different functional crates in the same repo.


Actually you did miss something.

Thinks like `cargo tree | wc -l` or the number of dependencies cargo builds are not appropriate as they count _internal_ dependencies (from the same repo).

Given that it's split into many "sub-crates" in the same workspace this will make the numbers unnatural high.

Or with other words _most_ of the 417 "dependencies" for building the "unix" feature set are internal ones in the same repository. I.e. the project is just split up into many small parts.

Additionally many deps are reused between many internal deps.

Removing duplicate dependencies, dev-dependencies (e.g. testing tools) and build dependencies less then 100 deps are left.

Looking through them many are rust versions a C/C++ libs which are "available per default", like e.g. unix_socket, libc and similar.

Then some are thinks which come from rust deps being split up more, e.g. there are md5, sha1, sha2, sha3 as separate crate.

There are also a bunch of "internal" deps of deps e.g. backtrace-sys.

Anyway still a lot of deps but most are quite reasonable and could be "taken care of" in some way or another if they want to ship a distro based on this.


Most scripts don’t depend on the POSIX standards anyway, they depend on GNU or BSD programs. CPython’s configure script, for example, blows up if the system grep is BSD and not GNU. The whole C build ecosystem is duct tape anyway and things rarely build without tons of manual effort (figure out what versions of which dependencies need to be installed to which locations in the file tree and then maybe things will build, but none of it will be documented and good luck reverse engineering those Autotools/CMake scripts!), so citing concerns about breaking things seems really disingenuous.


I believe a main point of the article is that while linux might be fine with this state of affairs, OpenBSD is not and actually care about having a specific standard.


His statement was that there was "no attempt" to write POSIX tools. As a "statement of fact." Irrespective of how much work there is still to do for coreutils in terms of compliance, they are an attempt. They may be be completely unsuitable as a drop-in replacement right now, but that was not the "fact" that he asserted.


Rust dependencies are just compilation units. Do people complain when a coreutils-scale project ultimately depends on 400 .c (and/or .o) files?


To replace anything in OpenBSD, you need way more than POSIX-compliance -- you need OpenBSD-compliance. Which means every non-POSIX option has to be implemented down to edge cases, every departure from or controversial interpretation of POSIX (POSIX is vague in many places) has to behave exactly the same. Or you'll definitely break scripts.


These nuts don't understand OpenBSD's base doesn't map to GNU coreutils, it ships SEVERAL more tools. They are so GNU and Rust blind they don't get anything.

.ls -1 /bin/ /sbin/ /usr/bin/ /usr/sbin/ | sort -u | wc -l

     671
Good luck reimplementing that.


I don't think you miss anything.

POSIX compliance is a headache which is not only hard to get right but also strongly limits your interfaces and internal tooling.

Many scripts still will run with non full compliance, but that doesn't help if you try to build for OpenBSD.

Also I'm not sure how they ended up with 400 deps for coreutils. I would have expected much less. But yes this means at least until better code signing and so on is the default this isn't at all appropriate for the base of any OS.

I just which the author would have written it a bit less mean in it's wording. (Probably was just annoyed, but sill).


The author is from the 1990s era of Linux / FOSS, before kindness was part of the culture, when the culture had a tough competitive streak where mistakes and humility were seen as signs of weakness and disrespected. High expectations wee tightly bound with low levels of patience. And many people were scared off from contributing.


> POSIX compliance is a headache which is not only hard to get right but also strongly limits your interfaces and internal tooling.

Platforms which are POSIX compliant usually offer extensions and then allow for selecting actual POSIX-compliant behavior. It's not a very strong limitation if you're already fairly close to that anyways.


Somewhat academic as I have no dog in the fight, but ...

> To actually achieve POSIX-compliance is not an easy task and requires lots of testing.

Why are the existing tests not enough for a new implementation?


Sweet summer child, imagining OpenBSD has comprehensive tests? This is UNIX, old-school. If it seems to work, it works.


> Is this really appropriate for a core system component to have this many dependencies?

Absolutely not. The entire OpenBSD base system can be built without an internet connection.


You can also build Rust and its ecosystem without an internet connection. It works the same way; you need to get a copy of everything first, but the build itself does not require the internet.

(This has been a hard constraint for effectively forever, because both Firefox and Linux distributions have this requirement.)


I understand that if you have all dependencies locally, you don't need to download more dependencies.

I understood NotAPlumber's remark as: openbsd has no third party dependencies. In other words: there is one party to trust, not 400.

If I misunderstood the BSD case, I'd love to hear it.


after you are done downloading all the source.


You can check that the source hasn’t been tampered with before starting the build, and know you’re building the right code base.

Builds that curl junk and use a dozen language specific package managers don’t have that property.


That assumes every user is a developer, expert in security assessment, knowledgeable in tricks like trusting trust, and all the languages used in the code base.


That is exactly the set of properties you get with Rust and Cargo: that the source hasn't been tampered with, and that you know you're building the right codebase. It would be deeply irresponsible to build a package management system without those properties.


This is one of those things that's strictly true, but is kind of misleading because it doesn't capture the full picture. In particular, there is actually no guarantee that the source code you see on GitHub matches the source code you compile from crates.io. Now, most responsible maintainers tag each release and publish exactly that tag to crates.io. Which is good. But that's just a convention. There's nothing enforcing it.

If you want to actually review the source code, then you'd have to download the crate archive itself and review the code there. There is some tooling for this (notably, cargo-crev), but it's definitely not a normal part of development in the Rust open source ecosystem as it stands right now. For the most part, trust and reputation hold the system together. For example, imagine what would happen if someone found malicious code in one of my crates that I was duplicitous about (i.e., published it on crates.io but kept those code changes off of GitHub).


You're right.

I've heard from reliable sources that this is actively being worked on.


True enough... but you still don't know what the dependencies do. E.g., one added ads. Another added use tracking. This could even be legitimately in line with the dependency's purpose.

Unless you're manually vetting all the code, 400 dependencies means trusting a lot of third parties.


Trusting a lot of third parties is almost as irresponsible as not bothering with checksumming, etc.

See npm and pip for examples. I suspect cargo will have its share of incidents over time, unless it is somehow curated by a small group that tests and maintains the packages, but if it is, then it’s comparable to apt for c/c++.


He's a maintainer, gatekeeping is literally his job. He's the one who decides what makes it in or not.

His reply is not great, but IMO posting to some venerable old project's ML saying, in the abstract and without actually having made any effort to analyze the issue in depth "Under what conditions would you consider replacing one of the current C implementations with an implementation written in another, "safer" language?" comes off as rather rude and not very productive IMO. I can understand him shutting it down immediately instead of wasting his time arguing with programming 101 students who cargo cult Rust without understanding the reality of maintaining something as complex as OpenBSD.


Some substantive comments I read from the mail:

* It takes a long time for a rewrite to be fully compatible and a good implementation. This project says it uses the busybox tests and that seems like a good start. Is it good enough though? Busybox itself is quite good for certain embedded niches but it would be absurd to replace a good chunk of OpenBSD userland with that, the usability would suffer a lot.

* OpenBSD needs to compile itself, in reasonable time, and at the time of the writing rust was slower than C at this and could not self host on 32bit x86 due to address space limitations. Has it been fixed?


> Busybox itself is quite good for certain embedded niches but it would be absurd to replace a good chunk of OpenBSD userland with that, the usability would suffer a lot.

Not to mention that Busybox (last I checked) uses a license unsuitable for inclusion in OpenBSD (which is trying to reduce the amount of GPL-licensed software in the base install).


> the gatekeeping implied here (it's not a serious language because it's not used to build an operating system) is really toxic.

He's not gatekeeping in that way by saying that Rust isn't used to create OS utilities (which as an aside is what he actually says. He doesn't say anything about building an OS) and therefore isn't a serious language and therefore shouldn't be included in OpenBSD*. He's saying it's not used for making OS utilities and therefore has no specific use in OpenBSD that requires it to be in the toolchain. Very different points with the latter being pretty reasonable, imo.


The post was written in 2017. How far along was the project in 2017.

Also, I am a bit confused about the gatekeeping implied here (it's not a serious language because it's not used to build an operating system) is really toxic.? Which line in the e-mail stated that?


The gatekeeping is strongly implied by this statement:

> As a general trend the only things being written in these new languages are new web-facing applications, quite often proprietory or customized to narrow roles. Not Unix parts.

Which I read in the tone of "go play with your toy language somewhere else, and let the real programmers program with real languages. Additionally, there's the allusions to the fact that the ls/grep-replacements in Haskell aren't POSIX-compliant, which I again read in the tone of "they can't be taken as serious replacement efforts."


> As a general trend the only things being written in these new languages are new web-facing applications, quite often proprietory or customized to narrow roles. Not Unix parts.

That doesn't imply "real programmers program with real languages". It implies they're not used to enrich the UNIX / open source ecosystem, which might imply there's not a real need for them in the official repositories, since they're not used to build anything else in the repository.

> Additionally, there's the allusions to the fact that the ls/grep-replacements in Haskell aren't POSIX-compliant, which I again read in the tone of "they can't be taken as serious replacement efforts."

There is real value in being "POSIX-compliant". It means programs respect an interface. If I write a script that uses "grep" or "find", I have strict requirements from them. I expect them to take certain arguments and interpret them in a certain way. If I can't simply replace "grep" with "haskell-grep" and get the same results (with stronger safety guarantees or performance), I may not have a use for them in certain scenarios.

The critique that "They are completely different programs that have borrowed the names." doesn't imply "they can't be taken serious", it implies they can't be used to replace the existing ones.


That tone seems more you than Theo. OpenBSD isn't known as a first mover operating system for new technologies. If the utilities are rewritten and not compatible then it really isn't a rewrite, its a new program with the same name as a previous program. Frankly, if they don't implement that standard, they aren't serious replacement efforts. They could be useful and maybe the options omitted aren't important to the person implementing it, but they are not suitable to replace the current utilities.


There is gatekeeping here... but it's not implied, it active and out in front. The entire conversation here is whether Rust code is going to be permitted through the gate of the OpenBSD base system. You don't have to parse through implications to find the gatekeeping, it's the topic of the conversation.


I think it's OK for platforms to have principles and if those principles are going to yield anything of value you have to be fairly consistent and ruthless about them.

There was a little bit of editorializing there but if 'base builds base' is table stakes then start working on that coreutils project first.


> the gatekeeping implied here [...] is really toxic

It's Theo de Raadt, toxic rhetoric is sort of his brand.

But... he has a real point here. It's not about grep or cat or whatever really, those are just the use cases for which OpenBSD would care.

It's that as Rust is reaching the second decade of its history, and despite some outrageous wins in press, evangelism, and general developer mindshare... Rust just hasn't made much progress in replacing exactly the software that it claims to be intended to replace!

Where are the pervasively used compression libraries in Rust? Video and audio codecs? Network stacks? Database engines? System management utilities? PKI and encryption stacks? All that stuff is still in C. After ten years of Rust success!

It's not like no one is using Rust. It's not like it doesn't have success stories (even in the categories above). But they're comparatively rare, still. There's nothing in Rust that rises to the category of "stuff everyone just uses because it's what everyone uses" (c.f. zlib, libjpeg, readline...) yet.

And if there isn't after the first decade, what needs to change to make it happen in the second. I mean, is it maybe fair to say that the window is closing for rust to take over a significant fraction the systems programming world?


" There's nothing in Rust that rises to the category of "stuff everyone just uses because it's what everyone uses" (c.f. zlib, libjpeg, readline...) yet."

IMHO, that's basically not possible, so it isn't a fair ask. If the Rust community produced a drop-in replacement that is every bit as good as readline in literally every way... still nobody would switch, because why would they? What's the benefit? You spend all this effort just to trade evenly across. If the Rust community produces something better, nobody's going to use whatever those extra features are outside of Rust itself.

This is the category of software that will be last.

The question for Rust, and equally, every other ambitious language, isn't really "What percentage of existing code will be rewritten in this new language?" (And by "rewritten", I don't just mean "reimplemented but only your community uses it, but in the sense you mean... it actually replaces the original.) If your answer is any number significantly over zero, that itself represents entrance into the absolute top tier of languages... this is very rare. The question for Rust et al is "What percentage of future code will be written in your language?" On that front, Rust is still on a pretty decent trajectory.


> IMHO, that's basically not possible, so it isn't a fair ask

Is it not? There is software in that category written in the last ten years (not as much as in the preceding decade, obviously). How about Docker? V8/node? Most of the wasm ecosystem dates from that period. Actually there was a huge window with TLS implementations once the community collectively decided to move away from openssl, too.

Rust didn't hit any of those sweet spots. And I think we should maybe be spending more time asking why.


"How about Docker? V8/node? Most of the wasm ecosystem dates from that period."

Not a replacement of any previous tech, not a replacement of previous tech, and not a replacement of previous tech respectively. You were suggesting Rust must replace things like "zlib, libjpeg, readline". These are not the same sort of thing. It is "zlib, libjpeg, readline" I was suggesting is far too high a bar to clear to say a language is succeeding; I mean, what other language has replaced them either? It isn't just Rust failing to displace C from those places, it's all the languages.

I don't even know what it would take for a language to actively replace C in those positions. What could a language possibly offer to overcome the cost of throwing all that code away?

"Actually there was a huge window with TLS implementations once the community collectively decided to move away from openssl, too."

This is closer to valid, but still a huge ask. Rust may not have done it but neither did anybody else; SSL is still C.

Now you've shifted the question from "Why doesn't Rust successfully replace existing tech?", which is what I was saying is effectively impossible, to the question "Why doesn't Rust have a killer app?". This is a fine question to ask; I don't attack you for asking it. I'm merely defending my original characterization.

Still, I suspect Rust is like Go; while there is definitely a significant set of apps now that are implemented in Go that people use because they are the best-of-breed and not because they are Go, I expect that usage is dwarfed by usage of Go that you never see. I see Rust coming up around me in places where it's just happening organically, unrelated to HN.


> You were suggesting Rust must replace things like "zlib, libjpeg, readline". These are not the same sort of thing.

They're all parsers of foreign input. Somethings which C is extremely fast at, but also extremely dangerous at doing.

> I mean, what other language has replaced them either? It isn't just Rust failing to displace C from those places, it's all the languages.

But did other languages have the same ambitions ?

As I understood, Rust wanted to be safer but just as fast as C. Which would make it an ideal language to replace those input parsers. It's a pity that still hasn't happened because we'll still be seeing plenty of CVEs for those libs..


At a guess because it is much easier to yell from the sidelines than to be exposed with your stuff out in production across major installations. Maturity of code is measured in blood, bugs and overtime, regardless of what language it is written in.


> It's that as Rust is reaching the second decade of its history

Sort of. It's only been five years since it was actually stable enough for production use. (with a few exceptions of folks who were really invested)


And:

> Where are the pervasively used compression libraries in Rust? Video and audio codecs? Network stacks? Database engines? System management utilities? PKI and encryption stacks? All that stuff is still in C.

How many of those projects in any language are less than five years old? The ubiquitously used stuff is mostly much older. But among the projects which are that young, Rust has substantial representation.


I agree UNIX base is not its niche.

But take a look on Firefox:

* Video and audio codecs - MP4 metadata parser, Audio backend

* Database engines - key-value storage backed by LMDB

* Network stacks - A QUIC implentation, SDP parsing in WebRTC

* PKI and encryption stacks - TLS certificate store

* And more https://wiki.mozilla.org/Oxidation

Just five years since first stable release.


> Where are the pervasively used compression libraries in Rust? Video and audio codecs? Network stacks? Database engines? System management utilities? PKI and encryption stacks? All that stuff is still in C. After ten years of Rust success!

It's not ten or 20 years, the first release of Rust was in 2015. Pre-1.0 Rust was a wildly different language, with green threads, segmented stacks, regular and wild breakage, not really a C replacement! Please understand this point.

But anyway here are some projects:

https://github.com/ctz/rustls a TLS library that uses https://github.com/briansmith/webpki a pki library

https://github.com/burntsushi/rust-snappy a compression library

https://github.com/tikv/tikv a database engine

https://github.com/hyperium/tonic a gRPC library

https://github.com/tock/tock an embedded OS

https://lib.rs/command-line-utilities lots of CLI utilities which include system management

You ask for "pervasively used" but this is not under control of Rust itself. It's not feasible to replace decades old setups in five years.

The most widely deployed Rust stuff is in Firefox and some Gnome libraries AFAIK.

> I mean, is it maybe fair to say that the window is closing for rust to take over a significant fraction the systems programming world?

I don't see why this should be the case.


You make valid point, but you can't fault Theo for predicting a different future in 2017.


https://github.com/uutils/coreutils/commit/d4e96b33e34373399...

I read 2013, Theo de Raadt's email is from 2017, over 4 years later. I think we can fault him for being overconfident and not checking his facts.


That commit doesn't have any files in it. The first release of "uutils coreutils" appears to be this:

https://github.com/uutils/coreutils/commit/511f138e3856e2d69...

which is only twenty days young, and a 0.0.1 release to boot (whatever that means).


The first four utils were added 7+ years ago: https://github.com/uutils/coreutils/commit/9653ed81a2fbf393f...

This has been developed for a while. You're looking at the official GitHub releases which doesn't mean much. It just means they just started providing official compiled releases. Not all projects even do that at all.

I know a lot of people getting into Rust, including myself, started reading through this repo. This was one of the first good real-world examples of Rust, that I found.


> You're looking at the official GitHub releases which doesn't mean much.

Fermat would like rust then.


> There has been no attempt to move the smallest parts of the ecosystem, to provide replacements for base POSIX utilities.

There was an attempt, that started over 4 years earlier, and if I understand correctly that attempt was pretty far along when this email was written. We can argue over definition, but I'm pretty sure he wouldn't have written this if he was aware of this project.


There are POSIX implementations in Ada since the mid-80's.


Yes, and 2017 should be in the title of the post.


There is nothing sinister being implied by wishing to maintain the course that the OpenBSD project has taken. This leadership may not please all of the people all of the time (least of all non-dependent external projects), and it has no obligation to do so.


Do any major Linux distros think uutils is good enough to ship as the default? Are they at all involved?

They'd be infinitely easier to persuade to try something like this than a BSD. For OpenBSD it's probably always going to be a non-starter because of all the hardware they'd have to leave behind.


BSD does not use GNU coreutils. Is it really unexpected that the developer of a BSD project might have been unaware at the time of a project attempting to replace it, or simply found it incompatible with the stated requirements-- POSIX compliant base utilities. OpenBSD does not care about compatibility with GNU, for example longopts do not work for many utilities, with exceptions to permissive replacements to historically used GNU utilities (grep example).

It also does not address the concerns of higher build times compared the contemporary C counterparts, which also support far more platforms than Rust.

The use of cargo for example suggests that at least a few uu utilities makes network connections at build time to fetch dependencies!

For example something as simple as chown:

https://github.com/uutils/coreutils/blob/master/src/uu/chown...


The use of cargo implies this no more than the use of make. One can fetch all dependencies and then build in a completely offline manner.


>the gatekeeping implied here is really toxic

The words 'gatekeeping' and 'toxic' are toxic.


Agreed, and it's even more toxic to use "gatekeeping" in this fashion for somebody who is LITERALLY a gatekeeper for the project under discussion.


Linked mailing list message is from 2017, which is may be after when the linked project was started/viable.


No need to speculate, if you follow the GitHub link you'll see that the project is older than 2017.


You’ll also see that cp, ls, expr, test, and others are still only considered semi-done.


How many utilities were rewritten and shipped at the time of the post?


79 by my count, but I'd be happy to have someone correct me.

https://github.com/uutils/coreutils/tree/ef4d09ee3c067c280ba...

It looks to me like the only changes in that table since November 2017 are that `join` and `df` were moved from 'todo' to 'semi-done'.


Are they done, or did they move onto to something else?


I don't know.


uutils was started in 2013, and was decently complete well before 2017.


Your definition of "decently complete" and Theo's are likely to be waaaaay different... and within your respective contexts, you are both correct.


Not only that, there are several POSIX implementations in Ada.

They just don't get exposure in FOSS land.


Is it 'really toxic', or just uninformed? Note that the email conversation was from a few years ago.


Arguably, someone in that position speaking so definitively while also being uninformed, is at least a little "toxic."


I'd say toxic. When you point out that "nobody has done this" in this way, there is a strong implication that it's not possible, not reasonable, or not wort the effort.

Rust explicitly aims to be a systems language. It explicitly challenges C and C++ on their own turf. Implying that it's not suitable for serious systems work is insulting.

> Note that the email conversation was from a few years ago.

Remember that the project he said didn't exist was 4 years older than his email. And if I understand other comments correctly, that project was well known among Rust practitioners.

He made strong claims outside his area of expertise. An easy mistake to make, but one that has consequences when you're this famous. Such mistakes are totally fine in private, but in public… they're a little toxic.


Meta: if someone could explain to me where the 5 downvotes come from, I'd be curious to know which part of my comment triggered them.

Is Theo de Raadt above criticism? Is Rust actually failing at its primary goal? Are words like "toxic" and "insulting" under a soft ban? Right now all I have to go by is a dead flagged strawman from a throwaway account.


Don't worry about it. My comment seems to be up and down like a yoyo.


Not only does coreutils exist there is also another think which shouldn't be forgotten.

When you re-write a tool, _why not improve it_? (Especially if the chance that the 100% backwards compatible rewrite will not be accepted anyway).

For a bunch of tools exactly this happens, a new tool which provides the (roughly) functionality but a new (IMHO often better) interface, slightly different options or syntax (for got reasons). Etc.

An example for this is ripgrep (rg), which already became a widely used replacement for grep. It's just not a drop in replacement. (It couldn't do what it does if it would be).


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: