Thoughts on Rust bloat

ajxs · on Aug 21, 2019

At the risk of being slightly tangential, I've been sorely wanting to air this particular grievance with Rust for some time. It's somewhat related, since the author mentions their package system. Its package ecosystem isn't nearly in the horrible state that node's is, but having a package system shouldn't be a substitute for designing a useful standard library for a language. I think that the attraction to 'small languages' is very much misplaced. If I can't get through Rust's official documentation without being recommended the use of third party packages for basic functionality (getopt, interfacing with static libraries... Etc) then the designers have made a terrible error.

burntsushi · on Aug 21, 2019

Opinions on this are a dime a dozen. You often see the reverse of it too, for example, you might have heard that "Python's standard library is where things go to die." You could just as easily call that a "terrible error." The fact that Python's standard library has an HTTP client in it, for example, doesn't stop everyone from using requests (and, consequently, urllib3) for all their HTTP client needs. So despite the fact the standard library provides a lot of the same functionality as a third party dependency, folks are still using the third party dependency.

I think the size of the standard library is just one of possibly many contributing factors that leads to a large number of dependencies. I think a part of it is culture, but another part of it is that the tooling _enables_ it. It's so incredibly easy to write some code, push it to crates.io and let everyone else use it. That's generally a good thing, but it winds up creating this spiral where there's almost no backpressure _against_ including a dependency in a project. This means there's very little standing in the way of letting the fullest expression of DRY run wild. There are some notable examples in the NPM ecosystem where it reaches ridiculous levels. But putting the extremes aside, there's a ton of grey area and it can be pretty difficult to convince someone to write a bit more code when something else might work off the shelf. (And I mean this in the most charitable way possible. I find myself in that situation.)

I do hope we can turn the Rust ecosystem around and stop regularly having dependency trees with hundreds of crates, but it's going to be a long and difficult road. For example, not everyone even agrees with my perspective that this is actually a bad thing.

mantap · on Aug 22, 2019

Python's standard library is where things go to die because of the terrible adhoc versioning system (the module name is the version number) and dynamic typing means they are afraid to change anything. But even then it's still better than having no standard library at all.

The advantage of a standard library is that you only need to learn one API instead of a dozen different APIs for doing the same thing, which means you can develop a degree of mastery over it. It also reduces the friction for using better abstractions. e.g. Every professional Python programmer knows defaultdict, whereas I rarely see that data structure used in other programming languages, it's too much of a leap to install a dependency to save a few if statements, but it all adds up.

swsieber · on Aug 22, 2019

> The advantage of a standard library is that you only need to learn one API instead of a dozen different APIs for doing the same thing, which means you can develop a degree of mastery over it.

The rust ecosystem has done well to converge on certain crates as sort of replacement for missing std features.

In practice (at least in the rust ecosystem), I only need to learn one interface for:

* regex (regex)

* serialization (serde)

* network requests (request)

There are de-facto base crates in the ecosystem.

kd5bjo · on Aug 22, 2019

As a relative outsider, it’s not obvious at all that these are the right crates to choose. I appreciate the commitment to long-term stability that the standard library appears to have, but that benefit goes out the window if I accidentally rely on a third-party crate that changes its API every six months.

Looking at crates.io, regex looks pretty safe, as it’s authored by “The Rust Project Developers” and includes explicit future compatibility policies. Unfortunately, I can’t find an index of only the crates maintained by the Rust team.

Serde is obviously popular, but at first glance is a giant Swiss Army knife that will likely have lots of updates to keep track of that are completely unrelated to my project (whatever it is). If I search for JSON, I get an exact match result of the json crate, followed by a bunch of serde-adjacent crates, but not serde itself.

Request hasn’t been updated in 4 years, and has a total of less than 7000 downloads.

nicoburns · on Aug 22, 2019

They probably meant reqwest (https://github.com/seanmonstar/reqwest), not request. Reqwest is maintained by the same developer (seanmonstar) as hyper, the de facto standard http library.

swsieber · on Aug 22, 2019

Ah, I did. I posted that on my phone. Autocorrect strikes again.

throwaway8941 · on Aug 22, 2019

Because it's "reqwest", not "request".

All these libraries are very well known within the community and are what I would come up with as a complete outsider (I don't think I've written more than a hundred lines of Rust code to this date).

You can also find some pointers here:

https://github.com/rust-unofficial/awesome-rust

https://lib.rs

roblabla · on Aug 22, 2019

There's actually a more official resource: the rust cookbook[0]. This is maintained by the rust-lang team (rust-lang-nursery is an official place for crates maintained by the rust language maintainers).

[0]: https://rust-lang-nursery.github.io/rust-cookbook/

Freak_NL · on Aug 22, 2019

That sounds like something that could be solved by having crates.io provide a curated list of common popular crates for certain features. That is, this seems to be mostly a documentation issue.

roblabla · on Aug 22, 2019

This was attempted by the Libz Blitz back in 2017. See the rust-cookbook: https://rust-lang-nursery.github.io/rust-cookbook/ .

I agree that this should be better documented and probably more integrated with crates.io somehow.

kd5bjo · on Aug 22, 2019

It’s really a reputation bootstrapping problem, for which popularity can be a useful proxy. For me to use third-party code, I have to trust that the future behavior of the developers will be reasonable: I want my side projects that don’t get touched for months or years to still mostly work when I get back around to them.

Not everyone or every project will have the same desires, though. Sometimes, a fast-moving experimental library is the right choice. The trouble is figuring out which I’m looking at.

xenocratus · on Aug 22, 2019

I'm not sure I follow these concerns about "working in the future" - as long as you specify versions that work for you in your Cargo.toml file, that should work at any point in the future given that you use Rust 1.x.

If you'll want to update to always be on the latest version of each crate, well that discomfort about them potentially not working is part of the price.

kd5bjo · on Aug 22, 2019

If I come back to something, it’s because I want to resume active development. Keeping a dependency pinned at an old version makes that more difficult in various ways, so I personally value forward compatibility.

Not everyone does, and that’s fine. I just want to know what a library developer’s stance on it is before I try to use their library.

p0nies · on Aug 22, 2019

Anyone can search for "awesome [lang]" when they want such a thing.

pjmlp · on Aug 22, 2019

Do they also compile and run across all platforms supported by rust compiler?

Because that is the biggest asset from stuff being in the standard library.

Ygg2 · on Aug 22, 2019

No, but not all parts of std run on all supported platforms.

To clarify, there are different levels of support on different platforms.

pjmlp · on Aug 22, 2019

Then it already starts with a failure of quality gate of what goes into std.

nicoburns · on Aug 22, 2019

Not really. Rust supports 8 bit microcontrollers. Lot's of libraries, including parts of the standard library make no sense on these kind of platforms.

The standard library and 3rd party crates generally have excellent compatibility across mainstream platforms.

pjmlp · on Aug 22, 2019

So do Basic, Ada, Pascal, C and C++ as well, and yet they have richer standard libraries, with deployment profiles.

floatboth · on Aug 22, 2019

Since when does libc include "rich" things like regex, serialization and http?? Even C++ standard lib is mostly containers and algorithms.

libc is barely a standard library, it's almost nothing.

pjmlp · on Aug 22, 2019

C might not include those specific examples, but POSIX + libc already include plenty of stuff.

C++ surely does include regex, with serialisation and http scheduled for C++23, or with luck with a TR as intermediate delivery.

Although serialisation depends on static reflection being finalized as well.

Ygg2 · on Aug 23, 2019

Still, it implies C fails, because you don't get POSIX on all available platforms.

I think the definition is meaningless. Rust spends huge amount of resources to test itself and its surrounding ecosystem on tier 1 platforms.

I'm fairly certain regex from c++ std will run like crap, if at all on something with 4MB of RAM.

pjmlp · on Aug 23, 2019

Except the small detail that C's POSIX support is much wider than Rust's tier 1 platforms.

I am fairly certain that without profiling and defining a test configuration for a set of specific C++ compilers / standard C++ library I won't assert anything about std::regex performance with 4 MB of RAM.

Ygg2 · on Aug 23, 2019

Perhaps. But you're comparing a 26 year old standard for a 47 year old language, to a language that's been stable for 4 years.

howdydoo · on Aug 22, 2019

>deployment profiles

In other words, not all functions in the stdlib of Ada, Pascal, C and C++ can be used in all possible target environments? Sounds like a failure to quality gate those standard libraries.

pjmlp · on Aug 22, 2019

Nice jab try there, the suble point you are missing is that a deployment profile still is a guarantee of support.

There aren't an endless number of profiles.

ajxs · on Aug 22, 2019

I'm not sure if you're just being disingenuous here, but you're right that you're not going to be able to use all functionality from the stdlib of Ada ( and others ) on every possible target, but you were never, ever going to. And Rust certainly won't solve this problem for you. It's not a consequence of poor standard library design either. It might not be immediately obvious, but even C has a runtime library, which needs to be specific to the architecture and OS that you're targeting. Just for a quick example, `malloc` is going to need to function differently depending on what OS you're running, and if you're targeting a microcontroller with extremely limited RAM it might not even need to be implemented at all.

nicoburns · on Aug 23, 2019

I don't think the parent was claiming the Rust was better in this regard, just that it was no worse. Other languages also restrict standard library features on some platforms.

Ygg2 · on Aug 22, 2019

Not exactly. Rust can be made to run on 16-bit toaster or even OsIJustWrote. Just because it can be run, doesn't mean Rust std lib devs will support 16-bit toasters or OsIJustWrote.

Each platform has different levels of support. Primary being Windows, Mac and Linux, where every pure Rust crate runs.

Std lib makes certain reasonable assumptions, for which it works e.g. malloc exists and panic! is implenented.

stdbrouw · on Aug 22, 2019

In Python, if you're running 3.7.1 then you're also running the standard library for 3.7.1. Sure, I guess it would be possible for a programming language to decouple these things so that it's possible to ask for a particular version of the standard library (in its entirety), or a particular version of a standard library... but then programmers can no longer rely on the standard library to "just work" and "just be there", which is its appeal. If you decouple the standard library from the language, might as well switch to a Rust-like system where you simply give an official stamp of approval to certain packages regardless of who developed them.

burntsushi · on Aug 22, 2019

I think you might have misunderstood me. I was contrasting one extreme interpretation with another. I was not really criticizing Python. Its large standard library is one of the things I like about it.

EdwardDiego · on Aug 22, 2019

Java 8 saw Map obtain the method computeIfAbsent, it saves a bunch of boilerplate, just like defaultdict.

https://docs.oracle.com/javase/8/docs/api/java/util/Map.html...

bad_user · on Aug 22, 2019

> dynamic typing means they are afraid to change anything

When talking of the standard library, static typing doesn't save you when breakage happens. It's better of course, at least the compiler protects you from obvious errors (although that doesn't work for transitive dependencies, with the dynamic linking to binaries that Java / the JVM does ;-))

The problem is when a piece of code that was compiling fine a year ago, fails to compile on a newer version of the standard library, due to breaking changes, that's going to take time and effort to fix.

And this gets worse when the breakage happens in dependencies and those dependencies are no longer maintained. This can always happen of course, not just due to the standard library, but due to transitive dependencies too. But still, breakage in the standard library, or in libraries that people depend on, is a bad thing. And consider that as the number of dependencies grows, so does the probability for having dependencies that are incompatible with one another (compiled against different versions of the same dependencies).

And semantic versioning doesn't work. Breaking compatibility will inflict pain on your downstream users, no matter how many processes you have in place for communicating it. And this is especially painful when you're talking about the standard library.

If the standard library introduces breaking changes, regardless if the language is static or dynamic, then it's not a standard library that you can trust. Period.

Also — when should you break compatibility, in the standard library or in any other library?

The answer should be never!. When we want to change things, we should change the namespace and thus publish an entirely new library that can be used alongside the old one. Unfortunately this isn't a widely held viewed, but I wish it was.

---

Going back to the batteries included aspect of some standard libraries, like that of Python, there's one effect that I don't like and that's not very visible in Python since the bar is pretty low there.

The standard library actively discourages alternatives.

When a piece of functionality from the standard library is good enough, it's going to discourage alternatives from the ecosystem that could be much better.

Some pieces of functionality definitely deserve to be "standard". Collections for example, yes, should be standard, because libraries communicate between themselves via collections. And that's what the primary purpose of a standard library is ... interoperability. Anything else is a liability.

raphlinus · on Aug 22, 2019

In my post, I specifically call on proc-macro support (syn and quote) plus rand (not all of rand though, just the "give me a random number" functionality that comprises 99% of the use cases but 10% of the implementation complexity) to be added to the Rust standard library. But I feel these are particularly justified because they're already present, just not accessible. Overall, I think Rust's "batteries not included" approach is a good tradeoff.

burntsushi · on Aug 22, 2019

I don't necessarily disagree. I have my own pet things that I'd like to see in std too. I've long wanted to see lazy_static in std. Funnily enough, it looks like `once_cell` might wind up being a nicer approach to achieving a similar end, but without using a macro. So if we had added it to std many years ago, we might find ourselves with an API that we regret! It's tough.

With that said, if `syn` finds itself in a spot where it needs to do a breaking change release for a new language feature, then that would be tricky. It sounds like syn's architecture is pretty flexible (non-exhaustive enums), but it's not clear to me that it could support all possible language additions without breaking changes.

kbenson · on Aug 22, 2019

> So if we had added it to std many years ago, we might find ourselves with an API that we regret! It's tough.

That is the ultimate problem of batteries includes vs a small standard library distilled into a specific example.

People used to tout Python's "batteries included" line as positive almost universally a decade and more ago. Then better replacements were developed, and those batteries started looking less appealing.

There's really no escaping that while including any base functionality. Either you include it and portions will be stale later as better interfaces and paradigms are developed, or you don't, and you risk slower adoption and harder usability as people need to figure out solutions for common tasks, even if that solution is as simple as find the crate that provides it. Because eventually that becomes a problem not of finding the crate, but finding the best crate out of the multiple that exist, which itself causes fragmentation of the common developer experience (and thus makes it harder to share knowledge and have a good community).

galangalalgol · on Aug 22, 2019

I think I get the tradeoffs but while my company would be a perfect fit for rust, we do all our development on an airgapped network. Custom registries are a thing now, but picking and choosing which packages our IT will consider trustworthy and then taking the subset of that with a license our legal will approve an then prunimg things with dpendencies that are now missing is a huge task that needs to happen at regular intervals and probably leaves some pretty big gaps in functionality. Its a shame, I like the language.

burntsushi · on Aug 22, 2019

At least from a licensing perspective, you should be all set. Virtually everything in the Rust ecosystem is permissively licensed.

Trustworthy is another thing altogether though. How do you handle this process in C and C++? Both of those languages have fairly spartan standard libraries as well.

mmastrac · on Aug 22, 2019

> Virtually everything in the Rust ecosystem is permissively licensed.

I just did our license audit and 100% of our shipped deps were Apache | MIT. If you can clear those two licenses with legal, you should be good to go for virtually any crate.

Might be worth releasing my one liner for this if anyone else finds it useful.

sansnomme · on Aug 22, 2019

Do it! This is HN after all, no better place to gain free karma and appreciation for your command line prowess

ajxs · on Aug 22, 2019

That does sound pretty useful!

Qworg · on Aug 22, 2019

Please do.

ajxs · on Aug 22, 2019

The spartan C standard library was what my comment regarding 'small languages' was referencing. I don't have much experience with many of C's contemporaries, but for an opposing view take for instance the standard library of Ada ( I use the term 'standard' here to connote the library that is mandated as per the standard of these languages ). It is definitely orders of magnitude larger than that of C, and takes a fundamentally different perspective. Ada's standard library, while dated by modern standards, looks more like it was designed to address the specific needs of its domain. Mind you, it could be argued that C's stdlib was also, if you restrict its domain to 'OS development'. My point is that for a language like Rust, that does not have a tiny stdlib like C, its standard library should address the common use cases required by its developers. Node.js is another bad offender here. If you look at the most popular npm modules, you'll see things like 'body-parser' and 'async', which clearly show gaps in the functionality that Node's stdlib is catering for.

pjmlp · on Aug 22, 2019

Which is only true if we forget about POSIX support that everyone kind of expects in every platform that offers C compilers.

pjmlp · on Aug 22, 2019

Actually they don't, C standardization just dumped that standard library into POSIX instead, which I why all major OS end up supporting POSIX if they want to make C developers feel at home.

And you usually see them cursing the platforms that don't care about POSIX support.

And ISO C++ has repented themselves from following C's footsteps and have been improving the standard library since C++11, mostly by integrating boost libraries.

burntsushi · on Aug 22, 2019

I don't understand what point you're trying to make. All I did was ask a simple question: if trust in Rust is hard, how do you handle trust in other ecosystems? The details on exactly how big or how small the standard libraries are (or why they are that way) are less important for this particular question. Consider the size of Python's standard library vs C or C++ or Rust. The size of C or C++ is much closer to Rust's size than Python's size.

Also, while some of your historical context is interesting, I don't really appreciate your editorialization, which I often find is off the mark personally.

pjmlp · on Aug 22, 2019

My point is that although many only think about ISO C libc when talking about C's standard library, the reality is that with a few exceptions, libc always goes alongside POSIX across the large majority of platforms with a C compiler.

So in reality POSIX complements libc as C's "runtime platform", even though ISO C never considered to make libc that big.

My historical context is how I experienced how things went, throughout the media we had available at the time.

I surely welcome factual corrections when I fail off the mark.

Everyone benefits from learning proper history facts.

burntsushi · on Aug 22, 2019

> My point is that although many only think about ISO C libc when talking about C's standard library

I wasn't. Even with POSIX, it is spartan by today's standards. Look at the standard libraries of Python and Go. POSIX doesn't have JSON (de)serialization, HTTP servers, XML parsing and a whole boatload of other crap. So I don't think there is anything wrong with my characterization.

orf · on Aug 22, 2019

> not all of rand though, just the "give me a random number" functionality that comprises 99% of the use cases but 10% of the implementation complexity

Are you suggesting including the core `rand` crate with some traits and a basic pseudo-random generator in the stdlib, and allowing other crates to use those traits to implement the many other[1] RNGs?

I'm honestly not sure this is a good idea, it seems the rand crate has undergone a few iterations, crystalizing this inside the stdlib might lead to problems?

1. https://rust-random.github.io/book/guide-rngs.html

raphlinus · on Aug 22, 2019

I admit, design of a proper random number library API is tricky, and experience from deploying rand will no doubt be invaluable. The point I was trying to make is that some use cases require sophistication such as being able to choose different algorithms, but a lot of the times rand shows up in a build time histogram, it's just because the user wanted some pretty good random numbers; a much simpler API would suffice.

anp · on Aug 22, 2019

I agree that in many cases a much simpler API would suffice, but it strikes me as odd to suggest adding a convenient but insufficient-for-security API to the same standard library that protects HashMap against DoS.

vunie · on Aug 22, 2019

The request here is for a simpler API. I and many others would be more than happy to get a function that returns a random u64. No traits, no crippling commitments to a specific API design.

anp · on Aug 22, 2019

The API you describe, IIUC, can’t be serviced faithfully on the OSes targeted by std while being backed by real entropy.

ChrisSD · on Aug 22, 2019

But the std hashmap implementation[0] already depends on a PRNG via the rand crate[1], and thus OS random (for the initial seed). So it's in the standard library even if you can't use it directly. Which is what makes me think this is an API issue, not an implementation one.

[0]: https://doc.rust-lang.org/std/collections/hash_map/struct.Ra...

[1]: https://docs.rs/rand/0.7.0/rand/rngs/struct.ThreadRng.html

anp · on Aug 22, 2019

It does depend on this yes, but the random seed isn’t part of the public API. An RNG API in std would need to pick an algorithm and make that part of its stability guarantees.

ChrisSD · on Aug 22, 2019

In this case I'd quibble that it wouldn't need to guarantee a specific algorithm, just a minimal set of properties. If the user can't manually set the seed then they'll be no expectation that it's deterministic or follows any particular algorithm. So the precise implementation can change with versions or even platforms.

Javascript's Math.random and Crypto.getRandomValues works this way.

anp · on Aug 22, 2019

Id say that in Rust’s case, we’ve already seen several painful examples of Hyrum’s law.

http://www.hyrumslaw.com/

Macha · on Aug 22, 2019

Which is fine. Not everything needs SecureRandom. I'd wager more code using random numbers is for A/B tests, sample data generation or video games than people implementing crypto.

I get that security people don't want non-secure RNGs to exist out of fear that someone might try implement security related functions out of them but why should I care if I just want to choose between 3 types of enemies this wave?

Sean1708 · on Aug 22, 2019

Why can't a simple API be backed by real entropy?

anp · on Aug 22, 2019

(past edit window) This was written with hostility I don’t feel. Sorry Raph!

raphlinus · on Aug 22, 2019

No worries, it was read with offense I didn't take. I agree, designing a good random number API is in fact hard. I just think we can give developers a better out-of-the-box experience.

Dylan16807 · on Aug 22, 2019

Having a very simple API is not at odds with security, though. A simple API could automatically seed from the OS and then run a tiny loop with sha2 or sha3, for example.

vunie · on Aug 22, 2019

SHA2 is not designed to be used as a RNG. SHA3 might be a little bit better with its "streaming" modes, but there are many far better and faster ways to get random bits.

Dylan16807 · on Aug 22, 2019

It doesn't have to be the absolute fastest, it just has to work.

What is far better and faster than SHA3?

And while SHA2 wasn't designed for that use, it's easy to make simple and provably correct constructs that turn a secure hash into a secure RNG.

heavenlyblue · on Aug 22, 2019

You’re speculating to prove a point.

I would definitely not want my std to be designed this way.

Dylan16807 · on Aug 22, 2019

1. It was just an example.

2. You definitely don't want your std to be designed with a simple API to a fast, secure, and popular crypto primitive? Pretend I named your favorite one, to avoid bikeshedding issues.

edflsafoiewq · on Aug 22, 2019

ChaCha20 is widely used for CSRNG.

Dylan16807 · on Aug 22, 2019

Agreed, that would work fine. But for a generic RNG function, I feel like the difference between 3 and 7 cycles per byte isn't too impactful.

ChrisSD · on Aug 22, 2019

I would like getrandom[0] to be in the standard library. It simply abstracts over the platform random and is even used by the standard library itself.

[0]: https://crates.io/crates/getrandom

est31 · on Aug 22, 2019

Inclusion of parts of rand might be a good idea, but syn seems to need breaking changes as for long as the Rust language is getting new features:

> Be aware that the underlying Rust language will continue to evolve. Syn is able to accommodate most kinds of Rust grammar changes via the nonexhaustive enums and Verbatim variants in the syntax tree, but we will plan to put out new major versions on a 12 to 24 month cadence to incorporate ongoing language changes as needed.

https://github.com/dtolnay/syn/releases/tag/1.0.0

raphlinus · on Aug 22, 2019

Since stdlib and compiler releases are tightly coordinated, that shouldn't be a problem, no?

est31 · on Aug 22, 2019

My core argument is that libraries usually do need breaking changes, and including them into std turns them into zombies that can never be removed. I think python has issues with this. And the syn example that I pointed out above shows that Rust isn't immune from this either.

anp · on Aug 22, 2019

How do you accommodate breaking changes to the public interface of std?

raphlinus · on Aug 22, 2019

Right, I see your point here. There are non-trivial issues that need to be worked out, which I'm sure is one reason it's not in the library yet. Another (mentioned elsewhere) is that it takes time to converge on what a good API might look like.

heavenlyblue · on Aug 22, 2019

Well, with non-std I can always downgrade or upgrade the library.

With std I am bounded by the current language version.

snovv_crash · on Aug 23, 2019

You can only down or upgrade if uour dependencies don't depend on it too.

Having a defacto standard library has more issues IMO than a real standard library.

ende · on Aug 22, 2019

I feel like a reasonable middle ground to these issues is for communities to perhaps embrace “metapackages” that serve as community maintained “standard libraries”. In the R community we have tidyverse, which is practically a full mirror of R’s standard libs at this point.

These metapackage communities can then focus on interoperability of constituent packages w/o overburdening the standard libraries, and core language can remain lean and concise.

eden_h · on Aug 22, 2019

Tidyverse is written by a strong contributor to R, but it isn't used regularly by a large percentage of the community, so I'd be averse to calling it part of the standard libraries - although I appreciate this is what you're referring to with the speech marks.

On point of the post, Tidyverse is a great deal of bloat by loading unnecessary packages instead of specifics and creating namespace issues where two functions have the same name. It's generally fine for interactive work, but causes so many issues in development as you can't pick and choose what's loaded into the environment.

In line with what you're saying, meta-libraries make sense in terms of developing a line of packages towards a singular vision, but only make sense when the core libraries aren't expanded regularly. Maybe in these cases more of push needs to be made to supporting the existing tools?

R is an odd example though, as the standard libraries are loaded by default and I think only really comprise of basic stats/graphics/data.frame tools. I don't really believe a great deal of extra tools have been added to base R in the last few years, just improved in terms of speed and memory use (e.g 3.5.0's change to compiled packages)

steveklabnik · on Aug 22, 2019

Several attempts have been made at this in Rust in the past, but few people use them. It just adds even more dependencies that you’re not actually using.

ende · on Aug 22, 2019

Sure, but that's why it's optional.

jchw · on Aug 22, 2019

You mention Python, but I am pretty sure you've written some popular Go packages, so what is your opinion on the Go standard library? I think it's a good example of 'batteries included' in the right sense. Though, some of the reason its good may just be virtue of the fact that it's newer and there's less rotting packages; I guess only time will tell for sure, but it definitely feels right to me in many cases.

burntsushi · on Aug 22, 2019

I like Go's standard library. It's well designed. The number of pitfalls is pretty small.

I have more thoughts, but they are very hand wavy and ill-formed, so please take them with a grain of salt. One of my theories for why the Go standard library has had as much success as it has, is that it doesn't necessarily provide implementations that go as fast as reasonably possible, and that tends to give more flexibility for exposing simpler APIs. A good microcosm of this idea is JSON (de)serialization. Without even blinking, I can think of three reasonably popular third party JSON (de)serialization libraries in the Go ecosystem. The one provided by the standard library is pretty slow compared to some of them. It's not clear to me that it can be fixed without changing the API. But it's a good example where the standard library has provided something, but it isn't good enough in a lot of cases, so folks wind up bringing in a third party dependency for it anyway.

But even that alone isn't necessarily a bad thing. encoding/json is likely good enough for a really large number of use cases. On top of that, it's very convenient to use. (I'd still take serde in Rust over Go's system any day, but that's a different conversation.) And this kind of fits within Go norms pretty well. Go was never built to be the fastest, so the fact that some of its standard library has perhaps sacrificed performance for some API simplicity is totally consistent with that norm. And I don't think that norm is a bad thing.

There are some other examples where Go's standard library is slower than what it could be, for example, CSV parsing and walking a directory hierarchy.

Overall, I think the balance struck by Go's standard library was very nicely done. However, I'm not convinced it could have been replicated by Rust. The reasons for that are just guesses, and wander too far into musings about how the language is itself developed. But even putting that aside, Rust is going to have stricter requirements, because people tend to gravitate toward Rust when performance is important. So if std doesn't provide the fastest possible thing, then it's going to be a bigger deal than if Go does the same thing.

Again, above is super hand wavy and just a bunch of opinions from my own personal perspective.

jacques_chester · on Aug 22, 2019

> I like Go's standard library. It's well designed.

It feels to me more like code incidental to other purposes than a library-as-library. Some folks like that and extracting from a concrete use is often instructive and efficient.

On the other hand, there is a truly maddening inconsistency in whether you get an interface or a physical struct.

One of these lends itself to easy replacement, injection and mocking.

The other lends itself to writing the nth-tillion interface wrapper for the parts of the standard library which use physical structs. Which are, naturally, slightly different from and therefore incompatible everyone else's bangzillionth interface wrapper for the parts of the standard library which use physical structs.

Then there's errors. But that's another day's rant.

jchw · on Aug 22, 2019

> On the other hand, there is a truly maddening inconsistency in whether you get an interface or a physical struct.

That’s an issue, though it’s also one with most Go libraries too.

The thing is though, unnecessary interfaces also kind of suck. It makes code harder to follow, when the concrete type is hidden behind an interface. Also, one of the things that’s common in Go is testing with real implementations, rather than mocking - and I used to do just that. Even with Redis, I had a tiny shim server that implements the Redis protocol, that I would use in tests. Not absolutely everything can be done efficiently this way, but the virtues of testing with real clients are hard to ignore. Mocks and stubs can hide a lot of bugs that you would need to hope are caught by slower integration or e2e tests... And by virtue of being slower, they generally would cover less branches, too.

> Then there's errors. But that's another day's rant.

Have you kept up with the latest? I think they’re headed in the right direction with errors. Specifically with Is and As, along with the %w directive. The %w directive is a bit weird, but honestly, it’s a clever solution, and it seems like it would work.

jchw · on Aug 22, 2019

Thanks, that seems reasonable. I still hope Rust can strike a better balance than it has now, which perhaps will become easier as champions emerge in the ecosystem. (Crossbeam stuff in standard library would be nice.)

sansnomme · on Aug 22, 2019

A better choice would be enterprise titans like Java and .NET, more bases are covered. Go's GUI story for example is just endless fragmentation compared to e.g. Java.

jchw · on Aug 22, 2019

Apples and oranges. It is a feature that Go, Rust don't have "enterprise" standard libraries like Java and .NET.

pcwalton · on Aug 22, 2019

This comment indicates the problem! Taking the scare-quotes "enterprise" as a synonym for "bloat", one person's necessary feature is another person's bloat.

.NET is an interesting case, because of WinForms, which is a de facto part of the standard library. I think few C# developers think of WinForms as bloat; it's very convenient for making simple UIs. Yet putting, say, GTK+, in the Go standard library would doubtless be considered bloat. I don't think there are easy answers to these questions.

grumpydba · on Aug 22, 2019

> it's very convenient for making simple UIs.

It's not that convenient on Linux though...

GordonS · on Aug 22, 2019

Agree, the "enterprise" jibe here is misguided - as an OSS developer, I find the dotnet standard library to be fantastic. I don't find it to be at all bloated or 'enterprisey'.

sanderjd · on Aug 22, 2019

Yeah, see, I think it's anachronistic and odd that Java comes with GUI stuff included! That seems niche and better relegated to a library to me.

jclulow · on Aug 22, 2019

It wasn't, though, in 1995. Though it is perhaps less fashionable now, it was certainly something a lot of programs used at the time.

sanderjd · on Aug 22, 2019

This is why I chose the word "anachronistic". It seems of another time, because it is. It's definitely hard to figure out what will and won't be timeless, but it isn't hard to look back with hindsight and point out things that definetely weren't.

sansnomme · on Aug 22, 2019

Yes, and instead now we have devs forced to build SPAs and architect every single thing as a client-server app with a database attached. I get it, SaaS is great for vendor lock-in. But not every in-house tool needs to be run as a service.

quickthrower2 · on Aug 22, 2019

The culture of the programmers comes into it too. In the Java/.NET world devs are happier to take what the core libraries provide.

A case in point is how the ORM Entity Framework that comes with .NET has made the older NHibernate (a separate package) obsolete.

.NET developers love using the standard libs, but OTOH Microsoft has a lot of resources to create very complete libraries.

chc · on Aug 22, 2019

Even in Java, if you're looking to do anything with dates, everyone will tell you to use Joda-Time over java.util.Date.

sanderjd · on Aug 22, 2019

I think not anymore since java.time - which is incredibly similar to Joda - came around? This is actually an illustrative example of the process I like and hope rust will develop over time: let the community reach a consensus on the best third party libraries things, then consider pulling them in to or at least taking the best parts of their APIs for the standard library.

bluejekyll · on Aug 22, 2019

But won’t the community converge on that anyway, and in a similar amount of time? So why add it to the stdlib?

tikkabhuna · on Aug 22, 2019

Other Java stdlib packages can't depend on Joda-time. If it wasn't added and I used joda-time I'd have to convert to the old datetime classes if I wanted to use it with stdlib packages.

Another example was CompletableFutures which were inspired by ListenableFutures from Guava.

I can now use these with guarantees that they will be stable as Java has strong commitments to backwards compatibility.

erik_seaberg · on Aug 22, 2019

The JRE should be self-sufficient. By bundling java.time, it can finally start offering methods that take and return those types, instead of the current jumble of millis, nanos, long+TimeUnit pairs, Dates, and Calendars.

sanderjd · on Aug 22, 2019

There's a cost to discovering the consensus choices that I think inclusion in a standard lib minimizes. But there may be other similarly good ways to accomplish this. If I'm remembering correctly, doesn't Rust have a set of libraries that aren't in the standard library but are somehow vouched for? Maybe that's a similarly good approach to solve the discovery problem, I'm not sure.

steveklabnik · on Aug 22, 2019

There are libraries that are maintained by the project itself, but are not part of the standard library, yes.

sanderjd · on Aug 22, 2019

Ok thanks. Is the intention for that to grow, as a curated set of libraries that isn't quite the standard library? Or so you think some of those will move into the standard library if they become canonical or stable enough?

steveklabnik · on Aug 22, 2019

We’ll see; it’s not a simple thing.

In theory they can move into the standard library, but none ever have. The process exists though.

raphlinus · on Aug 22, 2019

I can think of a bunch of stuff that has moved or is moving from third party crates into the standard library: parking_lot, hashbrown, (minimal) Future trait. But I agree that no crate has moved wholesale into the standard lib.

steveklabnik · on Aug 22, 2019

parking lot and hashbrown moved their internals, replacing ones that already existed. This is probably a distinction that doesn’t actually matter but in my head it’s different for some reason, thanks for pointing it out :)

kd5bjo · on Aug 22, 2019

Is there a list of these somewhere? Crates.io doesn’t appear to let me sort by author.

throwupaway123 · on Aug 22, 2019

This is the GitHub repository

https://github.com/rust-lang-nursery/

steveklabnik · on Aug 22, 2019

We’re in the process of deprecating the nursery, so they’ll end up in the rust-lang org at some point.

quantummkv · on Aug 22, 2019

Stability? Guarantee that the api will continue to work and that no one will wake up one morning and left-pad the entire ecosystem?

darksaints · on Aug 22, 2019

That's true if you're still on Java 7. java.time.* is the go-to now, and it is very high quality.

pjmlp · on Aug 22, 2019

Only if one is stuck in pre-Java 8.

GordonS · on Aug 22, 2019

> A case in point is how the ORM Entity Framework that comes with .NET has made the older NHibernate (a separate package) obsolete

This did happen over time, but NHibernate was still really popular for a long time after EF came out, because of limitations it had.

I also don't think it was entirely because EF existed - over time, EF implemented more and more features that NHibernate had, yet at the same time it seemed like the NHibernate team had given up - there were no updates to it for a long time. It was the lack of updates that moved me to EF, but I always preferred NHibernate.

tjalfi · on Aug 24, 2019

My goal for choosing .NET libraries is to minimize the total complexity of our software stack. [0]

I won’t add a library to our stack if we have an existing solution.

Most of our apps are written in C# and use Entity Framework with either ASP.NET MVC or Windows Forms.

I would need an excellent reason to use something else.

[0] https://mcfunley.com/choose-boring-technology

stuartd · on Aug 22, 2019

> .NET developers love using the standard libs

I may not be a typical .Net developer, but my personal feeling is that this is too general and somewhat glib.

If you're inexperienced you will (and should!) choose the default option, if there's one available, and the most commonly used option if there isn't a default. So, if you wanted an ORM, then before EF there was NHibernate. But after a while you become able (from painfully gained experience) to determine what you want from a tool and what trade-offs you want to make, so you might use something like Dapper or no ORM at all.

Personally, I have found some of the MS provided implementations - shall we say - less than optimal. The Unity Framework. EF (which I dearly wish I have never had to suffer using). MSTest. Enterprise Library (oh god ugh).

So I make other - informed - choices about what to use. Some things I keep using because they are 'good enough' and I have a library of utilities and a mental map of how they work - log4net, NUnit - and some things I find I just don't need any more (mocking libraries, for one).

MS still support their provided implementations - as they should - but even their own projects can and do use third-party frameworks rather than the MS provided implementation (for example, Bot Framework used AutoFac (back when I was using it, anyway) [1]) - because their developers have been released from the requirement to exclusively use MS provided frameworks, and are making their own choices about what to use in their projects, and consequently what their users should use when using those tools.

Eventually, of course, if a tool is so crucial that it becomes part of .Net itself - the best example I can think of is dependency injection in .Net Core, which is in Microsoft.Extensions.DependencyInjection - you'd have to be mighty stubborn to use anything else.

One other point: I wouldn't describe NHibernate as 'obsolete', but given their historic and chronic inability to keep their documentation sites live and working ([2] referenced from [3]) it's easy to get that impression. But people are certainly still using it [4]. Just not as many as there used to be.

[1] https://github.com/microsoft/botframework-sdk/issues/938 [2] http://www.nhforge.org/doc/nh/en/index.html [3] https://ayende.com/blog/4139/nhibernate-documentation [4] https://stackoverflow.com/questions/tagged/nhibernate

quickthrower2 · on Aug 22, 2019

I'm not saying .NET developers will never use other libraries as an alternative, but they wont invent wheels (or used reinvented wheels) where the .NET provided one is solid. I am speaking generally of course. And there are many developers who are .NET + something else. I'm .NET and dabbled with Haskell and Node JS for example. I'm looking at Lisp as it is interesting. But let's write a web app in .NET. How many .NET developers will think "which framework should I use?". That is a valid question in JS/Haskell/Lisp. It's quite interesting, and its a positive for .NET in many ways. You can get stuff done, and also come in on a project and it be familiar.

stackzero · on Aug 22, 2019

I think this reflects the enterprise culture and auditing practices more than anything

dhbradshaw · on Aug 22, 2019

The interesting thing about the way things have developed with using third party dependencies early on is that these dependencies provide us with data.

We don't have to just guess or make biased claims about what would be useful to move to the standard library. We can look at the numbers and see what people are actually using.

atoav · on Aug 22, 2019

I am a Rust user myself and I think one major problem is that beeing on crates.io say nothing about the quality of code. I never published a single crate myself because in my eyes to be worth published a crate should work decently.

That beeing said, I think the whole Rust userbase would benefit from having some sort of collection of well tested crates and strictly divide between private, work in progress and production crates.

chartpath · on Aug 22, 2019

I only know Node and Python well enough. I find with NPM that dependencies are indeed hell because anything goes. With Python it's slightly less bad because packages have a fixed tree of versions (although I don't know what's up with Conda or others yet), not to mention the more extensive standard lib. I suppose if there are real conflict workarounds required you could always use virtual environments all the way down, but I have generally just done whatever upgrading is required to resolve incompatibilities (which can sometimes preclude using third-party deps or require forking and upgrading their own deps).

I also remember from years ago that Bundler in Ruby allowed version mismatches to coexist and would install both deps in the same tree, punting any runtime issues (same as NPM).

Any Gentoo or NIX users might have something to add here, as I can remember being regaled at conferences by them about this topic.

All that said, it would seem that Rust could have some kind of super-intelligence about dependencies due to all the static goodness. So my questions are:

a) who cares how much is userland and how much is std lib if everything is equally safe and documentable?

b) if "too much" becomes std lib, could any feelings of overwhelmingness not be mitigated with more namespacing?

ajxs · on Aug 22, 2019

The canonical example I was thinking of was this: https://doc.rust-lang.org/nomicon/ffi.html We only get as far as the second paragraph before the official manual is recommending that we use a 3rd party crate. I understand that this crate is made by the Rust developers, but this doesn't seem like a good approach to me.

sgentle · on Aug 22, 2019

I feel like the role of a standard library has become kind of overloaded. Back when you had to manage all your dependencies yourself, having that baseline of functionality bundled with the language and maintained by its authors was a logistical necessity. You just don't have time to chase deps for every little thing.

But in the modern world, I agree with the small language people: the logistical problem is dead. Almost every modern language has push-button dependency management, so the difficulty overhead of using a third-party library is basically zero. And, as you pointed out, this means that the stdlib now competes with third-party libraries, and the third-party libraries proliferate wildly. So there's still a problem, it's just not logistical anymore.

The other problem a stdlib solves is making decisions. This is the npm nightmare: which of these fifty libraries do I use? They're easy to install, but hard to evaluate. This compounds because every library is also making those decisions with their dependencies and so on, so you can end up with 8 different implementations of basically the same thing. You don't have this problem with a stdlib because if there's a std::string, everyone expects your code to work with std::string.

So perhaps the modern stdlib is better off as a standards library: no code, just a mapping from name (ie "option_parser") to package@version (ie "clap@1.1.0"). The work of the stdlib would then be to curate this mapping in a cohesive way, so the packages are high quality individually, but also work well together and reflect the general direction of the language and its community. Whether or not these packages are actually bundled with the compiler is really just an implementation detail. The library isn't code; it's decisions.

Updates would be pinned to release versions, so backwards-incompatible changes coincide with a major version/edition/etc of the language. This would be an expected process, because, as with urllib{n+1}, the first answer isn't always the best. Third-party packages are just as available as ever, of course, and a destandardised package is only a (trivially automated) rename away.

Aside from providing a layer of simplicity and stability over the third-party ecosystem, the standardised libraries would serve as reference implementations for a common interface, like Node's express/connect, or Rust's tokio/futures. In other words, it could help concentrate community effort around emerging standards.

I grant that this is a very, uh, political approach to what has historically been a code problem, but if anything I think the current situation owes itself largely to treating community problems ("how do we agree on a foundational layer of library code?") with technical solutions (package registry + sort by stars).

atoav · on Aug 22, 2019

The role of the standard library is entangled with the role of the language as a whole. Many people use tbe python interpreter as a sort of advanced calculator. If batteries weren’t included there this whole usecase would suffer.

For a language like Rust this looks different..

Amezarak · on Aug 22, 2019

There are a lot of us out there with dev and prod environments that aren’t allowed to be connected to the Internet or otherwise strictly controlled. Languages that rely on me downloading packages are mostly dead in the water at my workplace.

stouset · on Aug 22, 2019

Virtually no modern language or package system requires this. All of them allow you to cache dependencies locally, so they can be committed to version control or otherwise put in place during deployment however you see fit.

Amezarak · on Aug 22, 2019

Maybe I didn't communicate my point clearly. I cannot just go out and get whatever dependencies I want to use. It isn't possible. What I have on the system is what I've got, and I can't add anything. I cannot put the dependencies on the system by any means.

Thus a lot of things get done in Python and Java, because of the ample standard libraries. I was able, after a year of lobbying and procedures and approvals, to get a Rust compiler, but there is zero chance of me ever getting anything off crates.io.

steveklabnik · on Aug 22, 2019

Firefox, Debian, and many other build systems aren’t either. That’s orthogonal from all of this; the use case is well supported and has been for years.

majewsky · on Aug 22, 2019

My main gripe with crates.io is that they allowed everyone to take every name. The "foo" crate just belongs to the first person who took (or squatted) that name, regardless of whether that person actually implements a nice library under that name.

What they should've done is allow uploads only to "<username>/<cratename>". So if I decide to make a regex crate, it's called "majewsky/regex" at first and Alice can make "alice/regex" and Bob can make "bob/regex". Then at some point, the community (through some open process) decides that Alice's crate has the best API, so "alice/regex" gets aliased to just "regex".

That way, everything in the main namespace adheres to some sort of quality standard and has community support behind it. Because there's some explicit process gate to getting stuff into the main namespace, you could attach any number of beneficial requirements to it, e.g.:

- test coverage

- documentation coverage

- at least 3 people having committer rights to the repo, at least 2 of which must not be affiliated with the same company

fnord123 · on Aug 22, 2019

Apparently that was done in the Ruby community and Github "helpfully" uploaded all the Ruby packages to the gems repository meaning all 217 forks of https://github.com/httprb/http became potential gems.

But the underlying idea is sound: namespace and federation. Rust should ideally follow Java's solution. And it's completely forward portable: when namespacing is introduced, the crates.io/regex crates become io.crates.regex. Then you can have your com.github.majewsky.regex.

majewsky · on Aug 22, 2019

I don't particularly like URL-derived package names, esp. with Go where the repo URL is auto-derived from the package name, because that makes it an absolute pita to move the repo to a new canonical location.

fnord123 · on Aug 22, 2019

That's not an issue in Java.

Macha · on Aug 22, 2019

That basically just moves the problem to a land grab for the group name. Because people don't think Alice/regex looks professional enough, there will be a rush for regex/regex.

apta · on Aug 23, 2019

You don't see that with Java/Maven.

Macha · on Aug 23, 2019

Thats because it's based on domain names and the land grab has already occured prior to names being chosen

acheron9383 · on Aug 21, 2019

The language is still 'fairly new'. I'd rather the lang team let popular things cook as a 3rd party package for awhile until a good default solution shakes out. For instance hashbrown was just pulled into the stdlib. Some people are taking a look at bringing Crossbeam into the stdlib. [1] https://internals.rust-lang.org/t/proposal-new-channels-for-...

dralley · on Aug 22, 2019

And some libraries that were included are now recognized to have been mistakes, e.g. std::sync::mpsc

roca · on Aug 22, 2019

This issue gets discussed over and over again.

Rust does need a better package curation story, a Rust "expanded universe" of recommended packages whose provenance is more carefully tracked than the average package and whose APIs are stable. The Rust community needs to encourage other packages to depend on stable versions in that recommended set to reduce duplication.

But going further and actually making that set to be part of the "standard library", and thus released at the cadence of the Rust language itself, and managed under the same umbrella, would be harmful.

pcwalton · on Aug 22, 2019

Rust used to have getopt in the standard library. It was removed because it was bad. Clap only came later. I don't think Rust would have been a better language if the old getopt were kept around.

falcolas · on Aug 22, 2019

One very quick comment in support of a strong first party library: it’s proven to be much simpler, and much more common, for third party libraries to be compromised and everything from intentional security holes to full on malware to be bundled into the distributed packages.

I could pin every dependency, and monitor CVEs for all my dependencies, and monitor the ownership/code changes for all my dependencies, and hope for updates in a timely manner should there be CVEs... or I can use the standard library and move on with my life.

Personal opinion of course. Some people enjoy monitoring CVE lists. :)

Myrmornis · on Aug 21, 2019

Isn't it at least partly explained by wanting to allow implementations of this functionality to evolve more rapidly than they could if they were in the stdlib and thus reach good solutions more rapidly? getopt in particular seems like something that programmers haven't reached a consensus on despite 40 years of experimentation.

ajxs · on Aug 22, 2019

I would beg to differ with this. The kind of functionality expected from Getopt is fairly standard these days. I also disagree that features like getopt really need to 'evolve' much at this point in time. At the risk of courting controversy here, I'm going to confess that often I just don't want the community involved in the development of a language. On a project I inherited recently ( with the mission to save ) I unfortunately had to suffer using Typescript. If you look at the community discussions regarding the language's future direction you'll be treated to the uninformed arguing with the ignorant about the language's direction. Every time I download an external crate, I have to learn a different developer's way of doing things, suffer their idiosyncratic ways of creating an API and potentially expose myself or my application to a new set of vulnerabilities. The more I can get away with not doing this, the better.

sanderjd · on Aug 22, 2019

Who is "the community" and who isn't? Are you sure you trust the core language developers more than people writing libraries?

ajxs · on Aug 22, 2019

More often than not, absolutely. It requires a much higher level of competence to design a language and develop a functional compiler than it does to design libraries. There are packages in Node with millions of downloads, packages that are basically ubiquitous in certain domains that are riddled with bugs, with terrible interfaces and documentation. I can even think of libraries I've worked with in Java/C/.net/etc that are just horrific... If the languages themselves were as badly designed as the average library, they'd never succeed.

sanderjd · on Aug 22, 2019

Language design and compiler implementation are definetely high-competency skills, but they don't necessarily correlate with library design skills.

For instance, http client and server libraries are often in this gray area of uncertainty about whether they should be in the stdlib or not. Is this something language designers or compiler implementors have a lot of experience in? I would say not; sending and serving http requests are not something compilers need to do. Or take GUI libraries, what do language designers know about that?

I also know of many bad third party libraries, but there are tons of examples of really awful parts of standard libraries. The original date/time APIs in Java are a mess (they finally fixed this by in essence bringing in a third party API). The ssl bindings in the Ruby stdlib were a common source of bugs back when I was paying attention to this (maybe they've fixed it), same thing for the built in http stuff. Someone else mentioned the similar weakness of python's built in http, such that most people use a third party library instead. Even the java collections APIs are pretty poor such that people often augment them with things like guava or apache libraries.

My point is just that developing good libraries is a hard thing and I don't see any reason to think language designers or compiler maintainers are any better (or worse!) at it than other people. There isn't really a shortcut, you can't just cede authority to the powers that be on the core language teams, you just have to evaluate the quality of libraries for your use case yourself.

ajxs · on Aug 22, 2019

You've made a very good point that proficiency in developing compiler infrastructure does not imply that you're qualified to develop every specific aspect of a standard library. Date/Time, as you pointed out, is a very good example of this. It's a very complex domain that requires specialised knowledge. I'll counter this by saying that one aspect of language design is choosing the scope of the project, and deciding how best to implement a standard library targeting the language's intended domain. If your language is designed to implement web servers, then developing a GUI library might be a poor investment. Consequently, if your language is designed for implementing system applications, then investing time and talent into developing things like FFI, Filesystem and GUI functionality are just the prerequisites to the language being useful in its intended domain.

Myrmornis · on Aug 22, 2019

Yes I do sympathize with this in that, put bluntly, I'd also prefer that a programming language be developed by a smaller team of highly skilled people, as opposed to making the process as accessible as possible. But I really think there are ways for everyone to help out while retaining the feature that the most important and challenging parts are contributed to by the appropriate people, without upsetting anyone. And I think it's possible that Rust might be a shining example of such a thing. Clearly, in modern western society, it's hard to avoid this discussion acquiring a political dimension. And honestly that is something the open source community might need to address openly and attempt to do a better job of schism-avoidance than other areas of society.

hinkley · on Aug 22, 2019

> Its package ecosystem isn't nearly in the horrible state that node's is

I propose that the dictionary entry for 'faint praise' be updated to include this example.

vvanders · on Aug 22, 2019

On the flip side Lua is a wonderful language because its small lib lets it go anywhere.

We used to run a whole gamestate of a shipped title on PSP in a 400kb block allocation. I've yet to see that in any other dynamic language of consequence.

hinkley · on Aug 22, 2019

When I was still in that 'I should write a programming language' stage of career development, I worked on a pretty sophisticated (for the era) mobile app. PyPy was getting quite a bit of press around that time and my brain connected some dots.

One of the ideas I wanted very much to explore is scaling the API, both up and down. For building something akin to PyPy, you might want a 'kernel', a small set of libraries that were available everywhere, and several other levels that include more or different things.

Mobile takes the base and adds a few things suitable for mobile (storage, UI, broader networking). Desktop has a real UI, and then there's the kitchen sink like .Net and Java have.

But you have the same problems you always have with decomposition - if you didn't guess the right boundaries when you built the thing then removing or rearranging bits is a serious PITA. Sometimes I think the best we can hope for is to leave clear messages for the next language so that it doesn't organize things the way we did.

nonbirithm · on Aug 22, 2019

LuaJIT is incredibly fast. It might just be the fastest scripting language out there.

I think this was only possible because Lua itself is such a simple language. It says a lot that LuaJIT's implementation of this simple language is extremely complex in comparison to vanilla 5.1. Imagine the added complexity for something like Ruby that offers more than one associative data structure.

With LOVE you get a complete game engine runtime with all the boring OS abstractions taken care of and you can just start working.

I tried a few game development bouts with Rust since it seemed like a natural step up from C++ but the compile cycle kills a lot of my creative drive.

I think this is just a consequence of the language being compiled, as with C++. With Rust having a wide variety of language features the compile time is increased also. At least we're lucky to have languages today that offer far shorter turnaround times for building prototypes.

pitaj · on Aug 22, 2019

Statically compiled languages like Rust can have a large a standard library as they want, because only the code that is actually used will get included in the binary.

giancarlostoro · on Aug 22, 2019

> but having a package system shouldn't be a substitute for designing a useful standard library for a language.

Completely agree. I hold firm to the believe that a strong standard library that considers modern software development goals will drive the success of a language.

Look at Go. Which is usually my example since not many do what Go does. You can do a whole web application in Go with minimal use (if any) of external libraries.

Back to Rust and their defense: they intend to adopt popular / quality packages from Cargo into the standard library or fill the gap between packages.

I secretly wish D had similar things to Go in the standard library. HTTP and SMTP etc which Python seems to have at least HTTP and sadly Go ditched their SMTP package for whatever reason making it awkward.

I hate having to learn a new package manager per new language. I love programming so I try a lot of languages out of love. Package managers and build systems are horrible UX in every language. I prefer to not rely on third party oh look now deprecated packages and start out with whats out of the box.

ameixaseca · on Aug 23, 2019

Given there are no infinite resources, one has to choose where to focus. Rust team seems to be focusing on the language and solving hard-to-deal-with but relevant problems - and on evolving the language, which is truly a remarkable feat when you are pushing the boundaries.

Libraries can be implemented by 3rd parties and maybe later adopted as standard libraries or become de-facto standards. That's how it has been done for most popular languages out there, and I think it's a smart decision.

spieglt · on Aug 22, 2019

If Rust had a standard library like Go's I'd love it even more.

majewsky · on Aug 22, 2019

Go's standard library works for Go because it has a rather sharp focus on implementing web services. It's worthless for implementing a GUI application, or a particle physics simulation, or a PID-1 daemon.

Rust has a much broader aim, so a stdlib accommodating all of its usecases would be as comically huge as Python's, with all the problems that come from that.

spieglt · on Aug 22, 2019

It's also good for crypto, image processing, logging, file compression, and other commonly useful things. Go does have an emphasis on network services but is not limited to that.

"a stdlib accommodating all of its usecases would be as comically huge as Python's, with all the problems that come from that."

That's a straw man. I don't want a stdlib like Python's, I want one like Go's.

pcwalton · on Aug 22, 2019

Even if that library had slow serialization like that of Go, forcing the use of packages to make it faster?

ninkendo · on Aug 22, 2019

So we'd be no worse off than we are now? And just in that one case?

TheCoelacanth · on Aug 22, 2019

Except for all of the wasted work for maintainers to maintain the crappy library that has a better replacement outside of the standard library.

baby · on Aug 22, 2019

This, Rust std library is ridiculous compared to Golang’s one.

Qasaur · on Aug 21, 2019

Use polymorphism sparingly

I think it is a little ironic that he speaks of performance culture but simutaneously advises to use dynamic dispatch and avoid polymorphism. I can see the justification in non-critical code paths, but serialisation is a pretty important part of most networked software nowadays so I do not think that smaller binaries and faster compilation times (better developer experience) justifies a performance hit in the form of dynamic dispatch through crates like miniserde.

raphlinus · on Aug 21, 2019

Performance culture has you measure the actual performance implications, then make an informed decision. Is the code on a performance-critical path? Maybe some of your serialization code is, but it's extremely unlikely that a dynamic dispatch when parsing command line args is the reason your app is slow. Also be aware that highly inlined code does nicely in microbenchmarks but might have significantly negative performance implications in a larger system when it blows out the I-cache.

pcwalton · on Aug 21, 2019

> Also be aware that highly inlined code does nicely in microbenchmarks but might have significantly negative performance implications in a larger system when it blows out the I-cache.

I see this assertion a lot, but I have never actually seen a system in which inlining that would otherwise be a win in terms of performance becomes a loss in a large system. LLVM developers seem to agree, because LLVM is quite aggressive in inlining (the joke is that LLVM's inlining heuristic is "yes").

I'd be curious to see any examples of I$ effects from the effects of inlining specifically in large systems mattering in practice.

Jasper_ · on Aug 21, 2019

Fiora refactored the MMU code emitted by Dolphin to a far jump, which had significant performance improvements over inlining the code [0]. She had an article about it in PoC || GTFO [1].

[0] https://dolphin-emu.org/blog/2014/09/30/dolphin-progress-rep...

[1] https://github.com/angea/pocorgtfo/blob/master/contents/issu...

pcwalton · on Aug 22, 2019

Interesting, that's a good case. Though it's a bit of an extreme one, because it's jitcode for a CPU emulator. I'm not sure how relevant that is to Rust, though it's certainly worth keeping in mind.

Jasper_ · on Aug 22, 2019

In my experience, i$ is much bigger than everyone thinks, and they over-emphasize optimizing for it whenever someone brings up code size. It can soak up a lot. That said, for JITs, where code is not accessed very often and in weird patterns, it can matter quite a lot.

jcelerier · on Aug 22, 2019

hm, I've run a lot of profiling of various software through the years and never once instruction cache misses have been a problem, in large template-rich boostful C++ codebases

ajtulloch · on Aug 22, 2019

Systems that JIT large amounts of code (HHVM, etc) deal with this trade-off all the time. See e.g. https://qconsf.com/sf2012/dl/qcon-sanfran-2012/slides/KeithA... for an old discussion of some of the issues (e.g. inlined/specialized versions of memcpy were slower overall than a standalone-slower outlined version).

barrkel · on Aug 22, 2019

You're asking for something which is a bit awkward to find, because it requires a bunch of code in a loop to pressure the cache, and then have someone notice the effects of inlining one thing vs not makes all the difference.

The most likely people to be able to answer this one would be game devs or video codec hackers, at a guess.

I do know that inlining choices can have massive effects on executable size. I've seen more people complain about this kind of thing. It's most noticeable when controlling the inlining of a runtime library function in a language a bit more high level than Rust - I'm thinking of Delphi, with its managed strings, arrays, interfaces etc.

ufo · on Aug 21, 2019

One somewhat related example I can think of was how the v8 javascript implementation switched from a baseline compiler to a baseline interpreter. The interpreter version has less startup latency (because compiling to bytecode is less work) and uses less RAM (because bytecodes are more compact).

It isn't exactly about inling but it is an example where optimizing for size also optimized for speed at the same time.

pcwalton · on Aug 21, 2019

Good example, but that's not I$ specifically.

jnordwick · on Aug 22, 2019

Any time you have an error/exception/abort path, you always never want to inline it (LLVM prob has attributes to prevent that, but I'm not sure if they are used by rust). Also, LLVM does get a little too aggressive with things like unrolling so I wouldn't be surprised if it inlined too aggressively too.

Jweb_Guru · on Aug 22, 2019

They are used by Rust.

jnordwick · on Aug 25, 2019

How does it tell an unimportant error path from a needs to be optimized common case (this is where exceptions would be great).

exacube · on Aug 22, 2019

I've been told Chromium saw some measurable benefits from reducing binary size over code-inlining. (sorry, I'm not able to provide a citing).

Qasaur · on Aug 21, 2019

I do agree with you with the measuring aspect. Part of building high-performance systems is being able to measure performance in an accurate and actionable way, and consequently optimise code paths that have significant performance impacts.

However, I do believe that a competent engineer would have the judgement to be able to see, roughly, where performance hits would likely arise and optimise accordingly. Command-line arguments would likely not fall under this mandate, but serialisation to stdout is likely a good candidate for well-designed and well-optimised code. A nice side-effect is that this also avoids significant refactors down the line when you need to, in this case, change your serialisation from dynamic dispatch to static dispatch.

andolanra · on Aug 21, 2019

One thing I like about Rust a lot is that it lets you choose which hit you want to take when it comes to polymorphism. You can manually (and without much difficulty!) prefer dynamic dispatch if it's important to you to keep your binary size small, but you can also choose static dispatch and allow some replicated copies of your parametric code if that's what you want to optimize for.

It's also worth noting that if you are using polymorphic functions that are only ever called in your code with a single known type parameter, then your program should be just as efficient as if you wrote a monomorphic version with that fixed type in both runtime _and_ binary size, which means the only disadvantage to the polymorphism there is compile time.

bogdanoff_2 · on Aug 22, 2019

It would be nice if Box<dyn Trait> implemented Trait. Then we would be able to write the function only once with parametric polymorphism and then at callsite decide if we want to monomorphize for the given type or not.

grumdan · on Aug 22, 2019

You can provide this implementation yourself easily enough though. I agree it's maybe not ideal that this needs to be done for every Trait you want this behavior for.

Koshkin · on Aug 21, 2019

I’ve been using C and C++ (sorry Rust), like, forever, and I think should hate them with all fibers of my soul. A high-level programming language is “supposed” to let me forget, for the higher being’s sake, all machine-specific details and focus on the logic of the problem at hand. (Hello FORTRAN.)

dingo_bat · on Aug 22, 2019

If you, as the programmer, would prefer not to know and remember machine specific details, you can use rust or c++ easily. Just don't think you will always get the best performance possible for your hardware. And I think that is justified and reasonable.

maxdamantus · on Aug 22, 2019

I've always thought the fact that parametric polymorphism always results in monomorphised code is just a temporary deficiency in the language/compiler.

I think if a problem can be solved with either, the default choice should generally be parametric polymorphism rather than subtype polymorphism, simply from a logical point of view.

Haskell is often described as passing around "dictionaries" corresponding to class instances. Presumably Rust could add the same functionality depending on how a type parameter is declared (eg, `fn foo<T: Foo>() ..` denotes a monomorphised function whereas `fn foo<%T: Foo>() ..` could denote a non-monomorphised function which at runtime takes arguments specifying the size and alignment of `T` as well as a vtable corresponding to the `Foo` trait). This would also make polymorphic recursion possible.

dbaupp · on Aug 22, 2019

Swift takes this approach: monomorphisation is an implementation detail/optimisation.

In languages like Rust and Swift where values don't have a uniform representation (that is, not always a pointer), this takes a lot of infastructure, and a lot of performance/optimiser work to get reasonable performance for common code: the Swift compiler has quite a bit of code devoted to making generically-types values behave mostly like statically-typed ones, with minimal performance cliffs.

Rust's approach is that this sort of vtable-/dictionary-passing has to be done explicitly (with a trait object), and such values have restrictions.

raphlinus · on Aug 22, 2019

To a very large extent, we already have this with `fn foo(foo: &dyn T)` (or `foo: Box<dyn T>` for the owned version). What I would find even more interesting is the compiler much more aggressively factoring out the common code from the multiple instances, ideally compiling it only once and putting only that one version in the binary.

maxdamantus · on Aug 22, 2019

`T` there would be a trait though, not a type. You should still be able to have, for example, `fn foo<%T>(v: Vec<T>)` in which case you can still call the function with a regular `Vec<i32>`, since if the size/alignment is simply passed as an argument at runtime, you can operate on the existing non-boxed representations of data. The only thing that's different is the specialisation of the instructions essentially happens at runtime rather than at compile time.

I have however thought that it should be automatically done based on heuristics, but since the notion of "zero-cost abstraction" is considered the default, I don't think this would be desirable.

idubrov · on Aug 22, 2019

Well, I have this anecdote. We switched from serde to our own serialization / deserialization scheme (it still uses serde, but only for the JSON part), which is heavily based on dynamic dispatch, and actually got it faster.

Wasn't apples to apples comparison, but it was some times faster at the time (my memory doesn't serve me, but something around 3x to 5x). Also, compilation speed went down (well, at the time :) ). It was mostly due how some of the features work in serde (flatten and tagged enums), though.

I made a separate, cleaner, experiment (https://github.com/idubrov/dynser), which does not show that dramatic improvement (again, wasn't apples to apples, there were other factors which I don't remember), but shows some.

_nhynes · on Aug 21, 2019

When compiling to Wasm (esp. in the context of blockchain), small bytecodes are less costly than fast but short-lived programs.

However, in terms of maintainability, I’m happy to foist a few extra kB of bytecode on developers to produce an idiomatic (i.e. serde-using) library.

vbezhenar · on Aug 21, 2019

Can you do both? Fast compile time and slightly slower execution time for debug build using dynamic dispatch and long compile time and fast execution time for release build using static polymorphism from the same code base.

Qasaur · on Aug 21, 2019

It could probably be done using conditional builds for which there is support for in cargo, though that would require the programmer to write two versions of the same code. I doubt that it is possible for the compiler to do this optimisation automatically.

Recall that dynamic dispatch does not need you as a programmer to know which implementation is being used for a given polymorphic method or function - I find it difficult to see how the compiler would be able to reason what implementation is being referred to in code to generate static polymorphic code without the programmer being explicit. If that were the case, there would be no need to be explicit at all (and consequently no need for static dispatch in Rust code), and all you would need to do for polymorphism is to use dynamic dispatch. However, although this would be incredibly convenient and ergonomic, unfortunately the Rust compiler is not capable of magic.

swsieber · on Aug 21, 2019

While the compiker might not be able to do thus optimization automatically, you coukd probably write a proc macro to do it.

hinkley · on Aug 22, 2019

I think it's considered old hat by now for JITs in languages with polymorphism to inline a little bit of dynamic dispatch code into the call sites. The branch predictor gets to work its magic and removes the call overhead in a high number of cases.

I think I read somewhere that Javascript engines do something similar, with some extra code to de-optimize when you fiddle with the object prototype.

mhink · on Aug 22, 2019

I’m not super familiar with the details, but all the the major Javascript engines definitely do extremely involved runtime optimizations, and I wouldn’t be surprised at all if the case you described is one of them.

pornel · on Aug 21, 2019

It looks like the author is most bothered by compile times of dependencies.

Cargo needs to do better with shared caches (so you compile each dep at most once per machine) or ability to get precompiled crates (so you don't even compile it).

Incremental improvements of compiler speed or trimming of individual dependencies won't bring the 10x improvement it needs.

leggomylibro · on Aug 21, 2019

One of the biggest issues that I face with Rust is that its builds are enormous, and I often work on machines with limited disc space.

The actual binary sizes are fine - even with embedded devices that have <1MB of program space, there doesn't seem to be much bloat. But I need to remember to clean every project when I finish working on it, because otherwise the build folders will eat 100s-1000s of megabytes each on a 16-32GB partition.

I really appreciate Rust's inclusive approach to learning and teaching, but I can't justify using it for education on ultra-affordable machines for that reason. People often scatter one-off projects all over the place as they learn, and when they run out of space it takes a long time to clean everything up.

So I would also be very in favor of some sort of simple shared package cache.

asgeir · on Aug 22, 2019

Have you considered doing something like this:

  cargo build --target-dir ~/build-artifacts/$(basename $(pwd))

Then you can have something like this in your ~/.config/user-tmpfiles.d/clean_build.conf

  d /home/user/build-artifacts - - - 1d -

You can then do cleanup by calling

  systemd-tmpfiles --user

Or you can use the tmpfiles timer

  systemctl enable --user --now systemd-tmpfiles-clean.timer

leggomylibro · on Aug 24, 2019

That sounds like an interesting idea, thanks! Can that '--target-dir' argument be added in the build.rs script?

asgeir · on Aug 24, 2019

It looks like you can set it using a CARGO_TARGET_DIR environment variable or build.target-dir config value. So I guess you could create a $HOME/.cargo/config file with the following in it (letting you skip the --target-dir part):

  [build]
  target-dir = "/home/user/build-artifacts"

But I don't know if the different projects will trample each other that way. If that's the case you could just go with a .cargo/config per-project targeting a subdirectory of the build-artifacts directory.

https://doc.rust-lang.org/cargo/commands/cargo-build.html

https://doc.rust-lang.org/cargo/reference/config.html

holy_city · on Aug 21, 2019

You can kind of hack a shared cache using cargo workspaces (put all your rust projects in one directory as part of a workspace) but that is far from ideal. All your projects will share a target directory and Cargo.lock.

I agree though. I don't mind binary sizes in the single megabytes for a release build, and I don't mind clean/rebuild build times that are a few dozen seconds. I do mind the 500MB of build artifacts in a single target directory, multiplied by all the projects I have that share the same version of serde/winapi/syn/quote/insert-common-dependency-here.

_nhynes · on Aug 21, 2019

Have you tried sccache [0]? It doesn’t always choose to cache a dependency, but it helps about 70% of the time. Anecdotally, it hastened a release build of a pretty standard CLI tool (with incremental compilation) by almost 4x.

In the context of resource-constrained machines, one can always host it remotely on S3. (or mount an NFS share as the CARGO_TARGET_DIR, if you’re feeling adventurous or want fast CI)

[0] https://github.com/mozilla/sccache