The Rust Platform

tibbe · on July 28, 2016

I wouldn't recommend following the Haskell approach. It hasn't worked well for us. (I took part in creating the Haskell Platform and the process used to add packages to it. I also used to maintain a few of our core libraries, like our containers packages and networking).

Small vs large standard library:

A small standard library with most functionality in independent, community-maintained packages has given us API friction as types, traits (type classes), etc are hard to coordinate across maintainers and separate package release cycles. We ended up with lots of uncomfortable conversions at API boundaries.

Here's a number of examples of problems we currently have:

- Conversions between our 5(!) string types are very common.

- Standard library I/O modules cannot use new, de-facto standard string types (i.e. `Text` and `ByteString`) defined outside it because of dependency cycle.

- Standard library cannot use containers, other than lists, for the same reason.

- No standard traits for containers, like maps and sets, as those are defined outside the standard library. Result is that code is written against one concrete implementation.

- Newtype wrapping to avoid orphan instances. Having traits defined in packages other than the standard library makes it harder to write non-orphan instances.

- It's too difficult to make larger changes as we cannot atomically update all the packages at once. Thus such changes don't happen.

Empirically, languages that have large standard libraries (e.g. Java, Python, Go) seem to do better than their competitors.

pcwalton · on July 28, 2016

I don't think most of these are applicable to Rust.

> - Conversions between our 5(!) string types are very common.

> - Standard library I/O modules cannot use new, de-facto standard string types (i.e. `Text` and `ByteString`) defined outside it because of dependency cycle.

We have one string type defined in std, and nobody is defining new ones (modulo special cases for legacy encodings which would not be worth polluting the default string type with).

> - Standard library cannot use containers, other than lists, for the same reason.

> - No standard traits for containers, like maps and sets, as those are defined outside the standard library. Result is that code is written against one concrete implementation.

Hash maps and trees are in the standard library already. Everyone uses them.

> - Newtype wrapping to avoid orphan instances. Having traits defined in packages other than the standard library makes it harder to write non-orphan instances.

This is true, but this hasn't been much of a problem in Rust thus far.

> - It's too difficult to make larger changes as we cannot atomically update all the packages at once. Thus such changes don't happen.

That only matters if you're breaking public APIs, right? That seems orthogonal to the small-versus-large-standard-library debate. Even if you have a large standard library, if you promised it's stable you still can't break APIs.

rtpg · on July 28, 2016

But if you have a large standard library and want to break the API, you can.

If you have 100 different libs that are basically "standard" (who doesn't have `mtl` in their applications at this point), now you have to coordinate 100 different library updates roughly at the same time. If you forget even one of them, then you've broken everything.

I think the argument for a large Prelude/standard lib is similar to Google's "single repo" argument: Easy to catch usages and fix them all at once. Plus you're making the language more useful out of the box. People coming from python can understand this feeling of opening a python shell and being productive super quickly form the get-go.

Arguments for small std lib exist, of course. But Giant standard libraries are more useful than not.

EDIT: I think the failure of the Haskell Platform has a lot more to do with how Haskell deals with dependencies, and the difficulties it entails, than with the "batteries included" approach itself.

barrkel · on July 28, 2016

Standard libraries - types, in particular - are the lingua franca between unrelated libraries. The more that's in your standard library, the easier it is to integrate different libraries.

The higher level the library (e.g. containing content specific to an application domain), the more magical-seeming libraries can be added to the ecosystem. The counter-risk is the standard library growing in undesirable directions that you can never change because you can't remove stuff.

The interstitial glue that lets third party libraries integrate with one another and be usable by your app: that's the single biggest reason for having a bigger standard library than a smaller one. It has very little to do with including the batteries in the box.

If you think it has something to do with including the batteries in the box, you'll be lured into the trap of making it easy to fetch the batteries from across the internet (that's almost the same, right?). The trouble is, the internet has 100 different batteries to choose from, and not only have you offloaded the choice onto the user, but the batteries use mutually incompatible terminals and you have to jerry-rig interfaces between them. Let a thousand flowers bloom, say some people: trouble is, waiting for the biggest flower can take years, and people pick different ones in the early days. A bad choice is better than indecision.

Low effort updates are even less what large standard libraries are about. Large standard libraries are much harder to update, not easier: there's much more surface area, so it's far easier to break an application - and since every application uses the standard library, you could potentially break them all. Easier versioning and updates are a strong argument for extracting out things into third-party libraries.

hyperhopper · on July 28, 2016

But even then, languages that have great, thriving easy to use dependency systems and package managers with small standard libraries still run into problems

see: javascript

wilmoore · on July 30, 2016

The issue with comments written this way is that there are no details to support the claim.

Writing "see: JavaScript" doesn't really help without context. Without context, one does no know if you meant "JavaScript in browser" or "JavaScript via Node.JS" or "I simply don't like npm".

I'm not claiming there aren't any problems; however, "problems" are situational and one person's "problem" is another person's meh.

I just think it's irresponsible to not provide detail when making such claims.

pcwalton · on July 28, 2016

> But if you have a large standard library and want to break the API, you can.

We have a policy of no breakage for stable libraries post 1.0 (as does Python, and Go, etc.). So no, we can't.

The size of the standard library has nothing to do with it.

bshanks · on July 28, 2016

I'm curious about which language features or tooling do other languages have that make them better at dealing with dependencies than Haskell?

JoshTriplett · on July 28, 2016

> We have one string type defined in std

The standard library also includes Path/PathBuf and OsStr/OsString. And third-party libraries also use [u8] for bytestrings.

It'd be nice to improve handling for user-supplied text where you can't assume UTF-8. For instance, git2-rs provides the contents of diffs as [u8], because it can't assume the diffed files use UTF-8. That led to this commit today: https://github.com/ogham/rust-ansi-term/pull/19/commits/a0da...

That felt like a lot of boilerplate to abstract between str and [u8]. Is there a better way to solve that problem?

(As much as I'd love to just say "use UTF-8", that would break on many git repositories, including git.git and linux.git.)

Manishearth · on July 28, 2016

> The standard library also includes Path/PathBuf and OsStr/OsString.

Right. And you want people to explicitly convert between them.

Having to convert between string types isn't a problem. String encoding is hard, and you're going to have to pay that cost somehow.

JoshTriplett · on July 28, 2016

Having to convert isn't a problem. Having to write some algorithms multiple times for different string types is a problem.

Manishearth · on July 28, 2016

Fair. Most of these algorithms could be written generically I guess.

pcwalton · on July 28, 2016

> That felt like a lot of boilerplate to abstract between str and [u8]. Is there a better way to solve that problem?

This doesn't have anything to do with large vs. small standard libraries, because all of these string types are defined in libstd.

JoshTriplett · on July 28, 2016

libstd defines varying amounts of string manipulation and abstraction for those string types, though.

I'd love to see additional support for handling bytestrings in libstd, to make it easier to write code that handles both &str and &[u8].

sitkack · on July 28, 2016

I think rust needs to slow down in this regard. I have been with Python since 1999 and the stdlib has held it back, I have also used Scala and Haskell and have witness the mess that platform libs on each have caused.

What Rust has right now is pretty amazing. What needs to happen is a way for devs to easily break the dependency cycle and include multiple versions of the same crate. Something that has plagued Haskell. I dunno what the answer is, trait only crates, struct only crates?

If people want to 'curate' (shop) a set of packages, they can make a meta package that exports its deps.

There is literally no reason to ship libs with the compiler aside from the basic verbs and nouns.

With verioned and properly name-spaced imports, one could use different curated libs.

goodplay · on July 28, 2016

If you can, could you elaborate more on python's stdlib holding it back? I think batteries-included experience is one of the reasons why so many people (including myself) use python.

It's also one of the features I sorely miss when using Rust. Luckily, Rust's stdlib is starting to tend towards being more practical with recent additions like system time.

sitkack · on July 28, 2016

The 'std lib is where libraries go to die' was invented by Python. The libs are shallow, don't break backwards compat and provide a substandard experience. Things that continue to improve provide an out of tree alternative package name. Python codebases that are resilient don't use much of "core", arrow for time, requests for http, simplejson, etc. Using core is an antipattern that will get you stuck on a version of the language which is ridiculous.

Linking the language and the libraries together is a mistake.

pjmlp · on July 28, 2016

I disagree.

In the enterprise space it is quite common that we only get to use what it is in the computer and access to anything else is strictly controlled by IT.

So if it isn't in the standard library or some internal library mirror, we don't get to use it, as simple as that.

hsivonen · on July 28, 2016

I think it would be terrible for Rust design/evolution/policy be constrained with that kind of enterprise badness that basically bans crates.io, and crates.io is an awesomeaspect of the Rust ecosystem.

pjmlp · on July 28, 2016

I can tell it is lots of "fun" when you can only use a Maven mirror, with approved jars.

To get a jar into that mirror, a request needs to be sent to the legal team describing the license and business case use, after approval the IT team will add the said jar to the mirror.

The same applies to version upgrades of already approved jars.

This is a typical scenario I had already in a couple of projects.

adrianN · on July 28, 2016

I agree that this sucks, but not doing it that way is dangerous for the company because developers might not care enough about license compliance when they include some stuff into their project.

pjmlp · on July 28, 2016

I also agree, as I have been through what happens when developers do exactly that and then it gets discovered the worse way.

wilmoore · on July 30, 2016

You willingly subject yourself to this? Why?

hibbelig · on July 30, 2016

Just a guess: it pays the rent.

Manishearth · on July 28, 2016

It's not a constraint as much as it is a consideration imo.

lmm · on July 28, 2016

So maybe there's value in shipping a "standard bundle" that includes popular libraries or some such. But it's not worth distorting the whole language design to accommodate bad policies.

pjmlp · on July 28, 2016

Might be, however those bad policies are quite standard in big corporations.

wilmoore · on July 30, 2016

I see where you're coming from, but I feel like it would be a mistake to expect the language or std lib to try to solve problems that are effectively organizational/cultural issues.

glandium · on July 28, 2016

Plus, python's std lib is a mumbo jumbo of all sorts, there is no API coherence.

oblio · on July 28, 2016

That's a failure at the moment of inclusion. I'm guessing it was done for convenience and to increase adoption (getting decent libraries in the standard library faster).

_ZeD_ · on July 28, 2016

Just as a data point... I like and heavily use the core libs... And not once i used arrow, request or simplejson, while knowing them, because i didn't feel the needs

the_mitsuhiko · on July 28, 2016

Then you most likely have various security or logic problems in your application unfortunately.

fnord123 · on July 28, 2016

Arrow seems particularly useless as it just wraps stdlib datetime and its awful 10 byte size rather than moving to an 8 byte representation like np.datetime64 uses.

rspeer · on July 28, 2016

Just because you haven't found a use for it doesn't mean it's useless.

The stdlib datetime class is terrible and desperately needs to be wrapped. Arrow is a good wrapper. I don't know what you're on about with counting bytes.

fnord123 · on July 28, 2016

I've wrapped datetime for company work (pre pandas, pre datetime64) to make sure it follows the rules of the data analysis platform we developed (adding functions for moving to next month of year based on various financial calendar rules for example). I wish I hadn't done it and had just wrapped a boost_datetime since the performance of datetime is slow when you have a large timeseries of them. The performance is especially unacceptable if you also have timezones attached to your datetimes.

Now we have pandas, yay. But I don't see why one would use arrow. If you're patient enough, could you explain why you would use it? The website doesn't seem to be very convincing.

rspeer · on July 29, 2016

Not everything is a DataFrame. You would use Arrow because you want a nice API for dates and times in use cases that have nothing to do with Pandas.

rtpg · on July 28, 2016

But isn't requests built off of urllib?

The thing I like about python is it gives tools for library writers to build things without going too low level.

Application writers will always write with better libs, but don't have to worry about third party lib compatiblity on platforms because of the stdlib serving as a virtual machine (most of the time)

BAKfr · on July 28, 2016

requests is built on urllib3, and it includes its own version of urllib3 (to avoid dependency problems).

If I'm not mistaken, the stdlib contains urllib and urlib2, but not urllib3.

The fact that there 3 "urllib" packages show that the Python way is not so good.

lmm · on July 28, 2016

Many libraries in the stdlib have much better alternatives, because libraries with their own release cycle can evolve much quicker. But people get stuck on the "standard" version because it's what's in the stdlib. Worse, people write for compatibility with whatever was in stdlib 2.4 because that's what RHEL6 ships.

steveklabnik · on July 28, 2016

Rust will already allow you to have multiple versions of transitive dependencies.

sitkack · on July 28, 2016

This needs to be screamed from the hill tops!

wilmoore · on July 30, 2016

To be honest, this is the kind of thing that would be great to highlight immediately on https://github.com/rust-lang/cargo or http://doc.crates.io/guide.html.

I go to those pages and I am given 0 information on how the thing actually works.

steveklabnik · on July 30, 2016

Cargo's docs need a bunch of work, it's true. I have so much to do :(

bassislife · on July 28, 2016

Which I guess is normal since it does not create any dependency cycle. A new version might as well be thought as a completely different package (of perhaps similar functionality).

anamoulous · on July 28, 2016

One of the things I love above all about Python and Ruby are the kitchen-sink standard libraries. The node ecosystem is deeply frustrating in this respect.

krylon · on July 28, 2016

It has been a while since I did anything with Python, but I did like its standard library. It was reasonably comprehensive without feeling bloated, and the documentation was pretty good (mostly).

Having a good standard library also makes deployment easier. (In Go, OTOH, I tend to care less, even though its standard library is quite good, because thanks to static linking, deployment is always easy, no matter how many third-party libraries I use.)

legulere · on July 28, 2016

> We have one string type defined in std, and nobody is defining new ones (modulo special cases for legacy encodings which would not be worth polluting the default string type with).

There's also `inlinable_string`, `string_cache`, `tendril`, `intern` if you need inlining for performance.

The bigger problem is with other things like 2D/3D points which can be (f32, f32), [f32; 2] or a custom struct.

moomin · on July 28, 2016

I really really would advise having a word with Snoyberg about this. The Haskell Platform has been a pretty deadly experience. It's also ridiculously beginner-hostile (sounds like it won't be, is in practice).

barrkel · on July 28, 2016

Hash maps and trees: fine. What about database interfaces (e.g. a JDBC/ODBC/whatever equivalent)? What about HTTP servers - even the minimal declaration for what a synchronous request handler might look like? How about threadpooling - if you have multiple libraries that have parallelizable work, you certainly don't want multiple threadpools each thinking they have X many cores to work with, and you don't want the user to have to partition these things either - that's not a happy problem.

All things you can delegate to third parties, but not without lots of cross-talk and confusion until things settle down to winners and losers, which may be a long time in the future. Indecision can be costly.

Consider standard library profiles, with progressively higher levels of abstraction supported. It's the right decision for creating a good ecosystem. C and C++ took decades to build consensus on the more complicated libraries, and C++ eventually grew a pseudo-standard library in the form of boost to centralize efforts, simply because it is more efficient that way.

kibwen · on July 28, 2016

  > Empirically, languages that have large standard 
  > libraries (e.g. Java, Python, Go) seem to do better than 
  > their competitors.

You seem to be overlooking the ultimate counterexample: C. :P

brandonbloom · on July 28, 2016

You were being down voted, maybe for perceived snark, but I think you raise an interesting point.

To me, C did have a standard library: Unix. It's a runtime system too! Due to the nature of the original C bootstrapping process it just happens to be possible to remove this standard library, and Windows was evidence of this.

There is another interesting potential counter example: Lua. It's minimalistic standard library is part of what makes it so attractive for embedding, eg. in game engines. However, Lua's embedding API is so good, you could almost say that it comes with a large standard library too: Your existing C code!

I guess my larger point is that languages rarely are able to stand completely on their own. They need some sort of valuable body of code to justify people to choose the language and libraries together. It might have been the case 40 years ago that you'd reasonably choose to build something "from scratch", but today, if you start on an island, you need to build a bridge, lest you remain on an island forever. Better to start on the mainland.

It's one thing to build a layered system with a small core. It's another thing to completely ignore the fact that the libraries and community _are_ the language, in the only ways that actually matter.

pjmlp · on July 28, 2016

> To me, C did have a standard library: Unix. It's a runtime system too!

Fully agree. We just ended up with ANSI C + POSIX, because the standard bodies refused to put everything into the same bag.

In the early days, most C compilers were anyway shipping partial UNIX APIs on top of their K&R and ANSI implementations.

ufo · on July 28, 2016

Lua's lack of a stdlib is also a curse. I can't imagine how many incompatible versions of string.trim and OOP libraries are out there in the wild right now...

Things have been getting better lately because of Luarocks but its still an uphill battle.

wahern · on July 28, 2016

String trim is just:

  foo:gsub("%s*$", "")

or

  foo:gsub("^%s*", "")

The standard idiom for OOP in Lua is a one-liner:

  return setmetatable(self, mt)

where mt.__index has all the methods. How you assign to mt.__index can vary across modules according to style, but that's a _purely_ asethetic issue. The mechanics are identical. Using a module to accomplish it creates a useless dependency.

There are many criticisms one could make of Lua, but I don't think those two particular criticisms are legit. They're classic bikeshedding.

ufo · on July 28, 2016

The function you presented that trims to the right has quadratic runtime behavior if your string has a long sequence of spaces that is not at the end of the string. For example, "a b". A similar performance bug was behind a 30 minute downtime at stackoverflow.com, because a code snippet with 20 thousand spaces inside a comment showed up on their frontpage.

See, its not that simple :) http://lua-users.org/wiki/StringTrim

Anyway, I wasn't trying to say bad things about Lua with my examples. Its just that if you go to any large Lua project out there there is a very good chance you will find some "utils" module in there with yet another reimplementation of a lot of these common functions. Ideally we should have people reusing more stuff from Luarocks than they are right now.

JoshTriplett · on July 28, 2016

If you're reading a pile of string processing code, seeing

    s.rstrip()

helps make code self-documenting, compared to

    s:gsub("%s*$", "")

I don't want to argue for a massive standard library (for instance, I don't think Python should have shipped modules for dbm, bdb, sqlite, or XML-RPC), but simple string processing seems like a good thing to standardize.

brandonbloom · on July 28, 2016

String processing is never simple. Simply identifying "what is whitespace?" is a big undertaking in Unicode.

Lua's philosophy seems to be to include the absolute minimum that is unacceptably painful to omit. This is a perfectly reasonable tradeoff for Lua's primary use case: embedding.

With respect to strings in particular, most systems that Lua is embedded in has its own string type, or inherits one from a framework. This is an unfortunate reality of the C/C++ world.

Returning to my point about language standard libraries: The lack of a traditional "standard library" is a feature for Lua, but only because Lua has a strong FFI and C API that acts as a "bring your own standard library" mechanism. It's less about needing a standard library, and more about admitting a language is only one piece of the puzzle. For a language to flourish, you need to have some story for interfacing with the rest of the world in a rich way.

kibwen · on July 28, 2016

  > It's another thing to completely ignore the fact that 
  > the libraries and community _are_ the language, in the 
  > only ways that actually matter.

I'm unclear, what is this aimed at? Who's ignoring anything?

saghm · on July 28, 2016

JS, too, right? Forget "large" standard library, there really isn't any standard library at all

hibbelig · on July 28, 2016

You have these built-in objects like Math and String and Array. Are those not the standard library?

saghm · on July 28, 2016

I'm not sure I'd classify them as a standard library; they're essentially just pervasive global variables. For a comparison, think of Java; the standard library is things like `java.util` and `java.swing`, which goes far beyond having the `System` and `Math` classes available in `java.lang`.

tibbe · on July 28, 2016

You don't need a standard library to win if you don't have any competitors (in the browser). :)

Fitness for purpose is relative to the other options.

saghm · on July 28, 2016

Well, JS was originally competing with Java applets in the browser, but, like you said, fitness for purpose is pretty significant!

My point (or rather, the point of the parent comment that I'm agreeing with) there's a lot more than just the presence and characteristics of a standard library that determine how widespread a language becomes

Touche · on July 28, 2016

That's overly dismissive, Node.js has the same issue as JS in the browser and does very well. Small core doesn't matter there.

brandonbloom · on July 28, 2016

You don't consider the dynamic, rich document presentation engine that is HTML to be a standard "library"? Seems like it is to me.

kibwen · on July 28, 2016

HTML doesn't do anything for JS other than provide a way to create visual interfaces. It might be comparable to the role that `tkinter` plays for Python's stdlib, but HTML alone is emphatically not a standard library.

brandonbloom · on July 28, 2016

My point is that most languages are totally useless on their own. JavaScript the _language_ doesn't offer any FFI or other mechanism to call outside services. Without a browser or something like Node's libuv, JavaScript wouldn't be useful at all. The capabilities provided in the box are part of the language in terms of what actually matters in motivating people to choose to use the language, no matter what form those capabilities come in.

pjmlp · on July 28, 2016

It is called UNIX and re-branded as POSIX.

C would not have gotten where it is today if it wasn't for the rise of UNIX in the industry, fueled up by free UNIX clones.

comex · on July 28, 2016

> You seem to be overlooking the ultimate counterexample: C. :P

I think one reason (of many) that C++ has replaced C almost completely for new development is the STL. Of course, the STL fundamentally depends on the language feature of templates, which you can only approximate in C, but considering that Java and Objective-C, among other languages, lasted pretty long with no generics and only non-type-safe containers, I think C could have benefitted greatly from basic things like resizable arrays, hash tables, trees, better strings, etc. in the standard library. Now it is probably too late for it to matter (which most people consider a good thing).

krylon · on July 28, 2016

FWIW, the last time I cooked up something in C, I liked Judy very much: http://judy.sourceforge.net/

It has slightly awkward but very simple API, and it's very fast.

tibbe · on July 28, 2016

But C didn't have competitors with large standard libraries, so it didn't suffer as much for it.

pjmlp · on July 28, 2016

It had, Modula-2 and Pascal dialects usually had richer libraries.

For example check Turbo Pascal libraries, including Turbo Vision, already on MS-DOS.

C took off thanks to UNIX's adoption, like JavaScript on browsers nowadays, it became the language to use for anyone working on the enterprise on those new shiny UNIX boxes.

In Europe it was just another systems language to choose from, back when CP/M and other 8 / 16 bit systems were common.

rjsw · on July 28, 2016

C was usable on MS-DOS before Modula-2 or Turbo Pascal were available.

pjmlp · on July 28, 2016

Both C compilers and Turbo Pascal already existed in CP/M, which preceded MS-DOS.

Also there were C, Pascal and Modula-2 compilers available for ZX Spectrum.

And on my part tiny of the globe I can guarantee that everyone only cared about x86 Assembly, Turbo Basic and Turbo Pascal, with Clipper for business stuff.

I only got to learn C in 1993, after having been a Turbo Pascal 3, 5.5 and 6.0 user.

rjsw · on July 28, 2016

Being able to compile stuff on CP/M wasn't much help if you wanted to develop MS-DOS applications.

I first used C in 1983 on MS-DOS, I didn't use UNIX until a couple of years later. I bought Turbo Pascal 1.0 when it was released but already had a C compiler at that point.

pjmlp · on July 28, 2016

My first contact with MS-DOS was with version 3.x on a PC 1512, until then I was on Z80 systems.

So I got to see the language world in a different way, given the local choice of languages as I mentioned.

EDIT: Reformulated the answer

rjsw · on July 28, 2016

You seem to be suggesting that geography made a difference to which languages were available.

pjmlp · on July 28, 2016

Of course it made a difference.

We only got to buy the compilers that were available on the computer local store, not always 100% original or find some magazine and order international via post.

BBS access was only available to a few fortunate capable of paying the high connection rates and the modem in first place.

We got to do with what was available to us and could afford to pay.

Some of my first Assemblers were taken from the Input magazines and typed in, because there was nothing else.

rjsw · on July 28, 2016

It would help if you gave a hint on which technological backwater you are describing.

It wasn't any harder to buy stuff in Western Europe in the early 80s than it was in the US.

pjmlp · on July 28, 2016

Portugal late 70's, early 80's in a small town village.

Average salary would be around 300 euros when converted for today's currency

We had just gotten out of a dictatorship.

bitmadness · on July 28, 2016

This is exactly the main problem with Haskell. A stunning language with a lousy standard library. In my opinion, Haskell should offer arrays and maps as built-ins (like Go) and ship with crypto, networking, and serialization in the standard library (I know serialization is already there, but everyone seems to prefer Cereal, so...)

lmm · on July 28, 2016

> (I know serialization is already there, but everyone seems to prefer Cereal, so...)

This is precisely why shipping things in the standard library is a bad idea. It ends up full of cruft that no-one uses because there are better alternatives.

wyager · on July 28, 2016

>Haskell should offer arrays and maps as built-ins

Why? What does that gain?

The standard platform provides Data.Map for maps, Data.Vector for arrays, and Data.Sequence for fast-edit sequences.

It's not even clear what a "built-in" array or map in Haskell would even look like, or what semantics it should have. Especially in a pure functional language, you need to be clearer about what your intentions are. A regular mutable packed array won't work most of the time.

runeks · on July 28, 2016

    > This is exactly the main problem with Haskell.
    > A stunning language with a lousy standard library.

I dream of the day where we can say that the main problem of Haskell is which libraries are included in the standard library. To me, we would already have reached programming nirvana at that point.

tibbe · on July 28, 2016

Agreed, except for the "like Go" part, which is unnecessarily ad-hoc.

catnaroek · on July 28, 2016

Haskell's actual problem isn't the lack of a comprehensive standard library, but rather the presence of core language features that actively hinder large-scale modular programming. Type classes, type families, orphan instances and flexible instances all conspire to make as difficult as possible to determine whether two modules can be safely linked. Making things worse, whenever two alternatives are available for achieving roughly the same thing (say, type families and functional dependencies), the Haskell community consistently picks the worse one (in this case, type families, because, you know, why not punch a big hole on parametricity and free theorems?).

Thanks to GHC's extensions, Haskell has become a ridiculously powerful language in exactly the same way C++ has: by sacrificing elegance. The principled approach would've been to admit that, while type classes are good for a few use cases, (say, overloading numeric literals, string literals and sequences), they have unacceptable limitations as a large-scale program structuring construct. And instead use an ML-style module system for that purpose. But it's already too late to do that.

wyager · on July 28, 2016

How are type families worse than fundeps? That's a pretty ridiculous assertion; the things you can do with fundeps are strictly fewer than the things you can do with type families.

> The principled approach

You're dead wrong. The principled approach here is dependent types and full-featured type-level functions. Fundeps are a hack that let you implement a small subset of such functions (while type families gets us a bit closer to the ideal).

> they have unacceptable limitations as a large-scale program structuring construct.

Such as?

> And instead use an ML-style module system for that purpose.

How about we just use C macros for parametricity?

ML-style modules have their uses, but they aren't nearly as elegant as a clean type-level solution.

catnaroek · on July 28, 2016

> How are type families worse than fundeps? That's a pretty ridiculous assertion; the things you can do with fundeps are strictly fewer than the things you can do with type families.

It's not about how much you can do (otherwise, just use a dynamic language, you can do everything, even shoot yourself in the foot!), it's about whether the result makes sense, and how much effort it takes to make sense of it.

> You're dead wrong. The principled approach here is dependent types and full-featured type-level functions. Fundeps are a hack that let you implement a small subset of such functions (while type families gets us a bit closer to the ideal).

You wanna play the dependent type theory card? Type families as provided in Haskell are incompatible with univalence.

    type instance Foo Bool = Int
    type instance Foo YesNo = String

Please kindly provide the isomorphism between `Int` and `String`.

Case analysis only makes sense when performed on the cases of an inductive type, which the kind of all types is not.

> Such as?

The insistence on globally unique instances?

> How about we just use C macros for parametricity?

What does this even mean?

> ML-style modules have their uses, but they aren't nearly as elegant as a clean type-level solution.

See here for how modular type classes, as proposed for ML, would actually prevent the issues caused by Haskell-style type classes: http://blog.ezyang.com/2014/09/open-type-families-are-not-mo...

sclv · on July 28, 2016

> You wanna play the dependent type theory card? Type families as provided in Haskell are incompatible with univalence.

Hi. As someone that knows type theory and knows homotopy type theory and also knows Haskell well I would pose the following question to you: what purpose on god's green earth would be served by introducing univalence directly to haskell?

(Oh, and furthermore, you realize that fundeps have precisely the same issues in this setting?)

Contrariwise, don't you find it _useful_ that we can have two monoids, say And and Or, which have different `mappend` behaviour?

Now, can you imagine having that feature and _also_ respecting the idea that set-isomorphic things should be indistinguishable? How?

catnaroek · on July 28, 2016

> what purpose on god's green earth would be served by introducing univalence directly to haskell?

Generally, when I want to reason about tricky data structures, what I do is:

(0) Define a set-isomorphic auxiliary type that's easier to analyze, and whose operations are easier to implement, but have worse asymptotic performance.

(1) Prove that transporting the operations on the auxiliary type along the isomorphism yield the operations on the original tricky type.

I need univalence for this argument to hold water.

> (Oh, and furthermore, you realize that fundeps have precisely the same issues in this setting?)

Type classes are already Haskell's controlled mechanism for adding ad-hoc polymorphism “without hurting parametricity too much”. I consider it healthier to reuse and extend this mechanism (which is what functional dependencies do) rather than add a second one for exactly the same purpose (type families).

> Contrariwise, don't you find it _useful_ that we can have two monoids, say And and Or, which have different `mappend` behaviour?

Sure. In ML, I'd just make two structures having the MONOID signature. Haskellers have this wrong idea that the monoid is just the type - it's not! A monoid is a type plus two operations. Same carrier, different operations - different monoids.

> Now, can you imagine having that feature and _also_ respecting the idea that set-isomorphic things should be indistinguishable? How?

Yes. Acknowledging that an algebraic structure is more than its carrier set.

sclv · on July 28, 2016

> I need univalence for this argument to hold water.

No, you don't. Univalence is the axiom that transporting operations across such equivalences _always_ works. If you're doing equational reasoning directly it doesn't arise.

Furthermore, all you need to do is to establish that the _type operations_ regarding one type respect the equivalence to the other type as an additional step.

As you say "a monoid is a type plus two operations" -- so fine, we can treat the monoid And as the type bool and the dictionary of operations on it, and all this still works out.

catnaroek · on July 28, 2016

> No, you don't. Univalence is the axiom that transporting operations across such equivalences _always_ works.

Sure, but the strategy I outlined is risky (as in “may lead to getting suck and having to undo work”) in a language where this isn't guaranteed to work.

> As you say "a monoid is a type plus two operations" -- so fine, we can treat the monoid And as the type bool and the dictionary of operations on it, and all this still works out.

Yup, but Haskell doesn't let you define types parameterized by entire algebraic structures. It only lets you define types parameterized by the carriers of algebraic structures.

wyager · on July 28, 2016

> otherwise, just use a dynamic language, you can do everything, even shoot yourself in the foot!

Type classes allow huge flexibility while maintaining type safety, to a much greater degree than fundeps allow.

> it's about whether the result makes sense

Which they do. Perhaps you have some examples of when type families confused you or made you perform an error?

> Type families as provided in Haskell are incompatible with univalence.

TFs aren't dependent types. However, they are on the right track. Fundeps are farther away from the right idea. Could you explain to me what's wrong with your example? I'm not up to date on HoTT, but it seems like there's nothing in principle wrong with pattern matching on elements of *. That seems like an important feature of type-level functions.

>The insistence on global unique instances?

Why is this a problem? It makes sense from a theoretical perspective (we don't associate multiple ordering properties with the things we call "the integers"), and it's very easy to use newtype wrappers to create new instances if needed.

> What does this even mean?

ML modules are flexible, but backwards from a theoretical perspective. Parametricity is something that should be embedded in the type system, not the module system.

> See here

Interesting example. However, I doubt that the syntactic cost of using such a system is less than the syntactic cost of enforcing global instance uniqueness and using newtype wrappers.

catnaroek · on July 28, 2016

> Type classes allow huge flexibility while maintaining type safety, to a much greater degree than fundeps allow.

Um, aren't functional dependencies an add-on to multiparameter type classes? I don't see where the opposition is.

> Which they do. Perhaps you have some examples of when type families confused you or made you perform an error?

I already gave an example above. I defined two type instances that violate the principle of not doing evil: https://ncatlab.org/nlab/show/principle+of+equivalence

> TFs aren't dependent types. However, they are on the right track.

Dependent types are a good idea. The way Haskell attempts to approximate them is not. Parametricity is too good to give up. With the minor exception of reference cells (`IORef`, `STRef`, etc.), if two types are isomorphic, applying the same type constructor to them should yield isomorphic types.

You know what type families actually resemble? What C++ calls “traits”: ad-hoc specialized template classes containing type members.

> Fundeps are farther away from the right idea.

Functional dependencies are a consistent extension to type classes, which don't introduce a second source of ad-hoc polymorphism, unlike type families.

> Why is this a problem? It makes sense from a theoretical perspective (we don't associate multiple ordering properties with the things we call "the integers"),

What if I want to order them as Grey-coded numbers? In any case, the integers are far from the only type that can be given an order structure, and many types don't have a clear “bestest” order structure to be preferred over other possible ones.

> and it's very easy to use newtype wrappers to create new instances if needed.

Creating `newtype` wrappers is easy at the type level, but using them is super cumbersome at the term level.

> ML modules are flexible, but backwards from a theoretical perspective.

ML modules are plain System F-omega: http://www.mpi-sws.org/~rossberg/1ml/ . Where's the backwardness?

> Parametricity is something that should be embedded in the type system, not the module system.

It's type families, as done in Haskell, that violate parametricity! Standard ML has parametric polymorphism, uncompromised by questionable type system extensions.

> Interesting example. However, I doubt that the syntactic cost of using such a system is less than the syntactic cost of enforcing global instance uniqueness and using newtype wrappers.

I can't imagine it being more cumbersome than wrapping lots of terms in newtype wrappers just to satisfy the type class instance resolution system.

wyager · on July 28, 2016

>Um, aren't functional dependencies an add-on to multiparameter type classes?

You're right, I meant "type families".

> I defined two type instances that violate the principle of not doing evil:

We're not doing abstract category theory; we're writing computer programs (well, I am). Have you ever run into a problem with type families in that capacity?

>if two types are isomorphic, applying the same type constructor to them should yield isomorphic types.

Agreed, but there's a difference between type functions and type constructors. TFs are (a limited form of) type functions. Value-level constructors admit lots of nice properties that value-level functions do not, and I see no reason to be uncomfortable with this being reflected at the type level.

> What if I want to order them as Grey-coded numbers

Use a newtype wrapper. Even if a language allowed ad-hoc instances, I would consider it messy practice to apply some weird non-intuitive ordering like this without specifically making a new type for it.

> Creating `newtype` wrappers is easy at the type level, but using them is super cumbersome at the term level.

And using ML-style modules is easy at the term level, but cumbersome at the type level.

It's a tradeoff, and I suspect that newtypes are usually the cleaner/easier solution.

> ML modules are plain System F-omega

I hadn't seen the 1ML project. That's pretty cool.

> It's type families, as done in Haskell, that violate parametricity!

How so? I really don't understand your argument here, if you just take TFs to be a limited form of type function.

catnaroek · on July 28, 2016

> We're not doing abstract category theory; we're writing computer programs (well, I am). Have you ever run into a problem with type families in that capacity?

I like being able to reason about my programs. For that to be a smooth process, the language has to be mathematically civilized.

> Agreed, but there's a difference between type functions and type constructors. TFs are (a limited form of) type functions.

By “type families”, I meant both data families and type families. Case-analyzing types is the problem, see below.

> And using ML-style modules is easy at the term level, but cumbersome at the type level.

Actually, ML-style modules are also more convenient at the type level too! If I want to make a type constructor parameterized by 15 type arguments, rather than a normal type constructor in the core language, I make a ML-style functor parameterized by a structure containing 15 abstract type members.

> How so? I really don't understand your argument here, if you just take TFs to be a limited form of type function.

“In programming language theory, parametricity is an abstract uniformity property enjoyed by parametrically polymorphic functions, which captures the intuition that all instances of a polymorphic function act the same way.”

https://en.wikipedia.org/wiki/Parametricity

wyager · on July 28, 2016

> I make a ML-style functor parameterized by a structure containing 15 abstract type members.

You can do this in Haskell with DataKinds (you just pass around a type of the correct kind which contains all the parameters). Admittedly, it is quite clunky at the moment. I did this to pass around CPU configuration objects for hardware synthesis a la Clash, as CPU designs are often parametrized over quite a few Nats.

> parametricity is an abstract uniformity property enjoyed by parametrically polymorphic functions

Whenever one introduces a typeclass constraint to a function, one can only assume that the function exhibits uniform behavior up to the differences introduced by different instances of the typeclass. There is no particular reason to assume that (+) has the same behavior for Int and Word, except insofar as we have some traditional understanding of how addition should work and which laws it should respect. The same is true for type families. It is not a problem that they introduce non-uniform behavior; we can only ask that they respect some specified rules with respect to their argument and result types.

Case-analyzing types in type families is no worse than writing a typeclass instance for a concrete type. Would you say that the fact that "instance Ord Word" and "instance Ord Int" are non-isomorphic is a problem? After all, the types themselves are isomorphic!

catnaroek · on July 28, 2016

> Whenever one introduces a typeclass constraint to a function, one can only assume that the function exhibits uniform behavior up to the differences introduced by different instances of the typeclass.

Of course.

> Would you say that the fact that "instance Ord Word" and "instance Ord Int" are non-isomorphic is a problem? After all, the types themselves are isomorphic!

It's already bad enough, but at least the existence of non-uniform behavior is evident in a type signature containing type class constraints. OTOH, type families are sneaky, because they don't look any different from normal type constructors or synonyms.

wyager · on July 28, 2016

>OTOH, type families are sneaky, because they don't look any different from normal type constructors or synonyms.

That is fair.

I think we're on the same page at this point. You have made me realize that ML-style modules are useful in ways I did not realize before, so thanks for that.

Question: How would you feel if the tradition was to do something like

insert :: Ord a f => a -> Set f a -> Set f a

That is, "f" is some type that indicates a particular ordering among "a"s. Then, "Set"s are parametrized over both "f" and "a", and one cannot accidentally mix up Sets that use a different Ord instance.

Here's a quick example:

https://gist.github.com/wyager/a021f7e5d9f23643bc90a9866b5c0...

catnaroek · on July 28, 2016

Seems a lot more cumbersome than the direct ML solution:

    signature ORD =
    sig
      type t
      val <= : t * t -> t
    end
    
    functor RedBlackSet (E : ORD) :> SET =
    struct
      type elem = E.t
      
      datatype set
        = Empty
        | Red of set * elem * set
        | Black of set * elem * set
      
      (* ... *)
    end
    
    structure Foo = RedBlackSet (Int)
    structure Bar = RedBlackSet (Backwards (Int))
    
    (* Foo.set and Bar.set are different abstract types! *)

wyager · on July 28, 2016

That is more elegant! But do you think they're functionally more or less equivalent?

catnaroek · on July 28, 2016

Assuming you don't mind plumbing value-level proxies all over the place, it's indeed functionally equivalent.

sclv · on July 28, 2016

> Parametricity is too good to give up. With the minor exception of reference cells (`IORef`, `STRef`, etc.), if two types are isomorphic, applying the same type constructor to them should yield isomorphic types.

You know that's not what parametricity means, right? Like, at all?

Here's a challenge.

`foo :: forall a. a -> a`

Now, by parametricity that should have only one inhabitant (upto iso). Use your claimed break in parametricity from type families and provide me two distinct inhabitants.

catnaroek · on July 28, 2016

> Now, by parametricity that should have only one inhabitant (upto iso).

I can count at least three: `undefined`, `const undefined` and `id`.

> Use your claimed break in parametricity from type families and provide me two distinct inhabitants.

Does this count? Here Oleg constructs an inhabitant of False using just some means to case-analyze types (GADTs or type families): http://okmij.org/ftp/Haskell/impredicativity-bites.html

sclv · on July 28, 2016

i should have specified "modulo bottom" because i somehow didn't cotton i was talking to someone more interested in pedantry than actual discussion.

that said, constructing an inhabitant of false a _different_ way (when we can already write "someFalse = someFalse") is not particularly interesting, and again doesn't speak to parametricity in any direct way.

wyager · on July 28, 2016

Sounds like the problem there is that the Haskell typechecker assumes injectivity, not that it supports case-analysis.

catnaroek · on July 28, 2016

The injectivity assumption isn't unjustified - type constructors are injective.

wyager · on July 28, 2016

> The injectivity assumption isn't unjustified

Strongly disagree.

(+2) 3 == (+1) 4 implies neither (+2) == (+1) nor 3 == 4. So Fst goes out the window.

(even 5) == (even 7) does not imply 5 == 7.

>type constructors are injective.

But type functions aren't, and that's what we want.

I agree that it's misleading to have type functions look like type constructors syntactically.

catnaroek · on July 28, 2016

The type constructor that's being assumed injective in the section “Deriving `absurd` with type families” is `R`.

tibbe · on July 28, 2016

There can be more than one problem, including having a small standard library.

catnaroek · on July 28, 2016

The lack of a standard library can be fixed relatively easily: write libraries! OTOH, the existence of anti-modular language features that are extensively used in several major libraries, is a more serious problem, because:

(0) It means that libraries in general won't play nicely with each other, unless they're explicitly designed to do so.

(1) It can't be fixed without throwing away code.

tibbe · on July 28, 2016

This whole thread is exactly about how "write libraries!" (if done outside the standard library) doesn't work (see my top post).

I do agree that lack of modularity features certainly doesn't help though.

saysjonathan · on July 28, 2016

One of the common mantras I've heard among Rust core devs is "std is where code goes to die". Where do you feel the line should be drawn between standard lib and external libraries?

skybrian · on July 28, 2016

Maybe a change in attitude? Stability can be a good thing. Go's standard library doesn't change that much, and that's a strength.

How about: "std is where code goes when it's done".

As is, really done. The API's won't need changing.

jsmthrowaway · on July 28, 2016

The counterpoint being Python simplejson vs json. Most working Python developers I know try simplejson first (when they are not controlling dependencies in the environment) and fall back to stdlib json because simplejson got much faster as it evolved outside of the standard library[0]. Most who don't know this go the other way[1].

There are a number of counterpoints in Python, in fact, which epitomizes the "standard library is where code goes to die" thing. Adding modules to the standard library in Python is, more often than not, overall a bad thing for the module. Python has not historically been awesome with standard library quality, either; see Java-style logging and unittest (I mean naming, not "Java idiomatic," which I think is fine for both).

This comes down to release cycles for the language, mostly. So I think API stability is a bit of a red herring when discussing Python, at least.

I tend to appreciate languages where I can remove the entire standard library and "start over," like C. (Yes, you can.) This can be good for a number of things: porting, embedding, frameworks, and so on.

[0]: http://artem.krylysov.com/blog/2015/09/29/benchmark-python-j...

[1]: https://github.com/search?q=simplejson+ImportError&type=Code...

yeukhon · on July 28, 2016

Case by case. If you absolutely need the speed, go with what gives you the boost. Otherwise, I always encourage people to use json and not have to have an additional external dependency. One of the reasons some people was using simplejson isn't really speed IMO, but because json module was not in stdlib until what, late 2.6?

I try to keep my dependency list as tiny as possible, and use what makes sense for my development and for future maintenance. Also, look at the result, in Python 3, json module beats simplejson.

But speed isn't the only thing that matters, it seems ujson would use more memory (https://news.ycombinator.com/item?id=9326499).

jsmthrowaway · on July 28, 2016

It wasn't "late" 2.6 (that's not how Python releases work for changes like that), it was 2.6, which was October 2008. Nearly eight years ago. Most distributions are even on Python 2.6 now.

Anyway, my point isn't the specific example. That you and I even have this discussion at all and that there are hundreds of thousands of caught ImportErrors on that specific example on GitHub is my point regarding standard library stability; folks seem to think the standard library is the end-all (wherein we wouldn't be having this conversation at all), but Python has shown it is anything but when not carefully maintained. I think Rust is wise to approach this with caution.

Honestly, I'm not extremely familiar with Rust, but it seems it elected the C approach where you can gut the language. A+. Good. How it should be for a systems language like that, because now it can be ported, embedded, and so on.

yeukhon · on July 28, 2016

> That you and I even have this discussion at all and that there are hundreds of thousands of caught ImportErrors on that specific example on GitHub is my point regarding standard library stability;

That fact that thousands of files catching ImportError does not necessarily implies folks are questioning stdlib's stability. That merely means some people are deliberately choosing to prefer simplejson over json. The benchmarks demonstrated json module before Python 3 could be slower than simplejson, but json module since Python 3 has beaten simplejso in terms of speed of execution. Furthermore, there are old Stackoverflow threads on usjon vs simplejson vs json regarding performance. All the above would naturally suggest folks who choose to prefer simplejson over json is due to the concern of speed, rather than opinion on stability.

Also, stability is the wrong term for the problem you are describing. Agility is probably the better word. Python release tends ot be backward compatible (of course except Python 2 vs Python 3 and a few other modules like asyncio). Python core developers try not to break applications. If anything, non-core libraries will break compatibility more frequently without having to face larger opposiitons; I can break simplejson if I were the maintainer of simplejson. The consequence is maybe a couple angry GitHub issues and a few blog posts, unlike Python 3 which still gets a lot of angry media coverage till this day.

The problem with stdlib is absolutely about agility. The core community is extremely small. It can take many weeks and sometimes months to get your commit merged. The reason I like to keep stdlib around is good citizenship. I would love to have requests in the stdlib, but in a more agile and more frequent release. Python isn't the only player. OS distro are also responsible for the slowness. There's been discussion on python-dev regarding more frequent release and even potentially breaking up stdlib could be an option for the Python community.

tveita · on July 28, 2016

The way I see it, the packages that support both do so because they know over 90% of users are satisfied with the performance of the standard library package and don't want to install extra dependencies to get the library or utility to work.

Even more code just use the standard json package without any fallback. The ease of development or deployment is clearly worth more to them than what small speed advantage they can get from going with the external dependency.

The calculus will be different for Rust, of course, with different build and deployment system.

pcwalton · on July 28, 2016

> Maybe a change in attitude? Stability can be a good thing.

I haven't seen anybody say otherwise.

New standard library APIs are stabilized at a steady clip with every new release: https://github.com/rust-lang/rust/blob/master/RELEASES.md

steveklabnik · on July 28, 2016

In the Ruby world, very few people use the standard library because it's got so many flaws, and they can't be fixed. So you end up with Nokogiri rather than REXML, all the various HTTP libs rather than net/*, etc. So it just ends up being bytes sent over the wire, wasting disk and bandwidth...

saysjonathan · on July 28, 2016

I wonder if identifying the atomic aspects of what you intend your language to be used for ultimately helps in narrowing down what should be in std lib.

Go prioritizes network programming and bundles the necessary components, like http & rpc servers and json.

The http libraries are extensible enough to allow for customization where it's wanted (like http mux) while still creating a canonical implementation that'a still viable.

Has Rust identified the core demographics of who they're targeting in order to provide the most applicable platform? Is the target everyone and all application type, therefore there is no default platform?

Edit: To put it another way, is there a set of packages that is either necessary for rust, rust development, or most development in rust? If std lib includes everything necessary, then who are you targeting with the default platform?

steveklabnik · on July 28, 2016

  > Has Rust identified the core demographics of who they're targeting
  > in order to provide the most applicable platform? Is the target
  > everyone and all application type, therefore there is no default platform?

Our target audience is still a bit too broad; "systems programming" can mean a lot of things. Application developers build a _lot_ of different applications, those who embed Rust in other languages have different set of requirements, OS/embedded devs have another. There's a lot of stuff in common, but there's also significant differences.

skybrian · on July 28, 2016

Well, the trick is to actually get it right before standardizing it - much easier said than done. Keeping the standard library small helps with that since the bar is higher.

pcwalton · on July 28, 2016

> Well, the trick is to actually get it right before standardizing it - much easier said than done.

And that's what we're doing.

> Keeping the standard library small helps with that since the bar is higher.

But weren't you just advocating for a large standard library?

skybrian · on July 28, 2016

I guess I didn't understand what was meant by "where code goes to die". Graduation != dying.

anamoulous · on July 28, 2016

I wouldn't say very few people use the standard library?

CSV, logger, json, fileutils, tempfile, pp, and on and on are used all the time...

steveklabnik · on July 28, 2016

There are some parts that are good, and some parts that are bad, for sure.

kibwen · on July 28, 2016

Go isn't immune to the problem either. See the `flag` package, which is something that new users are encouraged to avoid in favor of e.g. https://github.com/jessevdk/go-flags .

ainar-g · on July 28, 2016

I am a Go programmer and I've never seen anyone anywhere encouraging people to use anything over the flag package. How did you get such impression?

bakery2k · on July 28, 2016

Docker, for example, say "seriously just don't use it".

http://www.slideshare.net/jpetazzo/docker-and-go-why-did-we-...

the_mitsuhiko · on July 28, 2016

Everybody I have ever seen do CLI applications in Go will recommend heavily against it.

themihai · on July 28, 2016

The package you mentioned has about 200 imports. Compare that with the standard package. https://godoc.org/?q=Flag

timtadh · on July 28, 2016

If you want to write command line apps that conform to the GNU flags convention you can't use the "flag" library. I wrote my own simple getopt implementation (github.com/timtadh/getopt) years ago so I could just get some work done. It works fine and has no dependencies. I write a lot of complicated command line applications and having a small simple getopt implementation makes it a lot easier. Sometimes a higher level tool would be nice but I have never found one I actually like.

skybrian · on July 28, 2016

Why do minor variations in flag syntax matter so much you'd write your own? It seems easier to adjust to using the standard flag package.

the_mitsuhiko · on July 28, 2016

The standard package cannot be changed.

skybrian · on July 28, 2016

By "adjust to" I meant living with the standard library's decisions without changing them. Compatibility with GNU syntax doesn't seem very important.

timtadh · on July 28, 2016

Well, I am the primary user of my command line applications as they are for my research. As the primary user they better be,

1. Easy to use

2. Have standardized syntax for flags across languages

3. Be easy to maintain

4. Be well documented so that I can use them a year or 5 years from now.

My applications often have complex syntax. For instance I work in frequent pattern mining, the basic syntax for some of my programs looks like:

  program [global options] -o <output> --support=<int> \
        <datatype> [data-type-options] <path-to-data>
        <algorithm> [algorithm-options]
        <filter-chains>
        <logging-and-serialization>

Being able to tightly control how the sub-commands chain together is important to me. Support for both short (-s) and long (--long) options make it easy to write both one off commands and self documenting commands in scripts and makefiles.

I write programs in more languages than just Go, and the programs in Go need to work the same way the programs in other languages work. That means GNU option syntax, which is the superior syntax for my needs in any case.

threatofrain · on July 28, 2016

I think that Go can pull off a good standard library because there's a big corporate sponsor behind it, whereas Ruby may have had difficulty with its standard library for the lack of a sponsor.

Standard doesn't mean completely done. Standard should be able to accomodate things like HTTP2, as Go has done, whether that means expanding the API or whatever.

tibbe · on July 28, 2016

The "big corporate sponsor" argument often comes up when discussing language success. Google doesn't really put more than a few people's time into Go, the rest is open source. Other languages like Python didn't have any real backing until way after success.

pjmlp · on July 28, 2016

Guido and a few other core developers were employed to work full time on Python for quite some time, even before it became successful.

lmm · on July 28, 2016

It's too early to draw conclusions about Go's standard library. Python's standard library seemed like a good idea at the time too. Come back in 15 years and let's see how good it looks then.

shadowmint · on July 28, 2016

Sql?

It does all the wrong things; singletons, no testability, cgo for implementations, side effects and you have to use every database differently based on their individual semantics.

Virtually everyone I've ever spoken to either uses a high level wrapper around the sql library or a no-sql solution.

That's the definition of 'stdlib is where packages go to die'.

It's not that the API is unusable, it's just basically not used by the community because there are other better things out there...but you're stuck with it forever, because it's there and some people do use it, and changing or removing it would be a breaking change.

Anyhow, we're just speculating. Does anyone actually collect metrics about the usage of different parts of the stdlib for any language?

Without hard data to back it up, you couldn't really make a strong argument either way.

saysjonathan · on July 28, 2016

I didn't see anyone actually mention sql so I'll just assume your first line is to be interpreted as "sql is the counterexample of why Go's standard library is not as great as it may seem."

>Virtually everyone I've ever spoken to either uses a high level wrapper around the sql library or a no-sql solution.

How does that reflect the quality of the std lib implementation? All the high-level wrappers I've seen still utilize database/sql, they just provide convenience methods on top of the existing functionality. Are people using NoSQL databases because database/sql is so bad or merely because that technology fits their project's requirements?

>That's the definition of 'stdlib is where packages go to die'.

steveklabnik's example of Ruby XML parsing libraries is a better example of this, if only because the std lib implementations are almost completely ignored by all other gems. Go's database/sql is actively used outside of the std lib to great affect, whether in wrappers and ORMs or in implementing other SQL databases (like Postgres).

ZenoArrow · on July 28, 2016

> "Sql? It does all the wrong things; singletons, no testability, cgo for implementations, side effects and you have to use every database differently based on their individual semantics."

SQL has its flaws, but it is testable. The testing approaches available vary depending on the implementation. For example, can write unit tests for SQL Server (using tSQLt, to give one example: http://tsqlt.org/ ).

shadowmint · on July 29, 2016

yeah no. If you have an interface and you need a separate test suite for each implementation of that interface, it's a terrible interface.

ZenoArrow · on July 29, 2016

The point is, there's nothing inherent in the design of SQL that stops it being testable, it just hasn't reached the SQL standards yet.

Plus, there are plenty of ways to test standard SQL, you can easily do so through stored procedures.

shadowmint · on July 29, 2016

I'm not talking about SQL in general, I'm talking about the golang sql module.

Read this if you think I'm wrong: https://github.com/bradfitz/go-sql-test/blob/master/src/sqlt...

Great abstraction, right? Love how you have to mangle the raw SQL right? no. It's a bad abstraction.

When you test the implementation and you have to know the details of the implementation to test it, that's like, the definition of a bad abstraction.

It is what it is.

...and ultimately, we're stuck with it because its part of the standard library.

/shrug

rcfox · on July 28, 2016

Out of curiousity: what do the 5 string types do differently?

geekingfrog · on July 28, 2016

String: Linked list of Char. Nice for teaching, horrible in every other aspects. Text and lazy Text: modern strings, with unicode handling and so on. ByteString and lazy ByteString: these are actually arrays of bytes. Used to represent binary data. Because haskell is lazy by default, and sometimes you want strictness (mostly for performances), there are two variants of Text and ByteString, and going from one flavor to the other requires manual conversion.

tibbe · on July 28, 2016

Risking to go off-topic a bit, I think the lazy versions of Text and ByteString wouldn't have been needed if we had nice abstractions for streams (lists are not, they cause allocation we cannot get rid of) so that you don't need to implement a concrete stream type (e.g. lazy Text and lazy ByteString) for every data type.

Rust does this well with iterators, for example.

wyager · on July 28, 2016

The problem is that streams actually have very complicated semantics when they interact with the real world. What does it mean to traverse an effectful stream multiple times? Can you even do that?

Data.Vector provides a very efficient stream implementation for vector operation fusion, but it's unsuitable for iterators/streams that interact with the real world. Pipes, on the other hand, combined with FreeT, provides good, reasonable semantics for effectful streams.

As with many other things, Haskell forces you to be honest with what your code is actually doing (e.g. streaming things from a network) and this means that there's no one-size-fits-all implementation we can stuff everything into.

tibbe · on July 28, 2016

Just sticking with the pure types there's currently no generic stream model that works well. No stream fusion system fuses all cases (even in theory) and they also fail to fuse the cases they're supposed to handle too often in practice.

I haven't looked at pipes, but I'm guessing it doesn't all fuse away either.

wyager · on July 28, 2016

You're right, I believe Haskell's fusion framework could be greatly improved (although it is the best production solution I'm aware of). However, how would you go about solving this? I don't think there's any generalized solution to the problem of creating no-overhead iteration from higher-level iterative combinators.

losvedir · on July 28, 2016

> Haskell's fusion framework could be greatly improved (although it is the best production solution I'm aware of). However, how would you go about solving this?

Given that we're in a rust thread... are you familiar with rust's iterator fusion [0]? Basically there are three components: iterators (something like a source), iterator adapters (where all manipulations happen), and consumers (something like a sink). LLVM will compile all the iterator adapters into a single manipulation such that the underlying stream/vector/whatever only goes through it once.

I personally like it much better than Haskell's. With rust the fusion is guaranteed to happen, although it makes the types a little verbose and tricky to work with, but with, e.g., Haskell's Text's stream fusion I was never really sure that it was working, or if I could do something to prevent it. It seems like in Haskell it's more of a behind the scenes optimization that you hope kicks in, rather than designed into the types. Or do I misunderstand? I only dabbled in Haskell.

[0] https://doc.rust-lang.org/book/iterators.html

wyager · on July 28, 2016

Yes, I have used Rust a bit. Basically the primary difference is (and correct me if I'm wrong) you can't re-use an iterator in Rust without cloning it. On the other hand, you can use a Haskell pure stream object as many times as you want (without explicit cloning, because "draining" an iterator is stateful), so fusion becomes a bit of a more complicated problem.

If I had some Haskell code that was like

map f . map g . filter x . map y $ stream

It would almost certainly get fused into a single low-level loop without extraneous allocations. However, I can also do something like

foo = map y $ stream

bar = map f . map g . filter x $ foo

baz = map z $ foo

And now what do you do?

Haskell's fusion is also more general, because it allows you to do pretty much arbitrary syntactic transformations.

Unfortunately, this means it's somewhat fragile and is easy to prevent from functioning. Rust can guarantee fusion because you're restricted in the kinds of things you can do with iterators.

On the other hand, Haskell's Pipes restrict you from doing things like re-using an iterator, and I'm not sure what the optimization story is there.

stijlist · on July 28, 2016

That's one of the ideas behind transducers:

https://github.com/matthiasn/talk-transcripts/blob/master/Hi...

And they work! It's not stream fusion, but the composed functions being applied to whatever container or stream of values are applied per-value, so (map (comp xf1 xf2)) applied to [1 2 3] applies (xf2 (xf1 1)), (xf2 (xf1 2)), and so on, with similar allocation savings to stream fusion.

wyager · on July 28, 2016

>Conversions between our 5(!) string types are very common.

All five of those string types do different things. This isn't a problem; we just have increased expressivity. We couldn't fix this by having a more coordinated standard library. 5 is also a very manageable number IMO.

>It's too difficult to make larger changes as we cannot atomically update all the packages at once.

That's what Stack is for, no?

yoshuaw · on July 28, 2016

If a language is to play the long game, they must be conservative on what they add. Even a minimal runtime like Node is still wounded by the addition of a few broken interfaces into the core platform (even emitters, streams, domains to name a few). These cannot be stripped out of the platform because they're "blessed" and now everyone will forever have a bad time. I suggest we don't do that for Rust.

For a language to remain relevant in the long term, a system must be capable of evolving. Going out there and blessing potentially conceptually incorrect packages such as Mio is therefore not a good idea. The notion of "platforms" best resides in userland, where collections can compete and evolve.

By keeping platforms / package collections in userland we can get organic collections of stuff that make sense to bundle together. Imagine a "server platform", an "event loop platform", a "calculus platform", a "programming language platform". It would be detrimental to creativity to have all these collections live in the shadow of the "rust committee blessed platform".

But so yeah, I'm not opposed to the idea of platforms - I just don't think blessing userland stuff from the top down is the right play in the long term.

Tl;Dr: package collections sound like a cool idea, but if you care about the long term, imposing them from the top down is bad

rcthompson · on July 28, 2016

> These cannot be stripped out of the platform because they're "blessed" and now everyone will forever have a bad time. I suggest we don't do that for Rust.

It sounds like the proposal in the OP avoids this problem by having versioned platforms that are independent of the Rust version. So if something turns out to be a bad idea, it can be stripped out of later platform versions and replaced with something better without disrupting users of the older platforms.

vvanders · on July 28, 2016

Yeah, I really liked how DirectX did versioning in that respect. Lets you improve the API while preserving backwards API and signature. If you can give Microsoft something it's that they do backwards compatibility well.

I don't see why that couldn't be worked into this idea.

rkangel · on July 28, 2016

> If you can give Microsoft something it's that they do backwards compatibility well.

Completely agree. Don't underestimate the amount of effort it costs them though.

erickt · on July 28, 2016

> By keeping platforms / package collections in userland we can get organic collections of stuff that make sense to bundle together. Imagine a "server platform", an "event loop platform", a "calculus platform", a "programming language platform". It would be detrimental to creativity to have all these collections live in the shadow of the "rust committee blessed platform".

That's exactly what we want to do. We don't have the domain expertise in many of these areas, but we want to enable those ecosystems to help develop their own platforms. It's one of our goals to make sure any infrastructure we develop for the "Rust Platform" is usable throughout our community.

Animats · on July 27, 2016

I'm a bit worried when a language develops a "platform" and an "ecosystem". This usually means you need to bring in a large amount of vaguely relevant stuff to do anything. It adds another layer of cruft, and more dependencies.

Write standalone tools, but don't create a "platform". Don't make the use of the language dependent on your tools.

C++ does not have a "platform". Nor does it need one.

pcwalton · on July 27, 2016

C++ has innumerable de facto "platforms" and "ecosystems". You have to choose one to get anything done, whether that be various Boost libraries, POSIX, Win32, Cocoa, Qt, GTK(mm), even stuff like XPCOM…

Helpfully, many of these platforms reinvent basic things like strings [1] and reference counted smart pointers [2] in incompatible ways.

Wouldn't it be better if there were just one platform?

[1]: http://doc.qt.io/qt-4.8/qstring.html

[2]: http://doc.qt.io/qt-4.8/qshareddatapointer.html

btown · on July 28, 2016

And notably, C++11 actually moves in this direction, standardizing things like smart pointers [0][1]. It's a very smart move for Rust. Core or near-core library wars in the early days of adoption of a language leads to duplication of effort, and for those invested in seeing Rust gain a set of libraries to rival other languages, this is a great thing.

[0] https://en.wikipedia.org/wiki/C%2B%2B11#C.2B.2B_standard_lib...

[1] https://en.wikipedia.org/wiki/Smart_pointer#unique_ptr

glandium · on July 28, 2016

Actually, C++'s STL is in a weird situation, compared to the standard library in other languages, because the STL is a spec, not an implementation. And there are as many implementations of the STL as there are compilers. This might arguably happen if there were multiple Rust compilers, though. Anyways, the result on the C++ STL is that in many cases, the same types have different performance characteristics on different platforms, or worse, different behavior/bugs.

aldanor · on July 28, 2016

Nit: C++ Standard Library != STL

0xffff2 · on July 28, 2016

>You have to choose one to get anything done

I work on multiple medium-sized projects that disagree. If you're not writing GUI code, it's quite possible to write 99% platform agnostic code without the help of a 3rd standard library supplement, especially with C++11.

pcwalton · on July 28, 2016

It's possible, yes. But (a) large pre-existing industry codebases make use of their legacy libraries that predate "modern C++"; (b) that directly contradicts the parent poster's point, in that you're saying that having a standard platform is a benefit.

saysjonathan · on July 28, 2016

Some languages have also seen great success with having a standard platform but still allowing users to replace it as needed. The many Haskell Preludes and Jane Street's ocaml core are two such examples.

I think the ability to opt out of the rust-platform metapackage is a great feature.

arielb1 · on July 28, 2016

We already have `std` for that. The point of a "Rust platform" is that you can have confidence that the libraries you are using are of decent quality, reasonable popularity, and will be maintained.

lilyball · on July 27, 2016

I get the feeling that you didn't really read the post. Nothing in the platform is required to use Rust, and you can trivially write Rust packages that don't use the platform. The point of the platform is for convenience. In most cases it will make sense to use it because it provides a convenient set of libraries that are known to work well together, but you could also choose to just ignore the Rust platform entirely and continue to use Rust the same way we've been using it up to now.

Animats · on July 28, 2016

"In general, rustup is intended to be the primary mechanism for distribution; it’s expected that it will soon replace the guts of our official installers, becoming the primary way to acquire the Rust Platform and all that comes with it."

Then, of course, the other installers will gradually break and be abandoned. The effect is that users must run the "Rust Platform", unless they have the resources to build their own distro.

Is there a monetization scheme behind this? Does someone aspire to be the Canonical of the Rust ecosystem?

lilyball · on July 28, 2016

Rustup being the mechanism to install the Rust Platform does not mean that installing the Rust Platform is required to use Rustup. Rustup is an existing tool that installs Rust for you, and it's highly likely that you'll be able to use it in the future to install just Rust or to install the whole Rust Platform at your discretion.

> Is there a monetization scheme behind this? Does someone aspire to be the Canonical of the Rust ecosystem?

This seems like a complete non-sequitur. I have no idea what you're trying to suggest here.

pcwalton · on July 28, 2016

> Is there a monetization scheme behind this? Does someone aspire to be the Canonical of the Rust ecosystem?

Of course not. Spreading conspiracy theories isn't appreciated.

anp · on July 28, 2016

I don't see anything in this proposal which indicates that rustup would by default install any of this, and it's already the primary target of development efforts, Rust Platform or not.

What other installers are you referring to, exactly? The old rustup.sh which couldn't support multiple toolchains installed alongside each other? multirust which didn't work on Windows? rustup is a massive improvement over both, IMO.

tatterdemalion · on July 28, 2016

I'm sorry, but this is incoherent FUD. Do you expect the Rust project to maintain multiple installers? Why? What does this have to do with monetization?

tdicola · on July 28, 2016

Go watch recent talks by Herb Sutter and Bjarne Stroustrup--they both lament the fact that C++ never developed as strong a standard library as Python, etc. With C++14 and beyond the C++ working committee is actively trying to make the language and libraries more complete and comparable to larger libraries out there.

wahern · on July 28, 2016

In a low-level language like C there's _never_ a right solution out of the box. Instead, you use the language because it permits you to tailor the solution to the problem.

In a high-level language like Python there's always a right solution. It's just rarely well-tailored to the problem.

I'm not at all surprised C++ is still muddling around in the middle somewhere.

pjmlp · on July 28, 2016

In C is called UNIX and it got renamed as POSIX for other OSes.

zanny · on July 27, 2016

Qt / Boost / .net are C++ platforms. The difference is you can choose one or none, and the OP actually explicitly talks about how important it is not to have one absolute blessed platform like Java has.

majewsky · on July 28, 2016

When I was developing KDE applications, I refered to my programming language as "C++/Qt".

pjmlp · on July 28, 2016

> C++ does not have a "platform". Nor does it need one.

Sure it does, it is called C++ Standard Library and ISO C++ catching up to what other languages already offer.

https://isocpp.org/std/status

coolsunglasses · on July 27, 2016

Haskell Platform is the last thing you should take inspiration from. Many of us have been doing our best to kill it off. Maybe the downsides involved wouldn't affect Rust in the same ways.

My suggestion, look at how Stack (the tool) and Stackage (the platform) work:

https://docs.haskellstack.org/en/stable/README/

https://github.com/commercialhaskell/stack

https://www.stackage.org/

saysjonathan · on July 27, 2016

Most of your arguments linked in a comment below are unrelated to the application of metapackages to cargo. Cargo already includes a good portion of the behaviors found in Stack and Stackage. The idealogical battle of Stack vs Haskell Platform is irrelevant to this proposal.

Haskell Platform is a perfectly adequate example to take high-level inspiration for the core of this idea: use Cargo (which is like Stack/Stackage for rust) to help bootstrap Rust libraries when using rustup to install and upgrade (mostly equivalent to `stack setup`).

Ericson2314 · on July 27, 2016

The one thing stack does do nicely that this proposal doesn't is allow curated comparable versions without forcing course dependencies. In plainer English, stack can pick the versions while you still opt in pack by package. I beleive that this is crucial to not slow down the evolution of the ecosystem.

In Cargo jargon, a solutuon would be for metapackages could double as sources: `foo = { metapackage = bar }` to use foo from bar.

quodlibetor · on July 28, 2016

To make this more concrete, you're suggesting something like

   [dependencies]
   hyper = { metapackage = { rust-platform = "2.7" } }

I like the general theme of having the platform be just a set of known-compatible versions, but on the other hand this feels like it loses out on many of the ease-of-use advantages of just specifying a platform version and knowing that you have all the crates inside of it.

Ericson2314 · on July 28, 2016

Yeah that's a proper example.

I don't think we need to pick one way exclusivity. My specific use-case is making PRs for packages to work in kernelspace / in unikernels. The library night initially be packaged the easy way but then I'd use this. I don't want the PR recipient to also worry they might get out of sync with the platform as a side effect.

burntsushi · on July 27, 2016

If I'm understanding your concern correctly, that's totally a part of the proposal:

    But we can do even better. In practice, while code will continue working
    with an old metapackage version, people are going to want to upgrade.
    We can smooth that process by allowing metapackage dependencies to be
    overridden if they appear explicitly in the Cargo.toml file.

Ericson2314 · on July 28, 2016

Nah that's different. This is about avoiding unused dependencies, especially when they constrain portability.

steveklabnik · on July 27, 2016

I actually cross-posted this to /r/haskell to get explicit feedback. Someone else mentioned stack/stackage. In my understanding, Cargo already does this specific behavior. Can anyone who's more familiar with both confirm this?

coolsunglasses · on July 27, 2016

I checked that other thread, I agree completely with jeremyjh's summary.

The problem with Haskell Platform came down to a set of intersecting issues:

- A culture of setting very strict & narrow version bounds. (Based on known-good rather than based on avoiding known-broken)

- Tools (Cabal) that enforce dependency version bounds. If there's mutual/transitive incompatibiltiy in version bounds - the build fails, period. You had to figure out the problem and fix it yourself if there was a truly irreconcilable issue.

- Recommended installer on the website (Platform) was unnecessarily installing packages into the global package database, making you "stuck" with those versions for _all_ packages you attempted to build. Cabal was restricted to finding builds that abide by dependency versions narrowed to the ones provided by the global package database.

These problems led to beginners being confused by seemingly spurious build failures because Platform would fall out of date with the rest of the ecosystem. Cabal would be unable to find sets of compatible dependencies and say it couldn't build the package.

What non-beginners were doing to avoid these problems was:

1. Install the bare compiler, no Platform

2. Use package database sandboxes for each project

All of these (UX and technical) problems were solved without compromising dependency conflict enforcement via Stackage and Stack.

Speaking hypothetically, if Cargo behaves like Maven or Ivy and pulls in two dependencies who want conflicting versions (1.1 and 1.2, say) of a particular library and just picks a winner, then you'll never see something like this.

steveklabnik · on July 27, 2016

Cargo will do its best to unify separate versions, but if it can't, then it will just include both. So it sounds like this would be a non-issue here.

viraptor · on July 28, 2016

https://github.com/rust-lang/cargo/issues/2064

From my perspective: one of the biggest issues in cargo right now. I know it's not the same as the HP problem, but current cargo is definitely not good at solving this.

steveklabnik · on July 28, 2016

Yes, this has been a big thing we've been discussing a lot. Something will probably happen, but design work is still being done.

coolsunglasses · on July 27, 2016

>So it sounds like this would be a non-issue here.

In the sense of being a sucking chest wound driving new people away, I agree.

>just include both

makes my guts wrench a bit though. I understand how you're (currently) getting away with it but yikes.

steveklabnik · on July 27, 2016

Why would it drive people away? I read your thesis as "it's broken all the time so people don't like it", but I'm saying that it wouldn't be broken.

EDIT: Apparently coolsunglasses was referring to the Platform, not Cargo, with this comment.

anp · on July 27, 2016

For those of us unfamiliar with Haskell, could you expand on the downsides a bit?

coolsunglasses · on July 27, 2016

This is the main thing I've written publicly on it: https://mail.haskell.org/pipermail/haskell-community/2015-Se...

My email is what finally moved the committee and GHC devs on including Stack with the Platform and on some other decisions concerning the website.

Like I said, not all the downsides may be applicable to how Cargo works or what aturon has in mind, but please don't cite it as an exemplar of anything.