I feel like this is an example of what Jonathan Blow calls a "Big Idea" or a "10...

kibwen · on Sept 21, 2020

> He claims that language designers should aim for "80% solutions" instead, which cover most common usages but limit themselves enough to avoid complexity. This runs in contrast to a lot of commonly accepted language design wisdom.

This is easy enough to say, and indeed I do think it's a good approach, but the problem is identifying that 80% in the first place. The reason that language designers tend to favor general approaches is because they presume not to know how people are going to want to use certain things. It's an approach borne out of humility, not ideology. You need time observing how things are used in the wild before you can identify which 20% not to support; get this wrong and people will be more frustrated than if you had saddled them with the baggage of the general approach.

In the specific case of Rust's Range API, we can observe this problem acutely. Rust hugely benefited from the period between 2011 and 2015 where it was able to iterate aggressively on design and observe what opinionated stances were worthwhile. But the Range type came relatively late to the party: it was devised and stabilized only months before 1.0 as a replacement for an old, hardcoded slicing syntax that worked with no types other than plain integers, and only in very limited syntactic contexts. With little time to observe use in the wild (and with all the other madness and work that was going on in the run-up to 1.0), the reasonable approach was to not over-constrain. Now that we have experience with it one could devise ways to do it better, certainly, and with luck Rust may be able to move the type in that direction, but other than that it may just be a lesson for those languages that are yet to come.

nybble41 · on Sept 22, 2020

> With little time to observe use in the wild … the reasonable approach was to not over-constrain.

Given a new language feature and limited time to observe actual use, IMHO the reasonable approach would be to constrain it as tightly as possible. It's much easier to relax constraints to enable new uses later than it is to reign in inadvisable uses of an underconstrained interface. For example, if the original Range interface had simply consisted of two private, immutable fields with Copy + PartialOrd constraints and an implementation of the IntoIterator trait then it would be trivial to add setters (or public fields), an internal Iterator implementation, and looser type constraints later on if these were deemed necessary. Going the other way, however, breaks programs that have come to depend on these dubious features.

dathinab · on Sept 21, 2020

But it's a 80% solution, just not the 80% the author likes.

One of the main usage of `Range` is to be an iterator.

The other is to causally slice data structures.

For both use-cases would the proposed changes lead to major usability regressions and braking. Because you can't compiler time enforce valid ranges as many are not created at compiler time (e.g. `a.start()..b.mid()`) and making range creation fallible would in practice be a massive usability nightmare. E.g. consider `for x in (start..end).unwrap().iter { .. }` instead of `for x in start..end { .. }`.

The current solution while imperfect was chosen to fit the most common use-cases of it best.

For some very performance sensitive use cases where you need slicing of ranges and the way the std range does thinks is to slow/bad you can have alternatives which are faster but have usability drawbacks. But that's the exceptional case not the normal case.

Many of the other examples shown also seem kinda strange. E.g. `get_unchecked` as well defined as "an out of bounds array index" is well defined (it's only defined for Range<usize> it's also an unstable experimental API...).

Range need clones => only in use-cases it was not primary designed for.

Range is unsure when its valid => No it knows it's always valid but not all valid ranges can be used in all places without having errors, indexing a slice with a range can panic anyway (out of bounds access) so moving the error handling there is fairly sane. Also you really can't have fallible Range creation.

Range hides a foot gune => any exclusive from-to range in any language has this problem, it's why in mathematics there are 4 types of ranges

A Recipe for Rearranging Range => he/she somehow assumes you can magically make sure that start <= end without error handling but without that oversight on his/here part this changes would make Range a usability nightmare.

bilkow · on Sept 21, 2020

I don't think Range is intended to be a 100% solution at all. It only supports iteration over integers and slice indexing over a specific integer size, which are its main use cases. Also the author is exaggerating on the footgun, as Range is explicitly end-exclusive, and the Regex is explicitly end-inclusive (e.g. [a-z] includes z), so a RangeInclusive should be a straightforward decision.

Yes, I think it was a rushed type which should be a lot more constrained on 1.0 to allow for modifications later, which is how stuff is usually done in Rust. It also probably should have implemented IntoIterator instead of Iterator directly. It may also be my least favorite Rust type, but I don't think it's really that bad, and I'm thankful for whoever designed it as I find it still much better than slice-specific indexing syntax.

om2 · on Sept 22, 2020

The footgun is not just the exclusivity but the fact that `RangeInclusive` has an extra bool field and so is wastefully large for use cases that need the inclusivity. i.e. it's a perf footgun.

hombre_fatal · on Sept 21, 2020

Then again, TFA analyzes Range in a vacuum, as if some Range<T> must make sense in arbitrary context A but also arbitrary context Z.

In reality, Range of some T generally makes sense in a local API or program. Even if that same Range<T> doesn't necessarily make sense in every other place T might be used.

R0b0t1 · on Sept 21, 2020

Now that I know the details of Rust's range type it is extremely weird. The constraints need to be separate types. Why should range be so general as to support things that are obviously not ranges, only to return values indicating the range is malformed?

kibwen · on Sept 21, 2020

> The constraints need to be separate types. Why should range be so general as to support things that are obviously not ranges, only to return values indicating the range is malformed?

I'm not sure what malformed return values this is referring to, because I can't think of any. Is it referring to the fact that ranges where the start is greater than the end will result in an empty range? Without dependent types, which Rust doesn't have, there's no way to detect that; even in the subset of cases where the range bounds are computable at compile-time, back at 1.0 Rust didn't have the compile-time evaluation machinery necessary to make that happen. You could instead choose to interpret that a range where the start is greater than the end indicates a descending range, but plenty of other people will regard that behavior as a flaw.

saagarjha · on Sept 21, 2020

I assume the issue is with Rust letting you create such a range and then having the bad stuff happen when you try to use it, rather than failing fast.

tomjakubowski · on Sept 21, 2020

Whether 5..0 being an empty range is "bad stuff" or "good stuff" is a matter of perspective. It is often "good stuff" for me, when computing some indices to slice with. Panicking on construction would force one perspective on every use case.

dathinab · on Sept 21, 2020

> But why isn't it panicking on len()? How is 0 the right answer there?

- len is `ExactSizedIterator.len()` which is the length of `Range` as iterator, i.e. the number of items yielded by next. Which is 0.

- When slicing with 5..0 it threats it not as an empty iterator but as an out of bounds access. This is without question slightly inconsistent and not my favorite choice but was decided explicitly this way as it makes it much easier to catch bugs wrt. wrongly done slices. Also it only panics if you do Index which can panic anyway but it won't panic if you use e.g `get` where it return `None` so making it traet the "bad" empty case differently for slicing doesn't add a new error path, but doing so for iteration and `len` would add a new error path especially given that `ExactSizedIterator.len()` isn't supposed to panic as it's a size hint.

saurik · on Sept 21, 2020

But why isn't it panicking on len()? How is 0 the right answer there?

masklinn · on Sept 22, 2020

Because that’s the length of the iterator? The range is empty, its exact size is 0.

saurik · on Sept 23, 2020

The range is invalid, not empty; someone had to do a validity check to return 0 to prevent it from returning -5 or trying to count up from 0 (depending on what it was willing to assume). A big point of the article is that a range of size 0 should always be iterable, but it somehow isn't, because it isn't actually of size 0.

dathinab · on Sept 21, 2020

No the author misrepresents the facts.

If you index a slice with a out of bounds index it will panic independent of weather the index is a usize or a Range<usize>.

If you use `get` with a out of bound index you always get a None.

Sure it's open for discussion if why a range with start > end should be treated the same as an out of bounds index or if it should be treated as empty slice. But then doing the former makes it easier to catch errors.

Enforcing start <= end would mean that the range construction is fallible which would be a major usability nightmare and now you would need two synatxes one for the normally error handling and one for panicking or you would need to add a lot of unwraps or similar.

Range's are mainly used ad-hoc (e.g. `slice[start..=mid+2]`) or `for x in x..y {...}` and are optimized for that usage patterns.

For other usages they might not be optimal. But you can always do your own types.

dathinab · on Sept 21, 2020

> obviously not ranges, only to return values indicating the range is malformed?

It's not the case. The only think affected by range being generic is that `contains` takes a reference instead of a copy (which btw. can likely be eliminated by the optimizer). Which is necessary to allow thinks like `Range<BigNum>`.

All other things have nothing to do with it being generic but with for which use cases it was designed for.

In the end in rust a Range is mainly an iterator.

If it's a Range<usize> and only then you can also use it to get slice arrays/vectors/slices.

Which means that e.g. the unstable experimental `get_unchecked` function is actually very well defined.

Lastly the reason why you can't enforce `start <= end` is because that would make the creation of an range fallible which would be a horrible usability nightmare, a thing the author somehow misses completely.

The thing is indexing a slice already can panic so moving the panic there is generally a good idea. Similar you always want to have a non-panic path. Which would be e.g. `[T]::get()` which in case of a "bad" slice does the same as on a "bad" index it returns `None`.

In the end both `Range` and `RangeInclusive` are compromises focused on the most common use cases of range, which is a ad-hoc creation "just around" the place you consume it for iteration or slicing of slices. Which also means that e.g. the fact that `RangeInclusive` is bigger is no problem as at the place it's used you elsewise would need to either turn it into a iterator just like `RangeInclusive` adding even more overhead then the current `RangeInclusive`. Sure if you want to store a lot of `RangInclusive`s then this is not the use-case it was defined for and you are better of defining your own range inclusive.

saurik · on Sept 21, 2020

But shouldn't len() panic instead of returning 0? I don't even understand how it could return 0 without having already done all the work to determine it should have returned a negative number.

skocznymroczny · on Sept 21, 2020

Reminds me of D at times. It has many fancy features and powerful metaprogramming. But it also comes with drawbacks, many simple language improvement proposals are being shot down because they break in presence of some advanced usage of those features.

dathinab · on Sept 21, 2020

But that kinda isn't the case here.

`Range` was discussed a lot before being stabilized and it's drawbacks where well known when it was stabilized.

The reason it was stabilized that way anyway was because it happens to work out best for the most common use-cases.

It's more of a "practically use-full but theoretically imperfect compromiss" thing.

The main usage of range are:

1. To iterate over it

2. To slice things using it

Which is what it is focused on in it's design.

Sure ranges could be `Copy` but one of their main purposes is to be an iterator so it's reasonable to not make them copy as that would be a usability nightmare.

Sure it's strange that you can construct a invalid range and then panic when you use it to slice something. But the alternative would be to make the creation of range fallible which is a usability nightmare. Furthermore validity depends on what you use it one, so a backwards range might be a very reasonable thing for some use-cases so practically it's best to make every `Range` valid, but not necessary every usage of one.

Sure exclusive ranges based on start+end can't contain the maximal value but that's a fundamental property of exclusive ranges defined through start+end. There is a reason mathematics have 4 kind's of ranges (differing in exclusiveness in start/stop).

Sure `.contains` takes a reference, but that's a problem about how rust can't specialize traits in how they need references for Copy methods. Not having that would prevent the usage of ranges of `BigNums` or similar reasonable usages.

Sure `RangeInclusive` could be made shorter but that also means you can't have empty inclusive ranges and you can't use it directly as an iterator, which is probably the most common use case of inclusive ranges.

All in all the `Range` types are a compromise optimized for it's most common use cases. That makes some parts sub-optimal if used for other cases but you can also always use your own types so that's not really a problem in practice.

Also he does some mistakes:

- You can't enforce `start <= end` at construction time without making the constructor fallible which would be a ergonomic nightmare. Which means that neither `[T]::get()` or `Range.len()` would get faster nor would is_empty get easier.

The last point also sadly means that for certain arithmetic high performance tasks it can make sense to not use the rust provided range type but a custom one.

gameswithgo · on Sept 21, 2020

I feel that Golang makes a lot of these 80% solutions.

stouset · on Sept 21, 2020

Honestly it feels more like they make a lot of 40% solutions.

013a · on Sept 21, 2020

Well, as others have said: Hitting exactly 80% is pretty much impossible. And we've established that hitting more than 80% produces languages that are often bad, in some way or another.

So, logically it follows: aim for less than 80%. You can always add things; you can't take things away.

a1369209993 · on Sept 21, 2020

> And we've established that hitting more than 80% produces languages that are often bad

No we haven't; we've established (for very dubious values of "established") that aiming for more than 80% is frequently not worth the trouble.

djur · on Sept 22, 2020

Sometimes by aiming for less than 40% you add the _wrong_ thing, which you need to take away before you can add the right thing.

throwaway894345 · on Sept 21, 2020

I've noticed this as well. It also attracts a lot of criticism. People are really insistent that a language should be perfect in exactly one area and terrible in others. They tend to support this kind of thinking with arguments about how you should pick the right tool for the application, as though most applications only care about one criteria or another (and need a language that trades everything for that one criteria).

That said, I think Rust does an impressive job at squeezing efficiency out of these tradeoffs. Sure, it trades off some developer productivity for extreme performance and safety, but its developer productivity story is still markedly better than other systems languages (and probably on par with some of the more cumbersome managed languages). Similarly, the tooling story is pretty great while every other systems language has pretty awful tooling (especially build systems). Moreover, Rust is getting better at a remarkable pace. I don't think it will ever close some of these gaps, but I think it will get close enough to pose a real threat.

nitrobeast · on Sept 21, 2020

This statement seems very true, even to the point of “duh” for people who have designed and maintained semi-widely used API or applications. I wonder where the language design wisdom to the contray come from.

pas · on Sept 21, 2020

Similar to the marketing/advertising axiom: 80% of all marketing spending is waste, but you you will only know after it has been spent.

saagarjha · on Sept 21, 2020

What many companies seem to fail to realize is that for a significant portion of advertising you'll never know if it was a waste or not.

nitrobeast · on Sept 22, 2020

I'm unfamiliar with marketing but I suspect the situation is much better now with all the tracking. And I can imagine for many APIs, people do know what most frequently used features are, and what typical users are like.

seph-reed · on Sept 22, 2020

This is crazy, but I kind of like the way web does new features.

They start off with a name that obviously will never be needed (eg "moz-blur") and work with that for a while until it becomes apparent what "blur" should be.

If the rust developers had named "Range" as "RustRange" or something else weird to start with, then they could come back through later and name it to the more desirable name. This seems like a good tactic whenever you're still trying to figure something out but intend to put it in production anyways.

steveklabnik · on Sept 22, 2020

Rust already does this: https://doc.rust-lang.org/book/appendix-07-nightly-rust.html...

hardwaregeek · on Sept 21, 2020

This is true, but a lot of users will clamor for a 30% solution as well. A user may be writing a back end in your language and ask for first class SQL. After all, they write a lot of SQL and being able to have typechecked, safe SQL statements sounds great, right? Except, not everybody writes SQL. Indeed, SQL may be dead in 10 years (I'm not making this claim, but it is a possibility) and replaced by a different language.

A good language designer will see that users want a general way to query over data, and create something like LINQ.

Of course, you're right that language designers shouldn't go for a 100% solution. Monads are kind of the classic 100% solution. You can do anything with monads, but that means you can do anything with monads.

voldacar · on Sept 21, 2020

I'm not sure that this is true. Common Lisp shoots for "100% solutions" a lot and it never ends up with the kind of crap that this article describes. This could just be a result of the fact that a lot of clunky-looking code can be handed off to macros and a human will never have to see it or touch it.

centimeter · on Sept 21, 2020

Haskell’s Foldable and Traversable typeclasses represent these use cases, are more general, and have none of the clunky edges mentioned in the article.

tsimionescu · on Sept 21, 2020

But Range is a concrete type, not an interface, right? And most of the discussion here is about the implementation of Range, not about the interface it exposes to users.

Note also that the clone issue and the borrow issue are not applicable to Haskell, and that the performance characteristics of Range may be hard to replicate while implementing Foldable or Traversable.

centimeter · on Sept 22, 2020

Right, the trick is to not use concrete types when unnecessary. As the OP makes clear, it doesn't make sense to stuff all these use cases into a single concrete type.

> Note also that the clone issue and the borrow issue are not applicable to Haskell

No, but Rust switching to a typeclass-based iterator syntax should help with this too.

> the performance characteristics of Range may be hard to replicate while implementing Foldable or Traversable.

I don't see why - rustc generally does (and must do) a great job of specializing parametric code.