Hacker News new | comments | show | ask | jobs | submit login

I too have suffered Rust's religiosity on the floating point issue.

In Python, given an array xs = [3.1, 1.2, 4.3, 2.2], I can write

    xs.sort()
and get [1.2, 2.2, 3.1, 4.4]

In Haskell

    sort xs
In Swift

    sort(&xs)
In Rust you have to spew this monstrosity

    xs.sort_by(|a, b| a.partial_cmp(b).unwrap_or(Less))
The Rust position appears to be that sorting an array of floats is unreasonable and so you must be "punished" by not being allowed to use the built-in .sort() function.



Whenever I've seen people complaining about Rust religiosity, is mostly because other languages trained them to be careless about certain things.

Rust cares great deal about PartialEq and Eq because you can have stuff like non-deterministic float NaN sorting or Run-time errors as demonstrated by previous posts.

Or Rust cares about doing manual memory management safely, but since you are arriving from either C/C++ which cares little about doing it safely, or GC language which cares little about memory management, you find it obnoxious.

Hell, even the TFA mentions how hard it is to use Hash, I'm pretty sure that's because Rust is aiming for actually making the hash work correctly and not allow you to just randomly compromise yourself.


Perhaps in some cases, but how does that apply here?

People won't stop needing to sort arrays just because the syntax is made heavier and obscure.

It's fine to punish people for coding in the wrong way, but here I have to accomplish a task (sorting the array) and so the punishment serves no purpose at all except for making my experience worse.


     > It's fine to punish people for coding in the wrong 
     way, but here I have to accomplish a task (sorting the
     array) and so the punishment serves no purpose at
     all except for making my experience worse.
Then Rust way is fine. It's merely 'punishing' you for not thinking things through at compile time (that floats are sortable), instead of at run time (dependent on input). Maybe you prefer to have fail dynamically, and that's fine, but that's preference, and the goals of Rust do not align with it.

If you find relying on it often, why not write a macro? So:

       xs.sort_by(|a, b| a.partial_cmp(b).unwrap_or(Less))
becomes:

      sort!(xs)
Also honestly, if this is a very common use case bring it up on Rust forums maybe people will add a macro. They added macro for generating N elements for a vector.


Why a macro? Why not just define that comparator as a function?

    fn value_nans_last<T: Float>(a: &T, b: &T) -> Ordering {
      match (a, b) {
        (x, y) if x.is_nan() && y.is_nan() => Ordering::Equal,
        (x, _) if x.is_nan() => Ordering::Greater,
        (_, y) if y.is_nan() => Ordering::Less,
        (_, _) => a.partial_cmp(b).unwrap()
      }
    }

    xs.sort_by(value_nans_last);


I haven't done sorting, so no knowledge of sort interface in Rust, but maybe so it works across different sortable collections?


There's no need to use a macro: you can abstract over different kinds of sortable collections using a trait. (That said, there's no real reason to generalize IMO: by far the most common case is sorting slices.)


> If you find relying on it often, why not write a macro?

Why not a function? I think people turn to macros in rust sometimes because of the necessity to think about the semantics of function calls. Should I use references or values? What are the trade-offs?

The amount of thinking you have to do to perform a good "extract method" refactoring is one of my least favorite things about the language, but falls out of some of my favorite things about it. All in all, I think it's worth the trade.

edit: Expanded a bit.


I'm curious why you characterize it as a "punishment"? Do you actually think that is why anyone designed the system this way, as some kind of aversion therapy just to get you to avoid using floating point numbers?

All Rust is doing here is implementing IEEE 754 floating point semantics as specified, and implemented in lower level hardware. Part of the IEEE 754 specification that you need to deal with if you want to use floating point numbers is NaN, which represents not an actual value but the absence of a value, an indication that your computation did something that could not be represented. Because NaN is one of the possible values of a floating point number, a typesafe interface for comparing floating point numbers must, by definition, be a partial function; and as a partial function, it cannot be relied upon for implementing sort.

The Rust way of doing this does not punish you, it just ensures that you actually think about, and deal with, edge cases like this. While that can seem cumbersome in a small, one liner example like the above, or can seem restrictive when you're writing a simple personal project and you know that you will never encounter a NaN and so you just want to sort the values without thinking about that, it can be quite valuable when programming in the large; when you are working on a program larger and more complex for any one person to know and reason about the whole thing, type safety allows you to encode certain constraints in the type system that ensure that you don't make mistakes.

For instance, if you write code that depends on sorting a list of floating point values in Python, like in your example, and write all of your unit tests and design using non-NaN floating point values, then use a third party library that winds up producing a NaN value, you are likely to be quite surprised by the outcome:

  >>> nan = float('nan')
  >>> xs = [1, nan, 0, 3, 5, 2]
  >>> sorted(xs)
  [1, nan, 0, 2, 3, 5]
Now, deep in the middle of production code, that result might not be so apparent. Most of the values that you care about are ordered correctly; but eventually you'll hit the fact that the 1 is sorted before the 0, so you'll have some strange, hard to reproduce bug, that depends on the precise ordering of the original array.

What Rust is doing is not punishing you, but instead just making you make that decision about what to do about such a case up-front, before you accrue that technical debt that comes to bite you later on.

There are several possible ways you could deal with it; one is the one you mentioned, where you just define some way of comparing NaN so that you now have a total order. Another would be to panic any time you try to do an undefined operation like comparison on a NaN. Or you could work with a type that is restricted to non-NaN values, and deal with the issue only at the boundary of components which convert between arbitrary floating point values and your restricted subset (and any operations that may produce NaN values).

In order to not have to write those cumbersome sort expressions by hand every time, if you need to work with floating point numbers with one of the non-standard semantics described above, you could define any of the above behaviors by creating a newtype the wraps floats but provides the semantics you want. Since they are static types, there will be no overhead on the values, they will just be represented as floats; you may have some overhead on your checked operations that wrap the underlying float operations, but that the price you pay for going with semantics which are not the standardized IEEE 754 semantics as implemented by the hardware. In a larger project, if you need such a type, it's not all that much work to just define that type once with the semantics that you want, and then just use that everywhere rather than using one of the native floating point types.

So, what Rust is providing is a type-safe, low-overhead implementation of IEEE 754 floats, without providing certain conveniences that would make your life easier when dealing with a subset of them but cause problems when working on the full range of values.

Can it be convenient to accrue technical debt in order to get things done quickly? Sure. Shell scripts are a classic example; almost every non-trivial shell script will have some kind of quoting bug, delimiter bug, confusion between arguments and flags if an argument value ever contains a "-", or the like. But because they are familiar and allow people to get things done quickly, they can be really useful for little one-off hacks, especially when you're working with data that you know is simple enough not to hit one of those edge cases, like filenames where you know that none of them contain spaces.

However, you need to be really careful about that sort of thing. That kind of quick and loose reasoning can quickly come to bite you if it gets deployed in production in an uncontrolled or even hostile environment. All of as sudden, the things you thought could never happen will happen. I've seen a seemingly innocuous shell script for cleaning up a few particular types of files turn into an "rm -rf *" due to a bug in handling of spaces in filenames (and yes, an actual customer lost actual data due to this bug).

So, is Rust appropriate for that kind of fast-and-loose exploratory programming that the shell or dynamic languages like Python allow you to do? No. If I were working with known inputs, interactively, where I could easily tell that I didn't have NaNs and could check the outputs to make sure they were sane, I would choose Python and numpy, or Julia, or something of the sort that was more appropriate for rapid and loose prototyping.

But for software that will be deployed in the wild, where I need to write modules that will work with values provided by other modules that I don't control, or the like, making you think about this kind of thing up-front can help you avoid having weird, obscure, hard to debug problems, or even security vulnerabilities, down the line.


Bob needs to sort an array of floats.

Bob tries xs.sort()

Bob gets an error message.

Bob googles "how to sort an array of floats in Rust".

Bob pastes in

    xs.sort_by(|a, b| a.partial_cmp(b).unwrap_or(Less))
Bob continues on his merry way.

No safety has been added, no technical debt has been avoided. It's not any less "quick and loose".

The need to sort arrays of floats doesn't disappear simply because the Rust designers will it to. The code will still exist, but it will be longer and less maintainable. This is what I mean by punishment.

If anything, Rust has given you a false sense of security. The modules and other code you work with will still be handling NaNs incorrectly.

My preference, all considered would be a .sort() that pushes the NaNs to the front or back (but is slightly slower), and a .sort_unsafe() that assumes no NaNs but is faster.


I'm still reeling from how cynical and dismissive this comment is. The Rust developers don't like IEEE 754 any more than you do, but it doesn't change the fact that IEEE 754 is what hardware implements. I encourage you to take your complaints up with hardware vendors and the IEEE, as well as with with the popular programming culture of blindly pasting SO answers into your programs in frantic attempts to get them to compile at any cost. As for me, I very much appreciate that Rust is conscientious enough to throw up a red flag and force me to realize that this seemingly-simple task is actually quite complex, rather than implicitly imposing a leaky abstraction.


Well, hopefully first answer will be stack overflow which will address this topic :P Rust community has been hard at work, answering a lot of Rust question. I don't think I've seen many unanswered so far.

Well, when Bob found sort_by and a weird partial_cmp he should have paused there and look at docs if he understood nothing. In case of Python he is rightly to blame Python for silently doing stuff for him that he doesn't like.

It's like a speed bump. If you go fast over a speed-bump, ignoring it, you suffer the consequences. It's different than a dark unlit road that just has a sign THIS WAY.


> Or you could work with a type that is restricted to non-NaN values, and deal with the issue only at the boundary of components which convert between arbitrary floating point values and your restricted subset (and any operations that may produce NaN values).

How often do NaN's appear in practice? I think it makes perfect sense to handle IEEE 754 floats like this, but maybe rust should follow your suggestion here and provide a new totally-ordered float. Maybe `f64` should _be_ this totally-ordered float and IEEE 754 could be imported if you need to use that one?


That would likely make operations on the default float slower than they need to be. After all, the CPU is probably implementing IEE 754: How do you handle it when a NaN bubbles up from below?


When you write programms for robots, you can't afford such "maybe" things. I think Rust will be perfect for robots.


And what should be the result of a divide by 0 in your magical totally ordered float type?


You are cool. I'd like to read your twitter or blog (link please). I don't see here any personal messages to ask it another way. I just respect professionals with knowledge.


I don't know much Rust. But your comment just made me appreciate Rust a lot more!

To me, this signals that they value long-term safety over short-term convenience. Any other choice, IMO and IME, is short-sighted.


Lets see after you try a big project in Rust, if you come back with the same appreciation.

Most of the times i think is a overburden, I know people like Haskellers like to suffer and are masochistic coders, and may like this, but i dont.. already have to deal with C++, and Rust did achieve the impossible.. Its even more over-enginnered than C++, and its not even faster


> Its even more over-enginnered than C++,

Can you explain more about what you mean by 'over-engineered' here?

> and its not even faster

We are faster sometimes, and we haven't even put time into optimizing things. We're also slower sometimes.


I think you should replace 'time' with 'a lot of time'. I've seen several commits that optimize things.


IME, us "masochists" suffer much less in the end. We "suffer" a few seconds here and there when we get a compiler warning, forcing us to think edge cases through.

Then, later, we suffer much less through runtime debugging, QA, and get a lot less calls at 2AM.


You mean a large project like Servo, or large project like Rust compiler itself?


What is the expected result if you have NaN in your list?


In python at least, it's a bit funky:

    >>> sorted(map(float, ['1', '2', 'nan', '4', '3']))
    [1.0, 2.0, nan, 3.0, 4.0]
    >>> sorted(map(float, ['1', '5', '2', 'nan', '4', '3']))
    [1.0, 2.0, 3.0, 4.0, 5.0, nan]


Ruby has it about right (imho):

  > [1.0,2.0,Float::NAN].sort
  ArgumentError: comparison of Float with Float failed


That's definitely the right behavior for Ruby—it catches a contract violation and fails at runtime, which is idiomatic. What's nice about Rust's approach is that it catches the possibility of that contract violation at compile time, and forces you to decide what to do about it before the code ever runs. The right thing to do in Ruby would be to catch that exception and handle it in some fashion, but there is no indication at the time you're writing the code that the possibility exists, so you're unlikely to handle it unless you have a strong awareness of the issue with floats and NaN. Rust encodes that awareness into the language itself, which actually limits the expertise you need to write good code.


Rust's way is more flexible though: You can choose how to treat NaNs, in Ruby you have to fail.


in Ruby you have to fail

Oh, really? ;)

  > [1.0,2.0,Float::NAN].sort
  ArgumentError: comparison of Float with Float failed

  > # let's make NaN sortable
  > Float::NAN.class.send(:define_method, '<=>') { |x| -1 }

  > [1.0,2.0,Float::NAN].sort
  => [1.0, 2.0, NaN]


That's just horrible. You just changed behavior globally.


I didn't say it's pretty.

Merely chose that example to point out the absurdity of challenging Ruby on the grounds of flexibility (of all things).

Obviously, in a real program you'd rather write a custom sort-comparator, use a wrapper-class, or monkey-patch only the specific NaN instances that you want to change the behaviour of.


Or use refinements.


No, it's beautiful. Being a Python programmer it took me a long time to appreciate the power to do things like this. Learning Elisp, Smalltalk, Io and JavaScript (well) certainly helped.

Also, of course you can scope such a change however you want, for example to a single block (and with threadlocals it's going to be mostly safe).


He was just demonstrating that it is possible. You can in fact alias the method, do the sorting, and then restore the previous functionality of throwing errors. Plus, someone already mentioned that you can also use refinements.


Right, and the Rust equivalent would be to define a newtype that wraps your floating point type and defines the ordering semantics you want; in Ruby, you have changed the behavior for everything that uses Floats (including other libraries that may depend on this behavior), while in Rust you can use your newtype, other libraries can use standard floating point behavior, and you won't have any confusion about who wants which semantics.


and the Rust equivalent would be to define a newtype

Which you can do in Ruby as well, as I pointed out just two comments below the one you're replying to.

The real advantage of Rust here was imho best explained by sanderjd; Rust can perform this check at compile time whereas in Ruby it's a runtime exception.


I don't have NaNs in my list. That's an invariant I would be happy to express in the type system.


Then you need another type. A float has NaNs and Rust knows it and prevents you from possibly shooting yourself in the foot.


A 'newtype wrapper' has your back in that situation, which lets you do exactly that.


Yep, and that type can have a total order and work with the `sort` method with no further ceremony. That actually might be a nice type to have in the standard library. It seems like it would be widely useful, but I'm not sure where it would fit in on cargo.


I had a crack at this. This is about the third Rust program i've ever written, so it's probably chock full of noob mistakes:

    #![feature(std_misc)]
    mod natural {
      use std::num::Float;
      use std::iter::IntoIterator;
      use std::iter::FromIterator;
      use std::cmp::Ord;
      use std::cmp::Ordering;

      #[derive(PartialEq, PartialOrd, Debug)]
      pub struct Natural(f64);

      impl Natural {
        pub fn new(value: f64) -> Option<Natural> {
          match value {
            x if Float::is_nan(x) => None,
            _ => Some(Natural(value))
          }
        }
        pub fn new_all<'a, A: IntoIterator<Item=&'a f64>, B: FromIterator<Natural>>(values: A) -> B {
          let b: B = values.into_iter().map(|f| Natural::new(*f).unwrap()).collect();
          b
        }
      }

      impl Eq for Natural {
      }

      impl Ord for Natural {
        fn cmp(&self, other: &Self) -> Ordering {
          self.partial_cmp(other).unwrap()
        }
      }
    }

    use natural::Natural;

    fn main() {
      let fs = [3.0, 1.0, 1.0];
      let mut xs: Vec<Natural> = Natural::new_all(&fs);
      println!("before = {:?}", xs);
      xs.sort();
      println!("after = {:?}", xs);
    }
In particular, the assignment of the return value of new_all to a local is ugly, but i couldn't figure out how to please the type checker without it.


Neat! My version with some mostly superficial changes[0].

Note that we haven't actually removed the panic in the `Ord` implementation! Which is because we've eliminated what we believe is the source of the ordering uncertainty (the NaN), but the type system still doesn't know that.


Whoops, didn't post the link: http://goo.gl/7AZxa6


In Python you basically get undefined behavior when encountering NaNs while sorting, Rust let's you choose. This seems much more reasonable to me.


To be fair, you also have the choice of implementing your own comparison function in Python (and likely Haskell, Ruby, etc) if the position of NAN actually matters to your algorithm.


You're unlikely to know there's a problem and do so.

Rust is reminding you of this fact, which is very nice.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: