I don't have an account there so I'll comment here:
> In particular, allocating a new object and returning a reference to it it from a function is common in C++ but difficult in Rust, because the function doing the allocation doesn't know the expected lifetime of what it returns.
This is what boxes are for. A Box is a unique pointer to a value on the heap and can be used without knowing compile-time lifetimes. References and lifetimes allow you to safely return pointers to stack allocated objects. In C++, you'd have to do this:
MyType value;
my_function(&value);
When returning references, rust uses the lifetimes instead of explicit declarations to figure out where (on the stack) `value` needs to be allocated.
> Declarations are comparable in wordiness to C++.
Only at interfaces where the declaration also serves as documentation. Elsewhere, types can generally be inferred.
> Rust has very powerful compile-time programming; there's a regular expression compiler that runs at compile time. I'm concerned that Rust is starting out at the cruft level it took C++ 20 years to achieve. I shudder to think of what things will be like once the Boost crowd discovers Rust.
Unlike C++,
1. Macros from one crate aren't imported into another unless the user explicitly requests that they be.
2. Macro invocations are clearly macro invocations. You never have to wonder if something is a function or a macro.
> The lack of exception handing in Rust forces program design into a form where many functions return "Result" or "Some", which are generic enumeration/variant record types. These must be instantiated with the actual return type. As a result, a rather high percentage of functions in Rust seem to involve generics.
How is this a problem?
> There are some rather tortured functional programming forms used to handle errors, such as ".and_then(lambda)". Doing N things in succession, each of which can generate an error, is either verbose (match statement) or obscure ("and_then()"). You get to pick. Or you can just use ".unwrap()", which extracts the value from a Some form and makes a failure fatal.
I agree that this is less than ideal. However, IMHO, this is better than Java and C++.
Java:
Libraries tend to bubble everything. This leads to long throws clauses in function signatures with unexpected exceptions. A user of these libraries often catches and ignores these exceptions when writing the first draft of his or her programs because they don't make sense (why handle IO Errors when using a collection?). And then, because his or her program works, he or she forget about the ignored exception cases turning them into silent errors.
On the other hand, in rust, you can only return one error. When writing a function that has multiple failure modes, this forces the programmer to think about the set of failures that can happen and come up with new error type. This doesn't force the programmer to come up with a meaningful error type but it gives them the opportunity.
Additionally, like in Java, Rust programmers can ignore errors (`unwrap()`). However, unlike in Java, these ignored errors are not silent, they are fatal.
C++:
Exceptions are unchecked and everyone I've talked to avoids them like the plague. In the end, C++ exceptions end up acting like rust's `panic!()` because programmers don't check them but are used like Java's exceptions because programmers could check them.
> There's a macro called "try!(e)", which, if e returns a None value, returns from the enclosing function via a return you can't see in the source code. Such hidden returns are troubling.
I agree that hidden returns can be troubling. However, in rust, only macros can lead to hidden returns, macros use a special syntax (`macro_name!(args...)`, and macros have to be explicitly imported.
> All lambdas are closures (this may change), and closures are not plain functions. They can only be passed to functions which accept suitable generic parameters. This is because the closure lifetime has to be decided at compile time.
The first sentence is correct but the last two are just wrong:
The `Box` allocates the closure on the heap and the `move` causes the closure to capture by value. This means that this closure (`f`) can be moved freely without lifetime restrictions because it doesn't reference the stack. However, most functions that accept closures use generics and do any necessary boxing internally to make the user's life easier.
> Rust has to do a lot of things in somewhat painful ways because the underlying memory model is quite simple. This is one of those things which will confuse programmers coming from garbage-collected languages. Rust will catch their errors, and the compiler diagnostics are quite good. Rust may exceed the pain threshold of some programmers, though.
Rust is a systems language. It exposes a lower-level (not simple) memory model because systems programmers need it. If you want garbage collection, you are free to roll your own (yes, you can actually do this in rust).
> Despite the claims in the Rust pre-alpha announcement of language definition stability, the language changes enough every week or so to break existing programs.
Re-read those claims. Alpha means fewer breaking changes and no "major" breaking changes not stability.
> References and lifetimes allow you to safely return pointers to stack allocated objects. In C++, you'd have to do this:
MyType value;
my_function(&value);
When returning references, rust uses the lifetimes instead of explicit declarations to figure out where (on the stack) `value` needs to be allocated.
OMG, thank you for including this. I spent several months reading every bit of documentation that was available for Rust, and programming in it daily. Made some good progress. But I never, never came across this explanation. Very enlightening.
Rust desperately needs documentation covering these kinds of details. How on earth is someone supposed to make serious use of the language without knowing this?
I believe the Klabnik documentation hinted at this (something like, "The Rust compiler is smarter than that" and therefore you don't need to overuse pointers), but by no means did it actually spell it out. And you only needed a few sentences to cover it.
I know the Rust community is aware that more documentation is needed and has a todo list a mile long. But I don't know if technical details such as this are high enough on the priority list.
> But I don't know if technical details such as this are high enough on the priority list.
There are actual features which still have no documentation. It's hard being a single person trying to keep up with changes from tons of other people, many full time and some community. I may be the person who is most looking forward to Rust being stable...
And I should say, for the record, that you deserve a ton of credit for being a nearly superhuman job. The doc may be lacking, but it would be far worse off without you.
Still, I wish the team/community could find a way to shore up the coverage of this kind of information. I think we've given beginners enough for now, that the focus should shift to "details you need to know about what the compiler does."
Thank you, I appreciate it. These kinds of comments are the ones I try to re-read when I'm feeling down about stuff.
Yes, I agree, I'd love more help :) I think you'll see a shift after 1.0.0-beta happens, since then, the focus will be on polish, rather than shipping every breaking change.
>Unlike C++, 1. Macros from one crate aren't imported into another unless
> the user explicitly requests that they be. 2. Macro invocations are
> clearly macro invocations. You never have to wonder if something is a
> function or a macro.
Given the reference to Boost, the author is almost certainly talking about template metaprogramming, not C macros. TMP is obviously a lot more limited in scope than Rust macros, but it could hardly be called dangerous; I doubt anyone's ever invoked it by accident.
Boost has all kinds of preprocessor macro stuff, like a loop construct that works through recursive includes, and the foreach macro (hopefully obsoleted with C++11).
True, but since the context of the comment was about how crufty he worries Rust would get once the "Boost crowd" discovers Rust macros, I doubt that it's really the specific topic of "relative merits of language constructs given the name 'macro'" that's under discussion (the word 'macro' doesn't even appear until later in the post). Certainly C macros aren't the main source of cruft in Boost.
> References and lifetimes allow you to safely return pointers to stack allocated objects.
This is explicitly called out as non-idiomatic behavior in the documentation, however. The preferred action is to allocate on the caller's heap and pass a mutable reference down to the callee.
In fact, in general it's recommended not to use Box, because it complicates human reasoning about the code. And while it gets around a lot of the compiler's restrictions, its akin to writing <language of choice> in Rust, which is frowned upon in any language. Recommending its use so broadly is doing a disservice to people who want to learn Rust.
> Only at interfaces where the declaration also serves as documentation. Elsewhere, types can generally be inferred.
Except where they can't, and those locations aren't terribly consistent. The Rust designers have publicly announced their preference for explicitness over inference, and the language reflects that.
> On the other hand, in rust, you can only return one error.
This is not unique to rust, or any language really. You can only throw one exception at a time. You can only set one errno at a time. You can only return one `error` at a time.
> macros have to be explicitly imported
Except for the built in ones, which are the only ones referenced by the OP. Also, by placing the macro delimiter `!` between the name and the parenthesis, it makes the macro harder to scan for visually. I imagine that any editor will want to set up special rules to highlight these distinctly, and having special highlighting for the ones known to change the program flow would be beneficial.
> It exposes a lower-level (not simple) memory model because systems programmers need it.
Low level memory is simple: write to, read from, write to referenced, read from referenced. The OS adds one more major operation: get heap memory. Everything else is added by languages or libraries.
That said, Rust's restrictions on memory lifetimes results in more simplistic memory related code. When you have to jump through extra hoops to create a pointer which may be used beyond a single scope, and the compiler creates so much friction when you want to do anything with them in that greater scope, people will defer back to simplistic memory code.
I'm not certain if this is good or bad; it just is at this point.
> Alpha means fewer breaking changes and no "major" breaking changes not stability.
Any breaking changes affect stability, affects documentation (Rust's library documentation is behind the actual code as of a week ago), and affect 3rd party libraries. The results of this is that if you're not Mozilla, there are significant barriers to writing Rust code right now, and I would not personally recommend learning or writing Rust right now to anybody.
> Except where they can't, and those locations aren't terribly consistent.
Why aren't they consistent? The Rust type inference is generally very good, and the places where you have to annotate are places where any typechecker would force you to annotate, because the types are simply underconstrained (e.g. the return type of Vec::collect or mem::transmute).
> The Rust designers have publicly announced their preference for explicitness over inference, and the language reflects that.
As the original author of the typechecker, I can state that the idea that we intentionally made the type inference less powerful than it could have been is totally false. It's always been as powerful as we could make it, except for interface boundaries (where type annotation is needed because of separate compilation anyway).
So, I went back to do a bit of research, and it's gotten better since this first bothered me, my apologies. My beef was with the `let x = vet::Vector::New::<i32>()` vs `let x: Vec<i32> = vec::Vec::New()`. Perhaps not the best way to word it, so consider this objection retracted. :)
> the idea that we intentionally made the type inference less powerful than it could have been is totally false
Except for function definitions, where the types could be inferred from the function bodies, but are not:
Plus (and this is more related to the complete lack of implicit type conversions), there are types everywhere in the program. I frequently can't write a number without having to append a type, even when the type has been explicitly defined previously.
Here's one of my favorites from a recent attempt to write a ray tracer:
> Except for function definitions, where the types could be inferred from the function bodies, but are not:
That's an interface boundary, as I mentioned. You would have to write the types in many cases anyway for separate compilation to work. In languages where you have whole-program type inference like ML and Haskell, people frequently end up writing the types for functions because of this issue.
> Plus (and this is more related to the complete lack of implicit type conversions), there are types everywhere in the program. I frequently can't write a number without having to append a type, even when the type has been explicitly defined previously.
This has nothing to do with implicit type conversions, but is rather because numeric literals have no type. It is not a type inference problem; it is just the way that numeric literals are defined.
To summarize: bare FP literals default to f64, bare integral literals default to isize. (NOTE: isize is recently renamed from int. It is a pointer-sized integer.)
(EDIT: The default for integer literals may be superseded by a later RFC. I seem to recall that the default is actually i32 now, but I can't find a PR to back up that claim.)
So you could easily get a Vec3<f64> like so:
let mut s = Vec3 { x: 0.0, y: 0.0, z: 0.0, w: 0.0 };
let mut t = Vec3 { x: 0f64, y: 0.0, z: 0.0, w: 0.0 };
The key here is that integral literals and floating point literals are distinct.
A bare literal of the form `0` is an unconstrained integral literal.
Whereas a literal of the form `0.0` or `.0` is an unconstrained float literal.
In practice it is very rare for me to annotate my numeric literals. If the variable escapes the stack frame it will be constrained by the signature of the function anyways. If not I constrain the type inline (`let x: T = ...`) and use the appropriate bare literals.
You are right that this changed, but I can't find it in the RFCs either. https://github.com/rust-lang/rust/pull/20189 implemented it. And it is what everyone agreed upon.... hmm
> The preferred action is to allocate on the caller's heap and pass a mutable reference down to the callee.
Care to link/elaborate this? I thought the recommendation was to return by value in this case -- making it easy for the caller to decide where to store the value. The move semantics would then optimize the copy away, so you end up using either the caller's stack or the heap, depending on how the call was made.
Actually, I got curious and tried it out. Turns out Rust didn't optimize the heap case: it used the caller's stack, and only then copied the value to the heap.
Because it would change the semantics of the program. Where heap allocations happen is considered a side effect, and forcing a function to be inline(never) also makes it have unconstrained side effects (in some cases). Those side effects cannot be reordered.
> In fact, in general it's recommended not to use Box, because it complicates human reasoning about the code.
Really? I've always understood it was because when possible that decision should be left to the caller and boxing by default just made the interface less flexible/convenient for callers. How does Box complicate reasoning about the code?
To me, because Boxed objects are a weird combination of a pointer and a stack value. Boxed values have a lifespan all their own, and you need to understand the details of a box's scope to understand how they will behave, and when they will be deallocated.
> In particular, allocating a new object and returning a reference to it it from a function is common in C++ but difficult in Rust, because the function doing the allocation doesn't know the expected lifetime of what it returns.
This is what boxes are for. A Box is a unique pointer to a value on the heap and can be used without knowing compile-time lifetimes. References and lifetimes allow you to safely return pointers to stack allocated objects. In C++, you'd have to do this:
When returning references, rust uses the lifetimes instead of explicit declarations to figure out where (on the stack) `value` needs to be allocated.> Declarations are comparable in wordiness to C++.
Only at interfaces where the declaration also serves as documentation. Elsewhere, types can generally be inferred.
> Rust has very powerful compile-time programming; there's a regular expression compiler that runs at compile time. I'm concerned that Rust is starting out at the cruft level it took C++ 20 years to achieve. I shudder to think of what things will be like once the Boost crowd discovers Rust.
Unlike C++, 1. Macros from one crate aren't imported into another unless the user explicitly requests that they be. 2. Macro invocations are clearly macro invocations. You never have to wonder if something is a function or a macro.
> The lack of exception handing in Rust forces program design into a form where many functions return "Result" or "Some", which are generic enumeration/variant record types. These must be instantiated with the actual return type. As a result, a rather high percentage of functions in Rust seem to involve generics.
How is this a problem?
> There are some rather tortured functional programming forms used to handle errors, such as ".and_then(lambda)". Doing N things in succession, each of which can generate an error, is either verbose (match statement) or obscure ("and_then()"). You get to pick. Or you can just use ".unwrap()", which extracts the value from a Some form and makes a failure fatal.
I agree that this is less than ideal. However, IMHO, this is better than Java and C++.
Java:
Libraries tend to bubble everything. This leads to long throws clauses in function signatures with unexpected exceptions. A user of these libraries often catches and ignores these exceptions when writing the first draft of his or her programs because they don't make sense (why handle IO Errors when using a collection?). And then, because his or her program works, he or she forget about the ignored exception cases turning them into silent errors.
On the other hand, in rust, you can only return one error. When writing a function that has multiple failure modes, this forces the programmer to think about the set of failures that can happen and come up with new error type. This doesn't force the programmer to come up with a meaningful error type but it gives them the opportunity.
Additionally, like in Java, Rust programmers can ignore errors (`unwrap()`). However, unlike in Java, these ignored errors are not silent, they are fatal.
C++:
Exceptions are unchecked and everyone I've talked to avoids them like the plague. In the end, C++ exceptions end up acting like rust's `panic!()` because programmers don't check them but are used like Java's exceptions because programmers could check them.
> There's a macro called "try!(e)", which, if e returns a None value, returns from the enclosing function via a return you can't see in the source code. Such hidden returns are troubling.
I agree that hidden returns can be troubling. However, in rust, only macros can lead to hidden returns, macros use a special syntax (`macro_name!(args...)`, and macros have to be explicitly imported.
> All lambdas are closures (this may change), and closures are not plain functions. They can only be passed to functions which accept suitable generic parameters. This is because the closure lifetime has to be decided at compile time.
The first sentence is correct but the last two are just wrong:
The `Box` allocates the closure on the heap and the `move` causes the closure to capture by value. This means that this closure (`f`) can be moved freely without lifetime restrictions because it doesn't reference the stack. However, most functions that accept closures use generics and do any necessary boxing internally to make the user's life easier.> Rust has to do a lot of things in somewhat painful ways because the underlying memory model is quite simple. This is one of those things which will confuse programmers coming from garbage-collected languages. Rust will catch their errors, and the compiler diagnostics are quite good. Rust may exceed the pain threshold of some programmers, though.
Rust is a systems language. It exposes a lower-level (not simple) memory model because systems programmers need it. If you want garbage collection, you are free to roll your own (yes, you can actually do this in rust).
> Despite the claims in the Rust pre-alpha announcement of language definition stability, the language changes enough every week or so to break existing programs.
Re-read those claims. Alpha means fewer breaking changes and no "major" breaking changes not stability.