As someone who hasn't actually used Rust for anything yet, the example on the root of the site is surprising legible: https://tokio.rs/
Question for the more knowledgeable Rust folks: Does this code (the echo server example) not handle the situation where the port is already in use? What's the return type of the socket bind in that situation?
EDIT: Wow six responses in as many minutes. I think I've created a new objective measure of how popular a programming language will be!
This is actually something about the code samples in Rust docs that irks me a bit.
My view is that sample code such as this should be as idiomatic as possible and that means providing a sample demonstrating a typical real use case. So seeing "unwrap" in this context doesn't sit well with me.
In fairness, this isn't specific to Rust. I find sample code like this in many projects regardless of language. However Rust is billing itself as a safe systems programming alternative to C and C++. It would help its marketing efforts, in my opinion, by having more robust samples than the competition. And let's be honest: given the competition is C and C++, that's a low, low bar--and this comes from a guy who's about as big a fanboy of those two languages as possible.
Edit for more disclaimer: documenting Rust code is a real pleasure. The team has done a stellar job at making documentation an easy thing to do while writing the code. It's already better in most cases than many other projects I can think of.
If the bind fails, panic seems totally appropriate. Bonus points for telling me it failed for EADDRINUSE. This is head and shoulders above anything C could deliver IMO. It's a fair debate whether we should prefer the simplicity of panic over the elegance of unwinding and handling.
So, to be clear, unwrap is safe. It is never going to cause the sorts of memory issues that an uncontrolled crash is.
That is, in terms of safety, this is the same thing as explicitly handling the error and then terminating the process. Which is the only real way you're going to handle this error anyway, unless you wanted some sort of retry logic. Which you might!
I agree that in terms of safety it's equivalent. However in terms of how an actual program would be written it's...Less than optimal. I almost never simply allow a panic-like termination in a program like this, if it can be trapped, without doing something else, even if it's just to spit out a log/stdout message with a descriptive reason. Then again the plural of anecdote isn't data, and maybe I'm the oddball here.
Yeah, unwrap()s in examples should be written to use expect() (which spits a message and then panics).
But for most example programs there's no easy way to handle errors further than just using expect() unless you want there to be more error handling code than actual useful example code.
Rust n00b here (I read a lot about it but haven't written a line yet), but wouldn't `.orPanic('error message')` be a better name than `.expect('error message')`? The latter seems backwards.
I understand the intent, but usually the programmatic object is the subject of the sentence, the method is the verb and the parameter/argument is the grammatical object.
Here, the programmatic object is the grammatical object. That's what I meant by 'backwards'.
could_fail().or_panic_with('message') may be even better.
A lot of people don't seem to agree with me but I think even expect is too long of a name and terseness for commonly used names seems more important to me than reading like natural language.
unwrap should be confined to test and temporary code. It somewhat makes sense in example code, but expect is better for that.
There are sometimes cases when you very easily know that the unwrap will never fail and if it does something has gone very horribly wrong, in which case you might use it.
Libraries should keep usage of unwrap/expect to a minimum. Applications can be more liberal with it, but they should try to use expect or better error handling.
Not everyone perfectly follows this, sadly. But most do.
It's at least partly a matter of affordance. You could soft-deprecate .unwrap() and .expect() (emit warnings at compile time, it will be annoying but not fatal), and provide a new version of expect with a name that's more cumbersome to type (or_panic_with(...) isn't bad in that regard, but you could do even worse... or_panic_with_error_message(...) :-).
> even if it's just to spit out a log/stdout message.
This will print an error out already.
$ ./target/debug/tokiotest
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { repr: Os { code: 98, message: "Address already in use" } }', ../src/libcore/result.rs:837
Printing out a _better_ error might be helpful, though, I'll agree :) You could use expect for that, which lets you change the message in the ''s easily.
This really shows a fundamental tension though, in documentation. Is this example supposed to be demonstrating error handling? Or just get you going? Does adding more complex error handling distract from the point it's trying to teach? These are sort of open-ended questions.
Wasn't there an attempt to get a variation of main that returned result at one point(that would presumably print error and set exit code one error)? Back before "?"
If rust had that examples would be even shorter using ? Instead of try! or unwrap
FWIW there are third-party libraries which include that feature e.g. error_chain has a `quick_main` macro (working with its `error_chain!`-generated error, but I'm not sure it actually depends on it):
quick_main!(|| -> Result<()> {
// insert Result-returning code here
});
Alternatively it takes a few lines to bootstrap it by hand e.g. ripgrep uses this code to bootstrap:
`-> impl Trait` seems useful in that context, so that main could have a variety of appropriate return value types: `()`, `i32` (for an exit code), or `Result<T, E>` where T implements ReturnValue and E implements Error.
`quick_error` uses a trait like that to determine the exit code of `main()`, allowing it to return either () or i32.
The output of panic! seems to be useful only to the programmer, not the user. As a user, getting an error message like that would lead me to think that the application is defective, as it arguably is if it cannot provide better user experience in a completely expected error condition like a port being already in use.
The output of panic will only ever be seen by the programmer. Unwrap exists to ease prototyping and to make simple code examples. IME, the first thing you do when you take a Rust application from the prototype phase to the production phase is to grep for unwraps and insert proper error handling.
This is invalid since nothing in the compiler forces you to remove the .unwrap() so it's safe to assume it will not be done before production.
The whole "but this is just for prototyping" is a logical fallacy, as you know we have tons of prototypes in production ;)
I admit that I'm having a hard time seeing this criticism as anything but overblown. Finding usage of unwrap is trivially easy via textual search. Furthermore, Clippy can be used to automatically forbid usage of unwrap for your entire team ( https://github.com/Manishearth/rust-clippy/wiki#option_unwra... ). Furthermore, even when you hit a panic, it tells you which line of code in which file triggered it so that you can fix it immediately. Furthermore, the Rust community has a strong and long-entrenched proscription against libraries which unwrap rather than handle errors.
We can agree in the cynical interpretation of the laziness of programmers, but the mitigations in this case are so trivial, and the stakes so low, that focusing on unwrap as a point of contention is a poor use of energy.
> Your users aren't likely to be seeing a server-side process like this fail, though, so in this situation, seems fine.
Whoever maintains the server and runs the service is also my user, though, in the general case.
> As I said in the post you're replying to, a nicer error message would be a good thing.
Yeah - I don't mind if unwrap panics with a dev-oriented message as it's basically an assertion, but I guess I expected expect() (no pun intended) to give a more user-friendly error. Maybe the format of the panic! output could be changed to bring the message to the front and the technical details after that.
In the world of server systems programming once an organization gets beyond a certain size it's uncommon to have programmers administering server daemons or other server applications. Those folks are indeed users.
See my sibling comment to the above. By the time your software has matured enough that it's been deployed to non-developers, unwraps have no place in the code. It's not an error-handling strategy, it's just "// TODO: Add error handling" that the compiler understands.
As I said elsewhere, exemplary of what? The concept, or robust error handling? Even with the latter, it's unclear what the _right_ error handling is without knowing what you're actually doing.
Doing anything other than a crash is often sub-optimal, in my experience.
Such error handling code is usually untested, which is another way of saying 'buggy'. It almost always swallows useful information, like the backtrace. It sometimes lets program execution continue in a messed up state, causing very strange and hard to debug errors later on.
Certainly Rust makes it a lot harder to mess up error handling code than the languages I'm used to but in general I'm definitely in the 'all exceptions fatal' camp.
An exception to the exception rule IMO is a program that is managing many internal tasks at once, and the failure of one should not bring down the others. For example, a program that is coordinating many IoT devices should not fail if one of those devices cannot be contacted.
Agreed. Looking for examples of how to handle errors, results, etc and only ever finding unwrap has frustrated me in the past, and as a noob I would have preferred fleshed out rather than shortcuts.
The mention of memory safety is kind of weird. Reading or writing to -1 in C is memory safe too. It's even sometimes done deliberately as a debug technique if you want to insert messages in your IO traces. (Wrt your post that rust is more than just safety, people will get that impression when you introduce memory safety into unrelated threads.)
While Rust is more than memory safety, it's really important that people accurately understand what safety actually means in Rust. Because then they oversell it. Common misunderstandings here:
* Rust prevents deadlocks
* You can't leak memory in Rust
* Rust prevents race conditions
* Panic is not safe
etc. In my mind, bringing up memory safety here isn't a red herring; the parent said this:
> However Rust is billing itself as a safe systems programming alternative to C and C++.
The way in which we are safer is memory safety, nothing more. And knowing that is crucial.
Rust code will have bugs. Rust code will have security vulnerabilities. Rust is not a panacea.
> The way in which we are safer is memory safety, nothing more.
I know you want to not overstate rust's claims given recent articles, but I think you're actually underselling a little here. For example, a rust `enum` make it much easier for the compiler to enforce code correctness. It's hard to go back to similar code in C or Go once you've gotten used to `match`.
Yes, I guess in my mind, "correctness" and "memory safety" are two different things. I like that Rust can help you write more correct software, but it's not nearly as strong of a guarantee as our memory safety guarantees are.
This is a relatively uninteresting quibble: the guarantees of any language only apply outside their equivalent of unsafe blocks (e.g. Python is memory safe... until you use ctypes). Pretty much everything has a way to FFI down to interact with the machine; in Rust, it's just called `unsafe`.
(One way to look at `unsafe` is that it's a tightly integrated FFI to another language called "unsafe Rust", and it benefits from having zero performance or semantic overhead.)
His point is that in the world of programming languages, "true memory safety" isn't really something achievable without making it impossible for your language to interact with things like native libraries or the OS. Given this, the concept of "true memory safety" isn't a useful one. In general when you say a language is memory safe, there's an implicit caveat that there may be explicit escape hatches. No language (except perhaps web-based JS) is "memory safe" by the strict definition so it's not a useful parameter to use when talking of languages. You instead use the weaker version.
Can't tell if trolling or not. Python doesn't market itself as memory-safe because it's completely expected that scripting languages are memory-safe. The only domain where memory safety isn't taken as given is in systems programming, which is why Rust's safety guarantees are a big selling point.
You seem to be saying that because Rust is able to provide stronger guarantees, it should make weaker claims on its main page. But just because some of Rust's benefits are verifiable, unlike those of some other languages, doesn't mean the constraints on them need to be enumerated every time they're expressed. This is not "lying."
By the same token, you should be requiring that every subjective statement on the other langs' pages have the caveat "in the opinion of the $LANG developers." Obviously no one would wear that.
"thread safety" and "memory safety" are vague terms. "thread safety" is clarified there. "memory safety" has a particular meaning in that context which is true for Rust.
The guarantees Rust makes are always only outside of unsafe blocks. After all, you can call C code in unsafe blocks. :)
The page clarifies later on that the definition of thread safety in question is "threads without data races." The 16 words on the front page of the website are not the appropriate place to go into caveats and nuances.
We've tried to come up with something succinct to replace "thread safety", but it's tough.
> Only outside unsafe blocks.
This is just true of all of our guarantees. Given that the vast, vast, vast, vast majority of Rust code is safe code, I don't feel this is misleading. Do you say that Ruby doesn't prevent segfaults due to its C FFI?
1/ It says on the tin "prevents segfaults". It does not prevent segfaults in all cases, but OK, let's debate the other claim.
2/ It also says "guarantees thread safety".
What is the Wikipedia definition of thread safety? It varies, so let's take the most common "freedom from race conditions."
Steve come and say Rust can have race conditions https://news.ycombinator.com/item?id=13376485 and that it's a common misunderstanding to think it prevents race conditions. Surely it would be great if the frontpage would not promote it!
I disagree with the segfaults bit (which is what you had been focusing on so far, and which is what I've been arguing against) but yeah, "guarantees thread safety" is iffy.
The wikipedia definition is pretty vague, it relies on a concept of "safe" that isn't defined there. It's acceptable to say that "data race safety" is "thread safety", though confusing. Rust's homepage does clarify what it means in the bullet points below that statement, so I wouldn't call this a lie. It may be misleading though, and this is a common misunderstanding as steve mentioned, so I submitted a PR to fix it https://github.com/rust-lang/rust-www/pull/685
Yes I think "data race freedom" when I see "thread-safety" (I despise this one term for being so vague).
In the punchline, "prevent segfaults" is framed in negative terms. If you put "provides memory safety" it will be a bit less evocative of specific pain, but will give a warm feeling.
Yes, to second Manish, I would love to find something that is accurate, succinct, and understandable. Concrete suggestions (from you or anyone else) very welcome.
While teaching Rust it's a recurring theme among students to believe that all crashes are created equal and that an exit from a panic must be equivalent to a segfault, despite this being significantly false. So it's common when discussing panics to novice audiences to sprinkle in "by the way, this isn't a crash, it's a controlled exit".
File descriptor -1, as in the same thing we're unwrapping in the rust code. In C, if socket() returns -1, that's actually a monad that you can pass to bind() and listen() and accept() and then check for errors. :)
> My view is that sample code such as this should be as idiomatic as possible and that means providing a sample demonstrating a typical real use case. So seeing "unwrap" in this context doesn't sit well with me.
I've starting trying to use `if let Some/Ok(...) = ...` in examples for my projects rather than `unwrap` for pretty much this reason. I feel like it strikes a pretty good balance between providing a relatively terse example without the risk of implying that `unwrap/expect` is considered normal usage for the library.
That code is easy to understand. However I tried to play a little bit with tokio and my first idea was to extend the example to send all bytes 2 times back, which means I tried to avoid the copy helper method. In C/C++ or Go with a blocking socket API that would have been a 5 minute task, but with tokio I was unfortunately quite lost (didn't complete it in 2 hours). Some things that caused headaches were that buffers are consumed by the futures and how to reuse them, getting an infinite read loop running, and trying to compose different types of futures (all the time got errors that different branches have different types, which I finally silenced with using .boxed() everywhere). Maybe I should have tried to implement a custom future with a poll method instead of trying to use composition for that.
My intermediate impression is: Basics and the whole idea behind it looks very good and suits Rusts general concepts. However from a productivity point of view it yet can't match the ergonomics of Go or C# with TPL and async/await for IO tasks. But I think the rust contributors are well aware of this and working on some things (async/await like sugar and that impl trait thing which might fix my composition problems).
Perhaps something like that should also be on the homepage to show that, while you can do low-level, it has already scaled up to a higher abstraction for you.
On a related note, how does the `parse()` in `let addr = "127.0.0.1:12345".parse().unwrap();` work? I am assuming that this parses a IP:port string into its components, but it looks like this is just calling a method of the basic String class. So how does it infer what `parse()` is supposed to do?
parse() is generic over it's return type (which is Result<F, F::Err> where F: FromStr) [1]. The value of which can be inferred.
The result of parse().unwrap() is of type F, which means addr is of type 'F'. afterwards when we call `let sock = TcpListener::bind(&addr, &handle).unwrap();` the compiler can infer from the signature of `TcpListener::bind`[2] what F has to be, namely `std::net::SocketAddr`
Where is string to address parse implementation defined? I took a peek at SocketAddr and see a from_str function. Does that name have special significance?
The method `from_str` is from the FromStr trait. [1] The `parse` function only works on types that implement FromStr. It's "special" in that it's the conventional way to do things, but the compiler does not treat FromStr specially.
If you look closely, you'll see that the from_str function is part of the implementation of the FromStr trait for the SocketAddr struct. Its this FromStr trait that the parse() method depends on.
Rust uses type inference to figure that out. Because `addr` gets passed to `TcpListener::bind` which requires a `SocketAddr`, the type inferencer knows that `addr` should be a SocketAddr and so it all works :)
You might also add that it only works because SocketAddr implements FromStr. Type inference will only determine which version of from_str to call, it won't actually do the parsing.
The example code does not handle a failure in that situation.
The TCPListener::bind call returns a Result enum, which can have the values Ok(SomeType) or Err(SomeError). The compiler forces you to decide how you want to handle that, for the sake of brevity the example code calls unwrap() which assumes the result is Ok and will either return SomeType or else panic.
Glad to hear it, especially since this isn't really the code you'd write most often; this is a library that frameworks will use. So it can be even more ergonomic...
> Does this code
So
let sock = TcpListener::bind(&addr, &handle).unwrap();
Here, the bind method [1] will return an error if it can't bind to the port. unwrap will cause a panic to happen.
If the socket is already in use the .bind call will fail to unwrap() and the program will panic and exit.
It's generally considered bad form to use unwrap() (and by extension panic) in production code, but for examples where you just want to show something it's fine.
bind would either return a Option type or a Result type, unwrap is defined for both.
edit: Others have looked through the docs and confirmed it's a Result
This code does not handle the situation - the program will crash.
TcpListener::bind(&addr, &handle).unwrap()
Rather than raising exceptions, the paradigm in Rust is to return a Response item, which has an Ok and an Err subtype. unwrap() here assumes that the response is Ok and just gives that back the data returned, or crashes if its an error.
Is Rust like Ruby in that devs and library authors can add methods to and redefine built-ins? And that merely including a library can alter the definition of those objects everywhere in your project's code?
I'm not trying to start a language war, but IMO that's too bad: this feature is, in my experience, the single largest source for difficult-to-detect errors that characterize the Ruby programming experience, e.g. anything that uses Rails. It also contributes to deep stack traces which are essentially useless, because the departure point from your actual code is some 20+ frames above the site of the error.
One can't add methods dynamically like Ruby nor redefine built-ins. However, one can implement traits for built-in types and the functionality in such traits is often provided as methods, giving the appearance of adding/overriding built-in functionality. Rust has the important distinction that functionality from a trait is only available when the trait is in scope, which requires an explicit import. Additionally, Rust has a static type checker, which helps lessens the risk of any features that seem implicit.
In this case, both of those methods are in the standard library.
So in this case, someone has created a trait, I guess as part of the standard library, that provides parsing of network addresses from strings?
From reading the docs, something provides a method called `FromStr` and the `str.parse` method does some magic to figure out which `FromStr` of all the registered traits is the one which we want? So then, it's not arbitrary monkey-patching, but using a defined protocol for defining and extending string parsing?
The parse method is an inherent method on the string type (i.e. not connected to a trait, and doesn't need an import), but added by the standard library, yes.
> From reading the docs, something provides a method called `FromStr` and the `str.parse` method does some magic to figure out which `FromStr` of all the registered traits is the one which we want?
The method isn't doing any magic: the language's type inference is deducing that the return type has to be a SocketAddr based on how the return value is used (it is passed to a function expecting a SocketAddr), and this forces the compiler to use the SocketAddr implementation of the FromStr trait (which has a from_str method), all at compile time.
(If the compiler deduced that the return type was something that didn't implement FromStr, i.e. no idea how to parse that value from a string, one would get a compile error.)
> So then, it's not arbitrary monkey-patching, but using a defined protocol for defining and extending string parsing?
Yes, parse is entirely driven by the FromStr trait.
So, specifically. Rust has "traits", which, for these purposes, work like Ruby modules. But. Rust has "coherence rules." That is,
You can only implement a trait for a type if you defined either the trait, or the type, or both.
So, because libstd (well, Rust itself, but you get the idea) all three of FromStr, &str, and SocketAddr, it's allowed to do this. You, in your own package, could not define this, since you wouldn't have defined any of them.
However, I was thinking of a slightly different case when I said that, which is methods. That is, you couldn't call methods unless they've had their traits brought into scope. But parse isn't on a trait; it's defined on `str` itself. In that definition, it also has to bring FromStr into scope.
So basically, this is very muddled by the fact that this particular example uses the standard library heavily, which already implicitly brings a lot of things (like &str) into scope for you. External packages don't have that.
> So, because libstd (well, Rust itself, but you get the idea) all three of FromStr, &str, and SocketAddr, it's allowed to do this. You, in your own package, could not define this, since you wouldn't have defined any of them.
It's unclear to me what you're saying here. It's true that you can't make an inherent method `parse` on str like the one being called here, it's also true that one couldn't implement FromStr for SocketAddr in an external package, but one can create a custom generic function that uses FromStr and have that generic type be SocketAddr for some call sites. Of course, if this function wanted to have the syntax of a method call on str, it would have to be defined in a trait that users then import.
To be clear, this isn't necessary for calling methods---including inherent methods on SocketAddr---unlike the methods in a trait. The only reason to do this is that it lets one abbreviate the name: "SocketAddr" instead of "::std::net::SocketAddr".
This is all very impressive, especially since AFAICT Tokio is only about 5 months old and consists of tens of thousands of lines of code spread across a half dozen different repositories. I'm anxious for the day when I can use Rust at work. All I'm really missing to justify using it is a bit more mature ecosystem. Especially a full featured, stable AWS SDK.
It was announced roughly in August, so yeah, five months. Though some work had to happen for that initial release, of course. There was a joke though, about this rapid development. Someone said something like "That they keep re-writing tokio is really annoying as a user, but I guess if Rust is so productive that you can throw out huge chunks and re-build it that quickly, well, that says good things about the language." Of course, as the post says, things will be more stable now.
Rusoto seemed like a good effort last time I looked, it didn't have very full coverage of the AWS API. Amazon just has so many APIs that it's a huge dev effort to implement it all and keep up with changes. This is a problem in other languages where Amazon doesn't contribute to keeping the client up-to-date. I'm hopeful that with macros 1.1, the situation will improve since it will enable compile-time code generation based on the official json files in botocore. Just add that repo as a git submodule and then:
Becomes up-to-date with the latest changes from Amazon the moment they're released (recompile required).
Procedural macros are really exciting with regard to writing/consuming APIs as they enable both the client and the server interface to be implemented in an API specification language (Amazon uses custom JSON, but Swagger/RAML could work too) instead of Rust with zero performance penalty (gotta love those zero-cost abstractions :-)
We're tracking the missing services: https://github.com/rusoto/rusoto/issues/436 . There's plenty of work to do and we're concentrating on getting them implemented. Sometimes progress is slow since it's a side project, but it's still moving forward.
If we can make the derive statements work as your code snippet shows, I'd be really happy. We'll take a look at how we can improve codegen when new features are available. Some of our codegen is dated, using what was available when it was written.
I've used rusoto a bit. A lot of it feels very "generated" in that there are a bunch of different types for everything, but you can ignore most of that and just use the Client methods. It does make browsing the docs a bit difficult though.
It seems like many things are in flux. There's a new `rest_xml` codegen thing that just got implemented which will allow AWS API to be supported. They're also moving from Hyper to Reqwests. From what I saw, Async stuff is planned for after 1.0 by adding new AsyncClient impls backwards-compatibly.
Yes, there's a lot of types we generate from the botocore service definitions. Any suggestions on reducing the noise in our docs to make usage clearer? I'd love an issue on Github with thoughts. https://github.com/rusoto/rusoto
We also have a crate that's designed to provide higher level abstractions. Rusoto is analogous to botocore and rusoto_helpers (https://github.com/rusoto/rusoto/tree/master/helpers) will be closer to boto3. It's on the back burner as we focus on completing the core functionality.
Docs wise, a big help would be to highlight 'important' structs. When I was first reading the docs, I saw a great big list of structs including ones like SpecificMethodInputAeguments, and completely missed the KinesisClient structure that all the actual methods are implemented on!
I've made https://github.com/rusoto/rusoto/issues/519 for this particular issue, along with a sample way of calling out the important Client struct for each service. Thanks for the idea!
I used it's S3 module for a project a few months ago. I remember the API feeling a bit awkward and non-idiomatic, but I also got the impression that the API was emulated another language's AWS API. I don't know if that is true or not.
Overall I was super happy. I finished the feature I was working on in maybe a day or two without any issues in the rusoto library. Can't ask for more than that.
I really want to know if the eventually arising full-blown web framework in rust still can claim "zero cost abstractions", and if this would translate into basically "the most performant/efficient way to write a web-app".
So, when what was to become tokio was first announced, we did a check to see how it compared to writing mio by hand. The two were within 0.3% (not a typo, one third of a percent) of each other. That was before any optimization work was done.
One of the core premises of tokio is that this compiles down to the state machine you'd have to write by hand if you wanted to do asynchronous stuff. So yes, this is very much intended as a zero-cost abstraction.
Yes, I got that. So to build a "small" web framework like express.js, one could build a routing layer based on macros (like Phoenix/Elixir or rocket.rs do), plus JSON-serialization (Serde might do it) and a compile-time HTML-templating-engine, all of those leveraging zero-cost abstractions once Macros 1.1 land in stable, right?
Then stuff like spinning up 1 OS-thread for every CPU core and scheduling requests in round-robin (?) style, and then wrap it all up into a developer-ergonomic thing.
Since you probably have a way better insight into rust-dev and the ecosystem, can we stay tuned that this eventually will happen or are there still showstoppers somewhere?
You could see how that match block functions as a router...
I was literally playing around with this last night. There's a lot of experimentation going on in the server-side Rust web framework space right now, and I expect it to heat up even more now that tokio has had a release.
> once Macros 1.1 land in stable, right?
So to be clear, macros 1.1 gets you custom Derive, which is Serde/Diesel. It'll be stable in the next release in ~3 weeks. It won't get you the full ability to use any custom attribute, like https://rocket.rs/ uses.
> Then stuff like spinning up 1 OS-thread for every CPU core
This is not implemented in tokio yet, but in my understanding, it's coming.
> a lot of experimentation going on in the server-side Rust web framework space
This alone are awesome news! Having worked with literally dozens of web frameworks, I have a sincere interest to see what emerges from the rust space. I can't pinpoint or prove it, but many developers seem to care about doing stuff "the right way" instead of getting something out the door that works maybe Ok. I really wait for the "this is it"/"we couldn't possibly solve this better" moments that might arise from such experimentations.
> macros 1.1 gets you custom Derive, which is Serde/Diesel
... basic building blocks for all things web I'd say. Looks like the next stable release will arrive just-in-time for things to emerge.
And then there might be the next wave of rust-users/developers that want to built upon such rock-solid foundations, like me. Really, I'm excited like I haven't been in many years!
Really minor thing, but I for instance like how in Hyper.rs types are used to enforce (at compile time) that one does not try to write headers after having started or sent the body of a response.
http://hyper.rs/hyper/v0.10.0/hyper/server/index.html#an-asi...
Good sign that we are starting to leverage the compiler to verify such invariants.
> > Then stuff like spinning up 1 OS-thread for every CPU core
> This is not implemented in tokio yet, but in my understanding, it's coming.
I just started full-time work on something like that (waiting out a noncompete) but more similar to the LMAX disruptor than a round-robin scheduler. That sort of heavyweight pinned actor architecture has a lot of performance and simplicity benefits especially for something like Rust which already has a solid handle on good memory use.
No, because its creator is not willing to compromise on ergonomics in order to get on stable. I would imagine that they would reject this on these grounds alone.
Agree. The documentation would make for a great talk. I'm going to spend a lot of time reading it, because the concepts it introduces are surely going to be useful no matter which tech i'll be using.
Futures-rs and Tokio were started by prolific Rust contributors so the community is gravitating towards making these crates the canonical async libraries for Rust. Since async primitives are so important and used in so many different ways, the maintainers are taking it slowly so that the community can provide as much feedback as possible. It may not be in the standard library but futures/Tokio will be the basis for most high level async in Rust so it's critical to get it right.
The Rust community also takes semver quite seriously. 0.1.0 can break API compatibility when it moves to 0.2.0, so prototypes that want to iterate on API stay pre-1.0 until they feel confident in their API. The Cargo ecosystem has shockingly few 2.x or 3.x versions.
By comparison to the long tail of the C library ecosystem (SONAME handling), and the package ecosystems of many other languages, most of which do not use sufficiently well-defined versioning schemes to allow expressing dependencies like "1.4 or any compatible version". In other languages, I've done the equivalent of "cargo update && cargo build" and encountered build errors due to API changes. In Cargo, I've found that exceptionally rare.
To clarify, I was specifically looking forward to the first production Rust web framework based on futures. I'd like to write web applications in Rust, to integrate with many other libraries and parts of the ecosystem, and I look forward to doing so via a futures-based framework that integrates well with other futures-based Rust code.
I enjoyed visiting Tokio (Tokyo) the city and I liked the
"io" suffix and how it plays w/ Mio as well. I don't
know... naming is hard so I didn't spend too much time
thinking about it.
"Tokio" is one way to romanize 東京, the capital of Japan. [1] "Tokyo" is more accurate in a sense, and so is more popular today, but doesn't contain the I/O pun. The logo also references its metropolitan crest. [2].
I edited my post with a link to some history, you might want to check it out. As someone with very limited 日本語 skills, I find this stuff very fascinating. (And yes, un-learning kyo vs kee-ooh was tough...)
Yeah, I think I still say it wrong sometimes. To me, it's a word I learned in English, and so I think of it the English way. In that way, loan-words are troublesome in both English and Japanese for words from both languages.
Don't even get me started about learning French words in either of them and then learning their pronunciation in the other, and then the actual French. As an example, the Atelier game series.
I don't know if it's entirely correct to call it wrong. The nearer a word comes to its equivalent in its source language, the higher the expectation that it's pronounced the same as the source language. This expectation is arguably correct when the word is not commonly used in the host language. Some words are very different. Germany/Deutschland. Florence/Firenze. Some are closer, even having the same spelling: Mexico/Mexico. I don't think it's not wrong per se to pronounce it the way its typically pronounced in the language you're speaking.
shrug, aren't the cues provided by transliteration mostly for the sake of the English speakers (or at least non-Japanese speakers)? If so, the distinction won't matter.
It's not really as far off as you would think, since /j/ and /i/ are the same semivowel. As a single syllable, it's /kjo/ or /k ̯io/. As two syllables, it's /ki.o/.
Sorry, no. Tokyo is not a correct romanization of the metropolis' name. It is rather an exonym.
Romanization is a way of unambigously encoding Japanese into the roman alphabet. One method gives us "toukyou". Another system uses diacritics for long vowels.
"Tokio" is a different word.
tokio -- ときお (to ki o)
toukyou -- とうきょう (tō kyō: this is Tokyo)
tokyo -- ときょ (to kyo)
Note that ō loses an orthographic nuance, since it can denote either おう or おお.
Tokio is also an exonym for 東京 – in many languages other than English. While the spelling overlaps with how 「ときお」 (whatever that means) would be romanized, the author himself told the name got its inspiration from the metropolis of Tokyo.
Edit: By the way, your idea of "unambiguously" encoding Japanese doesn't hold water. Written Japanese is written in a mixture of kanji and kana, and no romanization system does round-trip the kanji. You could say that (good) romanization systems unambiguously encode the pronunciation, but that isn't true either: in addition to the fact that encoding "pronunciation" of written language is problematic in principle, there are the pitch accents that aren't captured by kana. But we're digressing.
If you give up the "unambiguous", "tokyo" is a perfectly good romanization for 東京.
I might have been using "romanization" a bit loosely, thank you. I was taught with Hepburn, IIRC, and was basically taught "hey there's a bunch of ways to do this, and people give them names" not "a way of unambiguously encoding Japanese into the roman alphabet", but am very willing to be wrong here. Thanks for the note.
I understand that your parent may be misusing the word romanization, though the intent of his comment is clear. The name of the capital of Japan (東京) is spelled differently in different languages, Tokyo and Tokio being two of them.
If you're going to be strict with respect to romanization and use accents, isn't the circumflex used to indicate long vowels, at least in Kunrei? So, Tôkyô? Or in Hepburn, the macron? Tōkyō? I've never come across acute being used for this.
Fixed that now. I was being lazy and hadn't noticed that Hacker's Keyboard provides ō, right there in the same long-press menu for o where I had obtained ó.
They don't want to add abstractions to the language (e.g. Green threads). So, they are doomed to add abstractions to the libs instead, and one way to do cheap concurrency without the help of the language is to use async I/O.
And by cheap I mean cheaper than OS threads. The reactor pattern still adds a lot of CPU overhead to each I/O operation.
What's interesting is that Rust had green threads at one point. They implemented that by making std::io async-capable under the hood. But they didn't like it: too much abstraction cost, and it was preventing to add more native i/o features easily.
So they ditched it and now they are doing exactly the same thing to I/O, just in user space, and without green threads.
> And by cheap I mean cheaper than OS threads. The reactor pattern still adds a lot of CPU overhead to each I/O operation.
You need to pay for concurrency one way or another, there is always going to be some bookkeeping overhead. Whether you pay the price in your language's runtime (like Go) or in a library doesn't mean you're getting for free. However, it looks like the design of Rust's futures will let it reach very throughput.
The term "zero-cost abstraction" is a bit annoying. By definition an abstraction has an obfuscation and compilation cost even if it doesn't add to the runtime cost. Too much of these and the program is not understandable anymore and takes hours to compile.
Asynchronous programming is hard, even when hidden behind abstractions like futures or promises.
It was cool and trendy when libevent and Nodejs came out, but now we should all have the experience and knowledge that tell us to stop doing asynchronous I/O.
Go, Hakell, or Erlang do a great job at concurrency, without the async i/o craziness.
No, this is not true and you got couple of things wrong. Erlang's model is more like an asynchronous event driven programming on steroids, while idiomatic Go is synchronous and is merely a traditional multithreading, which is known to be unusable for non-trivial concurrent problems. Here's the thing though, people don't really understand these things, they always want something easy that solves a simple problem they have in mind and always fail to grasp how much flexibility they sacrifice and how much harder or even impossible it will get for more complex problems. And believe me, if you get into concurrency you're gonna have a lot of problems that are very hard or even impossible to solve synchronously. The world is an asynchronous place.
I personally find the nodejs async much more intuitive than goroutine + synchronous code in it, but this is a highly subjective subject.
nodejs use to be pretty annoying when dealing with complex workflow in term of code legibility when it was fully callback-based, but with async/await you get the best of both world IMHO.
async/await is a huge improvement over callbacks, but you still have to make sure that everything in your code is async, else you block everything. Also, it doesn't magically works with all functions, the functions have to support async/wait.
And it's still less natural than writing sync code
Go certainly does not do traditional threading. It has Go routines, which is are lightweight "threads" which gets multiplexed on top of real OS threads. It's basically implementing async IO at the runtime level.
In term of programming interface, you deal with them as if it was a traditional thread. (with its pro and cons:
pro: all your code is sequential, it's intuitive to understand what the programmer wanted to do in a specific scenario.
cons: data-races are hiding in every corners :/)
Being a green thread with a growing stack is an implementation detail for the developer writing Go code.
Explicit async IO is the building block which allows the things you mention to be efficient, and Rust's aim is being great at low-level code, e.g. implementing the runtime systems for those languages. Another use of explicit async IO for Rust is building nice abstractions for other Rust code to use, and these abstractions may or may not be so explicit about the asynchronicity.
Like with a lot of Rust development at the moment, these libraries are building blocks, they're not the end of the story.
When the ecosystem is synchronous like the Rust ecosystem, you are basically going to rewrite everything. All your network code, all your client libraries.
Async code can not use synchronous code because this would block it, and prevent it from returning to the event loop.
This is a tedious task, and you end up with less tested, less complete code (at least during the first few years) compared to the sync libraries provided by vendors and std libs.
Event Nodejs still doesn't have ported the world to async yet, and still uses a thread pool under the hood for a number of things (e.g. name resolving), which defeats the promises of async I/O.
Synchronous code can not use asyn code either because, well, in order to get anything from async code you have to be async yourself.
Async code is also more difficult to reason about compared to classical blocking code.
The idea behind async code is to avoid the cost of context switches and the memory usage of OS threads. But they are not the only way to avoid these costs. Go, Erlang, Haskell do a great job at this, without forcing the world into async.
> The idea behind async code is to avoid the cost of context switches and the memory usage of OS threads. But they are not the only way to avoid these costs. Go, Erlang, Haskell do a great job at this, without forcing the world into async.
You're not avoiding the cost, you're just moving the runtime and language/code complexity costs around. Each of the techniques used by Go, Erlang, and Haskell to implement coroutines have trade-offs and there is two that Rust simply cannot make and still fulfill its goals: lose low overhead bidirectional C interop and add a language runtime.
Performant coroutine implementations (AFAIK) all require moving the stack pointer around which makes it very expensive to have code call FFI functions. C makes certain assumptions about the stack and invariants need to be upheld, especially when the foreign library takes a function pointer from the host language. These features require a runtime which is out of the question for a low level language.
We're definitely aware of all of this. Tokio is made by two Rust core team members and the person who wrote the most widely used async io tool; it's virtually all but provided by Rust itself. And the ecosystem is aware of this too; the other people who were working on AIO have backed tokio as well, and the Rust community in general is interested in not having this split.
> Synchronous code can not use asyn code either because, well, in order to get anything from async code you have to be async yourself.
This is solvable through a threadpool, which tokio provides. In other words, it lets you make a blue function red.
> But they are not the only way to avoid these costs
These do not avoid all costs. For example, they pay the overhead of green threads, which means that you can't interoperate with C code at zero cost. That's a price a language like Rust cannot pay.
> How will you avoid having two variants of each and every lib ? E.g. redis-rs, sync, and redis-tokio, async ?
By having one, the async one. If someone doesn't care about asynchronicity, there's always some sort of "wait until completion" functionality. (This is practically what the "normal" synchronous IO functions are doing anyway, just internally.)
This is actually what Go got very wrong. They implemented net completely synchronously, instead of doing it in an event loop and providing synchronous wrappers that communicate with that event loop for those who need them.
Having a sync interface would make it easy to use in sync code, yes. I feel that we are far from zero cost abstractions now, though.
> switch stacks
Is it because the stacks in green threads is smaller than what C would expect ?
An interesting approach taken by Go here is to avoid calling C as much as possible. They don't call the libc for system calls, for instance. This is also what allows them to switch the execution to an other goroutine just before the syscall.
> An interesting approach taken by Go here is to avoid calling C as much as possible. They don't call the libc for system calls, for instance. This is also what allows them to switch the execution to an other goroutine just before the syscall.
Go and Rust don't have the same goals. Rust has been designed as a replacement fro C and C++, that can be progressively integrated in a existing code base (like Firefox's one, or librsvg's). Go is Google's replacement for Java and Python to build independent micro-services.
Go does a great job in its niche, but won't work at all where Rust shines. They are different languages, meant for different use-cases and if people could stop comparing them every time one is mentioned, I think we've made a great step forward …
> They are different languages, meant for different use-cases and if people could stop comparing them every time one is mentioned
Are they really that different use cases? Only rust is aimed at systems programming, but it seems like it could fill the application programming role quite well, where it is competing with go.
People are welcome to use Rust for applications programming, but systems programming is where Rust brings the most to the table (memory safety without a GC was barely thought possible), and where development focus is: trade-offs are made with systems problems in mind, not application problems. This is reflected in many APIs through-out the ecosystem, which give fine control but require a lot of manual explicitness. IO is no different.
> Having a sync interface would make it easy to use in sync code, yes. I feel that we are far from zero cost abstractions now, though.
Why do you say that? Taking an async zero-cost-abstraction API and calling wait() on it doesn't magically make it more expensive. It just blocks the current thread until the async operation is done. Said operation is just as fast and zero-cost as it was before.
You had a call to read(). Now you have a thread, an event loop, a pooling mechanism, and a synchronization with the thread. That's much more system calls and cpu cycles.
If you accept that using the async call in an asynchronous nature doesn't have overhead, then you can turn it into a synchronous call by saying something like `.wait()`.
steveklabnik was not saying that it costs Rust to avoid libc, he was saying that "calling C has high overhead" (i.e. the main underlying reason for avoiding libc) is not something Rust can do, given its goals. This also means there's not nearly as much reason to put the effort into reimplementing the libc abstractions on every platform.
About the cost of calling an async function synchronously:
You had a call to read(). Now you have a thread, an event loop, a pooling mechanism, and a synchronization with the thread. That's much more system calls and cpu cycles, for an synchronous async read()
Can you explain how you run an async function synchronously, and how it has no overhead compared to calling a synchronous implementation of the same function ?
As far as the kernel-side implementation goes, IO is always asynchronous. The CPU is not involved in the actual movement of data between memory and the network interface.
When you make a synchronous syscall, the kernel initiates the operation, saves the state of your thread, and starts another one. When the network interface is done, it signals the kernel, which then marks your thread as runnable and schedules it for execution.
When you make an asynchronous syscall, the kernel initiates the operation but does not block your thread. This is usually done in the context of an event loop, which makes a synchronous syscall (like epoll_wait) when it runs out of tasks to run.
Thus, converting a single async syscall to a sync one means two syscalls: initiate the operation, then wait for its result. The extra round trip between user and kernel mode is basically free in this case because you're blocking on IO, and any logic it implements has to happen in the synchronous case anyway.
> How will you avoid having two variants of each and every lib ?
The python community is also struggling with this problem. The emerging approach, which seems to me to be the right approach, is to write the libraries such that they do not do any IO; then you can use them anywhere, with some integration. So basically it's just the principle of separation of concerns.
Incidentally, this is sort of one of the design goals of tokio/finagle: that you can write your code in a transport-agnostic way. A timeout future works no matter what protocol you want to implement a timeout for, etc.
Go has different memory layout and calling conventions in part because of its green threads implementation. It has to switch stacks to call C code because the Go stacks are small and relocatable, to make growing more efficient, which is only possible because of the GC.
> This is solvable through a threadpool, which tokio provides. In other words, it lets you make a blue function red.
I'll add that once you have this, the async/sync distinction (functions that return Future and those which don't) in Rust becomes exactly the same as fallible/infallible (functions that return Result or don't) and gets handled pretty much the same way.
Futures aren't enums. I meant that you have the ability to handle it there and then (block on it via threadpool), or defer handling (chain to the next async calls in your async function). You have this same pair of abilities with Option, which bridges the basically nonexistant gap between fallible and infallible functions.
I really dislike the "what color is your function" article, because it pushes the idea that Go's userspace M:N threading is somehow different from everything just being synchronous and using threads. It isn't. Go just doesn't have async I/O, with a particular idiosyncratic implementation of threads.
It IS different. In async code every single line of code must be async (or be very fast and i/o free), else you block everything.
In Go, you can decide that a call frame and all its descendants will live their own life in a separate thread of execution. But inside of that call frame, the code is just usual, synchronous code. It doesn't event need to know that it's a goroutine. It is just executing concurrently to other code, at a very low cost to the computer and to the programmer.
Go doesn't have async i/o by choice, it just doesn't need to. Though I'm pretty just there must be a libevent or libuv binding somewhere.
pcwalton said Golang's userspace threading model with Golang's I/O is equivalent to using "normal" threads with synchronous I/O, not that it's different from using threads and "async" I/O.
>> somehow different from everything just being synchronous and using threads
> It IS different. In async code every single line of code must be async
I think you misread the gp. He was not saying that Go code is not different from async code (what you understood), he said it's not different from synchronous code using threads.
I agree that there is no visible difference in the code, and that's exactly what I like. What's different, however, is how Go routines have much less overhead than OS threads.
It's not as much as you think, and goroutines also have significant overheads that OS threads don't. Most of the time, when people talk about goroutine overhead, they're referring to the small stacks, which are actually a property of GC--there are language implementations that are 1:1 that also have small stacks, such as SML/NJ.
> Async code can not use synchronous code because this would block it, and prevent it from returning to the event loop.
This sounds like a JS-specific issue, where you're not allowed to spawn new threads for historical/implementation reasons? Even in Python and Ruby, where the GILs prevent threads from really running in parallel most of the time, you can still use them to unblock an event loop around a long-running function.
Using threads to make sync code async defeats the advantages of doing async code in the first place.
You are writing async code to avoid threads. If you bring threads in your async code, you get the worse of both worlds: Convoluted code, and thread issues (pool exhaustion and/or thread overhead).
Of course. Switching back and forth between async and threads can be a bad sign. I didn't mean to say that you should do it all the time, rather just to put some context around this:
> You can only call a red function from within another red function.
That's really really true in JS. There's no way to call an async function from a sync function, if you need to return its result. You are Capital-S-Screwed if you need to do that.
But it's going to far to apply that absolute rule to other languages. When you have threads, it's pretty easy to mix sync and async code. It can be a bad idea, just like having a codebase that's half exceptions and half error returns is usually a bad idea, but you're certainly allowed to do it when it makes sense.
Threads and async are orthogonal. Threads allow you to do work in parallel. Async allows you to increase CPU utilization in a single process/thread.
writing async code to avoid threads
Async can be used in single-process systems.
It can make sense to use async techniques to maximize work being done on each process/thread. Working with threads and using async methods can both be complex if you don't have good abstractions for doing so. They can be used together to great effect if you have the right tools.
I tend to agree that high-level languages that are already paying the cost of a pervasive runtime (e.g. for GC) should abstract away async vs. sync operations. But Rust is a low-level language, and with that comes the expectation of explicitness and direct programmer control.
You can't really do IO in coroutines without async IO in the implementation, and (depending on your baseline) you can't make them zero-cost without something like async/await.
> They just let an other coroutines execute during the syscall.
They do this by using async IO under the hood (for network sockets at least- convincing disk-based read() calls to be async is much tricker and Go doesn't do it).
Because synchronous IO will block the OS thread. If you're multiplexing a bunch of green threads on that OS thread, they'll all stall.
That's why Go, for example, does use async IO. But it makes the use of it look like sync IO for simplicity. In the runtime, however, it is doing all the async event management so that you don't have to.
From a programmer POV this is a good thing, but it comes at the price of a runtime that has to exist. Something that Rust eschews.
Sibling comment by lucozade is good. To be more specific, it's because OS-level threads are the only way the kernel gives applications to run any code at all. When an OS-level thread makes a syscall, it ceases to run any application code until that syscall completes- it's just an ordinary function call that also switches to kernel mode.
The OS kernel itself provides both blocking and asynchronous syscalls. So if your OS-level thread is switching between several coroutines, and one of them makes a synchronous read() syscall, the whole OS-level thread blocks and can't run any other coroutines until the read completes. If it instead makes an asynchronous syscall, the kernel returns immediately and the OS-level thread can switch to another coroutine while the first one waits.
You don't have to rewrite the (vast) majority of crates to be asynchronous because almost all of them can run from a thread pool with zero problems. Worst case scenario, you spawn a new OS thread and return a future which integrates very well with Tokio. Since it's built on top of futures-rs it will play nicely with other asynchronous crates and wrapping a synchronous library is much easier than rewriting it.
It would be great to have every I/O library use OS specific polling mechanisms but that places a significant burden on library developers to not only know their domain, but have experience with each platforms quirks and limitations.
> all of them can run from a thread pool with zero problems
Until you exhaust the thread pool, in which case everything relying on the thread pool will be much longer than usual to complete.
Nodejs does this for name resolving, and it's a nightmare. Basically, with the default pool of 4 threads, it suffices of 4 slow resolves to DoS everything that needs to do a name resolve in the same process.
I hope that Rust will not have the same problem when mixing async and thread pools.
If the thread pool is not limited in size, you avoid this problem, but you lose all the benefits of async.
> If the thread pool is not limited in size, you avoid this problem, but you lose all the benefits of async.
You can't have both: either you have a limited pool and can block it on long running tasks or you don't and can wind up with a large number of threads. Go currently doesn't give you this choice and you're limited by whatever you set GOMAXPROCS to before you start your program. If you have GOMAXPROCS set to 4 and you have 4 goroutines that take a long time (not waiting for IO) you've blocked the ability to do any other work. This isn't entirely true of course because they have a runtime in-process scheduler (which adds more overhead) but you could easily avoid this particular problem with tokio by using an async DNS resolution solution, which is what Go is doing in their stdlib for you.
Question for the more knowledgeable Rust folks: Does this code (the echo server example) not handle the situation where the port is already in use? What's the return type of the socket bind in that situation?
EDIT: Wow six responses in as many minutes. I think I've created a new objective measure of how popular a programming language will be!