Hacker Newsnew | past | comments | ask | show | jobs | submit | awused's commentslogin

>It's the problem with classical stage-oriented compilers. A compiler designed for diagnostics from the beginning

I'm not sure any compiler, even one "designed for diagnostics," can gather the "necessary metadata" from inside my brain based on my original intentions. If the compiler could unambiguously interpret what I meant to write, it wouldn't have needed to fail with a compilation error in the first place. Whenever I see something like "perhaps" or "maybe" in a compilation error, the information just didn't exist anywhere the compiler could possibly get to, and it's just a suggestion based on common mistakes.


Honestly I found myself coding very much the same way in Rust as I did in Python and Go, which were my go-to hobby languages before. But instead of "this lock guards these fields" comments, the type system handles it. Ownership as a concept is something you need to follow in any language, otherwise you get problems like iterator invalidation, so it really shouldn't require an up-front architectural planning phase. Even for cyclic graphs, the biggest choice is whether you allow yourself to use a bit of unsafe for ergonomics or not.

Having a robust type system actually makes refactors a lot easier, so I have less up-front planning with Rust. My personal projects tend to creep up in scope over time, especially since I'm almost always doing something new in a domain I've not worked in. Whenever I've decided to change or redo a core design decision in Python or Go, it has always been a massive pain and usually "ends" with me finding edge cases in runtime crashes days or weeks later. When I've changed my mind in Rust it has, generally, ended once I get it building, and a few times I've had simple crashes from not dropping RefCell Refs.


Lifetimes and borrowing are very much a correctness thing and aren't just for tracking when memory is freed. While you won't have use-after-free issues in a GCed language, you will still have all the other problems of concurrent modification (data races) that they prevent. This is true even in single-threaded code, with problems like iterator invalidation.


>Trying to solve the problem by frequently invoking signal handlers will also show in your latency distribution!

So just like any other kind of scheduling? "Frequently" is also very subjective, and there are tradeoffs between throughput, latency, and especially tail latency. You can improve throughput and minimum latency by never preempting tasks, but it's bad for average, median, and tail latency when longer tasks starve others, otherwise SCHED_FIFO would be the default for Linux.

>I read the blog about this situation at https://tokio.rs/blog/2020-04-preemption which is equally baffling

You've misunderstood the problem somehow. There is definitely nothing about tokio (which uses epoll on Linux and can use io_uring) not responding in there. io_uring and epoll have nothing to do with it and can't avoid the problem: the problem is with code that can make progress and doesn't need to poll for anything. The problem isn't unique to Rust either, and it's going to exist in any cooperative multitasking system: if you rely on tasks to yield by themselves, some won't.


> So just like any other kind of scheduling?

Yes. Industries that care about latency take some pains to avoid this as well, of course.

> io_uring and epoll have nothing to do with it and can't avoid the problem: the problem is with code that can make progress and doesn't need to poll for anything.

They totally can though? If I write the exact same code that is called out as problematic in the post, my non-preemptive runtime will run a variety of tasks while non-preemptive tokio is claimed to run only one. This is because my `accept` method would either submit an "accept sqe" to io_uring and swap to the runtime or do nothing and swap to the runtime (in the case of a multishot accept). Then the runtime would continue processing all cqes in order received, not *only* the `accept` cqes. The tokio `accept` method and event loop could also avoid starving other tasks if the `accept` method was guaranteed to poll at least some portion of the time and all ready handlers from one poll were guaranteed to be called before polling again.

This sort of design solves the problem for any case of "My task that is performing I/O through my runtime is starving my other tasks." The remaining tasks that can starve other tasks are those that perform I/O by bypassing the runtime and those that spend a long time performing computations with no I/O. The former thing sounds like self-sabotage by the user, but unfortunately the latter thing probably requires the user to spend some effort on designing their program.

> The problem isn't unique to Rust either, and it's going to exist in any cooperative multitasking system: if you rely on tasks to yield by themselves, some won't.

If we leave the obvious defects in our software, we will continue running software with obvious defects in it, yes.


>This sort of design solves the problem for any case of "My task that is performing I/O through my runtime is starving my other tasks."

Yeah, there's your misunderstanding, you've got it backwards. The problem being described occurs when I/O isn't happening because it isn't needed, there isn't a problem when I/O does need to happen.

Think of buffered reading of a file, maybe a small one that fully fits into the buffer, and reading it one byte at a time. Reading the first byte will block and go through epoll/io_uring/kqueue to fill the buffer and other tasks can run, but subsequent calls won't and they can return immediately without ever needing to touch the poller. Or maybe it's waiting on a channel in a loop, but the producer of that channel pushed more content onto it before the consumer was done so no blocking is needed.

You can solve this by never writing tasks that can take "a lot" of time, or "continue", whatever that means, but that's pretty inefficient in its own right. If my theoretical file reading task is explicitly yielding to the runtime on every byte by calling yield(), it is going to be very slow. You're not going to go through io_uring for every single byte of a file individually when running "while next_byte = async_read_next_byte(file) {}" code in any language if you have heap memory available to buffer it.


Reading from a socket, as in the linked post, is an example of not performing I/O? I'm not familiar with tokio so I did not know that it maintained buffers in userspace and filled them before the user called read(), but this is unimportant, it could still have read() yield and return the contents of the buffer.

I assumed that users would issue reads of like megabytes at a time and usually receive less. Does the example of reading from a socket in the blog post presuppose a gigabyte-sized buffer? It sounds like a bigger problem with the program is the per-connection memory overhead in that case.

The proposal is obviously not to yield 1 million times before returning a 1 meg buffer or to call read(2) passing a buffer length of 1, is this trolling? The proposal is also not some imaginary pie-in-the-sky idea; it's currently trading millions of dollars of derivatives daily on a single thread.


You're confusing IO not happening because it's not needed with IO never happening. Just because a method can perform IO doesn't mean it actually does every time you call it. If I call async_read(N) for the next N bytes, that isn't necessarily going to touch the IO driver. If your task can make progress without polling, it doesn't need to poll.

>I'm not familiar with tokio so I did not know that it maintained buffers in userspace

Most async runtimes are going to do buffering on some level, for efficiency if nothing else. It's not strictly required but you've had an unusual experience if you've never seen buffering.

>filled them before the user called read()

Where did you get this idea? Since you seem to be quick to accuse others of it, this does seem like trolling. At the very least it's completely out of nowhere.

>it could still have read() yield and return the contents of the buffer.

If I call a read_one_byte, read_line, or read(N) method and it returns past the end of the requested content that would be a problem.

>I assumed that users would issue reads of like megabytes at a time and usually receive less.

Reading from a channel is the other easy example, if files were hard to follow. The channel read might implemented as a quick atomic check to see if something is available and consume it, only yielding to the runtime if it needs to wait. If a producer on the other end is producing things faster than the consumer can consume them, the consuming task will never yield. You can implement a channel read method that always yields, but again, that'd be slow.

>The proposal is obviously not to yield 1 million times before returning a 1 meg buffer, is this trolling

No, giving a illustrative example is not trolling, even if I kept the numbers simple to make it easy to follow. But your flailing about with the idea of requiring gigabyte sized buffers probably is.


> You're confusing IO not happening because it's not needed with IO never happening. Just because a method can perform IO doesn't mean it actually does every time you call it. If I call async_read(N) for the next N bytes, that isn't necessarily going to touch the IO driver.

Maybe you can read the linked post again? The problem in the example in the post is that data keeps coming from the network. If you were to strace the program, you would see it calling read(2) repeatedly. The runtime chooses to starve all other tasks as long as these reads return more than 0 bytes. This is obviously not the only option available.

I apologize for charitably assuming that you were correct in the rest of my reply and attempting to fill in the necessary circumstances which would have made you correct


Actually, no, I misread it trying to make sense of what you were posting so this post is edited.

This is just mundane non-blocking sockets. If the socket never needs to block, it won't yield. Why go through epoll/uring unless it returns EWOULDBLOCK?


For io_uring all the reads go through io_uring and generally don't send back a result until some data is ready. So you'll receive a single stream of syscall results in which the results for all fds are interleaved, and you won't even be able to write code that has one task doing I/O starving other tasks. For epoll, polling the epoll instance is how you get notified of the readiness for all the other fds too. But the important thing isn't to poll the socket that you know is ready, it's to yield to runtime at all, so that other tasks can be resumed. Amusingly upon reading the rest of the blog post I discovered that this is exactly what tokio does. It just always yields after a certain number of operations that could yield. It doesn't implement preemption.


Honestly I assumed you had read the article and were just confused about how tokio was pretending to have preemption. Now you reveal you hadn't read the article so now I'm confused about you in general, it seems like a waste of time. But I'm glad you're at least on the same page now, about how checking if something is ready and yielding to the runtime are separate things.


You're in a reply chain that began with another user claiming that tokio implements preemption by shooting signals at itself.

> But I'm glad you're at least on the same page now, about how checking if something is ready and yielding to the runtime are separate things.

I haven't ever said otherwise?


"Zero-cost abstractions" can be a confusing term and it is often misunderstood, but it has a precise meaning. Zero-cost abstractions doesn't mean that using them has no runtime cost, just that the abstraction itself causes no additional runtime cost.

These can also be quite narrow: Rc is a zero-cost abstraction for refcounting with both strong and weak references allocated with the object on the heap. You cannot implement something the same more efficiently, but you can implement something different but similar that is both faster and lighter than Rc. You can make a CheapRc that only has strong counts, and that will be both lighter and faster by a tiny amount, or a SeparateRc that stores the counts separately on the heap, which offers cheaper conversions to/from Rc.


I am very aware of the definition of zero-cost.

We're talking about the comparison between using an abstraction vs not using an abstraction.

When I said "doesn't have a runtime cost", I meant "the abstraction doesn't have a runtime cost compared to not using the abstraction".

If you want your computer to do anything useful, then you have to write code, and that code has a runtime cost.

That runtime cost is unavoidable, it is a simple necessity of the computer doing useful work, regardless of whether you use an abstraction or not.

Whenever you create or use an abstraction, you do a cost-benefit analysis in your head: "does this abstraction provide enough value to justify the EXTRA cost of the abstraction?"

But if there is no extra cost, then the abstraction is free, it is truly zero cost, because the code needed to be written no matter what, and the abstraction is the same speed as not using the abstraction. So there is no cost-benefit analysis, because the abstraction is always worth it.


The way you used it in your parent comment didn't make it clear that you were using it properly, hence my clarification. I'm honestly still not sure you've got it right, because Rust abstractions, in general, are not zero-cost. Rust has some zero-cost abstractions in the standard library and Rust has made choices, like monomorphization for generics, that make writing zero-cost abstractions easier and more common in the ecosystem. But there's nothing in the language or compiler that forces all abstractions written in Rust to be free of extra runtime costs.


I never said that ALL abstractions in Rust are zero-cost, though the vast majority of them are, and you actually have to explicitly go out of your way to use non-zero-cost abstractions.


Are you sure about that?

>Rust embraces abstractions because Rust abstractions are zero-cost. So you can liberally create them and use them without paying a runtime cost.

>you never need to do a cost-benefit analysis in your head, abstractions are just always a good idea in Rust

Again though, and ignoring that, "zero-cost abstraction" can be very narrow and context specific, so you really don't need to go out of your way to find "costly" abstractions in Rust. As an example, if you have any uses of Rc that don't use weak references, then Rc is not zero-cost for those uses. This is rarely something to bother about, but rarely is not never, and it's going to be more common the more abstractions you roll yourself.


>but Tokio forces all of your async functions to be multi thread safe

While there are other runtimes that are always single-threaded, you can do it with tokio too. You can use a single threaded tokio runtimes and !Send tasks with LocalSet and spawn_local. There are a few rough edges, and the runtime internally uses atomics where a from-the-ground-up single threaded runtime wouldn't need them, but it works perfectly fine and I use single threaded tokio event loops in my programs because the tokio ecosystem is broader.


You don't even need other runtimes for this. Tokio includes a single-threaded runtime and tools for dealing with tasks that aren't thread safe, like LocalSet and spawn_local, that don't require the future to be Send.


There's nothing buggy about a future that never yields because it can always make progress, but people prefer that a runtime doesn't let all other execution get starved by one operation. That makes it a problem that runtimes and schedulers work to solve, but not a bug that needs to be prevented at a language level. A runtime that doesn't solve it isn't buggy, but probably isn't friendly to use, like how Go used to have problems with tight loops and they put in changes to make them cause less starvation.


I think most of the stuff in this repo is too much and trying to beat a square peg into a round hole, but the little things like Option and Either patch a hole in Go. I don't think I'd use them without them being in the stdlib, though, which is where such fundamental types belong. Those aren't even really functional, unless you count null pointers as necessarily imperative.

>... read some blog post about how functional programming and type theory will save the world, instead of actually being productive

I see the opposite side in Go a lot, where without testing or trying anything they dismiss everything they aren't already using right now as useless ivory tower academia, which is its own set of popular blog posts. Seems both sides have a lot of time to argue on the internet though, oddly the people who actually write code tend to be the productive ones regardless of philosophy.


Yeah, there are counterexamples, but the only way to know is to read the comments or source code of the function you're calling. (T, err) doesn't convey any useful information and, in the overwhelming majority of cases, err != nil means T is a meaningless default value that should be ignored or a null pointer.

By and large I think the stuff in this repo is too much and doesn't fit Go. I don't particularly want Go to pretend to be functional, but Either and Option at least would be nice to have in the stdlib and help prevent this exact issue where there are rare exceptions to normal practices. I don't see them getting widespread use without being part of the stdlib though. If Either/Option were common in Go but io.Reader was one of the few APIs returning (T, error), that would convey a lot more information.


> means T is a meaningless default value

Go Proverb #5: Make the zero value useful.

> that should be ignored or a null pointer

nil is the zero value of a pointer, so it should be made useful per the above, but it is also inherently useful even if you put no thought into it. It allows you to know that there is an absence of a value out of the box.

And this is actually why the vast majority of (T, error) cases in idiomatic code sees T be a pointer, despite the computational and programatic downsides of using a pointer, so that nil can be returned when the value is not otherwise useful – exactly to ensure the value is as useful as possible, denoting the absence of a usable value.

If you read through idiomatic code, you'll notice that only when the underlying type is more meaningful is a pointer not used. Returning a slice is one such example. An empty set upon error is more meaningful than nil, usually. Another common instance is when 0 is meaningful, like in the aforementioned io.Reader interface. Idiomatically, one will always strive to return the most meaningful value they can.

> Either and Option at least would be nice to have in the stdlib

And if it were, then this Either wrapper in question would become useful as an overlay to it, as they would then share the same intent and meaning. But it does not match the current semantics of idiomatic Go code using the (T, error) pattern.

You can probably make it work, but code is about communicating ideas to other programmers. Either implies a dependence between variables. (T, error) has no such dependence. There is an impedance mismatch here which fails to properly communicate what is happening.


>Go Proverb #5: Make the zero value useful.

Yeah, it's a nice quip, but that's all it is. It sounds nice on first read to someone who doesn't program much. But it is inaccurate and not followed by Go, and is explicitly against the Google style guide.

The sophistry trying to paint a nil pointer as "useful" is just trying to defend a position you've dug yourself into in the process of this argument, so it doesn't really need to be addressed again.

>An empty set upon error is more meaningful than nil, usually.

This in particular is just a mistake in Go. Nil maps, unlike nil slices, cause panics, so people try to avoid ever returning them.

>But it does not match the current semantics of idiomatic Go code using the (T, error) pattern.

But it does match how (T, error) is actually used the majority of the time. The impedance mismatch is that code that currently has the semantics of Either, which is the vast majority of idiomatic Go, needs to use (T, error).


> The sophistry trying to paint a nil pointer as "useful" is just trying to defend a position you've dug yourself into in the process of this argument

You are right. I concede nil is not useful. Therefore, we agree that (T, error) cannot exist. As we see in the style guide: "Returning a nil error is the idiomatic way to signal a successful operation that could otherwise fail." This means there is no way to check the error condition. One might be tempted to write `if err != nil`, but because nil is not useful that obviously won't work. That would make nil a useful value, just as I once thought – incorrectly, as you helpfully made me realize – a nil T would be.

And as the Go style guide indicates that you cannot use the T value without first touching the error value, which is, for all practical purposes, impossible since the error value may not be useful, there is just no way this pattern can be used in any actual program.

> But it does match how (T, error) is actually used the majority of the time.

Right. As you have pointed out – of which I was reluctant to admit to, but you said it enough times that it must be true! – (T, error) cannot be used. Period. Its values do not convey the useful information required to be useable. Either does, then, indeed, match how (T, error) is used most of the time... which is to say not at all!

You mentioned something about MarshalText in the slog package showing how an idiomatic function with errors might actually be written, but then we realized that there are multiple implementations with the same name. Which one were you referring to?


>You are right. I concede nil is not useful.

You say this sarcastically, but it is actually true. A nil pointer is not useful. Once you have determined that a pointer is nil, you have confirmed that the function returning it at all was a waste of both space and time. Though it's actually only half true: nil pointers are worse than useless and they provide negative utility, because they allow invalid code to compile. A better design - which is also more efficient, even considering the overhead of tagged unions - is to not return the pointer/value at all if it would be useless. Other languages allow for this, even C does allow for it with manually tagged unions. Go is rather unique, especially among modern languages, in how it doesn't provide any mechanism for it, so people use what is available to emulate that.

For the rest of it, well, you've contorted yourself into some really interesting positions.


Exactly. (T, error) simply cannot be used in any real program. Like you say, if you try to use nil your program will crash, and error is asserted to be nil when there is no error, and you have to use that useless nil value in order to utilize T per the style guide, therefore your program will crash essentially all the time. It is impossible to write code that is valid if (T, error) is present.

It's not a question what is or isn't better. That's off-topic. It's just a question of how can we actually deal with the situation with the tools that Go gives us? MarhsalText no doubt contains the answers, but we aren't sure where to find it given the ambiguity. You went to all the trouble of looking it up to tell us about it, but now want to keep it a secret?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: