While the implementation is straightforward enough, this post clearly shows me why I don’t like Rust.
So, the function accepts two values that implement Mid trait. And there is a blanket implementation for Mid that applies to anything can be added, substracted, divided etc and can be derived from a u8. So now this will make every numeric type a Mid. (Which, as the author states only an example I guess)
I can read and understand how this works here. But when I’m in a larger code base, I can not find what trait implementation made what possible.
With all the trait bounds and implicit conversions happening around and macros implementing traits on stuff it becomes a shit stew in a hurry.
And everyone always reaches to the most clever way to do things. I mean, does Mid need to be a trait? I guess it works beautifully yeah. I’d prefer to pass a midpoint calculator function instead.
I guess this turned into a rant. I actually really like Rust the language but I think I don’t like Rust codebases basically.
It is the least readable language I’ve seen not because of the syntax but because of all the cleverness going on around the traits and macros.
Having the function take a midpoint calculator is fine and there are some advantages to it (e.g. it makes it easy to turn into a linear search if you want to do that for some reason). However since writing a midpoint function isn’t always totally straightforward (an incorrectly implemented midpoint will cause correctness issues or even infinite loops), I think it’s nice to have it as a trait. That way there’s little risk of someone using a bad `|x,y| (x + y) / 2` implementation out of laziness.
However I 100% agree with you that traits make it really hard to figure out what actual function you’re calling. I wish there was a way to see something like “in your code, this generic function is called with types T1, T2, etc.” and see the trait implementations for those types as well. I’ve actually wanted to contribute this to rust-analyzer for a while.
I think your parent’s point is something along the lines of “method lookup being complex doesn’t inherently limit the popularity of a programming language.”
I like Rust a lot, but I strongly agree. “Advanced” Rust becomes as inscrutable as C++ template crap.
I really wish Rust had a way to define “template style” generic functions rather than “trait style”. Some things simply can not be done gracefully with trait spaghetti.
Rust macro hell is even worse than Rust trait spaghetti. I regularly refuse to use crates because they’re macro based and there’s a no macro alternative.
I would assert it is actually worse than C++, because at least templates and constexpr execution still looks like plain C++, while macros in Rust, with two additional levels of syntax, add even more complexity.
Yeah, trait-heavy code is one of Rust's weaker points in practice. I find tower and some of the database crates to be some of the worst offenders: even knowing the language pretty well, I've often gotten lost in a maze of projections and blanket impls when an error occurs. There's no easy way to trace the implementation back to the intended behavior, to compare the expected and actual outcomes.
One of my annoyances is for something like `impl Foo for T: Deref<U>, U: Foo`. Ie a Foo impl for something that can deref to another type, which impls Foo. Eg an Arc, Box, etc. So now that you impl that to support, say, Box - well now if something fails to impl Foo, you'll instead get an error about it not being a Box (iirc, been a while). Ie, my memory is old so forgive me, but the error says something to the effect of your `T` type not being Box<T>`, instead of it simply saying that your T doesn't impl Foo.
I ran into this confusing error so many times i just stopped doing blanket deref impls and instead just impl it for `Box<T>` directly. Doing that avoids the weird error entirely.
Still, i love Rust. However it, like any lang, is not perfect. Nor should it be.
There already exist many affine-typed languages with fewer features than Rust. When you phrase it like "Rust but with weaker type safety and higher-overhead abstractions" one might see why this premise fails to gain much traction, though.
I very much do not want to go back to the days of "read up to 4 bytes off the wire to see how big the next chunk is. Oh I only read 3 bytes? Guess I'd better save those 3 + the fact I need to read 1 more, then drop back out to the event loop"
> I very much do not want to go back to the days of ....
Not what I recommend. The C library does much better than that.
In asynchronous programming you can only deal with the data you have, so you buffer it. I have not counted bytes like that - for the purposes of reading from a file, in decades.
But that's where non-blocking leads; it can't block until it reads all the bytes you asked for, that has to be handled somehow.
A fancy C library could buffer the partial read for you rather than you needing to do it, and it could even maybe deal with turning your function into the state machine required to resume the function at the partial read.
But then you look around and realise you've created another async/await.
I've been doing asynchronous programming for decades, and async/await is a notable improvement for many use cases. Event loop + callback hell, or condition variable/mutex/semaphore approaches are great in some cases, but they suck in others.
> I've been doing asynchronous programming for decades, and async/await is a notable improvement for many use cases.
I think that goes to taste.
> Event loop + callback hell
That was always due to indisciplined programming. Async/await offers up other hells.
I think it makes some sense in memory managed systems. But in a system with static compile time management like Rust async/await forces all sorts of compromises. Mostly these fall on library authors, but it is its own hellscape. `Pin`?
I believe this is reference to Go's approach, where code can be written in a linear straightforward fashion with normal function calls, but, if they are syscalls then they become async automatically.
I don't think it's appropriate for the level of coding that Rust is targeting... But... It would be nice.
I suppose that makes sense given OP's definition of "no dependence on run times", but I think I'd have to agree that that approach probably wouldn't work well for the niche Rust is trying to target.
That's technically POSIX, not standard C, isn't it?
But in any case, I'm pretty sure Rust has supported the nonblocking functionality you want for quite a while now (maybe since 1.1.0)? You'll need to piece it together yourself and use the libc crate since it's a platform-specific thing, but it doesn't seem too horrendous. Ultimately it comes down to the fact that the fundamental method for the std:io::Read is
which is perfectly compatible with non-blocking reads. For example, std::io::File implements Read and std::os::fd::FromRawFd, so if you can get a non-blocking file descriptor you should have non-blocking reads for files.
> It is the least readable language I’ve seen not because of the syntax but because of all the cleverness going on around the traits and macros.
Definitely agree about macros. A lot of crates use them extensively, but they are inherently less intuitive to reason about, tend to be less well documented and produce less helpful error messages when you use them incorrectly.
I think a lot of trait misuse probably stems from the fact that structs kind of look like classes and traits kind of look like interfaces, so people may try to do Java-style OOP with them.
(As an aside, the other thing that makes it difficult for me to grok Rust code is when everything is nested in 3+ levels of smart pointers.)
The modifications that this article makes to the original version [1,2] are not a good idea. The modified version in this article cannot properly be used to search an array, for example (due to the extra "sanity checks", which would cause out-of-bounds errors). The API is also made more complex, returning a pair of options of indices rather than a pair of indices. You really just want the pair of indices!
If you do sanity checks in Rust then you should panic instead of returning options. However, in this case you cannot do the sanity checks, because the contract of the function is that it will only call the predicate strictly between l and r (i.e., only on indices returned by mid). Further, given the initial sanity checks, the sanity checks inside the loop body are redundant, as the loop invariant ensures that those checks never fail.
> You want one that doesn't require a consolation with a manual to remember if it returns the first value that's greater or the last value that's lower. ... No more needing to remember "does this return the highest index whose value is lower than the value I'm searching for, or the lowest index whose value is greater than the value I'm searching for?". This one returns both, so you can use whichever you need for your use case.
This (italicized part) feels like a weird requirement. Function/API documentation (which this author's code lacks) is important, and I don't think I ever write a call to an unfamiliar function without checking the docs for it. Assuming that you know what a function does based on the name alone is a recipe for buggy code.
This is also a kind of unusual formulation of binary search which searches across a monotonic function space rather than a sequence. I get that this is more general and can be used to implement the latter, but a less general interface that only works on sequences is IMO more intuitive: in that case, I expect it to return either the index of the target number, or the index where it should be inserted if not found. It's somewhat telling that the author doesn't offer any code examples demonstrating how to use this wonderful general function to solve the common use case of finding an element in a sorted vector, or inserting a new one into the correct location.
The interface is nicer for keyframe interpolation eg. the problem where you have a list of (X,Y) pairs and you want to find the two pairs that straddle a given x and interpolate between them. In that case, search(|&i| v[i].0 >= x, 0, v.len() - 1) handles the three cases "before the beginning", "after the end", and "between two pairs" fairly naturally.
The code itself looks very straightforward, two preconditions and a loop. I don't see any significant difference in terms of complexity compared to any other low-level systems programming language. It's obviously a bit more verbose than python or haskell, but those are not targeting the same audience.
It looks extremely simple to me. I definitely have never seen a simpler binary search implementation (though in fairness it punts the midpoint calculation). Why do you think it's a joke?
I really like Rust, and what it enables, but this seems like a wonderful counterexample.
For a practical programmer, bsearch is on a sorted array. Implement that, with the necessary type abstractions. So: build the objects into an array of references, sort that, then bsearch it already. If you need to keep that array around, do that. Way easier to look at the code and be reasonably confident that it's ok.
Anything else is fobbing off the abstractions to something not-nearly-obvious ( eg, a 'Mid' trait?). I disagree that this is a defect in Rust.
But I suppose that enabling this kind of misabstraction may indeed be a defect in Rust.
Sound an awful lot like Fermat: "the truly marvellous proof which the margin is too narrow to contain".
> This one also makes that assumption, but in some cases if that assumption is violated it returns an error rather than silently returning a meaningless
Sometimes returning an error and sometimes returning an incorrect value doesn't sound like a binary search that is "absolutely reliable".
It seems to me that it's impossible for a binary search to detect all errors. The precondition is that the list must be sorted, and to fully verify that assumption would be O(n). That would make your binary search no better than a linear search. So I can forgive the best-effort error checking.
In theory at least that precondition could be guaranteed by a newtype like SortedList without requiring any linear scan, that would work at least for most cases where lists are sorted by an algorithm. Though not necessarily custom data which is constructed sorted/maintains it's sorting outside the algorithm, but you could perhaps have an unsafe constructor for your SortedList type. That should avoid a linear scan...
i am not sure i see a use case where that precondition is anything but mandatory (and at least a requirement in app space for the related data), if youre intending on making practical use of the algo.
if theres a risk that your input is unstable in ordering, then reaching for bsearch seems like a "premature optimization", and youd be better off with a straightforward linear scan until requirements (or your kanguage of choice's bst/heap/rbt equivalent), or data is preprocessed upstream in a way that makes sense for the contexts wanting to use it.
Binary search only works if you suppose the data is well structured. If you perform the search and in the meantime you spot a passing inconsistency, you can interrupt everything and throw an error. But other than that, checking the entire structure negates the entire point of the algorithm
Agree with basically everything here, plus I would add that this method also has weird ergonomics. For example there is technically a midpoint between these two: `0b10u64.mid(&-20)`--but you get a weird compiler error ("unsigned values cannot be negated") which will be confusing to debug when you're just trying to find a midpoint between two numbers.
What I don't quite understand is why the opposite (`-20.mid(&0b10u64)`) still errors out with the same exact "unsigned values cannot be negated" error. Why is the unsigned and not the -20 taking prececence here? Doing `-20i64.mid(&0b10u64)` I get the expected error: "expected `&i64`, found `&u64`."
This is a case where less abstraction would make the bug a lot more obvious. Binary search is simple enough that I don't think a generic implementation of it is worth introducing the possibilities of errors at a distance like that one.
I would normally agree, but the author went out of their way to check for error conditions in their binary search. I think a silent infinite loop is antithetical to that.
Oh god. A trait to average two numbers? As a separate function that nobody will ever find in a larger codebase since it can live anywhere? I am yet again convinced that C is better. Thanks.
> The only case where it returns (None, None) are those when it detects non-monotonicity in your predicate (returning false when a lower input returned true). So if you know your predicate is monotonic, feel free to hit this case with a panic!("should never happen").
All good points, thank you. A range parameter would be especially good because I think you could use RangeInclusive to emphasize that it’s an inclusive range which is not obvious from the type signature.
You can omit the runtime sorted checks via proper types. Only accept sorted arrays or lists, and enforce them with such a class. This would make it convenient, faster and safer. It would also take the sort check from the object, and not as extra argument.
In theory I like this, but it raises the question of how it would work in practice; in practice people create non-sorted lists and then sort them in-place. Maybe you could have the sort method return a sorted-token that certifies that a given array is sorted. And the token would read-only borrow the array, to ensure no one can modify the array as long as the token is valid.
Is it even possible for a rust function to take a mutable borrow as a parameter, decay it to an immutable borrow and store that somewhere, such that other people can borrow the parameter after the function has returned?
> Is it even possible for a rust function to take a mutable borrow as a parameter, decay it to an immutable borrow and store that somewhere, such that other people can borrow the parameter after the function has returned?
I think this will stop working once the compiler can no longer see that the lifetimes work out, but I think that makes sense - there's a mutable object underlying the mutable borrow, and the compiler needs to verify that no one else can get another mutable borrow while the derived immutable borrows are live.
You might also want to add methods on Sorted for creating a Option<Sorted> by verifying the sortedness of an immutable reference and a method that copies an array and sorts it (it would need to return a struct of a Box for managing the lifetime and a Sorted pointing into the box, or something I guess, probably need to use Pin?)
---
On a completely separate note, this will actually not work for the binary search in the article in question, as it uses binary search on a predicate. Having a sorted array doesn't guarantee that the predicate will be monotonic.
Wow, this binary search implementation is sure to reduce code duplication. It's so elegant! Can I try?! Here's my implementation of a rust program that can do anything!
```
pub trait Program<T> {
pub fn run(self);
}
pub fn run_program<T: Program>(program: T) {
program.run();
}
```
Boom. Just look at the code. How could it be simpler? This works for any struct that implements the `Program` trait.
/s
While the code in the original post is elegant, and cool from a "hey look what i can do!" perspective, I really feel like at some point we lost the plot on templates with Rust. When you try to finagle an implementation of a general algorithm, you just end up with a solution that works poorly in all cases, instead of one that works well in most cases.
I've written a lot of Rust, and I definitely know what it's like to get high off of your own template fumes. I once wrote a function with five generics and two trait constraints and I thought it was the coolest shit I'd ever done. But really, I could have just written, like, three discrete functions, and it would have been clearer and smaller.
If someone used something with binary search algorithm other than the Mid implementation shown in the post, that would categorically be bad code from someone who was trying to be too clever. If someone has a unique binary search to do, they should write it themselves. That way, the whole algorithm will be on display, and it will be easier to see ways to make it simpler or improve it in the context of the unique search operation. Also, it will be much clearer.
So, the function accepts two values that implement Mid trait. And there is a blanket implementation for Mid that applies to anything can be added, substracted, divided etc and can be derived from a u8. So now this will make every numeric type a Mid. (Which, as the author states only an example I guess)
I can read and understand how this works here. But when I’m in a larger code base, I can not find what trait implementation made what possible.
With all the trait bounds and implicit conversions happening around and macros implementing traits on stuff it becomes a shit stew in a hurry.
And everyone always reaches to the most clever way to do things. I mean, does Mid need to be a trait? I guess it works beautifully yeah. I’d prefer to pass a midpoint calculator function instead.
I guess this turned into a rant. I actually really like Rust the language but I think I don’t like Rust codebases basically.
It is the least readable language I’ve seen not because of the syntax but because of all the cleverness going on around the traits and macros.