> These algorithms use the data received from userspace in
order to index into a lot of arrays and thus benefit from Rust's memory safety.
If everything is done with arrays and indices (apparently from looking at the code: (https://gitlab.collabora.com/dwlsalmeida/for-upstream/-/blob...) it seems like Rust's borrow checker doesn't really help at all, and the only thing Rust really does for you is just bounds checks on arrays (with additional runtime overhead)... So I'm not sure how this can really improve the state of things compared to C with the equivalent bounds checks.
> So I'm not sure how this can really improve the state of things compared to C with the equivalent bounds checks.
The simplest answer here is that a compiler-introduced bounds check is almost always better than a human one. Humans make bounds errors, compilers generally don't.
The longer answer is that the current state of the code does not guarantee its future state. The fact that bounds checks constitute the current majority of safety guardrails does not mean that future refactors won't benefit from Rust's temporal memory safety guarantees. Or more abstractly: it's easier to perform safe refactors when your safety properties compose natively, rather than having to bolt another layer of checks onto pre-existing language that doesn't support them natively.
Edit: Forgot to mention: another benefit of bounds checking in the language itself is optimization: when humans bounds-check, the compiler needs to recognize human patterns to safety remove or merge redundant bounds checks. When the language specifies its own bounds checks, the compiler knows exactly what they'll look like and can optimize accordingly. Modern optimizing compilers are very good at detecting human-written bounds, but a fully compiler-controlled optimization is going to beat a human-augmented optimization >95% of the time.
The bounds checks in Rust are implicit (i.e. part of the std implementation), and get removed by the compiler if they're unnecessary. I think that's a pretty great improvement over the state of things in C.
And if you are convinced you don't need a bounds check and the compiler does not remove it you can explicitly remove the bounds check, provided you mark the access as unsafe. So Rust is a strict improvement over C in this regard.
shnatsel wrote a post on bounds checking performance implications in Rust. Money quote:
> The real-world performance impact of bounds checks is surprisingly low.
> The greatest impact I’ve ever seen on real-world code from removing bounds checks alone was *15%,* but the typical gains are in *1% to 3% range,* and even that only happens in code that does a lot of number crunching.
> You can occasionally see greater impact (as we’ll see soon!) if removing bounds checks allows the compiler to perform other optimizations.
> Still, performance of code that’s not doing large amounts of number crunching will probably [not be impacted by bounds checks](https://blog.readyset.io/bounds-checks/) at all.
Through value tracking. It's actually LLVM that does this, GCC probably does it as well, so in theory explicit bounds checks in regular C code would also be removed by the compiler.
How it works exactly I don't know, and apparently it's so complex that it requires over 9000 lines of C++ to express:
The idea is pretty simple. You can build a list of known facts based on control flow, explicit __builtin_assumes, and undefined behavior relations. For example, if you've got this code:
if (x < N) {
// In this block, we know that x < N
} else {
// ... and in this block we know that x >= N!
}
And on top of that, we can do some basic algebra. If we know that x < N and N < 5, then we can infer that x < 5. So if we see a comparison x < 5, we can then rewrite that to true.
> and apparently it's so complex that it requires over 9000 lines of C++ to express
The two main reasons for that is that a) there is a lot of rules covering cases like "we know the result of count_leading_zeroes can be no more than the number of bits in an integer" and so forth, and b) this is doing a lot more logic than just tracking integer comparisons: there's tracking known-bits of integers, maximum possible value, floating-point comparisons, pointer object references.
> And on top of that, we can do some basic algebra. If we know that x < N and N < 5, then we can infer that x < 5. So if we see a comparison x < 5, we can then rewrite that to true.
Even better: if `x` and `N` are integers, then we can infer that `x` < 4. :)
My case is a bit outside that, so I don't think the compiler can deduce that. I have a file format which tells me the expected number of fields about a category, and I throw an error & abort if the number is not exactly that.
Also, these data structure fields are always sent in as const variables, so they are never modified (making them "sealed" in a sense), hence I don't need to bounds check on arrays and vectors storing them.
That sounds trivial enough that the compiler would remove the bounds checks, assuming I'm understanding correctly that you have a condition that validates the number of fields at some point before an invalid access would occur.
But if it's possible for someone to muck with the file contents and lie about the number of fields which would cause a bounds error, that's exactly what bounds checking is supposed to avoid. So either bounds checks will be removed, or they're necessary.
I think it won't be able to because the creation of these data structures and consuming them is 3 files apart.
> But if it's possible for someone to muck with the file contents and lie about the number of fields.
You can't. You can say you'll have 7, but provide 8. But as soon as I encounter the 8th one during parsing, everything aborts. Same for saying 7 and providing 6. If the file ends after parsing 6th one, I say there's an error in your file and abort. Everything has to checkout and have to be sane to be able to start. Otherwise you'll get file format errors all day.
The rest of the pipeline is unattended completely. It's bona fide number crunching (material simulation to be exact), so speed is of the essence. Talking about >1.5 million iterations per second per core.
> I think it won't be able to because the creation of these data structures and consuming them is 3 files apart.
Strictly speaking I don't think the distance between creation and consumption matters. It all comes down to what the compiler is able to prove at the site where the bounds check may go.
For example, if you're iterating over a Vec using `for i in 0..vec.len() { ... }` then the amount of code between the creation and consumption of that Vec doesn't matter, as the compiler has all the information it needs to eliminate the bounds check right there.
If that's a vector which you basically iterate, yes. However, thinking what I developed, I have offset or formula determined indexes I hit constantly, and not strictly in a loop. They might prove harder. I need to implement these and see what the compiler(s) do in these cases.
The code I have written is a 3D materials software which works in >(3000x3000) matrices, and I do a lot of tricks with these to what I get from them. However, since everything creating them are validated during their creation, nothing breaks and nothing requires checks. Because most of the data is read-only (and forced by const correctness throughout the code).
> However, thinking what I developed, I have offset or formula determined indexes I hit constantly, and not strictly in a loop. They might prove harder.
I think at that point it'll come down to the compiler's value range analysis as well as how other parts of the program affect inlining/etc. Hard to say exactly what will happen.
>> So I'm not sure how this can really improve the state of things compared to C with the equivalent bounds checks. [emphasis added]
Like all things Rust, you can do the same thing in C but that requires extra effort and more source code. If Rust will add bounds checking for you automatically, that's a step up from C.
If you can assure that bounds checks are not necessary (either by construction, because it's a statically sized array, or by runtime check because you do a length check once at runtime), then doing those same things will tell rustc enough to know that bounds checks aren't needed[1][2]. If you think you don't need bounds checks, but can't communicate that in code, such as with an assertion (or if rustc had a bug that misses those checks, unlikely but could happen), then yes, you'll end up with bounds checks unless you use get_unchecked in an unsafe block.
I'm failing to see how this is an onerous difference.
Who would you rather trust to do bounds checking, computers who are zealously good at doing what they are told to do to the point of absurdity or humans who are notoriously bad at following rigorous procedures? I mean, we've seen from several other fields that the only way to get the safety standards of human procedures up is to introduce checklists and get people to rigorously follow them [1].
If you want to elide bounds checks for performance reasons, which is easier: manually verifying for yourself that every single array access is guarded by a bounds check somewhere and ensuring that no subsequent code changes break this verification, or getting the compiler to prove for you that every bounds check can be safely elided?
[1] And of course we still have several issues in fields like medicine where practitioners refuse to adopt this methodology because they find checklists to be an insult to their intelligence.
> getting the compiler to prove for you that every bounds check can be safely elided?
I'd prefer to delegate that where compiler if it can do that for that piece of code at hand. I've written about a case where it'd be very hard for a compiler to eliminate a bounds check because the guarantees are made elsewhere in the code.
On the other hand, I'd rather add my bounds check voluntarily (it's very simple in C++ vectors for example. use ".at()" instead of "[]", that's all), because I generally design my code in a way which doesn't need bounds checks by failing hard and early at places where I build/fill the arrays/vectors and prone to malformation. So, you need to be well-formed to pass these checks, and these data structures are not modified ever down the pipe. If they are modified, they'll be bound checked of course.
What I'm saying is, I'm not naive enough to believe that I'm perfect, but I'm not naive enough to believe that compiler is perfect, either. So, I do my part, and leave the parts I can't be sure to the compiler.
I'm not an hard-liner. I just want finer control on my code, and take full responsibility if it crashes and burns in a way it shouldn't, so plan and implement accordingly.
> If you don't need to do bounds checking, and doing it anyway, then that's a step down from C.
The Rust compiler tries to optimize away unnecessary bounds checks.
In practice, it works well. The real-world cost of Rust bounds checking isn't very significant in most benchmarks, aside from some synthetic micro-benchmarks designed to emphasize the issue.
If you come across a hot loop in Rust where bounds checking is an actual overhead, you can manually optimize it out if you so desire. It's important to really check first, though, because it's often surprising that it makes such little difference or has been optimized out already.
But isn't the point that the number of times a c programmer thought they didn't need bounds checks and reality are very different. Also off by one errors and such. Rust won't let you make these mistakes
Very little Rust code actually does all the safety checks that you would expect a debug build of that same program to do, especially in the kernel.
You can write safe rust (check the Option<T> returned by vec.get(i)) but code like `p[0] = update_prob(d[0].into(), p[0].into()) as u8;` (https://gitlab.collabora.com/dwlsalmeida/for-upstream/-/comm...) can panic at three different places. Such a panic would become a kernel oops, which wouldn't be the end of the world but it would probably kill whatever program was trying to decode video. With additional optimisation options, the bounds checking may even be omitted entirely.
Rust does generate more accurate bounds checking warnings thanks to all the metadata it has, but that should not be solely relied upon. Rust will let you make those mistakes, but only sometimes, not usually like in old C or C++.
I think it's important to know the difference, because feeling invulnerable to these bugs may lead you to write buggy code because you stopped thinking about common C bugs entirely.
Also worthy of note is that because of a compiler bug, it's possible to leak memory and cause other weird memory bugs in perfectly safe Rust at the moment. It involves messing with lifetimes and semi-unsafe code so I doubt that bug would just sneak in, but the language doesn't make your code completely bullet proof.
Remember that some of the point here is that it _will_ reliably kill the program if that happens; in C, you might be silently reading or writing to the wrong address 3 times.
The `single_ref` field is a fixed-size array in both of the objects referenced in this line, so this line can't panic, and no bounds checks are involved (since the compiler sees the index < length at compile time and doesn't even need to emit one -- although I think it still does, and it's LLVM that gets rid of it actually)
Causing memory leaks is possible in safe Rust even without any arcane invocations, you can construct a cycle of Rc<T> counted objects. There's even a perfectly safe Box::leak in the standard library that gives you a &'static reference to any object by leaking it.
Preventing leaks is outside of the scope of Rust's safety system.
I personally don't think that a programming language should save me from myself unless I explicitly ask for it. For every case I first prove to myself that I can get away without bounds checking, otherwise if I'm in doubt, I put it in and profile.
I think Rust is a nice language, but when it's presented as an "antidote" to "evils of C/C++", and is "indeed made to kill those" I lose my interest.
Also, while I think learning Rust is worthwhile, I think promoting Rust's barriers as saviors are funny. We say that Apple's against general purpose computing with all its walled gardens. Then Rust is against "General purpose programming". I want to be able to write programs which crash and burn like meteors entering the atmosphere, because this allows me to understand the hardware under me. Promoting a walled programming language as "the one and only" is wrong.
Why should I embrace a programming language as one and only if that doesn't allow me to do what I want with my computer?
Edit: No, unsafe doesn't count, because it doesn't remove all checks.
I'm a big proponent of choosing the right tool for the job at hand. On the other hand, I'm a big opponent of throwing stones because of emotions.
C/C++ can be made memory safe. You need to use a couple of data structures and need to be a little bit more vigilant and make periodic tests. If the developers can't bother to learn them, that's fine.
However, saying something is impossible and being adamant about that without research is harmful as it is. I sometimes say things about Rust, people correct me, and I learn. Some people see the evidence contrary to their beliefs and get triggered because they were wrong in the first place. This is what I'm bothered about.
I don't use C or C++ only. I'm not against Rust either. I'm against positioning of Rust and C/C++ from Rust community's perspective. That's all.
Lastly, I'm not a proponent of reckless coding either. Totally contrary. I take pride in writing robust code. However, I want to be able to write code which breaks on purpose to see what hardware does, to understand the failure modes or gotchas of the architecture I'm running on.
In theory, in small toy projects or with a massive NASA budget - Yes. In any other project, with normal (aka too short) time constraints and average skilled developers, it’s not.
35 years of trying, has proven us humans that, several times over.
I think it's much simpler than that, because I did it myself, on a HPC scale high performance materials simulation code. Is it simple? Yes. It's easy? No, because you need to design for that and be mindful during implementation (const correctness, guarantees by design, valgrind tests, unit sealing, etc.). I think it can be made much simpler with smart pointers, etc. if speed is not that important.
We push humans too much to develop things fast. C++ is not very conductive to that, yet it's the only tool which works in some cases.
I won't retype my views about Rust because it's all over this thread. Just I'll tell that I'm against vilifying C/C++ as evil because they can be held wrong. I believe things can and shall be able to held wrong. Knowing failure modes and shortcomings is a plus. Because you can then hold dangerous things right, and appreciate things which promote holding things right.
> I think it can be made much simpler with smart pointers, etc. if speed is not that important.
Why all the complex dances, and you even lose performance by doing them as well? Use Rust and it's going to solve most of those troubles for you.
Though granted, you can make a mess in Rust as well (by using various escape hatches) but you really have to try in order to do so.
> We push humans too much to develop things fast.
I agree, most programmers agree in fact. So when are you talking in front of the UN and when do the worldwide regulation against this problem begin? When will companies start getting a 10% revenue fine for not adhering to the regulation?
...You're preaching to the choir here. We all know the ideal theoretical reality. I too wish I had more generous deadlines.
> Just I'll tell that I'm against vilifying C/C++ as evil because they can be held wrong.
It boils down to: Rust can be held wrong in much less ways than C/C++. That's the core argument. Everything else is more or less a distraction, my own comment here included.
Computers should work for us, not we for them. Rust helps my human brain not to misstep. You'd likely label me as incompetent but I'd disagree and say "none of the gnarly details of the bound checking and lifetimes arw interesting for me, I'm writing the code, compiler will correct me if I assumed wrongly". Me and the compiler are a team and we both do what we do best.
> I believe things can and shall be able to held wrong.
As another poster said: do it in your hobby work. If I have to deal with subtle memory safety errors in your professional code just because you have such a life credo then I will curse at you and label you as incompetent.
I love tinkering. I keep it to myself and to other hobbyists. Professional work should be, you know, professional.
> Why all the complex dances, and you even lose performance by doing them as well?
Simple. Because Rust doesn't have BLAS, at least not yet. Though I doubt that I'll ever migrate my high performance code to Rust.
> So when are you talking in front of the UN and when do the worldwide regulation against this problem begin?
No need to snark. I write my comments as streams of consciousness, and I just wrote what I feel. I know it's a simple fact, but I'm not here for stone matches.
> Rust can be held wrong in much less ways than C/C++. That's the core argument.
Yep, and I'm very aware of this one. I think I need to clarify once more: I'm not against Rust. I'm against weaponization of Rust (via LLVM, via RIIR w/MIT, radical evangelism, etc.).
> Computers should work for us, not we for them.
I agree. I want to freedom to explore. Because I'm a hardware geek, and I like to PEEK, POKE, and understand.
> You'd likely label me as incompetent...
No, I don't do that. You have a very wrong image of me in your head. I'm only pro-choice in programming languages. If they are not interesting for you, that's OK, plus your interests are not my business. I have no interest in dictating anything to you or to form you to my liking.
On the other hand, I operate from a slightly different perspective. When I write code, I execute it in my mind at the same time, like a computer. Lifetimes, etc., the whole circus. This helps me to write better code. Then I pass it to compiler, with -Werror nonetheless. The code I write compiles without warning, otherwise it's unacceptable.
> As another poster said: do it in your hobby work.
The thing is, my work is a hobby that pays. I always select the tools before starting a job. What we're writing? Is it speed critical? Is it security critical? What happens if it fails, etc. Put the specs, define the required tools for the job, then start the blueprint in that language. If that's Rust? Go? C++? Erlang? Bash? Malbolge? Select the appropriate one. Don't know it? Doesn't matter. Go, learn.
This is how I do it. I'm not a zealot living with a single language. There are no bad languages, only bad solutions to problems at hand. My beef is always with the community of a language, not the language itself.
That's being said, I'm waiting gccrs to start Rust, because I have a single rule: Toolchain must be GPL licensed. I don't like to depend on a toolchain a vendor can fork and close. The code must stay buildable.
> Simple. Because Rust doesn't have BLAS, at least not yet. Though I doubt that I'll ever migrate my high performance code to Rust.
Perfectly valid, though in the light of this I'd say your arguments against Rust come off as biased in favor of a single use-case and I am sure you're aware of this.
> No need to snark.
Okay, though I am guessing you understood that the snark was because you criticized something that's beyond the control of at least 95% of all working programmers everywhere. I get the stream of consciousness part, I just didn't feel that that part of your comment was helpful of the discussion. Sorry if I came across as rude.
> I think I need to clarify once more: I'm not against Rust. I'm against weaponization of Rust (via LLVM, via RIIR w/MIT, radical evangelism, etc.).
I've read all your other comments and I am not disputing that you are arguing in good faith; IMO that much is visible, I just find it slightly puzzling how your stance on it is kind of going up and down. :)
The second part of your quote above is a bit bizarre because it's been quite a while since I've seen Rust zealotry on HN (at least 2.5 years IIRC). Every community has bad apples, IMO you don't need to preemptively defend against the a-holes that every community has (and that haven't shown up on HN for a long time, as far as I can tell at least).
> The thing is, my work is a hobby that pays.
While that likely does wonders for your mental health (and makes me envy you and get saddened that it's not the case for me) it also comes with this unique drawback that you already observed: you get called out by the programmers who don't work in the area(s) they love for not utilizing guard rails that are a good practice in a team setting. This is also likely made even worse by the fact that you work solo.
Again, good for you, for real, but that also makes you biased in a way that seems... not very team-cooperative. I don't stamp PRs with an approval if they don't cover a certain baseline of testing, type specs (for dynamic languages), some docs / comments, and the like.
Tinkering is nice and all but I'd probably make your life living hell if I had to review your PRs. :D (Have to stop here for a few secs to giggle.)
> I'm not a zealot living with a single language.
Nor is any Rust dev that I knew (dozens). ¯\_(ツ)_/¯
> My beef is always with the community of a language, not the language itself.
OK, that's valid but I'd still urge you to reassess your opinion of the Rust community. I haven't met a legit a-hole in years. There are always some dismissive people on this or that forum but meh, vanishing minority to the point that I don't remember the last time I received a semi-snarky response on a subreddit or on the Rust Users forum.
And finally:
> You have a very wrong image of me in your head.
That is quite likely, written text does not carry tone and spirit at all. Though that always has two sides: I misconstrued some of your comments more negatively than I should but maybe you also didn't come across as neutral as you are saying that you are (example: you seem to be negatively biased towards the Rust community).
No harm done IMO, and I don't think that I disagree strongly with anything you said. I am mostly saying that you working what you love and working solo is keeping you quite disconnected from what are good team practices. Maybe you have a compiler in your brain as you have alluded to but I will trust a good test suite over a human brain every day.
Yes, I'm pretty aware of this, and I keep this bias voluntarily, because Rust is generally touted (hyped?) as a Silver Bullet, and I'm wary of Silver Bullets. So, I want to highlight that fact.
> Sorry if I came across as rude.
No hard feelings. As I said, I'm not under delusion of a perfect life. I just said it.
> I just find it slightly puzzling how your stance on it is kind of going up and down. :)
I have a habit of mirroring the tone of the person I'm talking with. Combine it with the fact that English is not my native language, it's possible that "tone adjustment" is far from perfect.
> it's been quite a while since I've seen Rust zealotry on HN...
Yes, there's no zealotry here, and it's great, but I'm exposed to a wider community than HN people. Unfortunately, HN represents very small percent of the programmers around. A great talk I like about this issue is Robert Martin's "What Killed Smalltalk Could Kill Ruby, Too" [0]. Let's say, I'm marred and bitter from tons of zealotry and flamewars over the years (yes, I'm not young folk).
> While that likely does wonders for your mental health... [Snipped for brevity]
First of all, thanks. I pray that you work in a place you love and never work again. I'm very aware that how being solo makes me different and biased, and I think I highlighted at a couple of places.
On the other hand, I understand what teams and team-related activities are important for code quality. Let's say that I work as a team of two people. Current me knows everything about code, and my future me which inherits that code. So, I always code and comment for my future self, who doesn't know anything about the code. I always sharpen my axe, and try to improve myself as a "team of one". My personal state of the art of this practice is at [1]. I'll reflect what I have learnt from this project to the next one. The fact is while I do extensive C++ testing, I yet to pick up Go's testing abilities. That'll be the next step probably.
> Tinkering is nice and all but I'd probably make your life living hell if I had to review your PRs. :D
Give it a look at [1], and maybe [2], and let's have a chat again. I'm not perfect, for sure, but I think I'm no basement dweller when it comes to code and docs quality. :D
> OK, that's valid but I'd still urge you to reassess your opinion of the Rust community.
As a recovering grumpy, I'll do. I don't like to be bitter grampa (to be). I'm fine with the grampa part, but not with the bitter part.
> No harm done IMO, and I don't think that I disagree strongly with anything you said.
Same, no hard feelings here. I can say that you're good sport even, and I don't use this lightly. I enjoyed what you wrote and writing this comment.
> I am mostly saying that you working what you love and working solo is keeping you quite disconnected from what are good team practices.
Of course, but I'm trying to incorporate the good practices I can use as a team of two (as aforementioned). Also, I'm mingling with more Free Software teams than I might show.
> Maybe you have a compiler in your brain as you have alluded to but I will trust a good test suite over a human brain every day.
I'll trust my brain first, then evaluate that with the compiler, then evaluate the end product with a good test suite. Then feed what I learnt to myself, so I don't repeat the same mistakes as much as I can. My intuition gives me perspective, but I always operate on the assumption that I'm the worst programmer in the universe (no kidding).
> I keep this bias voluntarily, because Rust is generally touted (hyped?) as a Silver Bullet, and I'm wary of Silver Bullets.
Hrm. We're not progressing on that front so maybe it's time to stop. But I don't think Rust is "touted as a silver bullet". Rust is simply better than many other languages if you have criterias X, Y and Z -- and it just so happens that those X, Y and Z are actually things that fix problems that exist quite often out there. This is not "touting".
Forgive the generalization but you do seem to belong to a group of people who get annoyed if others praise something and if they hear about it often. Granted, popularity is not a signal for quality (and we as senior techies should possess and practice this kind of critical thinking much more than many other people!) but this is also tech and I'd like to pretend that at least part of all decision-making processes are based on merits. ;)
I'll give you a non-tech example. "Game of Thrones" was hugely popular and I was acting seemingly like you do now -- I kept hearing about it so much that I actually made it a goal to NEVER engage with it.
Obviously that's a bit irrational, if not a bit damning for the person practicing such methods in their lives.
But again, in tech it's not exactly the same. If you keep hearing about Rust, then critical thinking and objectivity demand that you ask yourself why and assess it with a clear head and without bias.
You seem to claim to have (mostly) done so but reading into your seeming-rebeliousness against Rust, I have some doubts. Hence this call-out (and the thread between me and you itself).
> Yes, there's no zealotry here, and it's great, but I'm exposed to a wider community than HN people.
I understand, yet IMO you should make up your own mind and not get irritated just because something gains popularity.
Sometimes there are good reasons for the popularity. Now in my day work I am pissed that the languages I work with don't have sum types. :\ Fixes so many bugs...
> Current me knows everything about code, and my future me which inherits that code.
You are surely aware that is not enough unless you actively go out of your way to check how other people do things, right? But I won't pursue this further, I believe we're in agreement and only the degree of stuff is under question from my side.
> As a recovering grumpy, I'll do. I don't like to be bitter grampa (to be). I'm fine with the grampa part, but not with the bitter part.
Well that's all that I asked for really. Thanks.
> Of course, but I'm trying to incorporate the good practices I can use as a team of two (as aforementioned).
And one of them should IMO be: "when you need something that you'd write in C/C++ and it is not highly specialized (meaning no good Rust library support) then try writing it in Rust".
Several huge organizations came forward and said memory unsafety contributes to at least 70% of all zero-day vulnerabilities. Surely at this point being religiously fanatical about memory safety should be common sense, I'd think.
> My personal state of the art of this practice is at [1]
You didn't have to show me, appreciate it. I do like the well-commented code. But I'd definitely remark that a huge if/else should be broken down into two functions. Maybe the else clause even further -- talking about `if isInputFromPipe(logger)...` in `initFlags`. I'd like the code to tell me what is it doing, not describe it to me how it's doing (and I still don't know the "what" part while reading such code). But that kind of stems from Golang's imperative nature and not everyone wants to write code in mostly FP style. I get that.
> Same, no hard feelings here. I can say that you're good sport even, and I don't use this lightly. I enjoyed what you wrote and writing this comment.
> Hrm. We're not progressing on that front so maybe it's time to stop.
I thing we don't have to progress on anything immediately to discuss productively. You put your ideas forward, I put mine. I get yours to ponder on them later, (and I assume) so do you. I think I agreed that my ideas may are colored due to what I have gone through, and accepted to revise them. I'll see where I can go after taking that step. :)
> if you have criterias X, Y and Z -- and it just so happens that those X, Y and Z are actually things that fix problems that exist quite often out there. This is not "touting".
Yes, and I agree that Rust is good for fixing these X, Y, Z. However, as I said what I have gone through is different. Just hold that thought.
> Forgive the generalization but you do seem to belong to a group of people who get annoyed if others praise something and if they hear about it often.
I don't mind being generalized. We're humans after all. However, your generalization is wrong, sadly. I don't get annoyed by praising or popularity. I get annoyed by being pushed to accept an idea without thinking on it. What I have gone through is akin to being surrounded by people chanting something, and I expected (or even bullied) to chant the same thing without understanding what I'm buying into.
I prefer to do my own research and make my own decisions. Trying to take my freedom from me annoys me with no end. For the record, I have a Vagrant VM which installs a turnkey Rust development environment (IOW, I started to dig into it), yet I put it on ice until gccrs is released, and learnt Go instead, because it proved to be a better tool for what I was trying to build.
> I understand, yet IMO you should make up your own mind and not get irritated just because something gains popularity.
Built on previous generalization, just passing if you don't mind.
> You are surely aware that is not enough unless you actively go out of your way to check how other people do things, right? But I won't pursue this further, I believe we're in agreement and only the degree of stuff is under question from my side.
We're in total agreement. I'm not a team powerhouse by working solo. I just try to do my best, but I can't build my missing parts without a team, and I'm willing to do that if I work in a team. I like developing tools, but I'm not the person who upends team dynamics because who wants to be the only one. I like the craft itself, and will happily adapt to the environment I work in (sans the bullying part, ofc).
> And one of them should IMO be: "when you need something that you'd write in C/C++ and it is not highly specialized (meaning no good Rust library support) then try writing it in Rust".
I understand and I agree, but I just can't port a library developed by a team of specialized researchers and hand-optimized for a ton of hardware/processor architectures [0] by myself. I can only write what I'm planning to develop in Rust, maybe develop a couple of libraries along the way, but please don't ask me to license them with MIT. That's not gonna happen. I'm not a GPLv3 zealot, and contribute to MIT projects, but my projects are for the users of these projects. Not for the developers who want to build upon them and forget to add credits because deadlines.
> You didn't have to show me, appreciate it.
Just wanted to share for the feedback, not as a proof, because I value comments and feedback, that's all :)
> ... talking about `if isInputFromPipe(logger)...` in `initFlags`
That's not a part of the code I'm very proud of, but it's not a hot path, is relatively self contained, and it works.
> But that kind of stems from Golang's imperative nature and not everyone wants to write code in mostly FP style. I get that.
I'm coming from C64 era. I think imperative (or how computer thinks), and write my code. Starting FP with a Lisp is on my roadmap, and the notes are in my backpack, but that road is a bit bumpy right now. :)
> I get annoyed by being pushed to accept an idea without thinking on it. What I have gone through is akin to being surrounded by people chanting something, and I expected (or even bullied) to chant the same thing without understanding what I'm buying into.
Yeah, happened to all of us. Sorry that it happened to you. My discussion with you was sitting on a much more reasonable ground, I believe: namely assess the thing (in this case thing == Rust) by its own merits and niche. And you said you are open to that so my work here is done, so to speak.
yet I put it on ice until gccrs is released, and learnt Go instead, because it proved to be a better tool for what I was trying to build.
On your point 1 here I'll only say that software licenses don't bother me at all. As long as the license does not say "you cannot use that OPEN source software in your CLOSED work or else you owe me" then I'll use it and not give it a second thought.
Morality and legalese concerns in software is something I just don't want to deal with for now, likely forever too.
As for your point 2... actually I did the same. I do love Rust but I had a number of mini projects, commercial and for my own use, where I really wanted to get something off the ground in literal minutes. Golang served me there better than Rust. When I get a little more free time I plan to achieve the same friction-less experience with Rust because it is not like that out of the box but I know for a fact that it can be made to be like that.
> I understand and I agree, but I just can't port a library developed by a team of specialized researchers and hand-optimized for a ton of hardware/processor architectures [0] by myself.
I understand. We touched on this already, suffice to summarize that in this case you're fully aware that your assessment is also a bit niche, you know that for less specialized work Rust can be a fantastic fit, you are open to try it and get intimate with it in the future, and as I mentioned already -- that's all I wanted to achieve when discussing with you.
.. but it's much harder to prove your work is memory safe. sel4 is memory safe C, for example. The safety is achieved by a large external theorem prover and a synced copy written in Haskell. https://github.com/seL4/l4v
Typechecks are form of proof. It's easier to write provably safe Rust than provably safe C because the proofs and checker are integrated.
I have never claimed that it's easy. I said doing it is simple, and prone to errors, and needs to be verified either by design or tests, ideally both. I'm aware of seL4. They're doing an amazing job of verifying what they have written.
However, why I'm is so adamant is because I have done something similar myself, albeit in a weaker form, but at least I verified that every part of my code is not doing funny things by vigorously testing it in valgrind in different scenarios both in units and end to end.
Again, I'm not against Rust. I'm against vilifying languages.
Which do you think is more efficient in terms of CPU resources, user-time, developer-time, money? Both development and the final program taken into account. On average.
Depends on what you're building. If the code you're running is not resource intensive and relatively short-running, developer time is more expensive.
However, if your program is long running and requires high performance (number crunching, simulations, HPC in general), %1 difference in tight loops affect your total runtime by hours, if not days. Then, user-time is much more expensive, hence you need more speed. Also, in this case you're maxing out ~100 servers in terms of power and TDP, so a shorter runtime has a bigger impact on your energy bill, and global warming.
If I can run more users' code for the same power and time budget, and conclude more research, developer time to be damned. They can spend as much as time they like.
Tech people tend to say developers are expensive and hardware is cheap. No it's not, if you're using it at its max capacity.
I hear a lot of people complaining about supposedly degraded performances due to bound checking; but IME, even on number crunching HPC code, I have never been able to get a signal greater than noise regarding bound checks, which can be explained by: (i) the prediction pipeline doing its job, (ii) iterators eliding bound-checks at compile time, (iii) bound checking being dwarfed by the actual computations within the tight loop.
Remember to measure what you optimize for first before going on an intuition.
Disclaimer: I'm an HPC admin and both develop code on these things and manage them.
The code I have written was doing ~1.7M iterations per core, per second when I implemented it w/o bounds checking and locks. It was designed to be fast from the start, so I never tried bounds checking.
I'm restarting the work on the code soon-ish, so I'll be writing a benchmark module for the thing. If you can provide me an e-mail address, I'll implement both, do the tests, and provide you the results, and we can discuss on it, too.
Also, I'll see whether GCC-14 (or whatever comes next) is intelligent enough to eliminate bounds checks in these cases.
The following part of the code [0], was running with much higher iteration numbers inside the "tight loop", but I never benchmarked it, because its iteration count is both inconsistent (due to adaptive nature), and was meaningless in the bigger picture (where 1.7M/sec/core number comes in).
That code was never optimized before measurement, and the biggest bottleneck was memory controller at the end. I needed to reorder matrices to pass that hurdle, yet the Ph.D. was complete, and speed was adequate, so we didn't bother, TBH.
It's worth noting that HPC is a different problem space from say, kernels, drivers, codecs, and browsers, in that HPC is not generally expecting to deal with adversairial input, and those other spaces are. So the safety-performance trade-off is truly different between you and many of the commentators who are rightly pointing out C/C++'s atrocious track record in the secure space.
Also, I've done a small amount of scientific HPC, and as I see it "correct" is infinitely more important than "fast". If you look at the amount of incorrect scientific papers that trace back to accidentally and silently corrupted data, I think maybe it might sense to consider using any and all possible tools to avoid corruption, of which Rust is but one example (and you've named others like valgrind).
However, this doesn't mean that an HPC application has no need to verify its input, fail gracefully if something goes wrong, or have to be absolutely rock solid because it's running for days (or weeks in some cases) while being dead-on the results of the problem evaluated.
You're right. Correct is the king in HPC, but it's a king which doesn't exile speed. What we do is to solve a couple of known cases correctly in MATLAB or something similar, then try to hit the same results first (or be better if we can do that), then make it iteratively faster without deviating it from a bit (Same result with 1e-32 precision, IOW).
This "comparing with a known truth" process is a great indicator of calculation sanity. Other parts are iteratively tested with Valgrind and custom test suites, from function level to end to end, automatically after each build. End to end tests are very expensive in the Valgrind memory profiler (esp. if you're testing multi-thread code), so we do these weekly generally, but I think the idea is clear.
I do think, when it comes to professional-grade software, programming languages should save programmers from themselves -- even the ones who don't want to be saved.
Vulnerabilities are where programmers thought the bounds checks were redundant, and they weren't. This overprotectiveness by default turns out to be useful:
Look how many of the crashes are panics and unwraps that could have been buffer overflows or wild pointer derferences otherwise. And there are plenty of arithmetic overflows that are much less dangerous when they can't cause out of bounds access.
The code in this particular codec seems to be a direct translation of C code. Idiomatic Rust code would use iterators more, which work better for optimizing out redundant checks. It's easily fixable.
As already mentioned, bounds checks won't necessarily cause that much overhead. When I rewrote my small image processing library from C to Rust ([1]), I only had to use unchecked array access in one hot loop to get overall performance equivalent to C code.
Then sit back and watch others try it out. Let's see what comes out of it, the Rust code can always be discarded if it turns out to be inferior to the current implementation. It's not like breaking a glass, we can reverse it.
I also don't think that we should be writing everything in Rust blindly. If you can guarantee that you won't be accessing outside of an array before entering a critical section, not having bounds checking is actually a plus.
I have a similar code where I can guarantee that I won't be ever accessing outside the boundaries of arrays and vectors, and that gives me great performance boost.
Yes, I'm aware. However in most cases I know the size of the array in the beginning and it's not modified by any means (which is guarded by const correctness throughout the code).
If I can't guarantee that, I use vectors and ".at()", which does bound checking at runtime.
Generally I'm developing solo, so people mucking what I do is very rare, however I don't blindly believe myself either.
It makes sense that you don't see the value of Rust if you spend most of your time developing solo.
The value of these checks is not just to reduce bugs in production code. It's nice that that happens, but that's not even the primary value of the borrow checker. The primary value is that having these checks (lifetimes, bounds checks, -- everything that makes Rust annoying to write) makes refactoring on a shared codebase significantly easier. This means that you are not introducing bugs in a refactor, so you can refactor faster, which means you can ditch bad architectures sooner, which compounds and saves enormous amounts of developer time and $$. And it's a knock-on effect that increases in value as the team grows larger.
Keeping track of lifetimes and bounds for a solo dev is quite easy as you say. Keeping track of lifetimes and bounds for the other dozen devs on my team? In all external dependencies? Extremely difficult. Impossible in a large codebase, actually, given the number of mem safety bugs that appear even in mature C codebases. It's collaboration that is the source of these bugs, that these checks work to mitigate.
I sort of wondered why C did not have a package management system, until I started working on a large C codebase. There is a reason Rust has cargo and C does not; it has nothing to do with whether or not someone decided to write cargo and everything to do with Rust's language features.
That's a different and refreshing perspective to look from, thanks.
I'm aware that my view is somewhat biased because I'm a solo dev which works on small to large projects by myself, and things get exponentially harder as more people mangle the same code base. That's very true.
What I was trying to highlight is basically neither C or C++ are "free for all without recourse". Esp. C++ has many features, but they're opt-in, where in Rust they're opt-out.
Also many newer developers don't understand that compilation used to take way longer on olden times even with simpler languages and compilers. Hence, the thing Rust doing today was "impossible" in the older days.
For the last time, I think Rust is a nice language, and I won't be annoyed by its limitations. What bothers me with no end is vilifying other languages and pushing rust as a silver bullet and savior. Other than that, Rust is just another tool which works for some things very well, and not very well for others.
Hate to be the optimization nerd, but that's not really how any of this works. It's likely that this overhead is two instructions in the likely case (compare and jump), and there's absolutely no profiling being done here to show if this causes a measurable performance difference. It's quite likely that nobody here is profiling this, and is just guessing based on "more code = slower", but there is really no way to predict this in such a general sense. It may be that these bounds checks even cause the compiler to do optimizations later since it can make more assumptions.
At the same time, if performance turns out to he a problem, there is probably bigger fish to fry than removing guardrails (maybe memory layout, maybe access patterns, maybe reordering things.
You would be surprised how much impact the additional instructions can incur. I found UB in a dependency of a popular video encoder. My fix involved adding compare and branches. I thought it wouldn't be significant but then I was instructed to benchmark it and it was significantly slower.
That's the Rust "secret" in many high-performance computing applications, like games. You write everything using arenas and handles, which effectively side-steps the borrow checker. Everyone sane has been doing that for decades in C and C++.
Obviously Rust has far better guarantees in general, but the pervasive usage of this borrow checker anti-pattern suggests that perhaps we need a more comprehensive way to guarantee memory safety.
I remember Zig is able to convert arrays of struct to struct to arrays in compile time [1], which effectively sidesteps all the need for the user to worry about array indices and having them in the right range.
> which effectively sidesteps all the need for the user to worry about array indices and having them in the right range.
How does converting AoS to SoA eliminate the need to worry about array indices? If you have an array of structs with N entities and convert that to a struct of arrays each array would also have N entities, so out-of-bounds accesses in one would be equally out-of-bounds in the other.
Interesting. Use case is so that there is some code exercising this path.
While the algorithm is cool, of course, the code itself is quite straightforward as it is. An interesting thing I didn’t know is that the kernel code avoids recursion (this one has a depth parameter to prevent recursion past some point).
I can see why this was picked as a candidate. Straightforward implementation. Good test suite. Self-contained and not a moving target.
Lots of the code is using the coefficients array against the framecontext but I don’t know how to enforce the bounds invariant that the two are the same. For the fixed size arrays I could see how it’s done, but otherwise it seems like the bounds checker will trigger. But I’m reading on my phone so maybe that’s just a misread.
Not a big performance hit even if it does. But perhaps it doesn’t, and I’d be curious why.
There was opposition to building interfaces for toy drivers and the last thread had suggestions at a rust filesystem interface rebuffed by saying they should try rewrite the ext2 driver in rust to prove that it was usable for real filesystems rather than toy ones. I'd guess similar thought processes fuelled this decision.
> These algorithms use the data received from userspace in order to index into a lot of arrays and thus benefit from Rust's memory safety.
Yeah, thats not a good argument for Rust specifically. Maybe Zig would be interesting for array-specific things like sentinel-terminated arrays ans their amazing C interop. Further, Zig code is unmatched in how explicit everything is. There is no abstraction hiding complexity, its built on the idea of custom allocators from the ground up.
This seems like someone looking for a usecase for Rust over C, rather than someone looking for a better solution altogether. Thats okay, I guess, but Rust does not do array bounds checking very well and the borrowchecker doesnt help there at all. The large amount of hidden complexity in a lot of the language dont make it more safe by any means.
The VP9 codec I imagine would heavily benefit from SIMD? Just SIMD should massively outperform any advantage gained by putting this into the kernel. Why does the kernel need an unoptimized codec?
I don't think that the driver implements VP9 in software. It uses hardware acceleration to perform the actual encode/decode, but is managing preparing of coefficients and DMA of data buffers in the driver.
SIMD can be pretty architecture specific. For example, does the CPU support AVX-512 or SSE3? So, you have to have a few code paths if you're going to support a wide variety of hardware.
I don't have an answer to your question it just occurs to me that maybe the Linux kernel doesn't allow a lot of SIMD for this reason or maybe they require a fall back/slow code path to be available if you're submitting a patch that includes SIMD operations?
It's nightly-only for now, but I've used it and it's lovely. I've added Simd to projects that I never otherwise would have, just because this made it so easy and accessible
> This patch ports the VP9 library written by Andrzej into Rust as a
proof-of-concept
...
> this library will not need
any further updates for the same reason we have never touched its C
counterpart
What was being proof-of-concepted? What's the metric of success for introducing Rust in a case where no one was doing any sort of active work anyway?
From what I understand the need for a "proof-of-concept" comes from the fact these codecs/drivers often using memory unsafe "tricks" to increase performance and therefore need to be properly tested on a myriad of hardware to make sure the conversion to memory safe code isn't a significant performance impact.
> No. It's all C code in the stacktrace. And kernel devs still don't have to deal with Rust, if they don't want to.
Thank you for the answer. I was wondering since the conversion to Rust is a large undertaking & rewrites can have surprises. Part of the question was if this was an argument to use Rust since NULL pointers are something that Rust goes through great lengths to guard against.
And wow...my original post got flagged? It was a sincere question that I was curious about & didn't have an answer for. Either everyone needs to chill...or is mentioning NULL pointer references in the Linux Kernel some sort of unspoken mortal sin or was I somehow offensive? Please, someone let me know if I'm out of line cause I just see zealotry of some type that I cannot begin to imagine. I want to avoid such land mines in the future...for the love of all that is holy.
> I was wondering since the conversion to Rust is a large undertaking & rewrites can have surprises.
There is no concerted effort to rewrite Linux in Rust, so there is no large undertaking. In spite of this VP9 rewrite, things are not being rewritten in Rust for the most part. Some individuals are rewriting some things but more importantly, some new stuff (such as drivers) are being written in Rust. It's this new stuff written in Rust (and not previously written in C) that is the focus IMO.
And most C stuff isn't going to be rewritten. Many maintainers don't know Rust and couldn't review Rust code anyway. The impact of Rust in the kernel so far is pretty modest (the most important code being the driver for Apple silicon GPUs, I think)
> And wow...my original post got flagged? It was a sincere question that I was curious about & didn't have an answer for. Either everyone needs to chill...or is mentioning NULL pointer references in the Linux Kernel some sort of unspoken mortal sin or was I somehow offensive?
I didn't flag your post, but I would guess it was flagged because you immediately assumed that Rust was to blame.
If everything is done with arrays and indices (apparently from looking at the code: (https://gitlab.collabora.com/dwlsalmeida/for-upstream/-/blob...) it seems like Rust's borrow checker doesn't really help at all, and the only thing Rust really does for you is just bounds checks on arrays (with additional runtime overhead)... So I'm not sure how this can really improve the state of things compared to C with the equivalent bounds checks.