Probably not much of a backstory, it's common to make copy-and-paste mistakes when implementing multiple similar things. For instance, when implementing both the Add and Sub traits, it would be normal to start with one of them, and duplicate it to make the other, making the relevant changes in the copy (Add -> Sub, add -> sub, + -> -); it's easy to forget to make one of these changes.
Yes, which also serves as a way to explicitly point out that the "surprising operation" is intended. Hopefully this even reminds you of putting an explanatory comment there.
Clippy is a linter. You don't use linters by mechanically following its warnings as gospel, you use it by double-checking the code wherever it warns and removing the warning from that function if it's a false positive.
I think the rust making you aware of what you need to do is a good thing. I have a vague memory of writing C code when I was a teenager where I directly manipulated a string from argv[] doing some concatenation or other and it worked fine until it didn't because I was merrily stomping on memory I didn't own (I was thinking in Pascal terms while writing C code). The whole String vs &str thing in Rust can feel a bit fiddly at times, but it does make you aware of what's happening (and I've found that it has improved my JVM coding as well although I do have to occasionally remind myself that I don't need to worry about borrows and lifetimes).
But C doesn't let programs fail. (or at least not loudly)
That's the problem, to me at least.
I've had this code yesterday on an embedded arm cortex m0:
bool tx_byte(char msg) {
for (int i = 0; i < 8; i++) {
tx_bit((msg >> i) & 1);
}
}
This function wouldn't return.
What's even stranger, the variable i would increment past 7. The for-loop just continues.
I got a warning about the missing return statement in that function, but didn't think there could be a connection to my bug.
It turned out that adding that return after the for loop fixed it.
The warning was one among about 100 others, as I was refactoring quite some code at the time.
I only started working on the warnings after I had decided that me an two of my colleagues were just overlooking something and would fix the bug another day.
All that I wanted to say is that, no, C doesn't make you aware of what is happening. It has this giant pit of UB that does weird things when you fall into it.
I was lucky that in my instance, I got a warning. I know other UB just works most of the time and goes unwarned.
It's undefined behavior, the compiler can do whatever it wants.
Here the compiler assumes the function never returns (because it would be UB since there's no return statement), so it looks like it just didn't bother to add a loop exit at all.
Actually it's only undefined (per C99 anyway) if the return value is used:
> If the } that terminates a function is reached, and the value of the function call is used by the caller, the behavior is undefined.
So if the return value is never used then there is no undefined behavior and a standards-compliant compiler isn't allowed to optimize away the loop exit condition.
That reminds me of my favorite bug in a C program I ever delt with. We had a dorm floors' worth of "Intro to C" students pouring over our friend's buggy program.
With a enough printfs() I was able to whittle the problem down to: A local variable "b" was changing after some function call that looked like: "call(&a)". Turns out because our program looked like "{ int a[3]; int b; ... call(&a); }" an off by one error accessing "a[4]" in the function was overwriting variables never passed to it!
Felt like the champ figuring out. (This is a 15 year old story from freshman year, so I hope I got the details right.)
Update, since this got attention: I remembered slightly wrong: "b" would need to be declared before "a", not after and you'd access it with "a[3]" not "a[4]". So after this, "b" would be 7: "int b=5; int a[3]; a[3]=7;". Looks like I'm still making the same mistakes.
But still: If you passed the array and referenced it wrong it would change local variables in some remote stack frame.
> The thing is, C also makes me aware of what is happening, by letting programs fail when I forget about certain things.
Not necessarily. Many C programming errors invoke Undefined Behaviour, and then "appearing to work correctly" is a permissible option. So you will miss the programming error, until a change seemingly unrelated to that part of the system will trigger some segfaults or wrong results, or allow a determined attacker to use your program to launch calc.exe.
If that was true then nobody would be fretting about unsafe memory access, and the primary source of bugs used in exploits wouldn’t exist.
Unfortunately it isn’t true, C let’s you write plenty of broken programs that don't fail at run time, despite making a complete dogs breakfast of your systems memory space, instead we get crap like Heartbleed instead.
Its also perfectly possible to write broken programs in languages that take care of all memory related issues:
var ch chan bool
for i := 0; i < 10; i++ {
go func() {
// do something amazing
ch<-true
}()
}
Oops, I just leaked 10 goroutines, because I forgot to `make` the channel instead of declaring it, so its nil and will block indefinitely. Unless the program encounters an complete deadlock of all gorutines, this will not crash anything, and the caller of this code has no way of knowing it happened.
Broken code can be written in any language, no matter how much checks & convenience it offers.
This depends on deciding that "broken code" is all the same, but it isn't.
Suppose I screw up writing the GIF decoder for avatars on my forum web site and you are a bad guy who can create accounts and upload an avatar.
If I write the decoder in C it is very possible that your bad files can seize control of the web server, spill user credentials, post nonsense, mine crypto-currency on my servers, anything.
If I write the decoder in safe Rust, some of these things are much harder to pull off, and I need to be really incompetent to cause the worst harm - but you can likely cause a lot of mayhem still if I screwed up.
But if I write the decoder in WUFFS, the best you can achieve is to have the decoder chew CPU in an infinite loop or something. There are no boundary misses, integer overflows, or anything like that in WUFFS. My decoder might render your weird input as a giant orange splodge, or, as I said, spin forever wasting CPU, but it can't accidentally become a reverse shell server, or send you credentials from my password database, such things are entirely impossible.
This is because WUFFS is a special purpose language. There doesn't need to be a way to write these nonsense programs in WUFFS, whereas it must be possible in a language like C to achieve the "general purpose" designator. But this should encourage us to write as little as possible with these unnecessarily powerful languages, like the way you don't use a chainsaw to sharpen pencils.
I don't see any memory getting unexpectedly stomped on.
C is a great language, but it is absurdly common to accidentally write code that writes garbage to nearby memory, or can be exploited to do so. Such bugs are the cause of a vast number of security problems. This characteristic of C is simply not true of most other languages.
C++ has even formalised a concept of compilation false positives (due to issues like separate compilation, which make it impossible for the compiler to know that you’re linking incompatible artefacts).
The Sanitizers catch errors that happen when they're running
So in many situations the Sanitizer gives you a false sense of security that your code is correct because in the test environment, with the Sanitizer, it works, and then in Production, without a Sanitizer, under some circumstances it blows up because those circumstances never arose in your test environment.
Compare (safe) Rust which doesn't have uninitialized variables, and even WUFFS which doesn't have array bounds errors. In both cases the entire problem doesn't exist, programs which exhibit this problem are not valid programs in those languages, which is great because we did not mean to write these programs anyway. Hooray.
If I have 10 different errors, but the root cause for all of them is that I forgot to do something I should have done (initialize a variable), then the cause is the same across all 10 different errors. If I don't start initializing the variables, then I will continue to have erros (which may or may not be similar to each other), thus making the problem reproduceable.
The errors are not the problem, but the result of the underlying problem.
If you have an error that only occurs on your computer, on your compiler version, on your particular optimization level, on this particular phase of the moon, how do you determine that the root cause is forgetting to initialize a variable? In the context of debugging, "reproducible" means "can reproduce that exact error in order to better identify the root cause", not "frequently occurs as a root cause".
I don’t think that a simpler language and a helpful compiler are mutually exclusive. In the article, the author even points out multiple examples where Go (a simpler language, relative to Rust) could output more helpful error messages.
I assert it’s possible to have a simple language where it’s difficult to shoot yourself in the foot. C just happens to be a simple language where it’s exceedingly easy to shoot yourself in the foot.
Rust is built to be 100% memory safe, not just "hard to misuse in practice". Of course the claim comes with some provisos about how you're always relying on 'unsafe' code, but there is real value to it. Zig is interesting but doesn't have that USP.
That’s an oversimplification. There are many reasons to choose one programming language over another. Strong guarantees of memory safety is only one design parameter.
Zig refers to itself with a version number < 1 and says that the language is not yet stable. That's going to filter the less adventurous crowd out right away. And that's probably ideal - zig can work out the kinks with the adventurous crowd and be better prepared for its post-1.0 life.
I haven't used Zig, but my understanding is that Zig is more comparable to C, whereas Rust is more comparable to C++. Zig is a much smaller and simpler language, but Rust has more high level constructs and abstractions.
I can't comment on either, but I do enjoy C and you can build relatively safe C programs using tools like Valgrind and static analysis. So unless you are building the next OpenSSL it could very well be worth sticking to C11.
> I am not looking for a language that is too theoretical, like Haskell.
Rust is not "theoretical", but have some theory applied to great effect. The comparison to Haskell is more in the use functional idioms, algebraic types, pattern matching and bits here and there...
Elm is a good example of what you’re talking about, though it “cheats” a bit by being quite domain specific.
Anyway it’s simultaneously a very simplified cousin of Haskell (similar syntax but most of the advanced syntactic and type stuff dropped), and where the current movement to make compiler error message more readable, conversational, and helpful (e.g. decoration, explaining the “reasoning”, providing possible solutions) originates.
In my experience (and to generalize a bit), people who shoot themselves in the foot with C are the same people who will shoot themselves in the foot with any other language.
Those who aren't careful enough to free up allocated memory are the same types of people who leak memory in Java.
You can't fix lack of knowledge with a compiler. You can only catch accidental mistakes, but that assumes that the norm is for the person to know what they are doing and a lot of people have only superficial knowledge of the tools they are using.
We already have accident catching tools for C. What C doesn't have is a helmet, training wheels and padded walls, which is by design.
In my experience, people who shoot themselves in the foot with a magnetized needle are the same people who will shoot themselves in the foot with any other programming technique.
Those who aren't careful enough to only flip the bits they intend to when programming with a magnetized needle are the same types of people who overflow buffers in C.
You can't fix lack of knowledge with a compiler. You can only catch accidental mistakes, but that assumes that the norm is for the person to know what they are doing and a lot of people have only superficial knowledge of the tools they are using.
We already have accident catching tools for magnetized needles. What a magnetic needle doesn't have is a helmet, training wheels and padded walls, which is by design.
> AS IT TURNS OUT, good craftspeople don't shy away from questioning, criticizing, and trying to improve their tools (or even switching to other tools entirely!)
C is the best tool for a lot of use cases, but no tool can be a perfect fit for every use case. This is the reason that multiple programming languages exist and the world has not converged on using only C everywhere. Also, there are properties not intrinsic to the language that also affect the fit.
Saying "it's easy to shoot yourself in the foot with C" is not constructive criticism because what's easy for you could be difficult for me.
If the Rust fanboy brigade could articulate actual criticism for C, such as the lack of imports, or eskewing the type system with void*, or the signedness of char, or any of the other topics beaten to death in online "C gotchas" power point presentations, then we could have a constructive conversation amongst craftspeople.
I believe that Rust users are past caring about that.
If you feel that C is a good language for your needs, then by all means, use it.
I know that Rust brings me a very efficient safety net that lets me concentrate on the algorithm [1] without having to also spend quite as much time auditing, fuzzing, valgrinding and {thread, memory, ...} sanitizing my code. In addition to the safety added by the type-system, I also enjoy the convenience of tools such as `Drop`, or the ability to derive (de)serializers for most formats, etc.
This is not about criticizing C. This is about picking the best tool for the job. Sometimes, it is Rust. Sometimes, it is TypeScript. I haven't encountered a job for which C was the best tool for me in a while. I'm sure that there are cases, mind you, but I don't encounter them anymore.
[1] With caveats, of course. As all Rust users, I need to please the borrow-checker gods.
> Saying "it's easy to shoot yourself in the foot with C" is not constructive criticism because what's easy for you could be difficult for me.
That’s a fair point. Unfortunately I think that in order to really talk about the pros and cons, we’d need a relatively specific situation. Toy examples are just that… toy examples. FWIW most of the time I will make language choices based on the expected lifetime of a particular project, the relative expertise of team members working on a project, and the ease of delivering bug fixes.
Note that rust doesn't claim to catch all errors, and the article points out places where it doesn't. In rust it is a lot harder to get into those situations accidentally, but you can.
Any non contrived example would be thousands of lines of code. Rust is meant for those areas where you are forced to do weird things, it does it's best but they don't know how to solve the problem of what they need in a way that can't be misused.
Seeing how contrived the examples are makes me more interested in rust.
I use it professionally, and can say: it's worth it. Even the warts are minor compared to how polished and intuitive the language feels after you accept it's dogmas (or die fighting against the borrow checker).
The error messages are very good, cargo is awesome, rust analyzer is already good but still needs some time to flourish, and the performance in production is really predictable. It's for sure not perfect, but in many aspects it feels like a step forward, so much that sometimes I'm wary on how much I'm relying on the compiler to check my decisions for me.
> The thing is, C also makes me aware of what is happening, by letting programs fail when I forget about certain things.
That isn't true at all. The majority of my and other's bugs I hit working with C (I did C for several years at a large company) were because they explicitly did not fail when people forgot things. If you iteratively learn C you learn tons of bad behaviors as C does not tell you when things are broken.
Deadlock detection is an interesting problem. As the author points out, Rust will not detect this at compile time, but there are run time analyzers.
It's worth looking at deadlock detection at compile time today. Not just to prevent bugs. Sometimes, on some CPUs, the compiler could potentially turn some mutexes into fence operations and become "lock free".
In Rust, code tends to do a lot of scope-based locking. There's a lot of
item.lock().unwrap().dosomething()
Now, if the compiler detects that 1) every lock on item is scoped, and 2) everything done inside the lock is short and non-blocking, then that lock could potentially be optimized into fence instructions. Worth thinking about.
There's an analogous case for RefCell items. RefCell is a lot like Mutex, except that you call "borrow" instead of "lock", and if the item is ever "busy", waiting won't help, because your own thread made it busy. If every use is a scoped borrow and there are no inner borrows of the same object within the scope of the outer borrow, that's the good case. Reference counts aren't needed at all.
I think that somewhere in this space lies the solution to the back-reference problem, but I haven't gotten there yet.
Compile-time deadlock prevention very, very quickly begins butting up against Rice's theorem, unless you do something silly like having channels be the only way for threads to communicate. Even then, deadlock conditions are still perfectly possible and very common.
You mean channels like Erlang-style message passing? Where the send is async, and the receiver can pick things out of the queue out-of-order? That is indeed hard to deadlock. It's easy to get other pathological behavior out of them (endlessly-backing-up messages being the one I hit with some frequency) but deadlocks are hard. I think they're doable, but it takes work.
Deadlocking Go-style synchronous channels isn't hard. In practice they're practical to use because they're not particularly hard to not deadlock, either, but I can certainly deadlock them in code if I tried. But I'd be using patterns I'd normally be suspicious of.
(By contrast, conventional mutexes are hard to use without deadlocking, because as soon as you have more than one you have troubles. I use mutexes a lot in Go, but my rule is never take more than one at a time. As soon as you are tempted, do something else. IMHO, the conclusion from the 1990s and early 2000s is wrong... it is not that multithreading in general is impossibly hard, it is that multithreading with multiple mutexes is impossibly hard. If you stay away from that, rather than "impossibly hard" it is merely a challenge, but one humans can rise to meet. I think there's an unjustified residual fear of multithreading to this day that stems from this misunderstanding. And I mean "threads" just as multiple independent simultaneous execution, without regard to any particular technology used to implement them.)
So does “Compile time memory safety”. The way Rust does it is by restricting the valid program to a much narrower set than valid C programs (at least in safe Rust). We may as well find a way to make a programming language that will guarantee zero deadlocks at compile time without being too restrictive, in the same fashion the “aliasing XOR mutable” solved the memory safety challenge. Maybe not, but nobody knows.
That language will need to be built around this limited set of programs though, and is really unlikely to be retrofitted in Rust, like C++ can't realistically add a borrow checker at this point.
Probably not. This would have to be extremely limited and restrictive in order to avoid false positives (which would be outright miscompilations). The more restrictive an optimisation is, the less valuable it is, and the lower the ROI.
This would be an unlikely-to-achieve-anything optimisation around atomics, which compilers already don’t bother much with because they’re hard to get right and have a low ROI.
I don't see how fences are ever enough for mutual exclusion. Maybe you are thinking of transforming a mutex in a spinlock but that would be catastrophic for realtime programs and generally an optimisation that compilers shouldn't make.
What I assume they’re thinking of is, if the compiler can determine that the lock is scoped and the scoped code only reads from shared memory, then it only needs a sequence point (a read barrier).
I don’t think it’s true because if you don’t “hold” the mutex there’s no guarantee that the mutex-protected data is consistent, unless it’s atomic at which point you’re probably better off optimising to an atomic read. Which compilers are unlikely to bother with either way, because optimising atomic operations would be both risky and low-return.
Right. Even if the mutex is protecting a is a single memory location, holding a mutex is still observable behavior (there is a total orders of all critical sections at the very least), and eliding it is questionable.
Isn't that exactly what SeqCst atomic access gives you in this case? Assuming that the critical sections are restricted to doing read-modify-write of an atomic-sized memory location - which would have to be enforced at compile time or require whole-program analysis.
But again, if you only have an atomic-sized memory location, and you bother writing an optimisation for atomic stuff, why bother with barriers when you could just perform an atomic update?
Also SeqCst is way overkill for a mutex, that is literally the use case for acquire and release semantics, the two barriers were named after the lock operation.
there is a total order of all critical sections on a specific mutex, but I don't think it is technically part of the same total order of seqcst atomics (although it might in practice). This sort of stuff is extremely subtle.
BTW, Mutex::get_mut exists and safely gives access without locking, when you can prove to the borrow checker that you have the only reference to the Mutex.
An example of one of the little reasons I expect large-scale Rust code to be faster than C++ in the long term. There's a lot of little things like this that the Rust can do but C++ can't that can add up to a lot over the entire program. But it's really hard to see these in benchmarks because benchmarks are almost never code complicated enough to show these differences, and then even if someone writes one, in the resulting benchmark competition someone may come in and hand-optimize the C++ code for this. But C++ can't scale that sort of attention, so the benchmarked performance won't be anything that real code can obtain.
The rule is that FooInner doesn't know about the lock (so calling other methods is safe), while Foo methods can't call other Foo methods (so you can't lock anything twice). You can even move Foo and FooInner to different modules to hide FooInner's contents from Foo, though I rarely find that necessary.
I know this is subjective, but for me, this works better than what the blog post suggests -- at least, I haven't gotten it wrong by accident yet.
Kudos! I've been successfully applying this pattern in a bunch of places and am happy to report that it is very useful and easy to reason about. It works well in any context where your object access is mediated... Mutex & RwLock, Rc and Arc, etc.
I can recommend the ambassador crate to go with this pattern -- it can be very useful for avoiding boilerplate for the newtype struct.
Really the hardest part is naming... I've also settled on FooInner, but I'm still not entirely happy with that convention.
"I guess that's why the Go compiler is so fast - it barely checks for anything!"
Any serious Go programmer ought to have golangci-lint or equivalent running somewhere. I prefer pre-commit, but CI/CD will do too.
An advantage of golangci-lint is that is not in the core compiler, so you can get some really opinionated linters from the community, some of which you're going to love and some of which you're going to hate, or linters written really quickly and hammered out in the real world rather than waiting for the full release cycle.
A disadvantage of golangci-lint is that it's not in the core compiler, so you have to actually fetch it yourself, maintain it (configure it), etc.
With golangci-lint, Go is still not Rust, of course, but it gets it a pretty decent way towards being a serious, quality language. Most languages need some augmenting to be serious, quality languages, because it isn't generally appropriate to put all the requirements for code into the compiler, for many reasons. Rust is somewhat exceptional in how much they've put in their compiler. Things like C and C++ need immense, high-quality support from static analyzers to be even remotely safe to use on a network. (So, yes, I don't consider C or C++ "serious, quality" languages in the absence of that static support. I don't even consider it close, honestly. I don't consider very many languages to be serious, quality languages out of the box.)
I suspect you know this, but for anyone not aware: Rust's equivalent to golangci-lint is Clippy, which is distributed as part of Rust (at least, with rustup, it comes with the compiler in the default configuration). You have to run it separately, but it contains a ton of lints[0]. The default configuration is pretty good (Most of the warnings I've got with it were either bugs or nudging me towards better APIs to do what I wanted).
And what's really nice about clippy is that it re-uses a lot of the compiler's infrastructure instead of having to duplicate it, so it has the same high-quality error messages, and stays in sync with rust's new language features.
I don't think that was intended as a serious remark, because the compiler-checks part of Rust compilation is not that slow. Rust compiles are slow when generic monomorphization or macros generate a lot of code to be compiled, or when complex macro packages are used as a kind of (slow) compile-time computation. The former can probably be addressed by some version of non-MVP const-generics; we will likely be able to "forward" implementations of generic functions and traits to code that only has to be built once for each set of type-derived parameters.
This question may be a bit off topic, so I apologize. I don't work with Rust on anything except homebrew projects, so time is never really of the essence. At work, I use scripting languages (Python and Javascript - mostly Python). Is compile time anything except an annoyance, in the real world? Has it really run you up against such close deadlines that an hour of compile time is going to break the company so far as the projects that Rust is being used on?
Now, I will fully grant you, I enjoy Rust and like programming in it for fun, but have not kept up on what projects have been made in the wild and professional worlds with it, past Mozilla, TBH. I know a lot of companies are recruiting, but haven't seen much coalesce. Could be my ignorance, though.
It's not about missing deadlines, it's about staying focused and not being interrupted while I'm working.
At work I use both C++ and go. My experience working on a C++ project is that it takes several minutes just to build, so when I make a change and try to run the unit tests, I find myself wasting a lot of time. I'll often get bored and go on HN for example :)
When working on go it's like night and day. Projects mostly compile in less than 30 seconds (unless it has a lot of C++ dependencies). With this pattern I can maintain my focus and continue the modify-build-test loop, I find I end up being a lot more productive.
> when I make a change and try to run the unit tests, I find myself wasting a lot of time
You might want to adjust your workflow and make it fit the language better. Avoid relying on unit tests for semantics that can be trivially checked by the type system. Get into the habit of launching builds as part of your work, then check back on the test results after they're done. A build doesn't have to "interrupt" your flow if you plan it correctly.
I love Rust, but rust being slow annoy me to not end, MUCH more than his complexities.
I have worked in Pascal(Delphi), that is a bit more useful in catch issues at comptime.
Being able to iterate quickly is also a way to fix errors without loose the train of thought!
--
And answering your question: Slow comptimes are directly proportional to burn more energy (no green, cost more). And it impact the cost of that CI pipelines and all that.
And yes: I have been hit by slow build times with Rust with serious bugs/problems with customers, unable to move fast enough.
One of the worst affect me with a combination of several issues that get slow to note because each one was slow to compile-ship-see effects, enough bad to almost lose a few customers...
So, in my mind, the #1 thing I wish Rust has, more than anything, is speed!
(Author here) Absolutely! I love TypeScript and refuse to write JavaScript without it (even just in JS-checking mode with jsdoc annotations), but I'm already fighting, like, 4 different battles in that article, I didn't want to bring a fifth one into the mix.
I wonder if there's a future where you have a language model that actually inspects the name of your functions, does an automatic code-gen of them from the context and then compares your implementation and gives you lint errors like:
fn add() { ... }: mispleading name or incorrect implementation `a - b`
You know, once upon a time I would have thought that was categorically impossible, but now, you know, it actually seems like something that might not be that far off.
> It's just not something Rust is designed to protect against - at least, not today.
> fn add() { ... }: mispleading name or incorrect implementation `a - b`
Advice like this is frustratingly non-general: you might be doing addition over GF2[1] in which case `fn add(a, b) { a ^ b }` is a perfectly sound function. I've worked on a codebase, in Rust, where that's the case[2]!
Linters should be frustatingly non-general. They should alert for the common case and special cases should have a "#[allow(xxx)]" or "#pragma warning disable xxx" (or whatever does that in your language) and, possibly, a comment.
This, incidentally, also helps fellow humans reading the code who'll see that it's all intentional and won't flip with "wtf why does it XOR in an add method?" even if just for a second.
Only up to a certain point. If you add a few pragmas before each line of code, your code is not completely illegible.
As an example, those linters that have a message for each programing style and a rule to convert between them are stupid. They might be useful as some other kind of code, but as linters they are not just useless, they are actively harmful.
Then the function name, class, and/or comments should be making that clear! In this case, the BatchGF2 class does so. Maybe somewhere else a Fourier class has `fn add() { a * b } // a and b are in time domain`
but a standalone add() function with no comments or class should probably not be doing a ^ b, a - b, or a * b.
some github copilot like linter can then draw from other GF2, Fourier, etc code references vs vanilla add functions
That’s abstraction breaking. The code calling add should generally not need to be aware of the implementation details of add. The users of the add API should generally not need to be aware of the implementation details of add.
In the case of a Galois field I think you're always intentionally reaching for the tool, you use it because of those implementation details. Making sure the struct's name clarifies that that's how the arithmetic works is appropriate.
In that case you definitely want to implement the standard Add trait, so the user can naturally write their maths. They've already opted in to the behavior with a call to Gf2::from.
It's quite fitting that this sort of thing is dubbed "autopilot", because in all likelihood it will have the same problem as gradually more self-driving cars. There will be a level of automation where the computer still makes errors frequently enough, but the user is sufficiently disconnected from the action that they lose all attentiveness.
If you are willing to accept quite a few false positives and false negatives, I think current techniques in machine learning can deliver on this promise. (If you put in all the work to train a good mode etc.)
Btw, have a look at hlint. It sometimes gives rather involved hints about how you can restructure your code.
This kind of linter would be useful for teams whose native language isn't English (such as ours). They often choose bizarre names for their functions, it's not always clear what a function does by reading its name alone.
Just run the output of autopilot in a symbolic execution environment and diff it against the code paths of the user code. This is one (admittedly convoluted) way of implementing what OP wants.
Seeing the RwLock deadlock surprised me. I tried making the standard library's implementation deadlock instead and I'm unable to reproduce it (even with 1 million iterations). The parking_lot implementation does seem to deadlock however.
I suppose the parking_lot implementation elides some checks for the sake of performance.
EDIT: It seems it can happen with the standard library's RwLock too[1]. Perhaps it's a Linux peculiarity?
These things are often tricky to reproduce, but the deadlock should happen from the spec, right? Essentially, a pending write lock should block new read locks from being acquired -- this is a feature, if it didn't do that, the write lock would starve -- so if you try to acquire the second read lock while the write lock is pending, you get a deadlock.
> These things are often tricky to reproduce, but the deadlock should happen from the spec, right?
Pretty sure the spec is entirely silent on the subject.
> Essentially, a pending write lock should block new read locks from being acquired -- this is a feature, if it didn't do that
That is an implementation detail of the lock. I don’t think either POSIX specifies the bias of its rwlock, and Rust explicitly documents that it does not and delegates to the OS.
Plenty of RWLock will, in fact, starve writers under heavy contention (both read-bias and write-bias have their pros and cons).
You’re right, I misremembered this completely, that’s very good to know. I’ve recently mostly used parking_lot, which guarantees prioritizing writers. That’s also the behavior where the post I replied to wondered if parking_lot was missing a check — in fact no, it has an additional feature.
> It seems it can happen with the standard library's RwLock too[1]. Perhaps it's a Linux peculiarity?
I would not call it a “linux peculiarity” as any system could have any priority policy and could change it unless explicitly guaranteed, which this rarely is.
I’d just like to take a moment to compliment the author on repeatedly producing content that is deeply technical yet very entertaining. As someone who has used both Rust and Go in anger, I enjoyed this one and it’s predecessors a lot. A big step up from a lot of the typical blogspam.
> It's a pretty radical design choice, to force you to be aware that, yeah, since you're creating a new value (out of the two parameters), you will have to allocate. And so it forces you to allocate outside of the Add operation itself.
Forcing you to think about allocation is good for many situations.
Making you do an extra allocation is a bad way to do that. And the only way I can think of to avoid that would give up having the function be generic.
>Making you do an extra allocation is a bad way to do that. And the only way I can think of to avoid that would give up having the function be generic.
Sure, if you want to optimize that, then create your "foo" string with the extra capacity instead of doing `"foo".to_string()`, eg `{ let foo = String::with_capacity("foo".len() + "bar".len()); foo.push_str("foo"); foo }`. It has nothing to do with the generic `add` function; it can stay generic as it is just fine. Even a function specialized for adding String and &str could not go back in time and change how much capacity the String was created with.
Edit (1h later): It's also worth noting that:
- This desire of `Add::add(&str, &str) -> String` only seems to come up with people learning the language and bringing expectations from other languages. I've never seen production code that would've benefitted from it.
- As much as the ship has sailed with libstd functions hiding allocations via the global allocator where you may not expect them, the less new cases get added the better. So I'd rather not have `impl Add<&'_ str> for &'_ str { type Output = String; }` added to libstd.
- If you really do have two str literals that you want to concatenate, just use `concat!()`. Not only does it work, but also it works at compile-time and without any allocation (it evaluates to a `&'static str`).
> It has nothing to do with the generic `add` function; it can stay generic as it is just fine. Even a function specialized for adding String and &str could not go back in time and change how much capacity the String was created with.
The only reason we're converting a parameter to a String outside the function is because it has that awkward signature.
I was suggesting a specialization for adding &str and &str that requires minimal code at the call site and handles the capacity issue inside the function.
But that means we're no longer just applying + to types that have Add, we've given up all that convenience because it's not compatible with performance. This wouldn't have to be a tradeoff if Add worked differently; having to make this decision is a flaw with Rust. And you could still make it explicit that allocation is happening.
> - This desire of `Add::add(&str, &str) -> String` only seems to come up with people learning the language and bringing expectations from other languages. I've never seen production code that would've benefitted from it.
It's not a big problem that strA.into_string() + strB tends to allocate twice, and you wouldn't notice it in production, but it's still a waste of cycles caused by Rust's abstractions. Not even the abstractions, really, they left that implementation out as a reminder to the coder. It's a bit of an icky tradeoff.
What do you propose, that fn add goes back in time taps the shoulder of the allocator as it's allocating what will eventually be the first argument and ask to give it a bit of extra capacity?
This is explained in the official docs: to avoid too many allocations when repeatedly concatenating strings.
> Implements the + operator for concatenating two strings.
> This consumes the String on the left-hand side and re-uses its buffer (growing it if necessary). This is done to avoid allocating a new String and copying the entire contents on every operation, which would lead to O(n^2) running time when building an n-byte string by repeated concatenation.
Sure, but it would have been a legitimate choice to let + do that, and use other functions to handle those use cases where it matters. After all, I expect to get a brand new value from an addition. If
a = 2
b = 3
c = a + b
I certainly do not expect "a" to be modified. So why would I expect it with strings? It would be no problem to warn the user about repeated concatenation with +, as I recall Java with its immutable strings does just that.
>After all, I expect to get a brand new value from an addition. [...] I certainly do not expect "a" to be modified. So why would I expect it with strings?
To be clear, your expectations are not betrayed when doing String + &str -> String in Rust. The String addend is consumed by + and attempting to reuse it will produce an error. So it's not like you notice that the String you used to have has implicitly been modified.
We call this being consumed, "moving". And in Rust actually your integer 'a' was the special case. Every value gets moved when you add it to something else, but this integer 'a' is still available to be used after it was moved.
What happens there is, the integer types implement Copy which is a special Trait that says, "I promise I don't have any meaning beyond my actual pattern of bits in a memory location or a register or whatever". If the type implements Copy then when 'a' gets moved, it is logically still fine to keep 'a' around anyway in case anybody still wanted it, as you can freely make or destroy copies of the bit pattern and this type has no meaning beyond the bit pattern.
You can derive Copy (or implement it) on your own types if the Rust compiler can see why that's a reasonable claim to make for your type, otherwise you can't. So String couldn't be Copy, because it clearly has a reference to a vector (of bytes) inside it making it work and so the bit patterns are just pointing at the memory address of that vector of bytes.
My expectations are betrayed, but I do appreciate that Rust at least stops me if that expectation matters.
Really, whenever I concatenate something variable-length like a string, I know there's potential allocation under the hood. There has to be. What I don't get is why that allocation has to be hidden as a mutation of a, rather than just be explicit as yeah, c is a new thing. I expect functions to return new things rather than mutate things via side effects. Isn't it better to let addition stay a function, rather than overload of + being slightly more convenient/efficient for building up strings?
"Sure, but it would have been a legitimate choice to let + do that"
Yes, it is a legitimate choice indeed. Many fine languages have made that choice.
But it is also a legitimate choice not to. We need at least some languages that don't do that. We really don't have very many that are this careful about allocation. Most of programming language history since C and C++ has been trying to "fix" this problem, and make allocations easier. Rust has chosen to be one of the languages that stays at the highest level of care about allocations. As a result, it will not just casually allocate memory for strings.
That's what this is; a design choice. If you don't like it, don't use Rust. I generally don't use Rust, because my tasks don't call for this level of control. But if I ever do have a task with that level of control, I know where to go get the tools to solve that problem, and I'm glad that someone made that choice. I'd be in real trouble if nobody chose that.
(Another example of a common pattern in programming language history can be seen happening here, too. C and C++ were this careful about allocations for the most part, but they made it really hard to use at all, and borderline impossible to use safely. Then, people incorrectly attributed that difficulty to being careful about allocations, and subsequent programming language design went in the direction of getting away from that level of care. But what we see in Rust is that if you are that careful, and add in a better toolset to deal with it, you can blunt the disadvantages while reaping the advantages. The disadvantages don't entirely go away, but if you reduce the "costs" in the costs/benefits analysis, the benefits become more attainable for engineers, especially when tooling can be developed to also increase the benefits. Haskell doesn't do this with allocation but it has a similar story for the things like how hard strong typing is. IMHO the mid-1990s to 200xs in programming languages was a story of running away from hard problems, but the big story in the 2010s has been the running towards hard problems, and I'm liking the results.)
> We really don't have very many that are this careful about allocation.
You're missing the point. I don't think this is being careful about allocation. There is potential allocation either way - when your string (or vec) grows, allocation under the hood may be necessary too.
Do you see my point? Surely doing the allocation inside "a" isn't better than just allocating a new "c" for explicitness' sake. It may be better for performance's sake, but that's not justification enough for making (+) an impure function as I see it. We could always just write it out as .append(b) or whatever if that's what we wanted.
You're welcome to dislike it because it's "impure". Actually, depending on how rigidly you define purity, either both implementations are pure (both return the same output for the same input) or both implementations are impure (both modify global state via the implicit global allocator).
But you're mistaken in thinking it's something obvious such that others should agree with you. As I said previously, the reality is that you cannot write a program that lets you differentiate whether String + &str reused the same String or allocated a new one. So your hangup is entirely over the documentation revealing the extra information that it has the efficient implementation, not the inefficient one.
The entire point of passing in a String, according to the article, is to "force you to be aware that you will have to allocate".
But you're saying that an extra hidden allocation doesn't matter and you shouldn't have a "hangup" about it? Trying to follow the same rules as the article seems reasonable to me, not a "hangup".
>The comment you replied to was talking both about the negatives of hidden allocations and about purity
That is incorrect. The comment was requesting that `Add::add(String, &str) -> String` should be impled by creating a new String instead of mutating an existing one. My comment pointed out why this request was meaningless.
That can not typecheck in Rust, unless you use something really odd like `Into<Cow<‘a, A>> where A: Add<B>` as your input types.
And in languages where it does, you need either ubiquitously mutable strings (which everyone’s moved from because it generally screws you up) or refcounting.
I think this is right. I haven't written Rust before.
I copied Add to RAdd because you can't implement traits you don't own on types you don't own. And I didn't bother implementing it for numbers.
There's a trait implementation that takes String and &str (directly copied from Add), and there's a version that takes &str and &str. Both of them return String.
The generic function is just as generic as it always was, but of course you can't do this with actual Add without changing the standard library. You'd have to make the function less generic instead.
> And in languages where it does, you need either ubiquitously mutable strings (which everyone’s moved from because it generally screws you up) or refcounting.
It's the same String as ever. Owned and mutated by adding.
No, they won't in standard implementations we are not exactly allocating, the allocated buffer doubles each time, so it's amortised and not O(n^2). There is a downside for non trivial example code.
If you want a function to do that, you're absolutely at liberty to write one. You could even write a zero-overhead wrapper around the references to allow for overloading. Rust won't let you implement Add<&str> for &str though, you'd need to use your wrapper.
I wouldn't recommend doing that, though, and not just because of "surprise" allocations: in my experience writing Rust code, it's fairly rare to need to add strings together. I'm much more likely to use a format macro.
Wow, what a horrid and cool way to create a function that accepts generic types in Go :-)
func add(a interface{}, b interface{}) interface{} {
if a, ok := a.(int); ok {
if b, ok := b.(int); ok {
return a + b
}
}
if a, ok := a.(string); ok {
if b, ok := b.(string); ok {
return a + b
}
}
panic("incompatible types")
}
You're a bit (~10 years) late to the game of complaining about having to use `interface{}` as a substitute for generics in Go (although it has worked pretty well in most cases). Go 1.18 will have "real" generics, so stuff like this can be expressed more elegantly...
brouhaha go bad. Why are you writing an add function when 1 + 1 will do?
I mean I know there's better examples out there, but in practice I've not missed a lack of generics in Go yet. Mind you, I'm not writing libraries for e.g. json parsing.
Actually there is one place in my application where I use interface{}, it's a bit of code that converts values to XML strings, but it needs different logic for different types. I don't think generics will solve that: https://go.dev/play/p/TtwyNZ7oOPx
i think constraints will help which are part of generics at least you'll be able to limit the types your functions accept. but otherwise no, generics do not solve sum types by themselves.
This is a very long post! The bit that actually addresses the title, after all the "here's a thing I tried, here's the weird error message, here's the mistake I made" business, starts way down towards the bottom, with this text:
Before we move on, I'd like to congratulate you on reading this far. Believe it or not, you're in the minority!
The title is very misleading when the content is a very, very general discussion of compile-time versus runtime errors, mostly focusing on problems that Go doesn't catch at compile time.
TL;DR: the mistake Rust doesn't catch is recursive mutex locks.
It does seem like it would be good to catch that one, and that the borrow checker in some form ought to be able to do it -- it should be able to track the static scope of a mutex lock, and block further attempts to lock it inside that scope.
As others have noted, that's just how I write articles. More specifically, I have an end goal in mind, and to get there I need to establish a scene: make sure that we share a lot of context so that we can consider a problem together from the same angle.
Most articles don't do that, and so they're "short and to the point", but only to a tiny audience that already has the same shared culture. There's always readers complaining that they knew half of it and so I "wasted their time", but they forget about the other half, and also other readers.
Ideally, to please everyone, I would have to write a tailored article /just for them/, filling just the gaps they have. I do that on occasion, but I charge a lot of money for it :)
Absolutely -- sorry if I came across as overly negative! I was trying to stay reasonably neutral. Your approach isn't one I'm keen on but I can see your subscribers get a ton of value from it.
Rust will catch problems that would be UB in C and would continue executing in an invalid state. Crashing is preferable in such scenarios because you avoid potential security issues stemming from the UB and the developers can find out about the problem.
I bet Rust programs crash more frequently than C programs overall. Undetected UB is far less desirable than clean panics though.