I think readability is a major part why java was hugely successful: C programmers could read java code and immediately understand what the code did and able to make minor modifications to the code.
This is very difficult with Rust code. Rust "learned" too much from Perl in this regard.
This isn't some abstract "readability", though, this is "how close are your langauges in the evolutionary tree". Where ALGOL is the proto-indo-european of quite a lot of langauges, but not all.
Now like I said, I'm a Rust programmer, so of course I understand PartialEq's definition there, but, is it really hard to see what's going on as a C programmer? There's a lot of magic you don't have, you don't have traits, or references, or the Self type or the self keyword, or really any of the mechanics here, and yet it seems pretty obvious what's going on, doesn't it?
I literally don't remember any unpleasant surprises of this form when learning Rust. Actually mostly the opposite, especially for != I thought, from experience with other languages like Java, C++ or Python, OK obviously at some point the other shoe drops and they show me that I need a deep comparison feature. Nope, Rust's comparison operators are for comparing things, it isn't used to ask about some surface programmer implementation detail like the address of an object in memory unless you specifically ask for that. If Goose implements PartialEq, so that we can ask if one Goose is the same as another Goose, then HashSet<Goose> also implements PartialEq, so we can ask if this set of geese is the same as another, and HashMap<Goose,String> likewise, so we can ask if these two maps from geese to strings are mapping the same geese to the same strings.
Also, it has often been surprising to me how code that looks like it's entirely read-only will try to move stuff.
E.g, this works:
let numbers = [1, 2, 3];
for n in numbers {
println!("{}", n);
}
for n in numbers {
println!("{}", n);
}
This doesn't work:
let numbers2 = vec![1, 2, 3];
for n in numbers2 {
println!("{}", n);
}
for n in numbers2 {
println!("{}", n);
}
I understand why that is and I know how to fix it, but I don't like it. There is a long list of things in Rust that I find understandable but also unpleasant. In that sense it really is a worthy successor to C++. Smart people making unpleasant decisions for good reasons.
I think your mental model is wrong if you thought a consuming operation like IntoIterator::into_iter is read only.
The reason the first one works is that [T; N] is Copy if T is Copy, and i32 is Copy so therefore [i32; 3] is Copy. So it's actually consuming two arrays, each of those for loops consumes the array to make an iterator, but since it's Copy it just leaves a perfectly good copy of the array behind in the variable. The compiler can see what we're doing here and won't pointlessly duplicate the array (presumably) but that's conceptually what's happening.
The second one doesn't work because the Vec<i32> isn't Copy, so, we consume it and now it's gone, when we reach the second for loop the variable numbers2 is gone already.
There is a difference between a variable and a value. The value is definitely read only or you would need a mut modifier. The variable binding changes in a similar way to shadowing it with a new 'let' on the same name.
This would imply that the variable can get its value back once the new binding is out of scope. But that is not the case as this doesn't compile:
let s1 = String::from("abc");
{
let s2 = s1;
}
println!("{}", s1); //borrow of moved value: `s1`
But I do understand what you're getting at. In a sense, a move is not a runtime concept at all. The compiler simply considers the variable as no longer readable unless and until a new value is actually written to it.
I find Rust extremely unreadable and can detail exactly how:
I know ripgrep is written in Rust and works really well and "does one thing", so I'll go to its repo ( https://github.com/BurntSushi/ripgrep ) and read through a bit. First, I'm looking for a "src" folder, but that does not exist, so right off the bat I am a little uncomfortable. Nothing in the root folder looks like it would be the source. What are "complete" and "scripts" and "pkg" folders? Because those (in that order) would be what I check.
"complete" looks like command line completion. The "scripts" directory has one file, "copy-examples". "pkg" contains folders "brew" and "windows". Still lost...
I thought "crates" was like modules for Python, so I think that would only contain dependencies and stay out of there.
Finally, I relent and open "build.rs" which I suspect is something like "build.ninja" and I can figure out which node has source files. Stupid me. "build.rs" IS the source file.
Oh. No, actually, it's not all of ripgrep, in fact. It's the tip of the iceberg. The source is in crates/core/app.rs, and build does manpage compilation, registers shell completions, etc.
So, line one. I don't know what the return value App<'static, 'static> is. I understand it's an App object, and the tick (no idea what it's called; this is starting to feel like a fracture mechanics lecture) I know has to do with ownership. It's tough to see how there are two subtypes in App<> when the assignment of app has ten functions that assign attributes. I can certainly READ what is being assigned to the App object. I wouldn't be able to WRITE any of this so far. Unimportant, it's not the meat and potatoes of ripgrep.
Now I'm scrolling, looking for something cool to analyze and run across Vec<&'static str>. Okay, I thought tick was ownership, but I also thought ampersand was ownership-related. Maybe one is mutability? (This is like knowing how to play music really well but now learning a totally foreign instrument.)
Skimming, it looks like this whole file is the argc/argv parser. I'll look for something interesting in main.rs, search.rs, and subject.rs (the last because it's an unusual name).
struct Config {
strip_dot_prefix: bool,
}
Well. I can read that! Next line...
impl Default for Config {
...
...not that. It's probably something like a class method.
As I'm reading through, it feels like a LOT of kicking the can down the road and verbosity. There's a function binary_detection_explicit(self, detection) that just assigns detection to self.config.binary_explicit and returns self. binary_detection_implicit() does basically the same with a different assignment member. getters and setters, roadblocking readability since the dawn of computer science... (a few lines later, the function has_match() returns the value of has_match. ugh. This is why I prefer crudely written C to textbook C++.)
I see .unwrap() in a line. This tripped me up during the tutorial; still no idea what it does. Sure I can look it up, but there are a hundred other terms to learn: unwind, map, Result (OH! The two App return types are probably Ok and Err ... maybe?), convert, concat, const, continue, ...
subject.rs, btw, can probably be written in three lines of really dense Python or ten lines of really well-written Python with a few other lines of Doxygen. Instead, it's about 90 lines of Rust with another 50 lines of comments.
Finally, I find something noteworthy.
fn search(...) ...
fn iter(...) ...
for subject in subjects {
searched = true;
let search_result = match searcher.search(&subject) {
....
This is all extremely readable and just feels like a standard queue in any language. But then there are lines like:
if let Some(ref mut stats) = stats {
*stats += search_result.stats().unwrap();
}
I give up. Are you assigning the variable stats to itself? by reference but mutable? If Rust doesn't have pointers, what is *stats? unwrap() is no longer the most confusing part of this.
-----
Postface, I'm originally a mechanical engineer who eventually got roped into writing driver software, so C is my comfort zone and I don't need anything much more complicated than "set bits here, read registers, use a 40-year-old communication protocol, handle errors".
There's simply no easy analog between Rust and Python ... or Rust and C ... or Rust and SQL ... or Rust and PRACTICE ... or Rust and any other language I've learned. I understand a lot of the CWE Top 25 software errors are mitigated by Rust and it's not just a theoretically correct but unusable language, but there hasn't been a strong reason to learn it, and the learning curve is made steeper by preconceived notions of every keyword.
-----
I forgot to compare with grep.c
Extremely readable. I know right away which part is args, where memory allocation happens, where file handling occurs, and then the "hard part"---where the magic happens---is the last page on my monitor. I can just stare at that and mentally deconstruct until pieces fall into place... Look up regcomp()? check. Look up grep_tree()? check. Read the kernel.org discussion on why grep_tree() was written and how it relates to ensure_full_index(), which explains some of the other variables.
The whole thing sits nicely in a few pages with terse but clear comments (unlike this post) and looks familiar even though I've never seen the source before.
Only because of the comments did I realize that calling drink.unwrap() will panic (throw an error?) if that argument is empty. So, unwrap is just the "if (... == nullptr) { return ...; }" of Rust, it seems.
But it's somehow an alternative to .expect() and has its own alternatives, unwrap_or/unwrap_or_else/unwrap_or_default, and it also seems optional. And my time in the rabbit hole is over; my children need to go to bed.
This is really helpful to understanding what you mean. If you would be interested, I could go through all that and explain each piece. But the normal answer of “spend a few hours reading the rust book, it will be faster then trying to muddle through” is fine, and misses the point.
You are absolutely right. The foreign-ness you are seeing is the ML side of Rust’s roots. Rebinding, restructuring, small wrapper types, and of course all the functional programming bits and bobs that people coming from, say, Haskell might see as being quaintly simple, but arcane to anyone else.
ripgrep is way more abstracted than most greps because ripgrep splits a whole bunch of its functionality into crates. The idea being that others can then re-use the reasonably sophisticated infrastructure that ripgrep uses to write their own bespoke grep tools. ripgrep internals are more like a bare-bones and under-developed grep framework. If you go back to the initial version of ripgrep (0.2.0), you'll probably find it smaller and significantly easier to read. There's a lot less abstraction. FWIW, I generally regard my abstraction attempts here to be a partial failure. The hit to code readability has been quite large. I also fucked up at least one abstraction boundary. On the other hand, it has enabled people to maintain very small patches to make ripgrep work with other regex engines.[1] (And of course, ripgrep uses the abstraction to make it work with both Rust's regex crate and PCRE2.)
I wouldn't call Rust an easy language to learn. It could be easier depending on your background. For example, if you have an ML/Haskell and C++ background, then Rust will probably introduce very few things you haven't seen before (that being the borrow checker). But if your background is, say, Python, Java and C, then there are going to be oodles of things in Rust that will be novel to you. That basic vocabulary will be more difficult to acquire.
I'm somewhat surprised you feel confident enough in your understanding of Rust to declare you could replace 50 lines with 2 or 3 lines in Python though. Meh.
> getters and setters, roadblocking readability since the dawn of computer science...
They are tools of encapsulation. I don't treat them as boiler-plate. I treat them as things that I use when I care about encapsulation. And when I don't care about encapsulation, I don't use them. My personal style is to lean heavily on encapsulation.
love your code (erm, the product of your code, at least), use it every workday, literally
also, true about rewriting subject.rs, that was a bit of hyperbole about how everything spreads out vertically and seems to do one task per screen-full of text. I'd need at least as many lines as functions and classes and returns, so about 40 keeping everything clean and not combining/deleting classes. which is what you have if you combined your multiline statements.
(aside, i don't need to know how to read rust to understand subject.rs. The comments describe everything in plaintext)
Do you expect there to be an easy analog between any two random languages? Even two languages as similar as javascript and python I would not expect to immediately jump in and understand something written in one by just kinda guessing as to what something I didn't recognise meant, or assuming that just because something is written the same as in one language it means the same as in the other.
Yep. Honestly, I don't find it too difficult to parse new languages even if they aren't C-like or Python-like (or Matlab-like or BASIC-like). I don't know Java, but I can read the Falstad circuit simulator source. It's just... very recognizable code patterns---almost a dialect of C++.
I'm not a database expert, but I became comfortable with SQL this past year so I could better understand how my company's office software is structured. I can cobble together enough Javascript to create an own online sheet music book using abcjs, a directory full of ABC files, and jquery. I was able to barge through enough C# to write a crude 2D game during a game hackathon in college.
In fact, AFTER learning Python, a lot of C became easier because I knew HOW a data structure should behave, which then influenced how I would create it from structs and functions.
However.
I get hung up on Kotlin. Kotlin reminds me a lot of Rust in its syntax, and Android has the same habit of renaming every single concept.
In general, the more keywords and syntaxes two languages have in common, the easier to read. (In SQL's case, I think the relative lack of keywords and rigid query structure made this a nonissue.)
Well there it is. Rust is difficult not because it's inherently difficult but because it's not the spitting image of the languages you've learned and you're not interested learning (which is totally fine). Things like putting dependencies (instead of program source) in 'modules' seem intuitive not because they're inherently intuitive or common but because you've taken the time to learn Python.
IMO ripgrep is a poor example of project layout both because it does "non-standard" things and because it encompasses more than the C-based grep you're comparing it to. A simple project will almost always have its source in a directory named "src", and a project with multiple subprojects (like ripgrep) will typically not shove them into a "crates" directory.
Compared to GNU grep, the ripgrep project serves as the repository for a bunch of libraries as well as the ripgrep binary. So there's already a lot more moving parts than the name "ripgrep" might otherwise suggest.
I don't know Java, but I can read the Falstad circuit simulator source.
If you can read Java, I'm surprised impl blocks threw you for a loop as the first comparison that comes to mind is that with Java interfaces. A quick look at the Falstead simulator suggests it's using a pretty small subset of the Java language (e.g. no generics, interfaces, or exceptions) which would certainly make it seem a lot like C. Not all Java is like that.
& and *? Akin to reference and dereference in C, and given that you can overload both the comparison to C++ seems pretty apt.
In any case, for more direct explanations the API reference is a better place to look. Two short sentences explain what unwrap and explain do, and what the differences are.
Oh interesting, I'd never seen that before. However I'm not a monorepo kinda guy and to me the ripgrep repo already looks way too busy. By the time you've enough crates to shove them in a 'crates' directory I think it's worth splitting them up into different repos.
Blech, no. That would make the maintenance burden of dealing with those crates even worse than it already is. And just imagine having issues spread out all across different repos. Oh my goodness. It would be absolute madness.
I have split out some crates that obviously stand-alone. The `termcolor` crate was born inside of ripgrep but now it's in its own repository. The `ignore` is another candidate for splitting out because it has broad utility, but it needs a lot of work before it's something that can standalone as its own project IMO. But most of the rest of the crates, i.e., grep-{matcher,searcher,printer,regex,pcre2} are essentially what ripgrep is. Those will never get split out. If I did, the ripgrep repo wouldn't actually contain the most interesting parts of ripgrep! Not good.
With all that said, I am a monorepo guy. Not for ideological reasons. For practical reasons. I also maintain dozens of crates, most of which are in their own repositories. The only reason I do a different repository for each because that's what the custom is and it makes collaborating with others easier. If the code was just for me, it would all be in one big mono-repo. No question. It would be A LOT less work.
With all that said, I am a monorepo guy. Not for ideological reasons.
For practical reasons.
Yep. It's a balance. The big problem I have with large(r) repos is that everything git related is just slower and takes way more space. I've got some git prompt magic (using gitoxide of course) and it just drags with the rust repo.
At one point I wanted to make some changes to a man page in FreeBSD. Sure, it's neat that there's thirty years of history. But it's thirty years of everything's history. Shallow cloning made it a bit more bearable, but the UX for trying to grab just a directory or two from a repo is still petty gnarly.
But it's all just emacs vs vi all over again (and I've settled on helix and sublime anyhow).
Cargo is quite bad at true monorepos, so in some sense I agree, but I think there’s often a sweet spot between the two where this makes sense. Some projects are just inherently larger, and breaking them down helps, and spreading the repos out hurts. Just depends.
I've been leaning Spanish for about the last six years. I'm reasonably effective at it. Not fluent, but I can speak and understand it well enough that I often don't notice how easy it's gotten for me. A while back, I tried reading something written in German and was actually surprised at how little I understood. Clearly I can understand not-English, why was German so hard? I spent a little while trying to parse out each word, guess common roots, vaguely remember the little bits of German grammar I've absorbed over the years, and so on. It was illuminating. I could not just force myself to understand German by relying on my success learning Spanish. (This is a real story, btw. I really did this.)
This comment reminds me very strongly of myself, trying to read German. You know C, why shouldn't you be able to understand Rust? But they're not the same thing, and they communicate different things in different ways. There are concepts that simply do not map directly. Someone could very easily write the exact same post from the perspective of a Rust programmer starting with C. Where is the equivalent to Cargo.toml? What is this Makefile thing, and why doesn't it look like a configuration file? Etc. No amount of just forcing yourself to try is going to provide clarity unless you already understand the idioms and structure at play.
With that said, I have written embedded Rust and it is an amazingly freeing experience. Writing C right now feels like trying to balance a knife on my finger. With Rust I can just get work done. I strongly recommend you spend the time to learn it. Even if it doesn't convince you that C's days are numbered, it will make you a better programmer by training you to think about safety and correctness up front.
My mom was a Spanish teacher in a reasonably hispanic part of Florida. I'm now a permanent resident of Germany. Your example resonates.
-----
I tried Embedded Rust about 4 to 6 years ago and found the experience underwhelming, as I was installing a TON of packages just to get Blinky and seeming to bypass a lot of standard Rust (no_std, no_main, use panic_halt as _, everything GPIO is unsafe, etc). The last straw was that imported Cargo packages had their absolute path somehow hardcoded in the symbol table, so my debugger would try to load, like... C:\Users\jacob\stm32-rust-thingy\app\solve.rs (and fail, because I'm not jacob and don't have his source code)
I'll give it another shot. I'm sure the "many eyes" principle (and better/more tutorials in the past few years) make the process easier now than then.
> I spent a little while trying to parse out each word, guess common roots, vaguely remember the little bits of German grammar I've absorbed over the years, and so on.
And German has a particularly special trap for "trying to parse out each word": trennbare Verben (separable verbs). Unless you know enough of the grammar to know when a piece of the verb has been unceremoniously punted to the end of the sentence, trying to use a dictionary to find the meaning of the verb of a sentence can be an exercise in futility.
> Clearly I can understand not-English, why was German so hard?
I'm reminded of a Mexican friends mother, she and her aunts went to Europe. They loved loved Italy. Partly because if an Italian spoke slow and clearly they could understand what the person was saying.
My experience with C# for instance was it was really easy to pickup. Because snippets of code are generally C like and procedural.
Heard the above about said about go. If you know Java and JavaScript you probably know go already.
I found python not too hard because I knew perl sort of. Mostly enough experience with it to avoid it.
I suspect if you know C++ and a ML language Rust is a breath of fresh air. But if you don't it's not. It's gross. And very very slow iteration times.
This self taught Mexican genius had an amusing description of ML languages.
You look at the code and go what? Then you cock your head and it's still what. And then you turn your head sideways and suddenly it makes sense. Then you look at code in another language and go what?
I fail to see how this is any worse than a whole bunch of sophisticated Python, C++ or Java applications I've tried to browse. It isn't as simple as a single code unit with a main entrypoint directly in a `src` folder, but I don't expect such things from reasonably complex software projects.
As for the source code itself, I agree, Rust does rely more on library functions than language constructs on some things (e.g. unwrapping optional values), but I'd argue this is just more different form C rather than strictly worse when it comes to readability. I wouldn't complain about how difficult it is to read Greek as a person who only knows English.
Not sure I follow. How did you ever read or learn anything more than pseudo code or, at best, Python? Any language at all will have hurdles like those.
Rust is a starkly different vocabulary and syntax than other languages I've learned. The error handling is different than both Python, C, Matlab, etc. Objects suddenly have ownership that has to be respected.
If I see Java source, it makes sense even though I don't know Java... at least up until how javax.swing layouts are constructed or I see a really obscure annotation without comments. If I had to add something to the Arduino 1.0 IDE, I'm confident I could figure it out in a day or two, because a class is a class, lines end with semicolons and comments look /* like this */, I can guess the existence of sort() or random() or an int being either int, Int, or Integer.
In contrast, a Rust "trait" being roughly equivalent to a C "struct" would not have been in my top 20 guesses. I can't recall trait being a keyword in ANY language I've used. I have no idea what Box and Arc are, because to me, those are "what you put tools in" and "plasma from too much potential difference".
Yes, I can check the reference (constantly) but at some point looking up every word or tick mark gets in the way of absorbing the story. The punctuation in particular bothers me, because it's not searchable via engine. There's no special Google result for
rust '
I can't be more clear. My ability to read/write code extends beyond pseudocode. Rust is just not an easy transition for a "primarily C" user, in my experience and opinion.
Arc is the most confusing name out of these for sure; most folks talk about “reference counting” in general and assume the count is atomic; with Rust’s performance and safety focus, Rc is the non-reference counted version and Arc is the atomically reference counted version. Additionally, there is a feature called “automatic reference counting” that some languages use, that uses atomic reference counting in the implementation.
While this syntax is unusual it is also not unique, OCaml uses it for similar purposes. Nobody loves this one, but also nobody was able to come up with something that could be agreed upon to be better.
> Rc is the non-reference counted version and Arc is the atomically reference counted version.
Typo here I think Steve, Rc is reference counted, but it's not atomically reference counted. Hence the lack of an A. You've managed to instead say that it's not reference counted.
> C programmers could read java code and immediately understand
Uh... maybe? But Rust is more like a replacement to C/C++. So a better question is that whether C#/Java programmers could read C/C++ code and immediately understand those & and *.
No. C existed long time before java, so the natural career progression was for C programmers to become java programmers.
It is not at all as easy for C programmers to become Rust programmers as you need to learn a lot about Rust before you are able to understand the Rust code.
I think readability is a major part why java was hugely successful: C programmers could read java code and immediately understand what the code did and able to make minor modifications to the code.
This is very difficult with Rust code. Rust "learned" too much from Perl in this regard.