Interestingly, because Firefox chooses to use stable Rust exclusively, this is the version of Rust that will be used to deliver Quantum when Firefox 57 releases on November 14 (by which time Rust 1.21 will be out (releasing October 12), but Firefox 57 will be in beta by September 20).
They do plan on replacing component by component so not sure how that will be in the case of Gecko, unless Gecko is already designed with pluggable parts then it would make sense, otherwise they might have to replace a lot of moving parts (not sure how large Gecko is / how much it needs to function) but it'll still be done in chunks if it's too large I'm guessing.
Hi, I'm a Servo developer who worked on some of the Rust code that's in Firefox. Calls between C++ and Rust code in Firefox all go through "extern C" functions. Some of the code involves reference-counted smart pointers. We use RAII wrapper types in both Rust and C++ to ensure that refcounts are incremented and decremented correctly on either side of the FFI boundary.
P.S. This old blog post is not about Rust-in-Firefox, but it does cover a related topic: How the Servo browser engine (written in Rust) interacts with the Spidermonkey JavaScript engine (written in C++ and embedded in both Gecko and Servo), including garbage-collected JavaScript objects:
Rust doesn't have any more runtime than C, and also doesn't have a GC. Servo has bindings into SpiderMonkey's GC so that the JavaScript stuff works properly, but that's only that part.
That said, I don't think there are any direct blog posts about it; Firefox's other code just sees Rust as C code, as far as I know. (I don't work on Firefox though so I could be wrong about some details.)
To clarify, Rust has its own ABI, just like C++ has its own ABI. And just like C++, you can expose a C ABI if you want by defining special functions. In C++, it's an `extern "C" { ... }` block, and in Rust, it's a `extern "C" fn foo() ...` function declaration. You can see an example here: https://github.com/rust-lang/regex/tree/master/regex-capi
> To clarify, Rust has its own ABI, just like C++ has its own ABI. And just like C++, you can expose a C ABI if you want by defining special functions. In C++, it's an `extern "C" { ... }` block, and in Rust, it's a `extern "C" fn foo() ...` function declaration.
Ohh, Now i got it. But its pretty good thing to have anyway.
C ABI is the lingua franca anyway to communicate to any language, even C++.
> Adding C++ ABI support is a significant effort.
Ok. So if has its own ABI, im sure it would be a pretty hard undertaking to be compatible with C++.
I was guessing if maybe Rust had managed to squeeze and reuse the C++ ABI.. but sure, giving Rust is not that much alike C++ this would probably be a bad decision for small gains.
Yup, with bindgen and rusty-cheddar you can build interfaces both ways that just talk over a C ABI. It's really, really nice.
In fact Rust in general is awesome for going to a wide range of platforms. I've got a project right now that runs on MSVC-x64/x86, Linux-x64, OSX-x64, Android-armv7, Android-x86, Linux-armv7 and Emscripten. Single codebase and interacting with various languages(C#, C++, Java) via the C ABI. Rust even cross-compiles my C source via GCC crate so I can use C libraries and build for any of those targets from my host(win32) machine.
Also, having just spent a few hours fucking around with linker flags in Qt I can't stress enough how awesome Cargo, Crates.io, Rustup and the sane defaults Rust has. It really is an incredible ecosystem.
bindgen actually has the option to generate wrappers for C++ methods and other stuff for Rust to call. We don't use this, but we do use its ability to generate templates and stuff.
The other way around -- getting Rust to use the C++ abi -- needs compiler changes and also has questionable benefits.
I don't know the precise history, but at one point, it did have a "gc" type, which I believe was demarcated by the `@` sigil. So, for example, `@T` was a garbage collected pointer to `T`. IIRC, the actual garbage collector was simplistic, and was mostly just reference counting under the hood. At some point (in 2014, I think), the GC type went away. Since then, there has never been any serious talk of adding an optional GC to Rust proper, although many folks have worked/written about writing their own.
Wow, I can't believe I have been reading up lately on official Rust documentation and still I was under the impression that Rust had optional GC. Thanks for sorting me out...
Well, I think is sort of does when you use the Rc types, right? It's just much more explicit than a type modifier, it's a type wrapper that requires requires some level of additional code ornamentation in some instances when using later as well. It's just not really any different than using a library in C++ that offers the same, except that in Rust's case the library is part of the stdlib.
It's not a blog, but Rust Belt Rust 2017 [1] (a conference I help organize) will have a talk "The Story of Stylo: Replacing Firefox's CSS engine with Rust" [2]. The conference is in Columbus, Ohio, and is reasonably priced.
Are there any stats yet on improvements in memory safety within Firefox attributable to Rust. In theory, it could be as much as 50% fewer based on the original premise of Rust removing whole classes of programmer errors, but are there stats from Rust being in the wild? The CSS replacement reminded me that there should be something to compare.
I don't know, but if I recall correctly, nobody has ever reported a memory safety bug to ripgrep, or even the underlying regex engine. I don't really know how many people use ripgrep, but it's not zero.
Well, everyone who's using VS Code is using ripgrep, and a year ago Microsoft said it had half a million active users, and the number has presumably grown since then.
Doesn't ANY rewrite (regardless of whether there is a change in language) reduce bug-count, simply because all the requirements are more clear in advance and the engineers have had more time to think about a suitable architecture?
Nope. Churn (= number of lines changed) is actually pretty well correlated with bug count.
I don't have a precise reference off-hand, but I believe I read it in "Making Software: What Really Works, and Why We Believe It".
Obviously you'll probably get closer to what the software is intended to do, but it probably won't reduce bug count in the short term. In the long term, you might end up with fewer lines of code overall (which is also correlated with overall bug count).
I would assume a rewrite counts as a bunch of line changes -- absent evidence to the contrary. Obviously, if you're changing the language that changes the equation, some bugs which can happen in C++ simply cannot happen in Rust, for example.
I've been having a little trouble using rust for a little project: I need a tree with uplinks (meaning there are cycles). I asked on IRC a couple times and I think what I need is a weak reference inside a refcell, but it's not very easy to make it work cleanly. For one thing, it doesn't look like refcell works well with traits (the nodes in the tree are traits, not plain structs).
I'm a bit frustrated because this is so easy in C. Is there any way this can be easier?
I am imagining a special way to construct cyclical structures where everything inside would have the same lifetime and be destructed at once. We may need to hide the references from destructors though, otherwise a destructor could see a dangling reference.
Have you read "Learning Rust With Entirely Too Many Linked Lists"[1]? I think it will be quite helpful for these kind of situations. It walks you through all the possible tools in the language that is available to you, and at the end, if you just want to write it how you would in C, you could always do it "unsafely" with raw pointers (which is no worse than C).
It's also really no better than C. Or at least no better than C++.
This is my main issue with Rust. It doesn't seem to really solve the right problem. I feel like it solves the easy problems that I already know how to solve easily, but as soon as I get to something that really, truly feels like I want it, the best solution is unsafe.
A complicated circular linked data structure is exactly where I want the language to be screaming at me if I make a silly error. But Rust doesn't even consider memory leaks to be errors...
I don't know, ensuring memory safety at compile time and safe concurrency is a pretty big win for me over C, I know many people who claim that they can write/debug C programs to be memory safe, however the real world would respectfully disagree.
Rust doesn't ensure memory safety or safe concurrency. It ensures memory safety - including data race safety, just one part of safe concurrency - assuming you never use any unsafe code and that the standard library is free of bugs. I'm happy to assume the standard library is free of memory safety bugs, because you have to trust something.
But I'm not happy trusting that dependencies aren't using unsafe code, and I'm not happy claiming that Rust ensures safety, when it ensures safety only if you assume that unsafe blocks aren't unsafe.
The problem is that you can't check unsafe blocks locally. Checking that each individual unsafe block doesn't have undefined behaviour requires checking the entire programme.
It's better than nothing, without a doubt, but it isn't safe.
Unsafe blocks are infectious, that's true, but it's possible to write safe APIs that limit that infectiousness to a single module. For example, even though Vec's implementation is crazy unsafe, you don't have to audit all the uses of Vec in safe programs -- a local audit of the Vec code can prove what we need to prove. This is the biggest benefit of the lifetime system and the borrow checker, that when we write piles of unsafe code, we can force safe callers to maintain our invariants.
You can prove it, but you can also prove that a C++ programme has no memory safety bugs. And there are a lot of languages where you don't have to, where it's simply impossible to get memory safety bugs (assuming the runtime is safe).
For nontrivial libraries that use a lot of unsafe, it really is very difficult to know that all the uses of unsafe don't interact in some way to create unsafety. The scoped lock that had a problem in Rust 1.0 (or just before it?) is an example.
You can force callers to maintain your invariants in C++ too, simply by using some basic safety. Yes people can still do things that are obviously visually unsafe in code and undefined, but that's not a serious issue.
I still think Rust is better here. Don't get me wrong. But it's very hyped as 'safe and fast' when it just isn't safe.
If freedom from data and race conditions is the easily solved problems, I'll take it.
The problems with engineering solutions that approach something close to the end of the spectrum of perfection, is that it gets undo criticism for not being perfect-enough. Rust is hopefully a stepping stone along path towards more correct, less error prone computation. Lets not throw the baby out with the bathwater.
I don't really have a refutal, but more of a dismissal.
Almost all of the code I write just uses prebuilt data structures (other then structs to group things) and when writing this code I find the safety measures that rust provides very convenient because I don't have to worry about these things such as lifetimes. It is nice knowing that the compiler will let me know if I make an error.
However yes, it doesn't solve the hard probem of complex circular structures. I don't see this as a major issue because when I am writing these I am carefully thinking about the strucutre anyways. So yes, while it would be nice to have these verified as well I wouldn't want take the tradeoff if it made the language much more complex.
Most of us write new, complex data-structures, that aren't part of the stdlib or a crate, like once a year, at most. Those are hard in Rust if they involve circular pointers. They're hard in C/C++ too, but in a different way (easier to write the code, harder to be sure it's correct).
The idea that Rust would be no better than C/C++ because of the latter parts doesn't make much sense. This kind of work is unusual for most programming. To say that other programming work is easy does not seem to bear out in practice.
And as has been becoming clear in this thread, if you're inventing new data structures, the odds are you overlooked an already existing better alternative.
It doesn't matter that it's 'kind of unusual', even though I contend that it isn't. Even if, for the sake of argument, we assume that it is, that doesn't change my point.
My point is that the whole point of Rust is supposedly that it
>is a systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety.
except that when you look at any of the examples of code that really would benefit from the compiler's help, the compiler just throws its hands in the air and goes 'it's all up to you now'.
The problem is that Rust doesn't let you make a single assumption and let the compiler prove the safety of the code using that assumption. It just has a valve that you can hit that removes all guarantees.
If you could say 'this code is safe assuming that this FFI function doesn't exhibit undefined behaviour, please check that for me' or write a proof that says 'this actually is safe, because this pointer can only ever point into this valid memory or this valid memory, and this is why' then the compiler would still be useful.
Whether 'this work' (which is not just creating data structures, but anything that the compiler doesn't understand, which is much broader than just creating data structures) is unusual or not, IMO the whole appeal of Rust is that it makes doing that work easy. But it doesn't.
Rust just doesn't seem worth it, doesn't seem worth rewriting whole ecosystems of code. It doesn't give any actual safety.
> examples of code that really would benefit from the compiler's help
This seems to be the point of disagreement here, and I think evidence clearly shows that you are wrong. Sure, Rust doesn't help you when writing the implementation of e.g. circular data structures. But what it does do is provide, far beyond C or C++, the tools for the author of that data structure to enforce that it's used correctly.
And as mentioned upthread, most memory/concurrency (especially concurrency) bugs are not in the implementations of these structures, but in their use. So Rust is a fantastic win here, empirically speaking. Look at the rate of memory safety bugs in Rust programs vs C++ programs- Ripgrep vs grep, Servo/Quantum vs Firefox, etc.
* Most developers are not writing data structures, so optimizing for that seems unnecessary.
* There is work and research going into verifying unsafe code
* I think historically we can see that most memory safety vulnerabilities are not going to be in some lower level data structure, which is well encapsulated and likely already built by someone else, but in the use of that data structure. In particular - sharing references and also invalidating data safely without leaving references to that data. Rust helps you here, and this seems like the far better target.
* Even if your rust code uses unsafe, you still have benefits - you know where to audit for unsafety, you know where to pay extra close attention, and you can still write a large portion of your code in safe rust.
> I am imagining a special way to construct cyclical structures where everything inside would have the same lifetime and be destructed at once.
The simple way to do that would be to allocate an array, and use indices into the array rather than pointers/references.
Doing it with pointers isn't so much harder in Rust than C as it is that Rust is making you deal with how hard it is to get this right, whereas in C the compiler is happy to let you think it's easy while you accidentally shoot yourself in the foot.
If you want the Rust compiler to accept your mistakes, you can always wrap it in an unsafe block ;)
But in my example this is not hard to get right in C. The tree is constructed (on the stack would be fine), then used for a while without mutating it, then freed all at once.
The thing that makes this hard in rust is destructors. If there's a cycle between A and B, and you destruct A first, then B, then B's destructor would see a dangling reference to A. And vice versa if you destruct B first.
But I don't need destructors, or at least ones that can see these references, so it's frustrating.
> But in my example this is not hard to get right in C. The tree is constructed (on the stack would be fine), then used for a while without mutating it, then freed all at once.
It's still "hard to get right" in that at any time nothing is stopping you from violating any of the assumptions that make this "easy". It's never easy to write a solution that's "guaranteed to be safe" in C, but that's what you're trying to do by writing such a solution in Rust. To give but one example, nothing in C will stop your nodes from containing some resources which needs to be manually destructed and which get leaked when the stack frame is reclaimed.
Rust is going to require you to make those assumptions explicit, so that it can enforce them -- in this case, you need to explicitly restrict your solution to dealing with Copy types, which by construction don't implement Drop, and therefore have no destructors.
But at the end of the day, if all you want to do is swear to the compiler that you know what you're doing and you promise to not be stupid, wrap it in an unsafe and get C-style consequences if you got things wrong. You get C-Style easiness only by explicitly abandoning the attempt at guaranteeing safety for every type and scenario your solution could be used with.
Nothing stops you from leaking memory in Rust either, it is considered memory safe, see std::mem::forget. Rust only safes you from use-after-free and double-free.
The tree itself will need destructors to clean up allocations, and that's what's problematic here: if the tree is destroyed top down, the destructor of the children may access the parent which has already been invalidated, and similarly destroying bottom up risks the destructor of the parent accessing the children.
For this purpose, no, I don't think. That said I might be missing something and maybe th compiler doesn't understand this, so I'm not saying it's a panacea in your case, just might be worth looking at.
The simple way to do that would be to allocate an array, and use indices into the array rather than pointers/references.
I've done that in code for a collision detection engine. The object descriptions for convex polyhedra have lists of faces, lists of edges, and lists of vertices, all referencing each other. The original implementation (I-Collide) actually used lists for them. When I re-implemented that in C++, I used arrays with indices for each of those. When you're done with a polyhedron, all those structures, which are owned by the Polyhedron object, go away more or less simultaneously.
While it may be simpler, it also removes any safety that rust adds. I'd imagine you'd get more mileage out of just using pointers and unsafe blocks liberally to get lifetime checking where you can.
It is still safer than c, as it will not allow any segfaults. The only thing that can happen is a runtime error but no memory corruption so it is more akin to say Java or Go
Regarding traits: you can have Rc<Trait> (by casting from Rc<your concrete type>), but not RefCell<Trait>. The reason is that Rc is a pointer, so the size of Rc<Trait> can be constant regardless of the size of the type implementing the trait. But if you actually want a weak reference inside a refcell (as opposed to the other way around), RefCell<Weak<Trait>> should work fine. Also consider the Cell type, which has a more limited API than RefCell but zero overhead.
Regarding everything being destructed at once: that’s called an arena, and there are crates for it:
You're correct, that crate doesn't solve the cycle problem. There is a different way of doing arenas in Rust that does though.
What you do is you put all of your tree nodes in a big Vec<Node> and instead of referring to children and parents via pointers, you do so via indices. It's less convenient because you have to pass around a reference to your "arena" everywhere (the Vec or a slice of it), and it incurs bounds checks (pretty cheap though). But, it solves the problem in a way that is guaranteed safe.
In a sense, they solve the cycle problem in Rust. The trick is that once something is allocated to an arena, it has the same lifetime as the whole arena, and that allows cyclical links because none of the nodes "outlive each other".
However, you must have interior mutability (achieved using Cell types), because otherwise you can't mutate the parent nodes to add links to child nodes.
Check the typed arena crate in crates.io and the Cell type in the standard library.
It'd likely be the other way, that is, you'd put a RefCell inside of an Rc/Weak.
> it doesn't look like refcell works well with traits
It should; I bet you had problems since you did it the other way around.
> I'm a bit frustrated because this is so easy in C.
You could do it the same way as you do it in C, if you're willing to resort to `unsafe`.
FWIW, Rust sort of changes the equation for what's easy here; thread-safe concurrency? Simple! Data structures? Hard! C is the other way around. So feeling a bit frustrated is normal; your C skills won't carry over, but it feels like they should.
> I am imagining a special way to construct cyclical structures where everything inside would have the same lifetime and be destructed at once
This sounds like an arena to me, which is another option, for sure.
If you post to https://users.rust-lang.org/, someone might be willing to whip up an example, or if you post your in-progress stuff, someone might be able to fix it for you.
$ ghci
GHCi, version 8.0.2: http://www.haskell.org/ghc/ :? for help
Prelude> let ones = 1 : ones
Prelude> take 50 ones
[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
You can take as many as you like. (The 'ones' list contains a tail which links back to its head, producing an infinite list.)
Obviously, this is a trivial example, you can do much more interesting things with mutually recursive bindings.
If you see data structures as cyclic, it is not "in large part due to laziness", but _only_ due to laziness. If you look at the memory representation I think (but I am not sure) there will be no cycles between allocated objects. Only a thunk.
Using strict data types, I think you agree that cycles can not be created. Non strict data structures I agree can be seen as cyclic, but I prefer seeing them as infinite.
Frankly, I'd just use unsafe pointers for the backrefs, and wrap the tree API up in a typesafe layer, and build on top of that.
RefCells seem to add unnecessary redundancy here. You'll take a hit for runtime borrow for every pointer chase up the tree. If walking from a leaf to the root is important, you don't want to add an extra compare/branch/set to every pointer chase. Turns a single memory read into a branch, a write, and two reads. Probably about 5x slower at least.
This is why, as a user of GC languages, I think a bit sad that the ergonomics are so bad for cyclic data structures, that this solution is most likely what the majority will turn to.
The answer isn't generics, Coq, or some fancy type system nobody will understand or use properly. There are only a few inherent trouble spots in pure Rust code safety. The big two are:
- Partially initialized arrays. "Vec" has to be unsafe because growing an array involves uninitialized slots. You just need a way to say "this array is initialized from 0..N only", where N is in a data structure associated with the array. Then you need an operation that says "initialize entry N+1 and update the count". That's all it takes.
"Map" could be implemented on top of "Vec", instead of using unsafe code. It would be worth trying this and seeing what the performance penalty is. That may be a premature optimization.
- Backpointers. Backpointers have an easily checked invariant relationship with the forward pointer that owns their containing object, but there's no way to tell the language that something is a backpointer.
> The answer isn't generics, Coq, or some fancy type system nobody will understand or use properly.
The solution is a formal semantics for unsafe Rust, so that programmers can prove that their unsafe Rust code is safe to use by whatever means they prefer. (Mine would be by hand.)
---
Reply to dmix:
A formal semantics doesn't have to be particularly fancy, although in Rust's case, it will in most likelihood not be straightforward either.
Probably. I used to do formal proof of correctness work and headed a project to build a verifier.[1] That stuff is very hard.
The partially initialized array thing is an issue of expressive power. You can't talk about that in Rust yet. This is a classic issue. The three big headaches in C around memory safety are "how big is it", "who owns it", and "who locks it". The language lacks the syntax to even talk about those issues. Rust can talk about those, which is a huge step forward.
Before you can even consider verifying something, you have to be able to talk about it in some formal language. Preferably the one you're programming in. Having to do formal specifications in a separate language is a huge headache. Been there, done that.
There are a few standard trouble spots. I've listed two of them. Most other unsafe code comes from
1) Foreign functions, which can be expected to decline over time as more libraries are implemented in Rust. (How's SSL/TLS in Rust coming along?)
2) "Optimization", which may be premature. This usually consists of bypassing subscript checks. I'd rather have the subscript checks on all the time, and see effort put into hoisting subscript checks out inner loops. (Subscript checks that aren't in inner loops usually aren't significant overhead items.)
3) replicating C/C++ code in Rust. (An early attempt was a transliteration of Doom into Rust, with lots of pointer arithmetic.)
Remember, it can blow at any seam. It only takes one buffer overflow to allow an exploit.
> Before you can even consider verifying something, you have to be able to talk about it in some formal language. Preferably the one you're programming in.
I'd personally be excited too see a modern language implement this. I saw the potential for this type of verification in my (hobbyist) dabbling with Haskell. Which subsequently inspired me to relearn math, including a great book on proofs recommended on HN which really changed the way I viewed/approached math.
The use-case analogies for formal verification can probably best be drawn from automated testing and TDD. Which is another 'optional' part of programming with varying degrees of usage -
although with a lower barrier to entry.
Types/proofs seem to have a positive influence on the 'best practice' part as well, as it promotes a programming style which force you to really consider the implementations you're coding. Very similar to testing.
At the moment formal proofs tools today seems to me like a rabbit hole with questionable practical ROI so I've been hesitant to try out the current state-of-the-art.
Assuming it does get built in to the language, even if it doesn't get used by the user they will likely benefit just by being able to build on layers beneath that were proven in the standard/popular libraries. I felt a similar feeling of assurance when building on top of well-typed Haskell libraries.
We built verification into the language in Pascal-F, 30 years ago.[1] That's rarely been done since in real-world imperative languages. It's much easier to keep the verification statements correct if they're in the same file and the same language as the program.
But Pascal was a small language. Getting this into today's bloated languages is tough. Back then, we looked at Ada, sized the project, and realized it was comparable to building an optimizing compiler for the language.
> We built verification into the language in Pascal-F, 30 years ago.
You sound like a very interesting person to buy a beer for, assuming you'd be patient enough for my questions :p, I'll check out the paper instead when I get the time.
> It's much easier to keep the verification statements correct if they're in the same file and the same language as the program.
Agreed. Hell even using Dialyzer in Erlang/Elixir for static type checking (which I make the effort to do often in my daily side project hacking) just doesn't feel right, even though it's still appended to function definitions in the original files.
This is one of those things that need to be a core part of the language design IMO, not just a tool built on top - 3rd party, by the core team, or otherwise. But, that said, tooling can still be superior compared to the complete absence of it.
Maybe you can answer a question I've been struggling to find the answer to via Google. Do you know the name of this popular older language/tool used by to Microsoft for doing formal verification? For c/c++ style code. It sounds like tkk or something similar? I can't seem to find it.
My other question I'd ask is if you think the testing analogy applies here as I mentioned in my comment above or do you see it as an entirely different paradigm? Basically changing how you program rather than adding on a tool/skillset.
> Before you can even consider verifying something, you have to be able to talk about it in some formal language.
Sure, but it doesn't have to be a programming language.
> Preferably the one you're programming in.
Who says so? Programming languages (justifiedly) optimize for the ability to express computation, not the ability to express proof.
In spite of Curry-Howard, there exist important differences between proofs and programs:
(0) Proofs are primarily for humans to understand. Programs are primarily for computers to execute.
(1) Proofs are sometimes allowed to be non-constructive. Even when they aren't, an easily understandable proof beats one that corresponds to an efficient program.
(2) Allowing non-terminating programs is actually a good thing: it allows for proofs of correctness that use techniques not anticipated by the language designer. OTOH, if your logic lets you prove a contradiction, it's the end of the world (for your logic).
Proofs and specifications are different things. Users need to be able to read specifications. For example, if you want to talk about definedness for arrays, a predicate
defined(arrayname, lowbound, highbound)
is helpful. That goes in assert statements. It's run-time checkable, if you have some extra state in the form of "is defined" flags. But you'd rather prove it once so the run-time checks are unnecessary. With a few simple theorems, such as
defined(a,i,j) and defined(a[j+1]) implies defined(a,i,j+1)
and a simple automated prover, you can deal with most of the issues around partially defined arrays.
Think of proof support as being an extension to optimization of assertions. Most assertions in programs can be proven easily with an automated prover. Users need never see those proofs. Some will be hard and require more proof support.
defined(a,i,j) and defined(a[j+1]) implies defined(a,i,j+1)
is a theorem. It looks like this in Boyer-Moore theory:
(PROVE-LEMMA arraytrue-extend-upward-rule (REWRITE)
(IMPLIES (AND (EQUAL (arraytrue A I J) T)
(EQUAL (alltrue (selecta A (ADD1 J))) T))
(EQUAL (arraytrue A I (ADD1 J)) T)))
Name the conjecture *1.
We will try to prove it by induction. There are three plausible
inductions. They merge into two likely candidate inductions. However, only
one is unflawed. We will induct according to the following scheme:
(AND (IMPLIES (LESSP J I) (p A I J))
(IMPLIES (AND (NOT (LESSP J I))
(p A (ADD1 I) J))
(p A I J))).
Linear arithmetic informs us that the measure (DIFFERENCE (ADD1 J) I)
decreases according to the well-founded relation LESSP in each induction step
of the scheme. The above induction scheme leads to three new formulas:
Case 3. (IMPLIES (AND (LESSP J I)
(ARRAYTRUE A I J)
(EQUAL (ALLTRUE (SELECTA A (ADD1 J)))
T))
(ARRAYTRUE A I (ADD1 J))),
which simplifies, rewriting with ARRAYTRUE-VOID-RULE and SUB1-ADD1, and
opening up ARRAYTRUE and LESSP, to the following eight new conjectures:
...
That finishes the proof of *1. Q.E.D.
"arraytrue" is defined recursively:
(DEFN arraytrue (A I J)
(IF (LESSP J I) T -- the null case is true
(AND (EQUAL (alltrue (selecta A I)) T) -- next element is true
(arraytrue A (ADD1 I) J))) -- and rest of array is alltrue
And yes, there's a machine proof that this terminates.
> defined(a,i,j) and defined(a[j+1]) implies defined(a,i,j+1) is a theorem. (proof)
Sure. My point is just that a theorem is a proposition equipped with a proof. If the user just enters a proposition into the system, then they aren't entering a theorem. They're entering a proposition that the system can turn into a theorem.
> > But you'd rather prove it once so the run-time checks are unnecessary.
> No. I'd rather prove it to rule out the program being wrong. The runtime check is totally besides the point.
That's the same thing. It's just a matter of it being automated.
> > Think of proof support as being an extension to optimization of assertions.
> I never use runtime-checked assertions, so I never need to optimize them away.
If you've ever used a language that uses bounds checked array access, you have. IIRC, you've used Rust some, so you've used bounds checked array access, except where Rust was able to prove that out of bounds access was impossible and elide the code that checks.
> > Users need never see those proofs.
> But, you see, I want to see the proofs. How am I supposed to maintain a program I don't understand?
There's a difference between need and ability. Just because it's stated you don't need to see something doesn't mean you can't. If you want to see a proof, you dig in and find it, or find where it's accepted canon that we don't need to reproduce (do we need proofs for integer addition? That seems of limited use to me, but maybe a formally proved as much as possible all the way to that level language has some interesting benefits).
> That's the same thing. It's just a matter of it being automated.
Re-read what he said. He presented runtime checks as an acceptable but non-optimal scenario for performance reasons. My reply was that runtime checks don't prevent a wrong program from being wrong, and it's this wrongness itself that I consider unacceptable.
> IIRC, you've used Rust some, so you've used bounds checked array access, except where Rust was able to prove that out of bounds access was impossible and elide the code that checks.
Runtime bounds-checked array manipulation is literally the single thing I hate the most about the languages I like the most (ML and Rust). It introduces a control flow path that shouldn't be reachable in a correct program, and hence shouldn't exist. This is particularly painful because beautiful array-manipulating algorithms have existed since, like, forever, yet in 2017 I still can't express them elegantly.
> If you want to see a proof, you dig in and find it, or find where it's accepted canon that we don't need to reproduce (do we need proofs for integer addition)?
I don't need to rebuild arithmetic from scratch again, because I've already done it at some point in time, and once is enough as long as you understand the process.
On the other hand, when I'm first confronted with an already existing program, I don't understand why the program is correct (if it is even correct in the first place), so I do have to set some time aside to properly study it.
IIRC, Rust's HashMap uses a single raw vector containing for each entry its hash code, its key, and its value; empty entries have a special hash code as a marker, with the key/value left uninitialized. Also, the hashes are kept together (separate from the rest) for better cache behavior during the linear probing.
Wouldn't that be two parallel vectors? (Side note: nobody ever thinks to bring up cache behavior in interviews where I ask about how a hash map could be implemented - it's nice to know that the library writers care :) )
How does that special hash marker work? What happens when something actually hashes to it? Just silently increment the hash?
Yes, but it's a single contiguous memory allocation. (It used to be three parallel vectors in a single allocation, with keys and values also kept separate to avoid padding between them, but experiments showed that had worse cache behavior.)
> How does that special hash marker work? What happens when something actually hashes to it? Just silently increment the hash?
Looking at the code, it always sets the most significant bit of every real hash value (since the least significant bits select the bucket, it makes no difference), and the marker has the most significant bit clear (in fact, all bits of the marker value are clear).
Yes. This optimization is expected to be expanded in the future but there are currently "NonZero" types. Rust notices this and uses the zero value as the enum descriminant.
So in this case the code generated is idential to a nullable pointer.
Quite often in Rust, the way to make it easier is to go one step back: instead of thinking about how to make a tree right, you might want to think about why you want a tree in the first place. Sometimes it's the right thing to do, sometimes there is another approach that fits Rust's paradigm better.
Speaking of paradigms, although Rust definitely looks "imperative", the ownership system makes it bloody different. Try to implement a tree "as in C" using Haskell or Prolog and you will lose time and energy for a result that does not use the language to its fullest.
Have you considered leveraging crates.io? I'm not sure of your exact constraints, but I've heard some good things about petgraph: https://crates.io/crates/petgraph . It sounds like you might also want to check out some implementations of arenas.
It's fairly easy to do this kind of thing using an arena and indices instead of pointers. Here's a simple splay tree with uplinks implemented this way:
I think I would just put all the nodes into a Vec and use indices instead of references. This results in every node having the same lifetime, like you wanted. It is what the specialized graph libraries like petgraph are doing, and it is memory safe due to bound checks.
Is there a book on Programming Data Structures in Rust. Like from scratch. Trees, Graphs etc. I would expect the first 2 to 3 chapters to be on Rust pointer system and the rest of the book to be on implementing Data Structures using the pointer system.
Similar to like Tanenbaum's book for Data Structures in C or Kruse, Leung and Tondo's book for Data Structures in C.
In Rust, you almost always want to use a library for things like that. If you want to write the library, you may want the Rustonomicon: https://doc.rust-lang.org/nomicon/. Also, of course, the "Too Many Lists" article in the other reply. Rust definitely pushes a different way of thinking about and using data structures.
What happened to the exhaustive list of unnamed Github contributors? I thought that was a really powerful marketing method: dear unnamed Github user x1230134384, your contribution has been acknowledged! Feedback without requiring identification, it's kindof very cool in a project like this.
If you click on the final link to thanks.rust-lang.org, you'll find it there!
There have been talks about moving it back, but it's complicated. You can do the "duplicate it" strategy, but that has downsides. You could do an iframe, but then you're using an iframe. You could use JavaScript, but then a whole different crew of people will Get Mad.
Obviously the solution is to turn the thanks page into an API, add CORS headers, and then a bit of JS on the announce page to fetch the list from thanks.rust-lang.org and build the list on load or fall back to the current behavior.
Good luck determining whether I'm joking or not. I'm not even sure myself...
Not a Rust programmer, so the answer to this question may be in TFM -- Rust has floats, but it doesn't make available constants for inf and NaN? Are you not guaranteed IEEE754 floats?
The consts are currently defined in modules with the same name as the type, `std::f32::NAN;`. This feature allows them to be associated with the type itself - `f32::NAN`. A small convenience.
That's something I'd like to understand better -- MAX is defined as a literal number, but NAN and INFINITY are computed by the compiler. I looked at the header files on my system: GNU nan.h uses a GCC builtin and falls back on a magic bit-pattern (a NaN as defined by the IEEE754). Is 0.0/0.0 more portable? Clearer? Are magic bit-patterns just not in the spirit of the language? Sorry to be so particular about this -- floats are tricky and I'd like to know exactly what's going on with them.
Then we can actually implement that trait for our data:
impl Bar for Foo {
const BAR_CONSTANT: i32 = 42;
fn some_function() {
println!("foo's associated function, and the const is {}", Self::BAR_CONSTANT);
}
fn some_method(self) {
println!("foo's method, and the const is still {}", Self::BAR_CONSTANT);
}
}
Then we can use it like so:
Foo::some_function(); // foo's associated function, and the const is 42
let foo = Foo;
foo.some_method(); // foo's method, and the const is still 42
And now we can take it further. Imagine that you have another piece of data, `struct Qux`. Then you can do the same and `impl Bar for Qux`. And now you can write a generic function like so:
fn bar_taker<T: Bar>(something_that_impls_bar: T) {
T::some_function();
something_that_impls_bar.some_method();
// And of course we can refer to T::BAR_CONSTANT in here as well.
}
And call it like so:
bar_taker(foo);
bar_taker(qux);
AFAIK, the big deal with associated consts is specifically that it allows generic code like that to refer to different values on a per-type basis.
Thanks for the nice example. I've heard of traits but hadn't read enough to know what they are. Looks like a form of multiple inheritance. The example you gave shows polymorphism via traits. That was helpful to me.
It may be easier to see their significance when they're associated with a trait, rather than a struct.
For a simple example, imagine I've got a trait Number, and every implementation of Number is supposed to have a zero constant. Then I could let that constant be Number::ZERO, and refer to it as such in generic code, with each implementation of Number having a different value for Number::ZERO.
That's interesting, I always assumed parametric polymorphism was a requirement of ad hoc polymorphism. I suppose I'm using a more narrow definition. On the opposite end,
Elm for instance, has parametric polymorphism:
type Parametric a = Parametric a
f : (a -> b) -> Parametric a -> Parametric b
f fn (Parametric a) = Parametric <| fn a
but no support for ad-hoc, like you said, in Haskell:
data Parametric a = Parametric a
f :: Ord a => Parametric a -> Parametric a -> Parametric a
f (Parametric a) (Parametric b) = Parametric (min a b)
May be too late in this thread for a response, but does the completion of the 1.20 release free up some time for the "State of Rust 2017" survey results blog post before the 1.21 release? I know there was a huge response, and you don't want to step on the release news, but I'm waiting to see where the user community stands.
Associated functions and associated constants sound so yum! That's one thing I would love to have in Flowtype. But classes seem to serve that use case well enough.
Edit: Sorry, this actually a Stage 3 TC39 feature. Not sure if you can use it with flowtype (I think you can), but you might be able to if you enable babel with stage-3 features:
Yeah, either way it is objects. Traits feel so much more flexible though; and more natural a layer over JS object-orientedness. It also makes it a pain that there are some important semantics missing while using classes in JS. Someitmes Java feels much better.
The real problem is that ES classes map onto a paradigm of prototypal inheritance rather than traditional inheritance (as in Java). The discrepancies between the two cause a leaky abstraction, like how a class still has a prototype chain, for example.
Note, I think in spite of being in stage 3 the syntax and obvious cases seem to be largely unlikely to change. It certainly improves react quite a bit: you can define the proptypes and default values inside the class rather than after.
And here I am, still waiting for the second edition of that book on learning Rust to be finished. I want to learn Rust for fun and I know I should probably just start on reading the book, but I keep making up excuses not to. I want my ++, --, ?:, SIMD, etc.
Is f32 the only suffix for float literals? That seems really noisy compared to just f in C. I'm mostly concerned about vector and matrix declarations, but even in the examples given, 1.0f32 / 0.0f32 looks like a mess of numbers compared to 1.f / 0.f;
Yes, it is, for 32-bit floats. However, most of the time you don't need to specify that, because it will be inferred from the context. (And if it isn't you can introduce hints like `let growth_rate: f32 = 50.0/140.0`) Also, Rust supports having _ anywhere in number literals. Some may argue that it makes things even noisier, but it helps to separate the suffix from the value itself: `100.0_f32`.
#include <stdio.h>
int main() {
float a = 1e7 + 0.5 + 0.5;
float b = 1e7f + 0.5f + 0.5f;
printf("%s\n", a==b ? "true" : "false"); // false
}
That's awesome. I'm totally fine with an annoying suffix like f32 if I never actually need to use it. While f might be a nicer suffice than f32, no suffix is even better.
Isn't this kind of inference dangerous? It seems that whichever call comes first is used to infer the type, so a single added line of code can change the type of a variable…
fn foo32(x: f32) { println!("{}", x); }
fn foo64(x: f64) { println!("{}", x); }
fn bar64() {
let x = 5.0; // f64
foo64(x);
foo32(x);
}
fn bar32() {
let x = 5.0; // f32, not f64
foo32(x);
foo64(x);
}
Rust doesn't have coercions like C, so both of those fail to compile because using x: f64 with foo32 is an error for the first one and similarly x: f32 with foo64 for the second.
The reason this works the way it does is that rust does not provide any cross-type arithmetic operators; as in you can't multiply a signed int and unsigned int or a float and an u8 without casting one of them to the same type as the other.
That means for the example you posted works is because the only valid operation that line of code can be resolved to as-is would be the case where all untyped values are inferred to be float by the inference engine.
> While f might be a nicer suffice than f32, no suffix is even better.
Well, isn't float in C and C++ platform dependent, and just usually 4 bytes on common platforms? I like the idea of being explicit in the storage type, even if it's slightly more verbose.
In particular, everything but MSVC treats long as 64 bits, but MSVC treats it as 32 bits for backwards compatibility.
However, I haven't used int or long or long long in ten years or so. Thanks to stdint.h support finally making its way to all platforms, it's exclusively sized types like int32_t and uint16_t for me.
I might otherwise agree with you, but I find it hard to read when extra numbers are interspersed with the numbers that actually part of the calculation. I feel the readability matters more than the explicitness.
Nope. `f32` is for single-precision floats, and `f64` is for double-precision floats. There have been some people lobbying for `f16` and `f128` as well.
Also, you don't need to use those suffixes. I imagine the OP is using them for maximum explicitness, but you can just write `1.0` and it will be inferred to a floating-point type as necessary (contrast `1`, which will be inferred to an integral type).
That's ok. I presume you weren't a C/C++ dev before, that's old school shorthand for C coders that don't want to type out suffixes or cast numeric values.
> An unstable sort could provide this result, but could also give this answer too:
It might just be me, but using "this" twice in the same sentence to refer to two distinct items, one a prior example and one an upcoming example, is somewhat odd. That said, I understood it perfectly fine, it just caused me to stop and ponder the wording for a moment. The following might be more clear:
An unstable sort could provide that result, but could also give this answer too:
In case anyone’s curious, this is called discourse deixis[1]. It’s a frequent source of errors for non-native English speakers, because in many languages, you use “that” to refer to an example that follows, but English is unusual in that it generally uses “this” for the future and “that” for the past.
So that[2] sounds wrong:
This[3] probably screwed you up.
The somewhat confusing thing is that “this” is also used for the present or recent past, especially in more formal writing.
> English is unusual in that it generally uses “this” for the future and “that” for the past.
I don't know that I'd describe it as "future" and "past"; it's more that English uses "this" for "current" and "that" for "other", whether past or future.
For instance, "this situation is broken, that solution looks promising" uses "this" to refer to the present and "that" for the future.
I was referring to past/future in the text. Deixis is about how words and phrases like “this”, “that”, “here”, &c. are contextualised. Your example is unrelated to discourse deixis because it’s not referring to a piece of the surrounding discourse. We use different forms of deixis and spatial metaphors for time, like “the end is near” or “the past is behind us”—in some languages, the past is in front of you (mnemonic: you can see it) while the future is behind (you can’t).
Since you apparently know stuff about language, maybe you can help me out:
Sometimes novice programmers try something like "foo == bar && baz" when they really should have "foo == bar && foo == baz". This is because, in English, "Foo is equal to bar and baz" means (and is more common than) "Foo is equal to bar and foo is equal to baz". There's some name for this rule, but I haven't been able to remember it for a while. I think it's "right hand ______", but I can't remember that final word.
First, I have a little anecdote about that. I was at a summer camp where we were taking game programming classes. I had been programming for longer than the other students, so I would help them out. One of my peers asked me for help with a part of his program. In the course of that, I noticed that he had written “if (x == 1 || 2 || 3 || …)” in another part of his code, and explained that this condition would always be true (in C++). He became defensive and dismissed me, saying “nah, I tried it and it works”…because he had only been trying the success cases. It absolutely infuriated me!
Anyway, IIRC that pattern is called “conjunction reduction”: “foo equals bar and foo equals baz” becoming “foo equals bar and baz”.
The term you might be thinking of is “right node raising”[1], which is when the elements of a conjunction “share” the stuff that follows them, as in “bar equals, and baz equals, foo”.
You've misread the announcement; Rust has had associated functions since time immemorial. Associated consts aren't class variables, because constants can't vary (that's sort of the whole point of constants...). Rust also doesn't have classes in any recognizable sense (we can argue all day about whether Rust supports "OOP", but the separation of data and behavior into structs and impls respectively pretty thoroughly subsumes classes themselves).
>Associated consts aren't class variables, because constants can't vary
That's why I wrote "limited version of". Rust's "associated constants" are a subset of C++'s class variables feature. Namely you can only have variables qualified "const" i.e. constants.
>Rust also doesn't have classes in any recognizable sense
What? If you have instantiatable abstract data types with associated methods you have a "class". Calling them "structs" does not change that. There are classes defined with "struct" in C++ and D too.
Oh and calling an interface a "trait" does not change its fundamental nature either.
Having aggregate data types and syntactically having methods seems like a very facile definition of "OOP", versus the conventional uses referring to Smalltalk style message passing or inheritance and overrides. Other than being able to call functions as x.foo() rather than foo(x) or foo x or similar, languages like Haskell and C seem to satisfy the requirements for being OOP, which seems to make "OOP" completely useless as a category.
In any case, this point is been argued at length for every language, including Rust.
I actually think having syntactic methods is an important (if not the most important) part of "OOP". You compared x.foo() with foo(x), but that's the wrong comparison. The correct comparison is that when x if of type Tree, x.foo() vs tree_foo(x). Otherwise you get name clashes.
That is, I think the essence of "OOP", OOP-as-used, not any theoretical OOP, is function name resolution depending on type. That and syntax. So first, you write x.tree_foo() instead of tree_foo(x) which is purely syntactic. And then you shorten x.tree_foo() to x.foo(), because x is a tree, which is a great semantic help.
According to this definition, Haskell and C are not "OOP", which matches common understanding.
Function name resolution isn't OOP, it's attaching a namespace to a type. The _really_ important part is dynamic dispatch. You can have proper OOP without method syntax, it just looks awkward to our eyes. In fact, in Objective C,
I already agreed function name resolution isn't OOP. My argument was that it is "OOP", with quotes. My supporting evidence is that people consider C++ without any dynamic dispatch as "OOP", with quotes.
> What? If you have instantiatable abstract data types with associated methods you have a "class". Calling them "structs" does not change that. There are classes defined with "struct" in C++ and D too.
So if I just take C's structs and add the ability to associate structs with functions using a "struct.function" notation, I now have classes? If so, a class isn't a very powerful concept, is it?
Here's an easy way to see the difference between traits and interfaces, and simultaneously see why some people find OOP completely inadequate for their purposes.
Imagine you're implementing several different types that all need to be able to be added. You've got integers, floats, mathematical vectors, and possibly other types. All of them should support an `add` method. But you should only be able add objects of the same type: an integer to an integer, a float to a float, and a vector to a vector. This incredibly simple idea, to my mind almost the simplest thing a person might want to do with a type system, cannot be expressed in most OOP languages using an interface with an `add` method.
But it can easily be expressed using traits (or using Haskell's type classes).
And this is why I never understood why anyone bothers with OOP.
Because nobody bothers with "OOP" or "FP" because there is not a single cannonical implementation of these that everyone agrees on.
What you're talking about has nothing to do with neither.
Any language that has generics can have self types or whatever you want to call them. It's a matter of whether the language has a strong static typesystem and most FP-style languages have these and a minority of OOP-style languages has them too.
The reason why I use C++ has nothing to do with a preference for OOP. It's because all the widely used alternatives have terrible performance.
I hate dealing with C++ projects. I hate dealing with memory leaks and segfaults. But that doesn't stop me from creating more of them and working on existing ones.
All of that because the resulting memory footprint and performance are worth it.
You seem to be trying to argue with me, but the points you're making are pretty much orthogonal to the ones I made. If you think OOP is not a meaningful concept, you should be arguing with the comment I was replying to, not me.
By that definition, Haskell is an OOP language too, because Haskell has type classes (traits/interfaces) too, and you can have associated methods/associated types too.
It's not though, AFAIK the x.foo() syntax is just sugar for
let a = Foo;
Foo::bar(&a); // if bar takes &self
Plus, there is no concept of inheritance. I don't see how that meets any definition of OOP unless you make the definition so weak as to be meaningless.
Depending on the language, interfaces can be a lot less powerful than traits. Traits can have default implementations for some methods (meaning you don't have to explicitly implement all of them), and in some cases they can be completely derived automatically. They're really more similar to Haskell's typeclasses than traditionally OO interfaces.
I, at least, don't really think of Rust as having classes. Rusty design patterns don't really look like many traditional OOP design patterns. This is often a hurdle for new Rust programmers. Chapter 17 of the book (https://doc.rust-lang.org/book/second-edition/ch17-00-oop.ht...) is trying to grapple with this question.
Part of the difficulty here is nailing down what "class" even means, exactly. Rust doesn't fit into either of the three major schools of OOP, which I personally nickname the "smalltalk school", the "java school", or the "javascript school".
Obviously Rust is in the 4'th school which I like to call "The Rust School".
All kidding aside, I think for any regular working day programmer Rust is obviously OOP. The debates are really just which parts of which favorite school of OOP you think Rust is inspired by.
But what really matters is that Rust gives you:
* Encapsulation
* Polymorphism.
* And Code Reuse.
Which are the only three things anyone who reaches for OOP is really looking for anyway. As fun as the PL theory discussions are. (Kind of a Hobby for me at least) I think those are the bits that actually matter to the people who truly need an answer to the question "Is Rust OOP or not?"
I agree that it's better to focus on encapsulation/polymorphism/reuse, but the reason why I personally object to the idea that "for any regular working day programmer Rust is obviously OOP" is because inheritance is usually taught above all of these as the fundamental property of OOP.
I doubt I'm the only one whose high school and college exposure to OOP was to model "Dog is-a Mammal, Cow is-a Mammal, Mammal is-a Animal", and if a newcomer to Rust tries to do the same as their "hello world" then they're going to be greatly frustrated. Again, this isn't to say that the phrase "Rust is OOP" is necessarily incorrect, only that it's going to backfire if we going around shouting it, due to misaligned expectations.
Agree 100%. And I think that what is commonly thought of OOP (the Java model) suffers greatly from real-life analogies like that. Not everything fits into neat hierarchies – not in real life and not in code.
I also think that the concept of "class" suffers from a great amount of un-orthogonality. A class packages together a memory representation, an encapsulation boundary, an interface that defines functionality and a unit of type parametrisation. I find the way how Rust separates modules as the unit of encapsulation, traits as the unit of functionality and structs as the units of memory representation, very elegant.
Is-A does not and should not imply Inheritance. It merely implies polymorphism. I didn't get formally trained so I can't comment on what college/high school's teach but if they teach that Is-A relationships imply Inheritance then claiming Rust is OOP seems like a good way to educate the mis-educated regarding the difference.
What kind of polymorphism? Traditional OOP is subtype polymorphism. You can't really do that in Rust. It's main polymorphism is parametric and ad-hoc polymorphism a la Haskell. If that's considered OOP then OOP is so broad it's not a useful term
I don't think that "is-a" is formally defined anywhere. In my school experience, using Java specifically, "is-a" was used to teach the `extends` keyword, while "has-a" was used to teach the `implements` keyword.
> Obviously Rust is in the 4'th school which I like to call "The Rust School".
Ha! But yeah, if you're going to claim Rust is OOP, I would argue that makes the most sense.
> what really matters is that Rust gives you:
Right, this is the "Java School" definition. However, when people talk about OOP this way, when they say "polymorphism", they mean "subtype polymorphism" not "parametric polymorphism." Take https://docs.oracle.com/javase/tutorial/java/IandI/polymorph... as an example. Rust's only sub-typing is for lifetimes.
> As fun as the PL theory discussions are.
I definitely agree that in some senses, this is academic. But at the same time, practitioners can be mis-led by using terms in a different way than they're used to. This is sort of the argument I'm making above; since many practitioners see "polymorphism" as being equal to "subtype polymorphism", calling other types of polymorphism "polymorphism" is a more academic, less user-focused distinction.
Oh dear. That java tutorial just reads wrong to me but that's probably because I've been burned too many times by conflating polymorphism with code reuse in the form of inheritance.
I think the working programmer doesn't know or care though. They care about whether you can reuse code in some way. And whether you can substitute multiple implementations of an object act as one particular type.
The specifics of how you reach either of those goals is all they want to know and the pedantry that us language geeks love to engage in just blows right past them.
It's possible we've just interacted with different people; I've spoken to a lot that assume inheritance, specifically, and we get questions about it often.
These are three incredibly vague terms that pretty much every programming language can be said to provide. I can't understand how you think OOP is defined by this.
Do you think Haskell programmers are unable to reuse code? That they don't have any form of polymorphism? That they can't hide implementation details of functions, data structures and modules?
This is the kind of argument that convinces me that OOP is completely bunk. Its adherents can't even explain what it is!
> These are three incredibly vague terms that pretty much every programming language can be said to provide. I can't understand how you think OOP is defined by this.
I do not want to defend OOP, but I would like to add that object-orientated languages are often to be expected to support these 3 things (encapsulation, inheritance and polymorphism) - at least many sources list these properties as must-have to be considered object-orientated.
> This is the kind of argument that convinces me that OOP is completely bunk.
Agreed. E.g. functional programming has a sound foundation (λ-calculus), but in object-oriented land if often sounds a little bit hand-wavy.
I wouldn't call myself an adherent. If anything I'm an adherent to the idea that Objects are just another Closure and vice versa. The interesting distinctions are in how they achieve those same goals.
But most working programmers don't care about the debate at all in my experience. When they say they are looking for an object oriented language what they are really saying is:
I am looking for a language that has a syntax where you
can declare an object and call methods on that object
usually with a dot but sometimes with an ->.
They don't actually care about much more than that most of the time.
> I'm looking for a language with "objects" and a syntax like <this>
But why? That's such a weird thing to look for. Clearly there's a reason they think they want "objects", but what is it? And why the focus on what syntax these objects have?
I'm not arguing this is a good thing. But I am saying that for the vast majority of programmers out there they use that pattern as a way of determining if they have those three attributes. It's cargo cult programming in a sense but it's also the only signal they've ever known.
I think calling Rust that would confuse people more than anything. They will look for their traditional inheritance style relationships and be frustrated that there is not really any support for it. Better to teach the features that Rust actually has, than to try and fit some uber generic definition of "Objects" that's not going to serve newcommers.
> Which are the only three things anyone who reaches for OOP is really looking for anyway.
Those are 3 characteristics of modularity, not object or class-specific in any meaningful sense. Rust isn't really a classic object-oriented language, but it is a modular language with object-like abstractions.
I'd argue that Rust is pretty close to being "javascript school" OOP, except with the addition of strong typing.
JavaScript code bases use a lot of classes, but while JavaScript supports inheritance, it's use isn't particularly idiomatic. What is idiomatic is duck typing (checking for the presence of method or property on an object). This is basically the same as using traits, except without compile time correctness guarantee. You even have things like `Symbol.iterator`[0], which allow an arbitrary object to work with a `for..of` loop, exactly like `impl Iterator` in Rust.
If you wrote that in C++ or Java or Python or.. it would /could look fundamentally the same. You create a window object and then call a bunch of its methods (mutating its state). That's OOP.
I think the Rust people just desperately try to disassociate themselves from the OOP label because OOP is not hip anymore. "So 90s" and all that.
I think that's childish. If you use a broad definition of OOP according to which C++, Smalltalk, Java, and Python are all OOP languages said definition certainly covers Rust too.
To be fair, even the C++ crowd doesn't like to use the term OOP anymore and instead say things like "generic programming".. but at the end of the day they too are instantiating classes.
Yeah, the usage does, but the definition does not, the implementation does not, and the features are very different.
For example, "new" is just a convention here, not an actual constructor, as Rust does not have constructors.
> You create a window object and then call a bunch of its methods (mutating its state). That's OOP.
It depends on what you mean by "object". If structs are objects, then OOP boils down to only "you can use x.y() instead of y(x)", which I don't think is a good way to think about programming languages or their features. YMMV, of course.
> I think the Rust people just desperately
I can assure you, that's not the case for me at least; I love OOP languages enough to have both Ruby and Perl logos tattoo'd onto my body.
I don't like saying Rust is OOP because of said definitions elsewhere in the thread, as well as people struggling to map their OOP patterns over. If they hear "Rust is OOP", they expect to be able to do OOP-like things, and when they can't, that's a big frustration. Enough so that we had to add that book chapter.
"OOP" has suffered since the 90s of being a somewhat vague term in practice (much to the chagrin of Smalltalkers), but making it even more overbroad just to make it apply to Rust will only drive the term further into uselessness. It's not about being "hip", it's about using terminology that's usefully precise. (For the record, I also think "functional programming" is a uselessly broad category these days, having become a victim of its own success.)
It's uncontroversial to state that Rust has methods, which are usually associated with OOP. It's also uncontroversial to state that Rust is utterly incapable of defining inheritance hierarchies, which are also usually associated with OOP. If we're going to argue about it, we might as well try to agree on terms that will let us say things that are meaningful.
It actually is the consensus in the OOP world these days that composition is preferable to inheritance in most cases.
Declaring class inheritance to be the defining feature of OOP just because it was fundamental to 1990s Java courses makes about as much sense as to say extreme late-binding is a fundamental aspect of OOP, like Alan Kay does, and thus neither Java nor C++ are OOP languages. Yippee! Suddenly they are no longer boring, old OOP anymore either! Thanks Alan!
You mention FP being a broad category. Indeed it is. Haskell is all about expressive static types, meanwhile Erlang couldn't care less about types. Yet I dare to argue that this broad definition is not "useless" because both Erlang and Haskell default to immutable data and pure functions, and that defines FP more than anything.
Practically meaning that Erlang and Haskell tend to be massively closer to each other than they are to imperative languages.
The same is the case here. Rust is massively closer to C++ than to Haskell. Mutable ADTs+methods+interfaces, as opposed to free functions and (immutable) "raw" data defines the code style way more than questions like subtype vs. parametic polymorphism
If you think that the fact that your dog does not "inherit" from mammal but instead has the mammal "trait", means you are doing a fundamentally different kind of programming, you are wrong. Such differences are of lesser importance in practice.
> It actually is the consensus in the OOP world these days that composition is preferable to inheritance in most cases.
No argument there. :)
> Yet I dare to argue that this broad definition is not "useless" because both Erlang and Haskell default to immutable data and pure functions, and that defines FP more than anything.
I think this illustrates what I'm trying to get at: if one's task is to further the tenets of functional programming, then going around espousing "functional programming" as a general concept is less direct than just cutting to the chase and evangelizing for immutability and/or purity directly, especially when one considers that e.g. Common Lisp is neither immutable nor pure, but will be what plenty of people's minds jump to when they think of FP.
> I think the Rust people just desperately try to disassociate themselves from the OOP label because OOP is not hip anymore.
No, it's mostly driven by the differing semantics despite happening to have similar syntax. Syntax is a surface polish that people focus on a lot, but doesn't drive the core language behaviour.
Associated consts allows constants to be derived from our trait-constrained parametric polymorphism. C++ currently has no analog to this system (templates can be used, but don't provide a constraint system; concepts are not in C++17 afaik).
At an introductory level, like this announcement, we teach this system by analogy to OO, but its not OO in its core and snide comments like this only reveal the limits of your knowledge.
I used to be a huge fan of C++. I loved how you could work with a somewhat OO language while still wielding dark powers like: manual memory management, mix-n-match polymorphism (aka. virtual inheritance), templates, type-erasing.
Then I got a full time job working with a large C++ codebase. Since then I've been dealing with:
- uninitialized variables / members
- NULL checks
- buffer overflows, especially with C-strings
- (mutable) global variables
- exception safety
All these things don't exist in (safe) Rust. I'm sold.
That's my main beef with C++. You read a book and it's beautiful - no uninitialized memory, safety, no C arrays/strings, RAII everywhere, it's clean, fast.....then you get to the "real world" and realize that 99% of C++ programmers are pretty f###ing horrible at their job and should've just stuck to naked C because at least then you'd be able to pin-point the bugs easier.
So C++ is a wonderful language - in a world that only exists in the creators' heads, or on committee tables, or on brand-new, post C++03 codebases. Basically - nowhere.
But isn't that the case with basically ANY programming language? Once its use becomes widespread, escaping the confines of the highly skilled early adapters, you can be sure a whole lot of dodgy code will be written in it.
Not really, because the language doesn't make it difficult to write dodgy code and gives you no power to easily find such dodgy code. In other safe languages either you simply can't write these kind of dangerous code or it should be easier to catch them with just keyword search. In C++ you need powerful static analysis and some dynamic instrumentation to hopefully catch them.
"class methods" are specifically static methods, meaning they don't have a self/this object. Rust's closest equivalent to classes has always allowed normal methods and variables (as well as class/static methods, just not class/static "variables").
Rust isn't inherently thread-safe; not like in the way Java is thread-safe because of how the JVM's memory model is defined.
Rust makes it really difficult to share mutable data. But once you do that, such as by smuggling a pointer _through_ an unsafe block, all bets are off, even for code outside an unsafe block.
Also, as with Java there will always be bugs in the compiler and runtime. Rust programs were susceptible to StackClash just as much as C applications.
No matter what language you use, if you minimize shared mutable data you'll minimize thread-safety issues.
If you use a C FFI in any language, including Java, all kinds of safety are off. Unsafe Rust is equivalent to C (with a lot of mandatory lints) in terms of safety, so Rust is not really less safe.
Not necessarily. There is no actual reason that you couldn't be required to prove to the compiler that your code is safe.
That proof might be parameterised by a proof that some external FFI function was safe, which you might not be able to actually prove and have to assume, but then you would have your assumptions well-documented.
As it is, you have to justify the safety of your unsafe blocks to other programmers using comments, which kind of sucks.
Still better than every other fast language in this area though so I can't complain much.
You are correct there. Though we can quibble over the definition of "thread safety", concurrent memory safety is a subset of memory safety, which safe Rust enforces (or at least intends to enforce, modulo bugs).