"we model a scenario where the original code is memory-safe; the ported code is memory-safe; and we consider memory safety and undefined behavior that may arise across the FFI layer between the two pieces of code."
I may be stating the obvious, but that's a bit of a strawman. Yes, writing good FFI code is hard; yes it could result in security/soundness issues; yes, we could use better tools in this space.
But nobody rewrites C code in Rust if they believe existing codebase is free of memory safety hazards; they rewrite it because they think the result will contain fewer hazards, even accounting for the potential problems at the FFI boundary.
If I could remove tens of thousands of lines of hard-to-analyze C code, and replace it with tens of thousands of lines of safe Rust, paired with a few hundred lines of hard-to-analyze FFI adapters, that sounds like a pretty good tradeoff to me. I now know exactly where to focus my attention, and I can have confidence that the situation will only improve with time: better tooling may allow me to improve the dangerous FFI layer, and in the meantime I can recklessly improve the safe Rust module without fear of introducing new memory unsafety bugs, unsound behavior, or data races.
Exactly. No one is saying "rewrite it in Rust because it already works". They typically say it because the thing in question is a bug farm, or it's really difficult to maintain.
I looked for the author's (whoever they are) proposed solution and it's this:
"This is because many of the FFI bugs surveyed are fundamentally cross-language issues. Instead, we propose that both C and Rust must conform to a shared,
formally-based domain-specific language, which provides a safe-by-construction abstraction over the FFI boundary."
Such a thing is a "new thing", and isn't going to retroactively apply to the legacy C code written before this new thing... so how does that help?
(Full disclosure: I'm a professional programmer who has written in C, C++, C# for many years, and now I choose to write new things in Rust.)
> Exactly. No one is saying "rewrite it in Rust because it already works". They typically say it because the thing in question is a bug farm, or it's really difficult to maintain.
Well some people are. Is there a word for attacking the weakest real version of an opposing argument? Strawman usually implies you are attacking a fake version of the argument, but on the internet you can usually find someone who actually holds an easily refuted point of view, just because of the law of large numbers...
> nobody rewrites C code in Rust if they believe existing codebase is free of memory safety hazards
I'd offer that even if the existing codebase is free of memory safety hazards, low confidence in future changes being able to keep it free of memory safety hazards in a cost efficient manner is a motivation to migrate.
The idea that a program can approach some optimal bug-free state, never to be modified or refactored again, doesn't resemble any project I've ever encountered.
> But nobody rewrites C code in Rust if they believe existing codebase is free of memory safety hazards; they rewrite it because they think the result will contain fewer hazards
What if someone makes a strawman out of memory problems so they can rewrite it in Rust ?
> But nobody rewrites C code in Rust if they believe existing codebase is free of memory safety hazards; they rewrite it because they think the result will contain fewer hazards, even accounting for the potential problems at the FFI boundary.
That's pretty generous. They re-write it in rust because a "Show HN: Thing-X, in Rust gets upvotes."
agree, i wanted to just add that this paper might be right for projects that are not actively developed anymore something like bash or coreutils etc, as there this is fairly well tested code and there aren't that many added features that could introduce issues.
for anything that is actively developed it's a whole other story, even if you are confident that the current codebase is safe, each added feature has a risk that it breaks some unwritten contract somewhere and introduces security issues.
eg. look at recent vulnerability in sudo, at and second sight it was safe and secure, triggering it required unobvious corner case.
how many of similar issues you could have in your codebase that could be dormant for years?
> Bash looks like didn't has any commit this year yet
Does anyone know why this is? Is it because doing anything would cause POSIX divergence? Nobody wants to (because it's an "ugh experience")? It's considered effectively complete?
Genuinely curious.
I don't actually use Bash for my personal shell, thought. I do write bash scripts semi-regularly though.
As a hobbyist with only limited C++ and Rust experience, reading things like the abstract of this paper push me towards Rust a bit.
The snark is really offputting. As a user of software trying to protect my personal information in an increasingly threatening world, I would like memory safety in more places. From the outside looking in it feels like there are a lot of C and C++ programmers trying to deflect from how poorly their attempts at memory safety have gone and just snarking at the people trying to fix it.
I will say that the point about FFI boundaries needing more care is a valid one I had not considered.
Is there a name for people who don't work in C++ and never, ever hear "rewrite it in Rust" except from programmers who seem like they're a bit hurt that their language has a competitor?
Not long ago surgeons didn't have a rule of washing their hands before doing an intrusive surgery, and lots and lots of people died because of that.
At a point, one surgeon discovered that, with incontestable proof, and started advocating for hand-washing. When almost all surgeons noticed that something so simple and easy had that much impact, they immediately adopted it and saved a lot of suffering from their patients...
Ok, of course they didn't. Almost everybody fought the few wand-washing advocates, tried to shame and discredit them, tried to generate bullshit evidence saying that it didn't help, tried to convince everybody that it didn't matter, all the time denying to themselves that it worked. Surgeons only started washing unanimously their hands when that generation of them retired.
> Despite various publications of results where hand-washing reduced mortality to below 2%, Semmelweis's observations conflicted with the established scientific and medical opinions of the time and his ideas were rejected by the medical community. He could offer no theoretical explanation for his findings of reduced mortality due to hand-washing, and some doctors were offended at the suggestion that they should wash their hands and mocked him for it. In 1865, the increasingly outspoken Semmelweis allegedly suffered a nervous breakdown and was committed to an asylum by his colleagues. [...] His findings earned widespread acceptance only years after his death, when Louis Pasteur confirmed the germ theory, giving Semmelweis' observations a theoretical explanation, and Joseph Lister, acting on Pasteur's research, practised and operated using hygienic methods, with great success.
This analogy seems pretty disingenuous. Static memory safety isn't an unambiguous good like handwashing; it has tradeoffs. There are many projects where control over memory is critically important and using a language like Rust is a non-starter. Also, the implication that developers who don't switch to Rust are negligently harming people is unnecessarily moralistic and divisive.
You're right overall. But I don't think there's any project that needs control over memory that Rust can't provide. Maybe you mean it would be inconvenient and not provide significant benefit?
Projects where you need to think about memory are a pretty small percentage of projects. Even on embedded systems, I wouldn't be surprised if there comes a time when nobody even knows what a register is.
For projects where you do need rust, obviously "This is literally impossible to do any other way" takes precedence over "It's best practice to do it this way", but the rest of the time.... I'm not sure I see the tradeoffs unless you're doing some crazy hand optimized low level library, or you just have a strong preference for direct control.
The existence of Redox OS in my mind is enough to prove that Rust can do just about anything and lower level stuff is needed less than people seem to think
Then again, I haven't worked much with Rust, it's only just now supported on any platforms where I'd have a reason to use something lower than JS/Python.
>Projects where you need to think about memory are a pretty small percentage of projects.
Performance matters in every project, and the way you get performance on modern processors is by using the cache effectively. That requires a certain amount of control over memory layout and access.
It's not so much about what's possible in a given language, it's about what features you're using and what features are getting in the way. Sure, you can write a bunch of unsafe Rust, but at what point are you fighting against the language and adding unnecessary friction?
The way you get performance on modern processors is to use libraries that use the cache effectively for the stuff that actually eats CPU.
High level languages are good at enabling high level optimizations like caching and special case handling of things, which might be a bigger gain than anything to do with memory layouts except in a few inner loops.
High performance and reasonably focused low complexity stuff might be a good fit for C, until compilers get even better.
But in a web browser? I guess time will tell, but my guess is the ability to more effectively manage complexity will let people do more high level optimizations with less fear of buffer overflows from more code.
Things like "In this specific case I can fall back to this very simple plaintext rendering for this one section" might save more CPU than "We made all rendering a bit faster" especially if you just keep doing the performance critical stuff in C.
"More control over memory than C" is nonsense. C allows you to do anything the machine's instruction set is capable of, all the way down to inline assembly.
Undefined behavior is a completely separate conversation. UB is necessary to output performant code. If dereferencing a null pointer were defined behavior, then the compiler would need to insert a check for every dereference, which would slow down your code.
> C allows you to do anything the machine's instruction set is capable of
Hum, no, unless you count inline asm, it simply doesn't.
And for more control over memory than C, Rust allows mapping high-level types into fixed memory, so the compiler can actually use the correct asm primitives and make your access fast. On C you either have to use a general purpose array abstraction or try to map your types over the memory and pray the compiler resolves all the UB the way you want. Most C developers do the later.
I have no idea what you're talking about with "general purpose array abstraction" and "map your types over the memory". A struct in C is just a fixed size chunk of memory where the fields have a fixed offset. Access is a simple memory read/write. An array in C is also just a chunk of memory, where subscripting is some simple pointer math. There is no UB involved. If you could give a concrete example of what you're talking about and why it's better in Rust than in C, that might be helpful.
Only if you use unsafe and even then three are restrictions. The general idea of rust is that it will only let you do things it can verify are safe otherwise it won't compile.
> Only if you use unsafe and even then three are restrictions.
I can't think of any. The borrow checker still checks your code but you can forge, twiddle, and cast pointers, there's really nothing you can't do that I can think of.
> In both unsafe functions and unsafe blocks, Rust will let you do three things that you normally can not do. Just three. Here they are:
> Access or update a static mutable variable.
Dereference a raw pointer.
Call unsafe functions. This is the most powerful ability.
That’s it. It’s important that unsafe does not, for example, ‘turn off the borrow checker’. Adding unsafe to some random Rust code doesn’t change its semantics, it won’t start accepting anything. But it will let you write things that do break some of the rules.
That doesn't change anything I said. Those unsafe functions include things like casting pointers. There's nothing you can't do through some composition of those 3 things.
That is a story to fit a theory.
Generations in a profession are not discrete units it is made of a blend of people born every year between the eldest and youngest.
Those that fought against a new idea are colleges of those that fought for an idea and colleges of those that are ambivalent to the idea
There isn't a name for you, but what you're experiencing is strawman attacks
Like the strawman argument where it creates a flawed argument to declare victory over, the strawman attack picks out the most extreme case to declare victory over
You'll see this done on left/right political debates where instead of recognizing that most people can find agreeable common ground on 80% of issues, extreme advocates will pick out the opposing extreme advocates to vilify each others' groups
I think a lot of Rust programmers appreciate where C is the right tool, & a lot of C programmers can admit that memory errors are a pain in the ass
>I think a lot of Rust programmers appreciate where C is the right tool, & a lot of C programmers can admit that memory errors are a pain in the ass
That's true, however IME C++ programmers are the most in denial. And yes, RAII has helped a lot. But still we see way too many memory errors coming from C++ code bases.
"Rewrite it in Rust" is another class of "I use Arch, BTW"; one of those phrases no one has ever actually heard unironically, and that is only used to attack the strawman who supposedly said it.
it’s important to note that memory safety is only one kind of bug. Plenty of go and java programs are memory safe with a litany of bugs - and are probably directly protecting an order or two magnitude more of your personal data than Rust and C++ - even without discussing relatively safe languages (interpreted but with various implementations) like Python, PHP, and Ruby.
The thing is Rust also does great on the "other bugs" side.
The standard library and ecosystem follow the philosophy of making it easy to do the right thing, and preferably impossible to do the wrong thing; made possible by a very expressive type system. Of course logic bugs are possible in any language, and I wouldn't suggest rewriting a battle-tested Java application in Rust. But if you are already committed to rewriting (or starting a greenfield project) then Rust is a good choice for more reasons than just memory safety.
In my experience a memory error is often the result of a logic bug - that is the "first failure" of an existing issue in C++ is a memory error, but the originating logic error still exists and would just cause a different issue under "memory safe" languages. It would still be a bug, and may cause security issues.
IMHO the big advantage here is not languages that are memory safe, so much as catch many of those misunderstandings at compile time, or runtime in a better way than "Crash" or worse - silent corruption (Much of this is available for c++ in static analysis and sanitizers and other runtime checks, but them being non-default options seems to make them oddly ignored). I feel them being non-default also means there isn't the same focus on performance - and it's an everything-debug-and-slow-as-hell with a UX purely designed around the dev debugging their own code, or no-checking-at-all-no-brakes rather than in between.
The classic "memory errors" that are purely local and just not checking the bounds of an array or similar have been pretty much eliminated if you actually use the features (containers, iterators etc.) of the c++ standard library.
The problem with memory unsafe languages isn't that they can crash. It's that they won't crash. And instead allow an attacker to load up malicious code into the application.
> better way than "Crash" or worse - silent corruption
My point is that the current crop of c++ compilers and standard library implementations have the ability to check for the vast majority of these issues, just they're relegated to non-default "debug" options.
Just flipping that, so you had some level of checking on by default and you had to explicitly ask for "Go fast and break things" mode would help - as it feels like the vast majority of native code doesn't need to get that last few percent. And even that is questionable, I remember doing some (micro-)benchmarks on simple stuff like array bounds checking and the performance difference on modern big CPUs was pretty much within noise. Maybe it's a big deal on microcontrollers or small embedded systems, but that's the "I Know What I'm Doing" non-default mode is for.
> better way than "Crash" or worse - silent corruption
My apologies, I could have sworn that your text was 'better way than "Crash" <end>'. And not in fact, 'better way than "Crash" or worse - silent corruption'.
Although, silent corruption is a possible problem in memory safe languages. True, memory safety gives you some sort of protections, but threads and/or setting the wrong value can still give a memory safe corruption.
No, I was talking about the ability for an attacker to craft malicious input such that a memory unsafe program begins executing that input as if it were the program. Something that you generally do not have to worry about in a memory safe language.
I think I edited it within seconds of posting, it was just "Or worse" IIRC, and noticed I'd missed it when rewording it just before posting. I guess you were lucky enough to see it immediately :P It's a bad habit of mine to post then re-read with fresher eyes and notice problems in my comments - exactly like this clarification.
Security of technology is really a process, not a boolean. "Memory Safe" doesn't mean "Safe", and "Not the current crop of Memory Safe" languages doesn't mean "Unsafe". There's so much we can do with the tools we have right now to help with current codebases and apps, too many people seem to have the idea you have to do a ground up rewrite using the latest and greatest language, or you're an idiot and/or lazy.
Honestly, the whole world isn't going to be rewritten in rust tomorrow, imagine the time and effort that would take and all the other errors a ground-up rewrite of well-tested systems would cause. But we can probably rebuild 90% of the existing world in a "Safer" manner right now. We should be promoting that too, and not let perfection be the enemy of better.
> But we can probably rebuild 90% of the existing world in a "Safer" manner right no
I strongly agree with this mindset. The only caveat I would offer is that some architectural structures that c and c++ allow probably don't amend themselves to fixing even if you can detect them.
Although just like you can write any language horribly, I think you can write almost any language well.
It's hard to see how tools that allow you to do that can be a bad thing. I'm certainly not going to be paying to rewrite everything from the ground up in rust or otherwise.
>> it’s important to note that memory safety is only one kind of bug.
Is it? Large companies with huge code bases (meaning lots of statistical significance) have said 70 percent of their CVEs could not have happened if that code was written in Rust. Not only are those kinds of errors very common, they can be IMHO a pain to debug.
So given the prevalence of memory safety issues, why do you find it important to remind people that other kinds of bugs exist?
There's a theory that C programs are better designed from an implementation perspective and from a feature perspective precisely because it's so difficult to work with. The cost of implementing architectural complexity is so high that people put in extra thought and effort to discover the simplest design that can possibly work, which means better-understood and more reliable systems.
Same thing with features. Fewer, better-vetted features mean more secure systems.
C encourages better, simpler designs, but at the cost of making it nearly impossible to write safe code. Garbage-collected languages remove entire classes of errors, but they make it easier to implement lazy, unnecessarily complex designs.
Maybe Rust offers the best of both worlds? (I say this mostly but not entirely tongue in cheek.)
> Maybe Rust offers the best of both worlds? (I say this mostly but not entirely tongue in cheek.)
AFAIK Rust's solution for tree/graph is to replace pointers by indexes. IMHO this is worse because if you get it wrong with pointers you may have a core, with indexes you just get silently the wrong node..
How does memory safety allow you to protect your personal information? How many binary exploits have actually lead to the exposing your personal information?
I can count with my fingertips exactly zero times that has happened to me. The times my personal information was compromised happened to me is when I downloaded malicious software. Malicious software doesn't care whether your process has "memory safety", it'll extract anything from the processes memory it wishes. Sure memory safety would at least rule out simple binary exploits, but to be targeted by one is not exactly common.
I don't get the narrative of an ever increasingly threatening world. If anything things have only been getting better, at least for the individual users. I remember individuals getting pwned very frequently in the old days. Now, not so much, with DEP, ASLR, and stack canaries. The attacks on businesses have increased, but those attacks are mostly just phishing or otherwise human error related.
Attack vectors for actual PI retrieval rarely incorporate some obscure memory bug in a program allowing RCE or remote access. Unless you're a high value target.
> How many binary exploits have actually lead to the exposing your personal information?
It happens with some frequency. WhatsApp was exploited due to a memory safety vulnerability, for example. Chrome 0days for memory unsafe vulns are definitely exploited in the wild.
Typically companies have much much larger attack surface, so a technique like phishing is going to be far cheaper to execute. But even still, I've seen a memory safety vulnerability used in an attack against a company.
The thing is that most companies' attack surface isn't in C/C++ because... that would suck, so they don't do it. Or if they do they use a specific codebase that's been heavily invested in over decades and they sandbox and isolate the services.
So on the one hand, yes, most attacks on companies are not due to memory safety issues but that's in part because of the investments into memory safety.
> I remember individuals getting pwned very frequently in the old days.
Yep, significant efforts were made to make the internet a safer place. Primarily the sandboxing and disabling of third party plugins in browsers.
But this doesn't really matter. Yes, there are other issues like phishing and those are being addressed with other techniques. There are issues like sql injection and those are also being addressed. That doesn't mean that memory safety isn't an issue.
I take issue with the notion that languages can have competitors. Languages are tools, and some languages are more suited to some tasks than others, but each has their place.
Saying languages compete with each other is a bit like saying different kinds of hammers compete with each other. They don't, each is just optimized for a different kind of hammering.
I would be surprised if hammers didn't succumb to exactly the same type of competitions. Here are a few hammer review pieces, pitting metal against metal
Those are comparing the same types of hammers from different manufacturers. My analogy is that different kinds of hammers don't compete with each other. A claw hammer, for instance, isn't competing with a ball peen hammer. They each are used for different kinds of tasks.
Just ask people what the round end of a ball peen is for. Few people know what peening is, fewer know how to do it, yet a ball peen hammer is still one of the most common hammers around. It's competing against other specialized hammers and winning because people simply have it already.
This is also replying to nordsieck, who made the same point.
You are both correct, but I was using "competition" in a different sense. I was thinking in terms of how people often talk about these issues: as if there was a One Language To Rule Them All. That is the notion that I object to.
When considering the best tool for a particular job, the different contenders are competing in a sense for that job. But I see that as not being the same as when people are asserting that a given language is always the superior choice independent of the task at hand.
The responses to my initial comment really surprise me. I had no idea I was saying anything controversial, but apparently I am!
Saying that languages compete isn’t the same thing as saying one language is objectively better than another. It just means that people might consider both options and weigh up the pros and cons of both. When people say that companies compete, they aren’t saying that one is objectively better than another, if anything it’s the opposite
Thinking of languages as hammers or screwdrivers is too simplistic. Instead, think of languages as platforms.
Your productivity in any language is directly proportional to your time and effort investment in it. It's in your best interests to pick the language that's likely to thrive and spend time learning its ins and outs. On the flip side, betting on a horse that doesn't win could mean the loss of months or years of effort. This is why people evangelize the platforms they're invested in - convincing other people to join improves the health of the platform, increasing their return on investment. More adoption => more libraries => more projects => more jobs => more adoption. A virtuous cycle.
This evangelizing can sometimes become contentious if others perceive it as an attack on their platform. People defend their language mostly because they don't want to see it lose popularity. If it did, their language's viability is threatened and their investment is in jeopardy.
It's also partly because they've spent so working with this language that it's become a part of their identity. A critique of the tool is perceived as an attack on the person.
I suppose my views are skewed because I have become competent or better in a large number of languages. I have my favorite general-purpose language, of course (it's C++-as-a-better-C), but I view choosing a language for a particular project to be an engineering decision that I make according to the demands of the project, not based on what language I personally enjoy the most.
Developers do get emotionally invested in their tools, and I've seen an uncountable number of holy wars because of that. But all of them strike me as being ludicrous.
If another dev uses a language I would not have chosen, that's not a personal affront to me. I assume they made an engineering decision that made sense for their situation. It's all good.
> If it did, their language's viability is threatened and their investment is in jeopardy.
I sorta see this, but in my experience, it's very easy to overstate this case. If a language is viable enough to develop in initially, it's viable. How popular it is or becomes isn't really that important except in terms of being a business decision ("is it possible to hire people in the future who know this language?")
Imagine you only have a rock. A rock can be used for many things. You can use it to open a can, you can use it to smash things, you can use it to drive a nail.
Then someone invents a hammer. You no longer use a rock for driving nails.
Then someone invents a can opener. You no longer use a rock for opening cans.
Then someone invents gigantic sledge hammers. You no longer use a rock to smash things.
Languages definitely have competitors. Java took a lot of projects that would have otherwise been written in C++. Rust will take a lot of projects that would have otherwise been written in C or C++, but languages like Java weren't suitable for. Eventually, those two languages might be completely supplanted (but it would take a very long time). Because they lost to competitors in each space.
Yeah, I'm not so sure. Certainly there are many cases where a language was used for a type of project when it isn't ideal -- but it was the best option. A new language can come along and replace the old for that sort of project. But I don't really see that as "competing" in any meaningful sense. It's just that someone invented a better kind of screwdriver for that kind of screw.
Some languages can certainly become obsolete (although even then, I'm hard pressed to think of a language that is no longer used at all) or relegated to very niche uses, of course.
Just stop. Programming languages aren’t like carpentry tools at all.
For a programming language to survive you need a critical amount of users so that they can support each other by teaching each other, writing libraries, and influencing management in private and public industries to adopt the languages that are wanted. That naturally leads to at least some competetition for mindshare among developers, even (especially?) among the grassroots.
This isn't true, there are different hammer manufacturers, and they market their products to steal market share from each other. There is a history of languages as locked in walled gardens with real competition. Java v .NET ring any bells?
Ah yes, the "* Considered Harmful" tradition in computer science. A tradition so widespread that it has even wrapped around upon itself and yielded the "'Considered harmful' considered harmful" paper.
Not the best start already. Hmm, written by 'Anonymous Authors'. So is this supposed to be a joke? Is the implication that speaking out against rust makes you the target of unjust persecution? Is it written by people known to have an axe to grind?
Abstract: This is a free-verse poem setting up a rust advocate strawman. Do such people exist on the internet? Yes. But time cube also exists on the internet.
Introduction: Incrementally rewriting a C/C++ system in rust is bad because FFI can have problems. Which can be simplified with a different spin to be, "C/C++ are so bad that they contaminate even the mighty rust."
It feels like the paper might have some good information that would be useful to people who want to augment their codebases with rust. However, it starts off so combative, that I don't think I could seriously recommend anyone actually read it. Maybe just skip the abstract and intro and see what you get out of it?
> Which can be simplified with a different spin to be, "C/C++ are so bad that they contaminate even the mighty rust."
I've written C and C++ for a long time, and... Alas, you're not completely wrong lol.
At work there's a still unresolved ticket of me trying to get cross compilation working, and it's basically impossible because of the way most C/C++ tooling is built and works. CLang kinda works nicer here than GCC, but there were other issues with that that I encountered. Basically, it's nearly unsolvable due to the number of hoops one needs to jump to properly link against a dynamic library produced for the target platform (if the host platform is different).
Don't you think you're a little bit too emotional/biased just because title (which isnt even really weird, just meme among HNers) and that authors arent using their real (or fake) names?
Follow up question then: Is it normal for pre-prints that go out for blind peer reviews to be publicly posted like this? Should I be interpreting this as a leak of some sort, someone trying to get their work public eyeballs without messing with the peer review, etc? Or is this just par for the course in academia?
There's a certain tendency to take anything written in a double column PDF followed by a long list of citations as gospel. Maybe that's what the authors were hoping for? The poor quality of writing, contrived examples and unnecessary snark could be why the authors didn't to put their name to this paper.
I'm possibly misunderstanding, but their first example doesn't make sense to me:
1. C can make aggressive aliasing optimizations based on observable behavior, meaning that the compiler may or may not optimize `add_twice` based on the results of any alias analysis it chooses to do. In other words: the author's assumption that a C compiler won't optimize `add_twice` in the same way that Rust will is not a guarantee, and assuming that it won't is relying on unspecified behavior.
2. The author claims that the result of the optimization can result in memory unsafety, and it's not immediately clear to me that that's true: it might be true in an interprocedural sense due to another function's assumptions about how `add_twice` affects its parameters, but this is the same incorrect assumption as in (1). In that sense, Rust is not really involved at all here.
> In other words: the author's assumption that a C compiler won't optimize `add_twice` in the same way that Rust will is not a guarantee, and assuming that it won't is relying on unspecified behavior.
In C, objects of the same type are presumed to alias unless the compiler can prove otherwise. Unless add_twice is compiled as a file-scoped (i.e. static) function or with equivalent compiler-specific options, the compiler cannot prove that a and b don't alias, and therefore it cannot make such an optimization. This is why C99 added the "restrict" keyword, to tell the compiler that two parameters of the same type do not alias.
This is indirectly why type punning is problematic in C--because C is pessimistic about aliasing, at least for objects of the same type, you normally can't run afoul of aliasing issues as the compiler bares the burden of ensuring safe behavior. But for objects of different types, the compiler can assume they don't alias. When type punning, at the point of usage it appears to the programmer and the compiler that they might alias (because they're the same type), but at some point earlier or later when operating on the original type of the type-punned object(s), the compiler can assume lack of aliasing, generating optimizations that atadistance break the type-safe code. Which is precisely the issue here with Rust FFI, and almost for the same reason--conflict between different aliasing rules in different contexts as Rust presumes lack of aliasing even for objects of the same type. In effect, with Rust FFI you're always type punning arguments passed into a Rust function, which is absolutely a noteworthy issue that absolutely should be addressed by adjusting the default semantics Rust applies to extern'd functions. (C isn't the only language that could run afoul of such aggressive semantics for foreign functions.)
In Rust mutable references are exclusive. They're statically guaranteed to never alias with any other pointer.
However, when you call such function from C, the C compiler has no clue about exclusive references, so obviously it can't enforce their invariants. The C side can provide aliased pointers and break Rust's assumptions. The Rust side assumes it's called correctly, and the C side can't know what is correct.
I don't think anybody expects incorrect code to behave correctly, but as usual the issue is about mistakes. The C compiler can't help you preserve Rust's invariants, so when you make wrong assumptions or declare a wrong interface, instead of a compilation error, you can get UB silently creeping in.
The FFI-glue language they propose is meant to prevent these types of mistakes.
I think some folks are misreading the authors intentions(possibly due to aforementioned snark): they’re not advocating for sticking to existing C codebases over rust.
They identify FFI as an area where it is easy to make mistakes which lead to undefined behavior, argue that it is the result of competing assumptions between C and rust, and then propose a formal system implemented in both in C and in Rust to resolve those issues.
Sure, the problems they identify wouldn’t exist if everything was rewritten in Rust, but on a large enough / old enough codebase, that may impossible to do all at once. This paper attempts to find a way to make rust better within those organizational constraints.
The audience here is not engineering managers deciding in your next tech stack, but programming language researchers.
> I think some folks are misreading the authors intentions(possibly due to aforementioned snark)
Yes. It looks like the paper has some useful things to say. However, that abstract + intro is like walking into a wedding and splashing red paint on the bride and then announcing a bunch of very good reasons why the wedding should not continue. Nobody is going to be listening.
Perhaps another way to say it is that nobody is misreading their intension. What they're doing is ignoring reasonable concerns said in an unreasonable way.
it seems under this view, the conclusion that can be drawn for the paper is “replacing parts of a C codebase with the conceptually equivalent Rust code without any thought as to the nuances of Rust ffi may lead to bugs if the C code is already contrived” which doesn’t seem too useful a conclusion to me.
Because no-one would want to put their name to this.
There may be some valid arguments in there (though it seems to be just "FFI is hard ergo noone should never attempt phased migration of any system codebase"), but even content aside, the tone is that of a moody teenager. The abstract in particular is one of the worst things I've ever read - the introduction doesn't read fantastically eitherr.
This is not literally from 2017. It’s a preprint; the 2017 bit in the footer is from the default article template. Per the footer, it’s actually from 2023. You can tell because the abstract mentions SBF and Madoff in the same sentence!
There's plenty of good content in this paper. FFI safety is a very important topic, and there are unique concerns with Rust FFI. Such a shame it devolves into unprofessional and inflammatory remarks. The authors could have put their names on the paper and gotten a decent credit if they could have kept their emotional responses to the topic out of it.
The problem I have with this article is it's pointing at a universal problem (FFI safety) and narrowing that to a rust problem.
Yes, FFI is dangerous, that's true regardless of language (including C). FFI is inherently platform, language, and even compiler specific. It's super tricky to get right.
"Rewrite it in ____" should pretty much always be a last restort. Rewriting takes a ton of time and effort, has its own downsides, and just generally doesn't have as obvious a value add to customers/consumers (outside of pretty specific scenarios) to justify the cost. Plus, rewriting anything in anything else just invites a whole new suite of problems with the new thing that probably/definitely weren't accounted for.
Now, for personal software I say hell yes. If a person is writing code for the fun of it, or simply wants to learn about concepts, rewrite whatever in whatever and have a great time. That's the point! But for professional/commercial software, rewriting is probably more expensive than is realized.
Wow, this is extremely relevant to my current work (porting from Go to Rust). Adding on to the paper, some other issues are:
- Modeling concurrency across boundaries, like if you're p using goroutines but also tokio, how the heck does that work?
- Persisting data across FFI, like if you have some Go code that calls Rust, and you want the Rust code to persist stuff in memory, that gets tricky fast.
- Assumptions around strings. Go will give you a bag o' bytes and say "it's a string, trust me!" while Rust expects UTF-8.
We definitely ran into some of these issues, like remembering who should deallocate which memory.
Although the paper's title and abstract seem to be intentionally trying to stir controversy and annoy people, the core issue of FFI unsafety is real.
C has many subtle behaviors, and a type system that doesn't describe them well. It also can't express its API's ownership or thread-safety, and there's no true immutability. Correctness of the ABI isn't enforced in any way, so you just smush symbols together and hope for the best. That is ripe for errors, especially when you try to add extra guarantees on the Rust side that C may or may not actually provide.
now here we have "X considered harmful" - a common Trope.
This seems more like a "First!" to title their paper this, as a meme of the whitepaper meta, that title came first, then the arguments came.
I say this because I joked about a paper being released with this name about a week ago and now here it is, but Its a bit lower quality than expected.
We might need another one later...
A lot of the commenters don't seem to understand the thrust of this paper. First, the use of "considered harmful" here is tongue-in-cheek or at least self-aware. Second, the point of the paper is not to say that rewriting in Rust is bad, but to propose a safer approach to incremental rewrites that addresses (what the authors argue is) the risk of introducing new bugs at the C/Rust boundary.
I really don’t understand the people who have a problem with rust. Do you not value the increased memory safety? Now that Microsoft and Google are adopting rust and reporting significant decreases in memory related bugs it’s pretty clear that rust does make a difference.
Sure, but the way that rust does it comes with a heavy cost, including slow compile times and confusing lifetime semantics (and, in my subjective opinion, poor syntax). There are other techniques to achieve memory safety that are simpler and more ergonomic (to me at least) than borrow checking. Vale, for example, has some interesting ideas in this space.
One of the main reasons that people complain about rust is because it has an extremely loud group of evangelists who often shame other people for taking a different approach and essentially refuse to acknowledge that others have a point about its weaknesses. It is always off-putting when a community goes around telling everyone else that they are doing it wrong, particularly when they are ignoring the very real flaws in their own way of doing things.
> One of the main reasons that people complain about rust is because it has an extremely loud group of evangelists who often shame other people for taking a different approach and essentially refuse to acknowledge that others have a point about its weaknesses.
Name them. Maybe you don’t have real names but you can probably cough up some Internet handles.
The “fanatical rustacean” is a bit of a 2018 thing. Out of the few who are actually evangelists I an fact think that they can be way too nice and conflict-averse (“right tool for the job”).
You realize I was responding to a comment in which the OP asked why we don't like memory safety? That is exactly what I an talking about. The coder's internet is filled with this type of attitude. You can disavow these types as not real rustaceans or whatever, but this language definitely attracts purists, which makes sense given that its main selling point is a form of purity.
The loudest voices in Rust that I know (pcwalton, steveklabnik, burntsushi, on HN for example) clearly acknowledge that Rust makes tradeoffs.
No one denies that there are "simpler ways to achieve memory safety". Stop-the-world garbage collection and reference-counting are exactly just that. Vale uses generational references, which, IIUC, has both memory and runtime costs when compared to the borrow checker.
* Rust does not have slow compile times compared to C++. They are often faster. Linking tends to be the slow part. C++ compilation speeds suffer heavily from header files.
* Lifetime semantics can be confusing in any language. In C++, you ignore it, you get crashes from dangling pointers. In Rust, you ignore it, you don't get it compiling - which forces you to think about them. The lifetime concerns are always present unless you're using a garbage-collected/reference-counted language.
My experience is that the Rust community is very tolerant and patient. Can you cite an example where a "loud evangelist" shames other people? Also, in my experience, when that happens they are criticised by the same Rust community.
Every single community has the 1% bad apples. As a Rust dev I refuse to be associated with zealots and fanatics. Every single other Rust dev I've ever worked with was a normal programmer, namely a pragmatic analytical type.
Stop parroting memes. "Rust evangelists strike force" and "Rewrite it in Rust" do not actually exist.
Point me at your local hobby or professional club and I bet my neck I'll find you at least one fanatic. But I will not use that to deride your hobby activity. So don't do that for Rust, please. The community is huge and doesn't subscribe under the fanaticism of a few loonies.
Rust restricts what you can do when it can't reason about your code. Some developers don't like those restrictions and feel like they're "fighting the borrow checker". Others don't think it's worth it and go back to managed languages.
It's usually something like "I don't need the compiler holding my hand, I know what I'm doing", or "I'll just write in Go/Java/etc. so I don't have to worry about memory".
From my experience its not the language that is the issue, its usually people with little to no experience in the language parroting what they have heard online.
Whenever anyone that gives it a serious try to build something and does not become a convert that touts its greatness as the one and true go...language they are usually given the no true scotsman treatment or a variation of the emperor's new clothes.
TBH, I don't have a problem with Rust so much as I have/had a problem with a section of the rust community.
The shouting fanbeings of rust put me off looking into it for years, because when I kept getting "rewrite it in rust!" as the answer to "there's a problem with $THIS_CODE" when talking with colleagues, _even when those colleagues had minimal rust experience_, all I could conclude was that the whole thing was an empty promise and that no-one knew how to solve the problem, but everyone "knew" that the New Cool Language was the way to fix everything.
Generalisation from incomplete data - no doubt there was a sensible majority in the rust community, but the fanbeings were _loud_.
FWIW I was wrong: I'm getting into rust now and I like what I see, and the discussions around it online and with colleagues are pretty sensible. But, it's taken a while to get there and when you've been in tech for a couple of decades you see this hype cycle and get jaded to it. Erlang is the new hotness ... OCaml is the new hotness ... Java is the new hotness, rewrite everything in Java, wait C# is the new hotness...
I suspect rust is here to stay, and I'm gonna learn more about it, and I regret some of my past words about it. But my problem was never with the increased memory safety, or the language at all, pretty much, just the early community.
TL;DR: Other humans are the worst, bug reported, fix unlikely :)
C offers an appropriate level of memory safety for the problems it solves.
I take managing my own memory over unreadable code and dependency hell any day. The fact that C will run on DSPs with 27-bit pointers is just an added bonus.
The ease of manual memory management in C is an advantage, not a downside.
And the tooling for C will exist long after rust has been abandoned.
I would like Rust a lot more if it got rid of methods. Instead of writing something like...
args.into_iter().skip(1).and_so_on...
Which I can't stand to look at, I'd vastly prefer something like...
for arg in args[1..*] { ... }
Though my true preference would be S-expressions, but I realise that people lose their marbles when they see something like...
(for-each do-something
(slice args #:from 1))
Of course Common Lisp is one of the faster slow-as-molasses languages, but having a language that uses S-expressions which can compile down into a small, fast binary would be the dream. Carp would be interesting if it didn't use Clojure syntax.
Slow as molases? I will assume you have actually written quite a bit in it since you seem to like s-expressions and I really can't think of any other language other than lisps that use them.
With that in mind, what other lisps are running circles against common-lisp? I would love to give those languages a try. From my experience sbcl common-lisp is about equal to Go and Java which I consider pretty fast languages.
Its not as fast as C,C++,Rust but when I think of slow I think of Python, not lisps.
> With that in mind, what other lisps are running circles against common-lisp?
Well, I haven't tried any of the Linear Lisps, but as far as I know it's the fastest Lisp. That said, well-written C or Fortran will run circles around it.
> From my experience sbcl common-lisp is about equal to Go and Java which I consider pretty fast languages.
SBCL is amazing, it's really fast (though LW beats it in certain situations), but it's in the same class as Go and Java, which I consider dog slow. Modern computers are really, really, stupendously, ludicrously fast; but many programming languages don't give our computers a chance to really shine, which is a shame.
Basically it's like... an F1 car is really fast as long as you don't compare it to a fighter jet.
I may be stating the obvious, but that's a bit of a strawman. Yes, writing good FFI code is hard; yes it could result in security/soundness issues; yes, we could use better tools in this space.
But nobody rewrites C code in Rust if they believe existing codebase is free of memory safety hazards; they rewrite it because they think the result will contain fewer hazards, even accounting for the potential problems at the FFI boundary.
If I could remove tens of thousands of lines of hard-to-analyze C code, and replace it with tens of thousands of lines of safe Rust, paired with a few hundred lines of hard-to-analyze FFI adapters, that sounds like a pretty good tradeoff to me. I now know exactly where to focus my attention, and I can have confidence that the situation will only improve with time: better tooling may allow me to improve the dangerous FFI layer, and in the meantime I can recklessly improve the safe Rust module without fear of introducing new memory unsafety bugs, unsound behavior, or data races.