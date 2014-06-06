Sandboxes have significant performance costs of their own: the cost of spawning and communicating with a separate process, in addition to the extra memory use of a multiprocess architecture. They also come with high engineering overhead: creating a Chromium-like sandbox is a lot of work (which is why so few apps go to the trouble), and it's easy to mess up and accidentally introduce sandbox escapes. There is the attack surface of the kernel and IPC layer. Finally, what seems at first to be a simple IPC interface between the image decoder and the host application becomes a lot more complex when hardware acceleration (GPU or otherwise) becomes involved. Securely providing a decoder access to hardware (say, to perform iDCT or colorspace conversion) is a nontrivial task.
This comment didn't deserve to be downvoted, as it's a legitimate argument. But when you add up all of these costs compared to just having a system that prevents memory safety issues in the first place, the Rust approach is rather appealing.
(Oh, by the way: Preventing image DoS isn't as hard as it sounds. Refusing to decode images larger in area than some limit gets you most of the way there.)
If you can get rid of half of the security risks by using software that are free for memory-born vulnerabilities, that's a win. That's what Rust is about.
Things are a bit more complicated with image libraries though, because a) performance is usually critical and b) even if you have that casual use case and it's not yet critical, memory safety is still not going to make it free from vulnerabilities to accidental DoS attacks (not even talking about malicious users here!), unless you apply those mitigation techniques and deploy supporting infrastructure to prevent process from using all memory and all CPU and take too long, etc.
Absolutely. Bounds checking is a mitigation technique to avoid potential footgun when you want/have to use manual array indexing.
In most situations in Rust, you use iterators instead of array indexing. Those are the «actual freedom from memory-born vulnerabilities» and have no associated overhead. Since it is known at compile time that you are not going to access invalid memory, there is no need for bounds checking in this scenario.
Most of rust safety comes from clever compile time checks with no additional cost at runtime. The two main counter-examples I can think of are :
- bounds checking on manual array indexing
- usage of a secure hash function by default in HashMaps
The whole story about safe multi-threading and dangling pointers prevention is totally free performance-wise: everything is checked at compile time by the borrow checker.
> memory safety is still not going to make it free from vulnerabilities to accidental DoS attacks (not even talking about malicious users here!)
You're right, Rust doesn't solve all the problems at once. Is it a good reason to get rid of the benefits of solving the worst 50% of problems ?
Again, not for images. I believe the real reason is different, Rust people simply want to have an image library in its ecosystem, just like Go. Nothing wrong with that goal, makes it more appealing to new users. But that's it.
"Memory-born vulnerabilities" sounds like a broad term, and if it includes DOS issues then Rust code is as vulnerable as anything else. But it's worth being precise: Rust's security model is about preventing undefined behavior.
If you have an out-of-bounds array write in C, that could do anything to the program. But in safe Rust code, all it can do is panic. Like you said, that's not necessarily any different from a DOS point of view, but it rules out a huge class of remote-code-execution bugs and leaks like Heartbleed. Of course it's still possible to write bugs and leak something you didn't mean to leak, but it shouldn't be possible to do that in a totally unrelated part of your process, and that's a big deal.
What kinds of DoS attacks are you thinking about that aren't covered by simply limiting the size of the images you need to parse?
I mean, I've written a PNG decoder and the only dynamic allocation I can think of that I needed was the actual decoded image data.
Theo's quote is BS by the numbers and pentest results. Sandboxing done right is way more secure.
Whether you want to do image processing on a Linux, BSD, or one of my kernels is a separate topic. However, it's worth noting that even stuff like I cited could be useful for that where it stashes the image processor into its own protection domain with read access to memory containing input and write access to memory that will have the output. The code itself isn't allowed to do or touch anything else. At this point, exploits must be chained in clever ways. Modifying this to preserve pointer integrity, lowest overhead range being around 1-10%, would choke those kind of attackers further.
Using Rust is yet another tactic that can work well with or in lieu of a sandbox. Ideally with for issues the language itself can't cover. Three precedents for that are right on my list: GEMSOS's safety was improved using a Pascal subset with call-by-value semantics; ASOS was a combo of security kernel and Ada runtime to host embedded, Ada applications (lang-safety + sandbox); separation kernels like INTEGRITY-178B usually support runtimes for Ada and Java subset to get safety within components. Now, there's some OS's like Tock and Redox plus low-level libraries written in a new, safety-enhancing language. Such designs can benefit from the old strategies which we see a few in Redox. On other end, Genode uses many old and recent tricks in architecture but not safe language. It can also run a Linux desktop with your image processing code. ;)
Note: A bit academic is also misleading. Several, separation kernels are deployed commercially for things like secure, web browsing. INTEGRITY and LynxSecure are among oldest. The Nizza and Perseus architectures got turned into Sirrix's Turaya Desktop. These things aren't just academic.
All the other stuff you say is on point, of course, and there's an argument or a hundred to be made that a lot of these problems are already solved, and that we (as users and developers) are missing out on a massive amount of amazing OS research that could have made everything better than our current insecure, 1970s-tech monolithic kernel OSes.
I just named multiple tools available to use in commercial sector and FOSS. Their predecessors were used by commercial and government sector for a decade before that. That's 100% not academic even if a few started there. If anything, more work just needs to be poured into FOSS stuff so it gets more deployment and impact.
"All the other stuff you say is on point, of course, and there's an argument or a hundred to be made that a lot of these problems are already solved, and that we (as users and developers) are missing out on a massive amount of amazing OS research that could have made everything better than our current insecure, 1970s-tech monolithic kernel OSes."
We're definitely in agreement there. Fortunately, we're seeing some exploration into that including on the Rust front with Redox.
Then, by implication, sandboxing might be effective if
it's done like in my examples. And there are CompSci
designs doing stuff like that.
Which defeats the exercise completely.
Recent efforts did it formally for things ranging from kernels to application-layer sandboxes. In case of VCC and SPARK, they do annotations on the code itself. In things like seL4, they do proofs on the code connecting it to a model. So, one can do that. It's usually not necessary since we're talking about improving the security of things over the status quo. You seem to be adding an additional requirement of totally unhackable with mathematical proof down to the hardware. AAMP7G and SSP are production CPU's close to that, esp AAMP7G. VAMP with CodeSeal & CHERI might do it, too. One academic one was proven down to the gates. They're all irrelevant as we're talking about reducing damage of an image library on regular CPU's & OS'.
You trust the Rust typesystem, compiler, and optimizations were implemented correctly enough that they'll knock out a bunch of problems. Certifying compilers like FLINT & CompCert took mountains of verification effort to pull off a fraction of that. Whereas, I'm saying one can trust 4-12Kloc or a compiler transformation connected to an abstract model fitting on a page or two might do its job with fewer problems than trusting just complex, native code in C or whatever. I also encourage mixing such TCB's with language-based security. However, such a tiny, clean, ASM-style program will have way fewer defects than a full OS or application.
You could argue against safety of Rust [...] with the same strawman.
Rust does a mapping of input symbols to output symbols. The input symbols have no 1:1 mapping with machine instructions on any 1 platform. All Rust does is attempt to ensure the constraints of the input symbol language are ensured to exist in the output symbol language.
Lastly no compiler attempts to say it's model is 100% consistent will always work correctly in all circumstances.
Hell C doesn't even bother defining how most integer operations work (in older versions).
This is impossible for them too. But for an attacker it is trivial to exploit these bugs.
Saying you've proved a system means the system is rigorously true beyond the shadow of a doubt. The fact you can't rigorously test underlying state machine, or hoarce logic correctly maps 1:1 with the underlying machine you haven't proved anything.
Okay so let us pretend I prove P=NP for a collection of symbols which I state map 1:1 to a set of all possible algorithms in both P and NP space. But I don't prove that second statement. I'll get laughed out my advisors office.
What proving software correct does is effectively no different then my antidote from a rigorous perspective.
Done a damned good audit? Yes.
Proved? No.
You are auditing software, not proving it.
This isn't a straw man. Or a logical fallacy. It is a failure of logic. You prove A is true in system B. But never prove system B is correct... You've proved nothing.
"here are some examples of containers & virtualization that were either proved secure against a policy with realistic model, survived tons of pentesting, or both. Sandboxing done right is way more secure"
"you can't prove your Hoarce Logic maps to existing hardware... Which defeats the exercise completely."
"Most models do not cover all hardware, or inter-generational bugs for example the recent skylake 0xF000000000000000F bug."
"let us pretend I prove P=NP for a collection of symbols which I state map 1:1 to a set of all possible algorithms in both P and NP space."
It's like you're talking about a whole, different conversation. Theo's claim is thoroughly refuted whether it was done by eye, hand proofs, model-checking, or full verification down to the gates. Done by each of these actually. The use of smaller systems with simpler API's plus assurance techniques that worked to reduce defects got results no monolithic OS or UNIX got. Hence, why people are recommending investigation or use of such approaches to sandbox applications given those approaches might similarly result in fewer vulnerabilities or with less, severe damage.
Then, you counter talking about how they aren't proven to code, in all circumstances, the hardware, P = NP, etc. What does all that have to do with the claim about reduction of efforts required or damage created using these approaches vs what building or relying on a whole OS would entail? You could throw all that same stuff at the alternative approaches. It would still be true. Whereas, people acting on my recommendations instead of Theo's would provably reduce their attack surface and number of hits they take based on the results gotten in the past by examples I gave. That's why your reply is a strawman to the discussion of whether sandboxing can reduce impact of problems or not with Theo's criticism being destroyed. Even by his own OS which had way many more kernel bugs that needed mitigation than stuff I cited which ran such crap deprivileged or eliminated it at design level where possible.
Not if you're going to take advantage of hardware acceleration.
https://github.com/rust-lang/rfcs/blob/master/text/1774-road...
https://blog.rust-lang.org/2016/09/08/incremental.html
I want to like rust and understand that there are benefits to it, but why is this on the frontpage? What important information does this post contain that justifies this, beside that some project now has their image library in pure-rust (does that even make sense? Not every rewrite is justified, replacing a rock-solid c-lib with a brand new rust-lib may not be a good idea)? I just loosely follow rust and i really don't see how i benefit from reading this.
I think we need to step back and realise that while there might be benefits to a certain technology, it's not a miracle drug.
1. Rust is gaining much acclaim for writing safer code, a good package management system, a very friendly community and other things. So it's natural that people would be interested in knowing more about libraries, frameworks and tools that would allow them to use Rust for whatever they're working on now or plan to work on soon. Not many people would want to develop their own HTTP handlers or image decoders or message queues or anything else just so they could write their primary project in Rust and gain the advantages it provides. Libraries are a huge part of that bridge that people need to use a language more (one big reason why Python is preferred across different subjects is because of the libraries).
2. This could inspire someone to go to these or other repositories and contribute to them.
3. Considering that images (and JavaScript) make the bulk of web pages nowadays, I'm sure many people would want to know about pure-rust libraries that handle images so that they could now have their web backend projects use more of Rust with better guarantees on a few aspects (like stability of execution, for one). Outside of web, this could also appeal to people who write imaging related applications.
Yes, but wouldn't those people want libraries that focus on images, as opposed to libraries that focus on games? (Piston is a game engine).
In the context of image-processing libraries, which are often used in a web-facing environment, being written in Rust with almost no unsafe code is also a major improvement in term of security.
> it's not a miracle drug.
I agree with you, but I don't think anybody pretends it either.
Rust is a massive improvement over the state of the art in system programming. It's not a panacea, and I'm not going to rewrite all my web or mobile code in Rust, but I'm still really happy Rust exists.
Projects like this one, or the recent announcement that librsvg is replacing C code with Rust one, give me hope that will see fewer and fewer memory-related security bugs in the future, and it's about time !
If this image handling can be extracted into its own library ( maybe it already has), then it can be used by other projects, and maybe even by C and C++ projects.
"One of the vulnerabilities can lead to remote code execution (RCE) if you process user submitted images."
If you code a command that reads and then deletes a file on the command line and people allow remote users to invoke that command remotely, the programming language is not going to help you.
ImageMagick doesn't have the design that is necessary to be exposed to remote users. So it shouldn't be.
Larry Ellison made the same observation many years ago:
> The interesting thing about cloud computing is that we've redefined cloud computing to include everything that we already do. I can't think of anything that isn't cloud computing with all of these announcements. The computer industry is the only industry that is more fashion-driven than women's fashion. Maybe I'm an idiot, but I have no idea what anyone is talking about. What is it? It's complete gibberish. It's insane. When is this idiocy going to stop?
I usually don't agree with these sentiments, personally, or at least not with the implications that are usually drawn from them. But it is something worth thinking about.
A lot of Rust stories _dont_ get upvoted on hacker news as well.
Because it indicates that there were no edge cases that rust couldn't handle, at least for this project. It's nice to know that it's a suitable tool for 100% of a project and not just 95% like so many others.
Can you clarify what you mean by this? Which rock solid C libraries are there for working with images? I'm not talking libpng and libjepg but higher level stuff. Because most people do not use those libraries directly because they are too low level. So any higher level library is a welcome addition.
Most programming languages currently have some agreed upon image library (like PIL/Pillow in Python) and they are not standing on good foundations either.
I don't have anything against rust and i would read a blogpost that details what they gain from switching to a rust-based library. But i am a bit frustrated by my subjective perception that recently there were a lot of posts on the frontpage that just do something in a fancy language.
- Data parallelism. This is where you have (for example) a giant array of data to process and want to use as many cores as possible. For this, rayon is gorgeous: https://github.com/nikomatsakis/rayon Seriously, I can just not say enough good things about programming with rayon, and how nice Rust's "no mutable aliasing" makes coding. Also at some point, Rust will also need good libraries for talking to the GPU.
- I/O concurrency. This is where you have a server that needs to respond quickly to 10,000+ sockets, for example. For this, it looks like everyone is standardizing on futures and tokio: https://github.com/alexcrichton/futures-rs https://github.com/tokio-rs/tokio
But it seems like you're most interested in Erlang-style actors with M:N green threads? The issue here is that—as best I remember the history of Rust—the Rust maintainers decided that green threads were simply too high a level of abstraction for Rust. I think Rust will eventually move back in that direction, first using futures+tokio+async I/O, and then by eventually adding async/await sugar on top. That way, you'll only pay for coroutines/green threads when you explicitly opt in. But it will take a year or so for all this to start maturing, I think.
In the meantime, tokio's method-chaining syntax should actually be somewhat tolerable, especially because Rust has a `#[must_use]` attribute that prevents you from forgetting to use a future. If you're lazy, just run everything through rustfmt to get the indentation right.
There's an RFC for that! https://github.com/rust-lang/rfcs/blob/master/text/0230-remo... (A reminder that this was written in 2014, so statements are about Rust at that time period)
> I think Rust will eventually move back in that direction
Note that tokio is not "green threads" as they're conventionally thought of, though they are similar if you squint, kinda.
Well, I'm leaping ahead to the end of the story, here, and speaking far too loosely about technical details. ;-) Futures + tokio + a possible future async/await proposal would provide a reasonable enough syntax for actors/coroutines in Rust, I suspect.
Relevant links: https://github.com/erickt/stateful and https://github.com/rust-lang/rfcs/pull/1823
It seems like you're talking specifically about async I/O here? Because Rust does have parallel programming primitives both in the stdlib and in the form of libraries like rayon and crossbeam. Rayon, in particular, is amazing for parallel programming.
Maybe i didn't understand everything on this project though, so please correct me if i'm wrong.
I'm not sure I get your point here, from other comments it seems like you're annoyed that Rust didn't choose a solution, and when tokio (which is the "chosen" solution for server side async stuff) is talked about you wish it had chosen a different one.
As I mentioned in my other comment I'm also not sure what you mean by "concurrency primitives" here, from your mention of server-side programming it would appear that you want something that handles async I/O, but this comment seems to indicate otherwise, and Rust does have pretty nice primitives (in the stdlib, and in rayon/crossbeam) for general parallel programming.
Since he mention Go and Erlang, I think what the gp just wishes Rust had a built-in solution for M:N threading.
Whether M:N threading is nicer than async APIs is highly subjective though. (I'm talking about ergonomics here. I think it's widely accepted that async is more efficient when performance really matters)
Yeah, but then they say that they doesn't want concurrency to be based on async io primitives, which is exactly what Go does IIRC.
Under the hood, I guess yes. But in Go, the developer always use blocking API and the runtime does the magic with the async stuff. Some people find it easier to reason about code written this way.
I mentionned async io but i should have added "+ event loop", because they can be used for different things indeed.
Now maybe the rust team could officialy endorse one programming model for the server side, but keep the language itself agnostic, so that other people could try and do different things in the future. But i don't think leaving it all to the community to decide is good for the very beginning. You need everyone to go in the same direction because there isn't enough talented people to run multiple races at the same time ( but honestly i don't have that,much experience running communities, so it's just my feeling).
I think you're asking Rust to be a high-level language like Erlang or Go, and not a near-the-metal systems programming language that may (someday) have an "official" story about using actors for people writing servers.
Rust's threads are raw OS-level threads, because using green threads by default (or as an option) imposed serious overhead elsewhere:
http://stackoverflow.com/questions/29428318/why-did-rust-rem...
https://github.com/rust-lang/rfcs/blob/0806be4f282144cfcd55b...
https://aturon.github.io/blog/2016/08/11/futures/
From the third link:
> The problem is that green threads were at odds with Rust’s ambitions to be a true C replacement, with no imposed runtime system or FFI costs: we were unable to find an implementation strategy that didn’t impose serious global costs.
This is (or, will be) the "officially endorsed" async model for the server side. All the libraries you'll be using *hyper, etc) are going to be integrated with it. It's not yet finished, but when it is it will be.
In the future if we get async/await that would be an even more ergonomic thing that can be used with futures (and hence, tokio), but most of the machinery is already planned for tokio itself.
Check out https://github.com/dpc/tokio-fiber/ for co-routines being re-envisioned on tokio, and https://github.com/rust-lang/rfcs/issues/613 acknowledges that the actor story is still a work-in-progress.
I have seen no indication that either coroutines or futures will be less ergonomic or performant due to being built on top of tokio's futures.
Start with low-level OS system calls and show what pieces one needs to write to reach higher level abstractions like futures, coroutines , async / await, nodejs style event loop, or full blown actors.
That would also make for a nice roadmap to whoever would like to build OTP or Akka style framework in Rust.
The goal of Rust is to be a general system programing language.
In fact, Rust in the past had a runtime that included green threads a primitives; that ended up being a problem. The runtime and stdlib was only usable in some system programing task (defeating the whole goal).
Having the language decide on one method of concurrency would make it useless for lots of people's purposes (including ours).
Either the split is "every use-case finds the primitive that works best for it" (Rust), or the split is "here's the blessed primitive, other use-cases can pound sand" (Go)
For example, while tokio is being integrated with hyper, it does not affect the "usual" users of hyper at all, both in terms of API and in terms of what's going on under the hood, cost-wise.
Having a blessed solution for async like tokio that is not in the stdlib seems to encourage folks to integrate with it, but not in an irrevocable manner; the core library stays intact and able to support potential other models in the future.
Rather, it is important for subcommunities to rally around particular sets of libraries. E.g. the web and I/O related libraries seem to all be going in the direction of integrating with tokio. This means a web programmer can use tokio and it will work smoothly. OTOH, a different subcommunity that uses a different set of libs may use stuff built on rayon and crossbeam.
Ultimately you'll get libraries or frameworks that are not compatible with each other because they use different concurrency models. That's a mistake Rust will pay in the future.
If you think of something like OTP, i don't see any equivalent anywhere else. Even akka isn't unanimously used as the foundation for server side dev on the jvm.
EDIT: to be clear, that means one is on both. The third member is very well-known as well; having built the async primitives that this is built on, as well as initially implementing Cargo. Bona fides aren't an issue here.
Golang was created within Google in 2007, went public in 2009, and has only gotten widespread traction over the last couple of years.[1][2]
Rust was created within a nonprofit in 2009, with a first pre-alpha release in 2012, so let's give it a break on popularity contests for a few years yet. ;)
While it's true that it was started then, it's also important that a stable release wasn't until 2015, so in some senses, that's also where the clock towards becoming popular starts; very few people are willing to put up with what we were doing pre-1.0, and for good reason!
I first started looking at Rust before 1.0, but the runtime and the many different pointer types/sigils drove me off. I came back in the lead-up to 1.0, and saw how much the language had improved and simplified, and started getting into it at that point.
We're seeing a huge uptick in production users. And a lot more who are pretty much waiting on Tokio to deploy more. Many of them read HN, but many of them don't.
Bugs happen, and they will happen in Rust too. Please file https://www.rust-lang.org/security.html so we can get a CVE if you find one.
> as are yours.
Not sure what that means, exactly.
Is it faster? Is it shorter? Has it fixed bugs? Does it have more functionality? What reason does a user or potential user of this library have to care that it's now written in another language?
The thing is - no one is going to be impressed if they hear it's been re-written in F# because I assume doing so is going to make the software easier to maintain, easier to extend and more reliable. Not management, not users, no-one.
Now, if I did the re-write and LOC went down by half, or performance improved, or it fixed a lot of bugs, or I was able to add features in much more quickly because abstractions were tighter and easier to create and the compiler was able to check more constraints for me - then people might take notice.
There is, however, no inherent value in the source language being changed in itself.
So your point here is valid, that is, "as someone _not_ in that audience, why should I care?" That's what the other thread, with its replies, is getting at. Which is why I pointed to you to it.
Again, for an example of a language X-to-language-Y rewrite story with some substance, that did a lot to boost the image of the language in many peoples eyes, see:
http://roscidus.com/blog/blog/2014/06/06/python-to-ocaml-ret...
I'd love to something like that the next time "X rewritten in rust" etc gets to the front page of hacker news. A re-write in of itself tells me nothing.
Like Steve said, this post wasn't written for a general audience. A Rustacean will read that post and will be very happy to know that they can do certain things using pure Rust libraries. That was the audience of the post, the post makes total sense in that context, and has inherent value in that context. It got shared more widely, and HN seems to like it, which means that possibly folks who weren't the original intended audience are reading it, which is okay too, just that they may not find it that interesting.
I do agree that that post is pretty great; I loved the series. But it's trying to accomplish something different than this post is.
However, if something can be improved there is no reason to not improve it, if that is what the goal is. Perhaps changing the source language integrates it better into their ecosystem. It may not work, be justifiable, or have inherent value for you or me, but it works for them. To each their own.
It is possible a rewrite in F# could help you, down the road, to add features faster than in C#. This could be a benefit to your users (shiny new useful feature) and management (revenue). So the discussion for moving to a new X involves showing benefit from the perspective of the company and not our personal wants. How do you mitigate risk? Perhaps start with internal tools, where you have to mimic a lot of the real business objects, and validate speed of feature (for the tools) development. You will get familiarity with your data and F#, independent of the user facing production code and come up with a better understanding of risk.