Perl's "taint" capability is pretty interesting in this space. Do other languages have something similar?
"You may not use data derived from outside your program to affect something else outside your program--at least, not by accident. All command line arguments, environment variables, locale information (see perllocale), results of certain system calls (readdir(), readlink(), the variable of shmread(), the messages returned by msgrcv(), the password, gcos and shell fields returned by the getpwxxx() calls), and all file input are marked as "tainted"."
Unrelated rant: Sometime recently mobile chrome omits any part of a url after a # when you copy/share the url. Grrr.
Just have two separate types, e.g. UnsafeString and regular String, and some kind of `convert` function that takes a validation function as an argument. You'd get compile-time checking that way.
People don't tend to use such things in practice though, and you would also have to ban a portion of most language's standard libraries to enforce it in practice (because they already return regular strings for inputs).
We use something similar where we have a BadNumber class in our code (python). Any operation with another number will also create a BadNumber. It allows us to make sure that these tainted numbers are always obvious.
Though I don't have experience with it, maybe one reason it isn't used is because of false positives?
For efficiency reasons, Perl takes a conservative view of whether data is tainted. If an expression contains tainted data, any subexpression may be considered tainted, even if the value of the subexpression is not itself affected by the tainted data.
Has anyone used this? Is the runtime overhead always there, or only when you turn the taint mode on? It seems like it would have to occupy some extra space in the string objects all the time? (Although I guess if it's literally a single bit, it can come for "free" because of padding)
FWIW here are some references on static taint analysis:
Perhaps not much on purpose, but it kicks in automatically if the Perl script is setuid. So you'll find questions about it where people are struggling with it.
It's probably sharing the canonical link. That's the "correct" behaviour as defined by browsers. Definitely not ideal in your example, though.
> * untrustworthy inputs;
> * unsafe implementation language; and
> * high privilege.
> Security engineers in general, very much including Chrome Security Team, would like to advance the state of engineering to where memory safety issues are much more rare. Then, we could focus more attention on the application-semantic vulnerabilities. That would be a big improvement.
Very nice. At the end of the process, Google might adopt Rust in Chromium. As much as I use and love Firefox, it's only realist to say that Chrome has higher chances of being around in 10 years.
I wonder why the list doesn't include their wuffs language.
An interesting list. They left out C# and PHP if it's supposed to be the most popular languages.
(Legal line noise: my opinions are not those of my employer.)
If you check where Firefox was 10 years ago to where it's now you can see the trend. It still continues. In the last year, Firefox lost more than 10% of its market share. A component of this is probably Firefox not being able to capture growth of the entire market, but the trend also holds for the absolute number of users: 890 million YAUs in Jul 2018 vs 809 YAUs in Jul 2019. In the long term view, Firefox is dying.
There are enough of them to keep Mozilla going indefinitely and they are doing some truly amazing stuff like using Rust to get massive performance boosts. Mozilla and Firefox have set the agenda technically for close to two decades. Everybody does tabs now. I remember when that was a Mozilla only thing. Extensions were a Mozilla only thing for a long time now. Even Safari has extensions now. The new focus on security and privacy started at Mozilla and is now being copied by others (Brave, Edge, Safari) while Google is moving to kill ad blockers and continues to sell users out to their advertisers.
When you observe someone using Firefox do you interject, "You know that Firefox is dying, right?"
When you participate in meetings, do you open with "I'd like to state for the record that Firefox is doomed, doomed I say"?
At the coffee shop in the morning, do you order "Grande Mocha, hold the Firefox because all hope is lost"?
It's interesting to get a sense of how deeply unrealistic they think it is, to write a safe parser for a typical data format in an unsafe language.
It is so unrealistic that Android is following up Solaris footsteps.
Google has announced that ARM memory tagging extensions will be required in future Android versions.
According to an interview they gave to Are in 2017 about changing Play Store contract.
And the new Project announced at IO to require support for GSI images.
It's not clear that all these bugs can be turned into an attack, but that sure is a lot of bugs.
2. Avoiding high privilege often means going out of process, which is challenging on resource-constrained devices.
- untrustworthy inputs
- unsafe language
- big, ball-of-mud codebase
We can do the first three, if the thing is small and simple.
Pretty much every OS kernel out there in wide deployment has all four of the above, though.
Looking at all the trivial exploits against web applications which are basically never written in memory-unsafe languages (Ruby, Python, PHP, ...) shows that it doesn't really matter much. While having the same implementation in a memory unsafe language would be slightly less safe, it's very unlikely that a heap corruption could be exploited remotely.
About 70% of CVE reported exploits are due to memory corruption.
Living with the remaining 30% would already be a huge security improvement.
Memory corruption is much harder (and in most cases realistically not at all) to exploit beyond a DoS, and that's what you would get with "safe" languages such as Rust or Python as well.
Heartbleed, Shellshock, Dirty COW etc. would all happen exactly the same way in different programming languages.
Yes, there is clearly a benefit in using something which makes it much harder introducing memory safety issues, but it's not nearly as big as many here on HN think.
I believe the same is true (roughly) with the Coinbase attack that went after Firefox.
In short, memory safety is not only responsible for the majority of reported vulns, but also the exploited ones, at least in the case of browsers.
> Heartbleed, Shellshock, Dirty COW etc. would all happen exactly the same way in different programming languages.
Heartbleed is impossible in a memory safe language, at the least. Same with cloudbleed for that matter.
DirtyCow and shellshock, sure.
It's a bit of a moot point though - human energy is finite, consider if we could spend energy on problems like DirtyCow and shellshock instead of memory safety issues that simply don't exist in many languages.