
On WebKit Security Updates - raimue
https://blogs.gnome.org/mcatanzaro/2016/02/01/on-webkit-security-updates/
======
TazeTSchnitzel
Devices with old WebKit versions are fun. A lot of hacks used for homebrew on
the Nintendo 3DS have relied on unpatched WebKit bugs. Vendors never update
their WebKit versions, and Nintendo is no exception.

------
ufo
Its kind of sad that text and image documents today are so complicated that
security vulnerabilities in software that reads them are considered
inevitable.

~~~
pcwalton
Complexity is inevitable. The problem (well, one of the largest problems) is
that we have collectively failed to come up with effective mitigations for
these basic memory safety issues (other than sandboxing). At best we don't
invest enough in effective solutions; at worst we incorrectly convince
ourselves that "good programmers" never create exploitable security
vulnerabilities.

Just look at the list:
[http://webkitgtk.org/security/WSA-2015-0002.html](http://webkitgtk.org/security/WSA-2015-0002.html)
"Memory corruption and application crash" appears more times than I can count.

~~~
sebcat
Sandboxing is great. Isolate certain functionality in a separate process,
sandbox it with OS-provided solutions like seccomp-bpf or capsicum, make it
terminate on a violation, design its API to be idempotent so you can just
restart the process on failure. Have rlimits on process memory usage. There
are technical solutions and development processes to write better, more secure
software.

The problem is of course that software has to be designed for this, it's not
something that can easily be slapped on as an afterthought. It demands more of
the design, and that's not easy to motivate in most organisations. Forcing the
developer to take security into consideration would be better, but the root of
the problem is that most organisations don't consider security that important,
even if they say they do.

~~~
mike_hearn
It'd be much, much nicer if we could just use memory safe languages all the
time, everywhere. Sandboxing is ultimately a kludge: a recognition that our
tools are so dangerous that failure is not just a possibility but an
inevitability. It comes at a huge cost in developer productivity, code
complexity and sometimes runtime performance suffers too.

pcwalton works on Rust, I believe. A rendering engine written in a memory-safe
style is something humanity just needs to build and I applaud Mozilla in at
least starting down that long road with Servo.

That said, HTML5 has sprawled to the point that it appears (to me) essentially
unimplementable from scratch unless you are Google, Microsoft or Apple. That
seems like a problem. If web servers indicated what HTML features a page they
were serving needed, it'd let browsers pick different rendering engines on the
fly, but of course ... malicious web pages would simply always ask for the
least secure engine. There doesn't seem to be a good workaround for that. It's
the downside of trying to force the web into being an app platform.

It's also unclear to me how the balance between safe and unsafe code works out
in Rust. Looking at Servo it appears that unsafe code is used all over the
place including in very core parts:

[https://github.com/servo/servo/search?utf8=%E2%9C%93&q=unsaf...](https://github.com/servo/servo/search?utf8=%E2%9C%93&q=unsafe+filetype%3Arust&type=Code)

My guess is that Rust's memory model simply requires unsafe code more often
than a traditional GCd range checked language does. A part of me wishes the
Java guys would hurry up and get value types done already (it's years away),
as combined with a lot of the auto-vectorisation optimisations being added at
the moment by Intel it feels like at that point a performance competitive
rendering engine written in entirely memory safe code would become realistic.
Perhaps it even already is, unfortunately none of the above mentioned
companies are interested in trying and nobody else can afford HTML5.

This line of thought is how I ended up concluding we need an entirely new
mobile code oriented app platform.

~~~
pcwalton
> Looking at Servo it appears that unsafe code is used all over the place
> including in very core parts:

That's a very misleading characterization. It's less than 1% of the code. And
it _should_ be in core parts: those are the parts that are carefully audited
and the parts that safe abstractions are built around.

We need unsafe primarily for the FFI layer, just as you do in any language.
Bindings to a JIT would be no less safe in Java. You just have to write
"unsafe" to call FFI functions in Rust, because we want to make it easier to
audit that code.

> combined with a lot of the auto-vectorisation optimisations being added at
> the moment by Intel it feels like at that point a performance competitive
> rendering engine written in entirely memory safe code would become realistic

Any compiler developer will tell you that pinning your hopes on
autovectorization is a lost cause. And you need a lot more than SIMD to write
a competitive browser engine. (SIMD isn't the reason we write unsafe.) What
I'd like to know is how you propose to write a JIT in safe code, without
trusting that the JIT is implemented correctly. There _are_ ways to
potentially do that in theory (proof carrying code), but they're way beyond
anything you've mentioned.

~~~
mike_hearn
It may be less than 1% of the code by lines, but it's a large codebase. My
point was that simply searching for the word "unsafe" in the code threw up 21
pages of results. That's not a rarely used feature.

By "core code" I meant the first results are things like layout code, a file
called string_list.rs, string_multimap.rs etc. That doesn't sound like stuff
at the edges to me.

 _Any compiler developer will tell you that pinning your hopes on
autovectorization is a lost cause_

I didn't mean that's the only thing that's needed, I think value types would
be by far the biggest needle-mover there. But there's quite a lot of auto-
vectorisation going on in the latest JIT compilers, a surprising amount (some
of it not released yet).

 _What I 'd like to know is how you propose to write a JIT in safe code,
without trusting that the JIT is implemented correctly_

Obviously if you're compiling code, the compiler has to generate safe outputs.
But take a look at how the Graal/Truffle frameworks do it. Graal is a JIT
compiler implemented entirely in Java. It generates machine code that's then
handed off to the JVM for execution. Truffle sits on top and uses aggressive
partial evaluation to convert AST interpreters for dynamic scripting languages
into JIT compilers automatically. They have a Javascript implementation that's
comparable to V8 in speed, and you can interop with it through safe APIs.
Again, the entire compiler/JIT stack is written entirely in a safe language:
eventually the CPU jumps to the result, but the code itself is all written in
a safe way.

~~~
pcwalton
I wrote the layout code you're talking about. There is no unsafe in the layout
algorithms, just in the work-stealing deque and the logic to share DOM nodes
over to the layout thread. You can think of both as low-level generic memory
management primitives, essentially as a GC. We do _not_ allow the layout logic
to be unsafe.

As for string lists and such, those are core collections. Again, those are
low-level primitives and relatively easy to prove safe. We should be migrating
more of that code to the safe language, but it never causes problems in
practice. That's because the "build simple abstractions out of unsafe
primitives and keep the complicated code safe" approach _works_. (We have
empirical evidence that it works, you know. There are incredibly few segfaults
in Servo relative to any other browser engine we've worked on.)

I am convinced that writing in Java would not make the browser any safer. All
you'd do is to move some unsafe code from the engine to HotSpot. The only real
effect it would have is that the word "unsafe" would not appear as often in
the code.

> It generates machine code that's then handed off to the JVM for execution.

And there is absolutely no difference between that and what Servo/SpiderMonkey
does, because you're trusting the millions of lines of code in HotSpot. Servo
is just being explicit about the lack of safety when touching the JIT.

------
ryan-c
Many of the obscure WebKit browsers do not validate certificates on
subresources, which allows MitM attackers to replace third party scripts with
malicious ones. Some of them do this without even any UI indication that there
is a problem. I have a test page here: [https://rya.nc/https-
script.html](https://rya.nc/https-script.html)

------
ianlevesque
"WebKit does have a Linux sandbox, but it’s not any good, so it’s (rightly)
disabled by default."

I'd love to know more about why, that seems mad in 2016.

~~~
creshal
Disabling a broken feature, or not fixing it?

The latter is easily explained: Webkit (especially on Linux) is woefully short
of manpower now that Google et. al. have pulled out of it.

I'll call myself lucky if WebkitGTK doesn't crash on any of the Alexa top 100
any more.

~~~
josteink
Just in time for Emacs, XWidget support and Webkit embedded in Emacs then.

And here I thought we'd finally get a quality browser-engine for our OS. Seems
we'll have to wait some more :)

~~~
creshal
Chromium Embedded Framework (used by e.g. Qt, Android, and browsers like
Vivaldi) is okay, and Servo (Mozilla's new rendering engine in Rust) is aiming
to be API compatible to it.

Seems it's where we're headed for now.

------
awalGarg
> Web engines are full of security vulnerabilities, like buffer overflows,
> null pointer dereferences, and use-after-frees.

Off-topic to the issue at hand, but I am curious: would Servo[1], the browser
engine written in Rust be having lesser (chances of) issues like this? I have
written very little Rust, but it seems that the above issues are extremely
unlikely to occur with Rust.

[1]: [https://servo.org/](https://servo.org/)

~~~
notalaser
Never underestimate what misguided cleverness can do. Heartbleed, the most
famous of these issues in the last few years, would have not occurred on
systems like OpenBSD, were it not for its developers' quest for optimization
in their malloc wrapper.

I've seen buffer overflows in Java, too. It's immune to writing past the end
of the arrays, but it's not immune to, say, smart programmers using one large
array to hold multiple buffers that they manually manage, because who knows
when the GC kicks in and wrecks havoc. Python may be immune to use-after-
frees, but the OpenGL bindings that it calls through the FFI aren't.

To err is human, and languages like Rust don't make it less likely to err,
they just give you new opportunities to do it.

Moving to a safe language like Rust would help, absolutely, but to a far
lesser degree than people hope. It would close far fewer cans of worms than
the hive mind hopes, and it's so new no one really knows how big its own can
of worms is.

~~~
orf
> and languages like Rust don't make it less likely to err

It doesn't make the person less likely to err overall, but it makes it less
likely that those errors get compiled and released.

~~~
notalaser
It makes it less likely that _those errors that we 're familiar with from C_
get compiled and released. There's no debate there. It's all those errors that
we're _not_ familiar with from C that it won't do anything against.

Not to mention errors stemming from our own arrogance and short-sightedness,
which no language can ever protect us against.

~~~
mikeash
If it makes it less likely that some errors get out, and doesn't do anything
against other errors, then overall it makes it less likely that errors get
out.

~~~
notalaser
What makes you think that it does not introduce any additional errors?

~~~
mikeash
What makes you think that it does?

~~~
notalaser
The course of programming so far.

------
sambe
Interesting. I have one major disagreement. This:

"This seems to me like a good reason for application maintainers to carefully
test the updates"

Doesn't seem remotely feasible. Every security release should be manually QA'd
by the distribution? If they are already worried enough not to take such
updates, I doubt they have the manpower to cover everything not covered by
integration tests (a lot?).

~~~
greggman
Well, there are 100s of thousands of tests checked into various repos. WebKit
has > 45k tests and there are open test suites for many web apis. Of course
it's work to get some testing infrastructure setup but once setup it's mostly
automated?

I supposed not, it just adds something else that needs to be maintained :P

~~~
makomk
One would hope that whoever committed the patches already ran those tests, so
the bugs that get through will mostly be ones that the tests couldn't catch -
for example, issues with applications that integrate WebKitGTK+ not
functioning properly with the new version. Incidentally, GTK+ as a whole has a
terrible record when it comes to backwards-compatibility these days.

And as the post makes clear, the security upgrades doesn't even help that much
because the original WebKitGTK+ API isn't receiving them anymore and moving to
the new API requires massive changes that distros certainly can't ship in an
update to a stable release. Apparently they gave applications "a full year to
upgrade", which is shorter than the lifespan of distro releases and not
realistic given that it requires upgrading to GTK+ 3 and more major changes on
top of that. Just the GTK+ 3 switch alone is a multi-year project for complex
software, and not helped by the fact they keep breaking their API in new
versions. Case in point:
[https://bugzilla.gnome.org/show_bug.cgi?id=757503](https://bugzilla.gnome.org/show_bug.cgi?id=757503)

------
kangaroo
Why is this so bad? Because the WebKit security group doesn't let new members
in.

Its a self perpetuating cycle.

