
The Security Implications Of Google Native Client - ilitirit
http://www.matasano.com/log/1674/the-security-implications-of-google-native-client/
======
amalcon
Wow. I didn't even realize it was an x86 bytecode verifier. I was under the
impression that NaCl was some kind of a virtualization trick, given that it
came out about the same time as CPUs with virtualization instructions started
to get popular.

Oh well. Maybe they'll get it to work in a secure manner, but I don't see it
happening any time soon.

~~~
ilitirit
The research paper says it's a sandbox.

[http://nativeclient.googlecode.com/svn/trunk/nacl/googleclie...](http://nativeclient.googlecode.com/svn/trunk/nacl/googleclient/native_client/documentation/nacl_paper.pdf)

Here's an interesting comment I read on Slashdot a while back that summarizes
the paper:

 _This is really a little operating system, with 44 system calls. Those system
calls are the same on Linux, MacOS (IA-32 version) and Windows. That could
make this very useful - the same executable can run on all major platforms.

Note that you can't use existing executables. Code has to be recompiled for
this environment. Among other things, the "ret" instruction has to be replaced
with a different, safer sequence. Also, there's no access to the GPU, so games
in the browser will be very limited. As a demo, they ported Quake, but the
rendering is entirely on the main CPU. If they wanted to support graphics
cross-platform, they could put in OpenGL support.

Executable code is pre-scanned by the loader, sort of like VMware. Unlike
VMware, the hard cases are simply disallowed, rather than being interpreted.
Most of the things that are disallowed you wouldn't want to do anyway except
in an exploit.

This sandbox system makes heavy use of some protection machinery in IA-32
that's unused by existing operating systems. IA-32 has some elaborate
segmentation hardware which allows constraining access at a fine-grained
level. I once looked into using that hardware for an interprocess
communication system with mutual mistrust, trying to figure out a way to lower
the cost of secure IPC. There's a seldom-used "call gate" in IA-32 mechanism
that almost, but not quite, does the right thing in doing segment switches at
a call across a protection boundary. The Google people got cross-boundary
calls to work with a "trampoline code" system that works more like a system
call, transferring from untrusted to trusted code. This is more like classic
"rings of protection" from Multics.

Note that this won't work for 64-bit code. When AMD came up with their
extension to IA-32 to 64 bits, they decided to leave out all the classic x86
segmentation machinery because nobody was using it. (I got that info from the
architecture designer when he spoke at Stanford.) 64-bit mode is flat address
space only._

[http://tech.slashdot.org/comments.pl?sid=1056231&cid=260...](http://tech.slashdot.org/comments.pl?sid=1056231&cid=26048381)

~~~
tptacek
It's a sandbox implemented with a verifier.

Some other notes:

I'm not sure I'd call X86 segmentation "elaborate", at least in the context of
X86 programming (sure, it's very elaborate compared to MIPS).

I don't think I've heard the word "call gate" used with the definite article
before, as if there was just one of them... but I'm an X86 autodidact and that
could be my mistake. My understanding is that a call gate is anything that
vectors a program from one context to another. In most X86 operating systems,
there are 2-3 basic call gates that will get you from userland to kernel: the
INT instruction (the interrupt handler will check your program state and
dispatch the right system call) and the SYSCALL instruction (which does the
same thing without the interrupt overhead).

NaCl disallows both of these instructions, along with the FAR CALL opcode that
would let you jump between segments and the segment override prefix that does
the same (note this was the epic fail Dowd found in the contest).

The trampoline mechanism that NaCl uses is not at all dissimilar from how
Win32 and BSD libc issue system calls; the library exports a stub interface
and hides the mechanics of actually issuing a system call.

Note: not trying to be pedantic here. Just love geeking out on this stuff.

~~~
limmeau
The x86 instruction set has a mechanism called "call gates" for system calls.
Basically, the OS puts the entry point of the system call handler into a
segment descriptor with the call gate bits set. The unprivileged user program
then performs a far call to an address consisting of a segment selector for
that descriptor and an offset which does not matter. Execution resumes at the
system call handler, with a privilege level as encoded in the call gate
descriptor.

That way, you could have thousands of system call entry points and avoid the
overhead of an int instruction and the syscall-number dispatch. I believe OS/2
used that mechanism extensively (and all the other elaborate segmentation
stuff).

And I call x86 segmentation "elaborate" :)

~~~
tptacek
You're right, I'm being imprecise.

------
jrockway
The general impression I got from this article is "never write software, since
the software could have a security problem". Technically true. But sometimes
the new functionality is worth the (minimal) security risk. Worry about making
something interesting first, and making it secure enough to run on your
banking computer later.

Also, if your parsers allow remote code execution, you are doing them wrong.
Don't write application code in C! Use a safe language like Haskell or Lisp or
Java or ... anything else. Then your bugs are bugs, not major security
problems.

~~~
Confusion
_The general impression I got from this article is "never write software,
since the software could have a security problem"_

What he's saying is that _this_ specific piece of software is very tricky to
get right. Moreover, if it becomes popular, then the consequences are enormous
if they get it wrong, as in the case of ActiveX or Flash exploits. The
aggregate damages from viruses and malware must surely run in the tens of
billions of dollars.

------
gojomo
So who'll be the first to port the Java VM itself to NaCl for a "belt and
suspenders" approach to dynamically-downloadable-code security?

~~~
oconnor0
Would that buy you much over installing a JVM + Web Start?

------
mildweed
If Google NaCl comes to fruition, they might have to consider rewriting their
arsenal of Javascript apps as x86 apps! This is certainly one of the more
potent salvos in the war on Microsoft. Come on, M$, fight back! You can do it.
You used to know how to innovate...

~~~
rcoder
I'm failing to see how this could possibly disadvantage Microsoft at all.
Since the sandboxed code runs as a native X86 app, it also uses native system
calls. Games (or any desktop app delivered via NaCl) using DirectX or other
native Windows system APIs would be just as tightly coupled to Windows under
NaCl as they were when running standalone.

If anything, Microsoft should _welcome_ this as a defense against the
encroachment of cross-platform JS and Flash applications.

~~~
tptacek
Wow is that ever wrong.

NaCl sandboxed code does _not_ use native system calls. If it did, running a
NaCl program would be just as bad as running an ActiveX control.

NaCl programs are restricted by the verifier so that they can only run a set
of virtual system calls defined by Google. That system call file is like its
own cross-platform OS interface, like NSPR except for desktop programs.

~~~
rcoder
Okay, good to know. It was unclear from the article _what_ set of syscalls we
exposed to the apps running within the NaCl sandbox.

~~~
tptacek
You build a NaCl program with a special GCC toolchain Google provides, which
creates ELF binaries with a base address of 0 and text starting at 0x10000.

When NaCl loads your program (which, remember, it wants to do without asking
you), it forks a "secure ELF loader" program, which populates the first 64k
bytes with a runtime (think crt0, on steroids), which we call the "trusted
code base". The TCB isn't verified the same way the NaCl binary is, because
it's code that Google wrote that never changes.

That code implements or trampolines to a common set of systemy things NaCl
wants to let NaCl programs do. We made the comparison in that post between the
NaCl runtime/TCB and the JVM applet sandbox, and I think it's apt. In both
cases, you have a wad of code that is trusted by the system and needs to be
bug free, and in both cases that's the first place a security researcher is
going to look for flaws once the system is mature.

~~~
rcoder
I think the reference to gaming in the article threw me off a bit -- looking
over the NaCl API docs, I don't see any references to accelerated 3D of any
sort[1]. Basically, it can handle Quake, because it works with a software-only
rendering pipeline, but Crysis is going to be out of reach for some time to
come.

As I understand it now, NaCl is really just a way to offload CPU-bound tasks
to native code with minimal overhead. It's interesting, but I'm not sure I
like the trade-offs vs. a platform actually designed from the start to insure
confinement of untrusted code (i.e., Java).

[1] -- Does anyone know of good literature on confinement of GPU-intensive
code? I've read up x86 and Java security issues, but don't know of any good
work on the security profile of code using OpenGL or DirectX...

------
skwiddor
Another x86 VM is vx32 which runs 9vx, a version of Plan 9 From Bell Labs as
an application. We promote it to new users as an alternative to finding
compatible hardware or running Qemu.

vx32 : <http://pdos.csail.mit.edu/~baford/vm/>

9vx : <http://swtch.com/9vx/>

I think a Windows port of 9vx is one of our GSoC projects

