I'm working on an exokernel built around a eBPF VM that'll let you use loops in certain circumstances (ie. more or less regular threads that happen to be in kernel space and are preemptible).
Are you saying the kernel bpf verifier can be disabled when loading a probe? AFAIK, there is no such option.
TBH, I cannot understand this statement. I've written a few thousands of lines of BCC C code as eBPF. Never encounter any reference to actually have loops in the code.
I never heard of that one can write a new eBPF VM.
There's no doc on my work as it's private jerk offy personal project that I work on when I get tired of jira tasks and process.
Edit: here's an example of one someone did in rust https://github.com/qmonnet/rbpf
A userspace VM that do not have the same capability as the kernel one is not useful when tracing kernel internals.
If you provided enough of std, or ported that VM to no_std that rust vm would work just fine in the kernel too.
Although in our case, our eBPF runs in external customer's environment, and we cannot ask them to patch their kernels with our code.
It's sort of like how Oak was this neat virtual machine for running on a early 90s PDA prototype. Then the writers of that VM realized that they had written a really general purpose VM, cleaned it up and released the first Java.
This general of a VM (talking about eBPF now) hasn't been a first class citizen of a mainstream kernel before. The devs are taking a very cautious approach (as they should), but ultimately eBPF is way bigger than a tracing tool. I wouldn't be surprised to see nearly everything you currently do with a kernel module ultimately being allowed by eBPF too. Maybe more like emulating other OS's kernels as easily as you'd start another container.
I am on existing stuff can be readily used.
You are probably estimating the future, I guess?
Well, that's the thing I personally mostly use it for via bpftrace and bcc, but that's not the only thing. It's being used for a lot of networking related things too. XDP, CloudFlare uses it for a lot of their DDoS mitigation, etc.
Wasm instructions are not native instructions.
The spec  clearly states that Webassembly is sandboxed.
A non-sandboxing implementation would be either non-compliant or have bugs (which admittedly they most likely do at this point, considering they are still new).
Lets take it from the beginning,
1. wasm is a set of instructions.
2. those instructions have to be turned into instructions the hardware understands to execute them.
No where in here is there a requirement of sandboxing. With a sandbox, a 'protected memory space' is dependent upon the implimentor. There's no such thing as a magic software sandbox that you just drop into software and congrats, secure. You impliment it.
Let me quote the Webassembly spec :
> WebAssembly provides no ambient access to the computing environment in which code is executed. Any interaction with the environment, such as I/O, access to resources, or operating system calls, can only be performed by invoking functions provided by the embedder and imported into a WebAssembly module.
This applies to every single sandboxed language in the world.
The instructions are designed so that sandboxing the actual core is trivial, and the spec says that you have to sandbox outside of deliberate pass-throughs. That's about the best you can possibly do.
If languages can qualify as sandboxed, it sounds like WASM qualifies. (And if they can't, then we're using a broken definition of "sandboxed".)
Then I stand by what I said before. Your definition is broken, and you're making a semantic argument rather than actually discussing eBPF and WASM.
When you see someone say "sandboxed language" read it as "language where conforming implementations are by definition sandboxed". WASM meets that definition, as far as I can tell.
When you see someone say "WASM is sandboxed" read it as "any runtime that implements the WASM spec is sandboxed".
You can argue that a sandbox might be low quality. That's fine. But it doesn't make it non-sandboxed.
You don't get to decide how an instruction set is used, and a 'low quality sandbox' is not a sandbox.
You've denied facts and continue to make both inexperienced and naieve claims that are dangerous. Not entertaining it further.
When all your security issues are violations of the spec, then it is not the language in the spec that is insecure.
True, but a conforming implementation has at least some sandboxing, since it prevents arbitrary memory access. But the degree to which it is sandboxed depends on the functions that are exposed by the runtime.
It’s entirely up to the implementation how much access it gets to the outside world.
Webassembly is designed to be sandboxed and safe to run.
Interaction with the host environment is only possible via the runtime.
The verifier that exists in the Linux kernel works very hard to make sure that you are not allowed to load programs with unbounded loops but you most certainly can when those restrictions are lifted.
The point being that you could rip the eBPF implementation out of the kernel, remove the verify check and have a very usable VM.
Here's a implementation that exists because of the GPL: https://github.com/iovisor/ubpf.
Pretty wild and cool stuff, seems like.
Yes. Though before a LLVM based eBPF target was available, so this adds the option to use GCC instead. So in principle you could use GFortran to write kernel code; ... profit!
> If so, is there any particular purpose in that, beyond making eBPF an easier place for people to write userspace software that would otherwise be kernel modules?
eBPF is an in-kernel virtual machine, with JIT for popular architectures like x86-64 (maybe arm64 and ppc64le too, not sure?). So you use GCC (or LLVM) to compile code into an eBPF compatible object format, load it into the kernel (with a special syscall IIRC, or maybe it was something netlink-based?), then an in-kernel verifier checks that it doesn't do anything that isn't allowed before it's enabled.
So what can you do with it. Quite a lot, it seems (disclaimer I haven't used it personally). The big use cases at the moment seem to be
- network filtering
- seccomp filtering (that is, check syscall arguments)
- tracing (see bcc/bpftrace) for performance analysis
Small correction there, seccomp still uses BPF not eBPF. That leaves a lot of restrictions on what it can do and I believe the bytecode is incompatible too.
I've been waiting to dive into eBPF until the tools mature a bit, so it's great to see eBPF support landing in GCC.