There's also some interest in making eBPF the standard for computational storage [1]: offloading processing to the SSD controller itself. Many vendors have found that it's easy to add a few extra ARM cores or some fixed-function accelerators to a high-end enterprise SSD controller, but the software ecosystem is lacking and fragmented.
This work may be a very complementary approach. Using eBPF to move some processing into the kernel should make it easier to later move it off the CPU entirely and into the storage device itself.
Samsung's SmartSSD comes with a Xilinx FPGA accelerator that can be used to execute code without having to move data outside the device. And their open source SDK includes domain-specific routines including databases.
By "vendors", I meant drive and controller vendors rather than server/appliance vendors. They're looking for ways to differentiate their products and offer more value add, but extending an SSD's functionality beyond mere block storage (potentially with transparent compression or encryption) requires a lot of buy-in from end users who need to write software to use such capabilities.
I meant the Seagates and WD's from this world. Around 2014 they also had HDDs with an extra core, where I suspect they just got that for free from they supplier with the message "look, we stopped making these single core CPUs anymore, here's a dual core for the same price".
In case of Seagate's Kinetic Drives, one core was used by the controller, the other 'extra' CPU was used to manage a Key value store on the driver. These were ARM processors.
I don't think they make these themselves.
You would be wrong. I work with guys who used to do SoC design for both Seagate and WD.
This is how WD was able to jump to RiscV so quickly; they didn't have any suppliers they needed to negotiate with, and the market for CortexR style in order, single issue, fully pipelined real time cores is sort of a commodity.
eBPF the bytecode is not particularly limited. You can parse complex formats like Arrow or Parquet even. The Linux kernel overlays a verifier on top which adds all sorts of draconian restrictions (for good reason). When people talk of eBPF they don't always mean to include the Linux verifier limitations as well. In particular, that nvmexpress working group link in the parent post does not say one way or the other.
Because ebpf is designed for verification under constrained circumstances like kernels, but allowing easy JITing that's almost a 1 to 1 translation. Stuff like not doing register allocation, being 2 address, etc.
I'm not seeing any new parallels to SCSI, beyond the similarities that have long existed between NVMe and SCSI. What kind of programmable functionality over SCSI are you referring to?
SCSI uses simple fixed command sets, the same as standard NVMe drives. As far as I'm aware, SCSI doesn't have any way to chain commands together, so you can't really do anything more complex than predefined commands like copy or compare and write. It's nothing at all analogous to an actual programmable offload engine like you'd get with eBPF, and all the semi-advanced SCSI commands that are actually applicable to SSDs already have equivalents in the base NVMe command set.
I really loved the idea of nebulet! Do you think it is possible to reach the same security guarantees of eBPF with a feauture set like wasm? Other big challenges you encountered?
I think the main issue is that eBPF cannot have loops. It restricts the programs you can write. Wasm does not have that property, so you cannot prove that it will complete.
To be clear, eBPF programs can have loops -- they're just jumps that have negative offsets (which are signed 16-bit numbers), but for security reasons many verifiers do not allow them so they can ensure that the program halts.
I'd like to see more proof-carrying code techniques extended to WASM. Some of the work could be handed off to the compiler, embedding the proof in the binary. This could make expensive proofs, like termination checking, more tractable.
uh..yeah. The halting problem doesn’t mean you can’t do termination checking. It just means there’s no general algorithm for deciding if a machine halts.
Many databases can bypass filesystem and use block device (partitions) directly. But modifying NVMe drivers is pretty novel approach. Maybe Redis compiled into Linux kernel is not bad idea :)
Reminds me of the old Auspex file servers, with their functional processing units. For a while CPUs got fast enough to do everything. Then they didn't, and GPUs became a thing (though they had been something of a thing back in the 80s too, like the Amiga blitter), and now we're putting CPUs in storage controllers. The pendulum has swung a few times.
I wonder if Unikernels would also see this speedup, though they're so much harder to implement/use that I assume BPF-powered storage will see mass adoption before unikernels do...
What is disappointing is even with newer OS's they seems to ignore exokernels and go straight for older microkernel style architecture, which is 1980's tech
This work may be a very complementary approach. Using eBPF to move some processing into the kernel should make it easier to later move it off the CPU entirely and into the storage device itself.
[1] https://nvmexpress.org/bringing-compute-to-storage-with-nvm-...