Hacker News new | past | comments | ask | show | jobs | submit login
File Expiration Using BPF (hondu.co)
64 points by todsacerdoti on March 27, 2023 | hide | past | favorite | 22 comments



Author here! As some comments mention, there are other approaches that could make more sense.

The aim of this proof-of-concept was to showcase how BPF enables the kernel to be programmable. Many behaviours can be retrofitted without having to submit changes upstream. In some cases the overhead may be lower than using other methods (due to avoiding context switches, etc). Even if you decide that the changes can be useful for the broader community, the new feature can be first implemented in BPF, where it’s faster (and safer!) to evolve it.


The "enable the kernel to be programmable" piece is absolutely the most insightful part about BPF/eBPF effort IMHO. Anyone who ever designed/implemented an OS/studied OS design knows that there is a lot of stuff that has to be done in the user space for security reasons (in the "don't let a user program accidentally blow up the whole system" sense) and there is just no way around it — or is there?

For example, consider CreateProcess vs fork/exec: at the first glance, the first one is the most obvious and sensible, you want a program launched, you point the OS at the executable and say "execute this". The second API instead makes you to make a perfect copy of the currently executing process, then you tell the OS to completely scrape and discard its state and launch another executable file inside it. Literally no other object in the system works this way (i.e. to create a new X, you must copy an already existing X first, then modify with the copy).

But then consider that you would also like to configure the environment of the newly launched program before it actually launches, and that's when CreateProcess API starts to lose its simplicity: you have to pass all imaginable (and some unimaginable) kinds of configuration, and the API has to be extensible and future-proof, too! Just look at what kind of silliness goes into thread-attribute lists [0]. But with fork/exec, you simply run the normal, user-space code that configures the current process whatever way it pleases, then swaps the executable, done — and you can't do that with CreateProcess, you can't just ask the kernel to run some arbitrary user-provided code in the kernel context... unless it's not entirely arbitrary: enter eBPF.

[0] https://learn.microsoft.com/en-us/windows/win32/api/processt...


Interesting! I'm in the process of learning Rust (early days, early days) and whatever I write in Rust in the near future will have to interface nicely with C, AND for various reasons I also need to become far more expert in BPF than I currently am (just at the "uh, that's what happens I make these seccomp calls" stage right now), so this is a great little project for me to explore. Thank you for that.

That said (you knew there was a "but" coming, right? :->), why ptrace? Was that the only or just the simplest way to do this?

I know this is a PoC, but given disabling ptrace is considered by some (many?) an important security practice (moo), I'm wondering if there is a better alternative?

Thanks!


ptrace is not used in this project, is it possible you meant some other system call or project?


My apologies! I saw the inclusion of ptrace.h in run_bpf and assumed the underlying mechanism was ptrace.


BPF --> Berkley Packet Filter

Maybe this comment will save you a google. But probably not.

https://en.wikipedia.org/wiki/Berkeley_Packet_Filter


Funnily enough, although this calls it BPF, the technology in the article refers to eBPF (https://en.wikipedia.org/wiki/EBPF): https://docs.kernel.org/bpf/index.html -- which, although rooted in the OG BPF, is only distantly related.


Gregg’s BPF Performance Tools has an extensive note about this usage in the first section:

> BPF stands for Berkeley Packet Filter, an obscure technology first developed in 1992 that improved the performance of packet capture tools. In 2013, Alexei Starovoitov proposed a major rewrite of BPF, which was further developed by Alexei and Daniel Borkmann and included in the Linux kernel in 2014. This turned BPF into a general-purpose execution engine that can be used for a variety of things, including the creation of advanced performance analysis tools.

> [...]

> Extended BPF is often abbreviated as eBPF, but the official abbreviation is still BPF, without the “e,” so throughout this book I use BPF to refer to extended BPF. The kernel contains only one execution engine, BPF (extended BPF), which runs both extended BPF and “classic” BPF programs.¹

> ¹ Classic BPF programs (which refers to the original BPF) are automatically migrated to the extended BPF engine by the kernel for execution. Classic BPF is also not being developed further.


In modern parlance, BPF always means eBPF. The original BPF is sometimes called cBPF.


This seems a little roundabout. Why is this preferable over using `inotify(7)`? That's an existing system which can notify your userspace driver program about any time `setxattr(2)` is called.

I get that it's fun to experiment with BPF, but it's also useful to see how many tools already exist. BPF is powerful, yes - maybe too powerful, and it's often not the best tool for the job.


Here you probably want fanotify, not inotify. The later would require setting up a lot of inotify entries, recursing through the filesystem.

https://man7.org/linux/man-pages/man7/fanotify.7.html

I think fanotify does xattrs, but I haven't tested.


I would still think that eBPF is a better suited approach. With fanotify, you would need to manage the mount points for which notifications need to be received. This works semi-OK for a static use case, but not for dynamically-created mounts and, worse, mount namespaces. In other words, fanotify is not suitable at all, without a lot of glue, for monitoring events happening in containers. And, for example, clamonacc (on-access file checking for ClamAV) does not work with removable storage, temporary network mounts, and containers for this very reason.


Exactly this. I should have mentioned it in the post. Might amend it.

Would be interesting to compare the chances of race conditions with inotify vs BPF for this contrived use-case.


Just curious why you wouldn’t amend it?

Doesn’t take away from the original point that bpf is cool.


Because it's from 2020/21, not even like it's a factual error, and OP may not have the interest to revisit it?


One other approach is directory expiration.

If a directory has a time-to-live of 8 hours, then any files or subdirectories inside the directory will be deleted 8 hours after their last modification.

That way the applications that work on these files don't need to be setting special attributes on every file.


In case you want to learn to write BPF kernel programs using Rust

https://www.amazon.com/Oxidize-eBPF-programming-Rust/dp/B0BP...


Quite clearly from the "solution looking for a problem" dept.

If one ever happenes to casually find 'several petabytes of old logs that were not deleted', whoever's in charge should consider having several people fired over that.


You'd fire people for leaving around some log files?


Depending on what is in those log files (passwords, credit card details, health records, etc), failure to delete could result in fines or technically even jail time for the people responsible.


Maybe even rehire them just to fire them for making a mistake.

Serious business these things.


We needn't be Nazis about technical glitches; these things happen.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: