Hacker News new | past | comments | ask | show | jobs | submit login
eBPF – Adding functionality to OS at runtime (ebpf.io)
118 points by truth_seeker on Nov 6, 2022 | hide | past | favorite | 22 comments

This is a good write-up and I like the diagrams. What appears to still be notably missing from eBPF is an "off switch". AFAIK there are still no kernel boot time commands [0] to disable eBPF entirely. I have to recompile the kernel to disable it and it is known that most people will not do this.

eBPF has the potential for file-less malware to run hidden from detection and I foresee the ability to tickle ring -3 (and -4?) CPU within CPU functions while bypassing local firewalls.

Here is some example code of what people already know how to do today and this list will grow as people discover more capabilities. [1][2][3][4][5][6] These do require some privileges to insert but will remain running and hidden until reboot. Privilege escalation today is easier than ever with the growing misuse and poor configurations of sudo as well as the growing number of suid/setcap binaries. A common argument I get is "Well if someone ... then its game over". They are not entirely wrong, but I do not want yet another file-less anti-forensics vector that risks Linux being forbidden in secure zones nor do I want to play whack-a-mole using commercial tools like sysdig or complex tools people avoid like SELinux to try to fight this stuff.

[0] - https://www.kernel.org/doc/html/latest/admin-guide/kernel-pa...

[1] - https://github.com/citronneur/pamspy

[2] - https://github.com/h3xduck/TripleCross

[3] - https://github.com/krisnova/boopkit

[4] - https://github.com/pathtofile/bad-bpf

[5] - https://doublepulsar.com/bpfdoor-an-active-chinese-global-su...

[6] - https://blog.doyensec.com/2022/10/11/ebpf-bypass-security-mo...

> eBPF has the potential for file-less malware to run hidden from detection

so do ROP chains. There's a lot of fluff in the industry about "file-less" this and "LOL-bin" that, but really this isn't anything new, the fact that people are still writing post exploitation stages that write persistent files to disk is out of sheer laziness/attackers not being forced to do anything different. Forcing attackers to write ROP compilers isn't really a worthwhile goal anyways because you only ever have to do that once as an adversary (and it would take, what, a single person a few weeks?) whereas getting to the point where file detection is so good would take a massive industry wide effort and billions of dollars.

eBPF-for-evil is not a worthwhile concern because it's not meaningfully different or more powerful than the same old techniques we've all known about since the 90s.

While I do agree generally that we're still experimenting with the guardrails around eBPF, it seems like the more fundamental problem of the "ease of privilege escalation" is what we should be focused on?

But I can see how eBPF itself might be considered too powerful to wait for that to be solved.

I'm honestly more worried about Linux desktop/small startups that can't afford expensive commercial guidance and are likely to not understand/care about the implications of eBPF until its too late.

'bpftool' provides a list of loaded bpf programs, but playing around with it just now, it is not obvious to me how to get the actual loaded bpf program details (e.g, pre-JIT version of the loaded program [or even the JIT version]). The man page is pretty terse.

'bpftool prog' / 'bpftool prog list' / 'bpftool prog show' all provided the same output (sample below), and e.g., adding the index in the list, to any of the above commands is an error.


  # bpftool prog
  8: cgroup_device  name sd_devices  tag 1f97e470ec084ee5  gpl
        loaded_at 2022-11-02T08:58:10-0700  uid 0
        xlated 464B  jited 292B  memlock 4096B
  173: cgroup_device  name sd_devices  tag c7286db13d0052fa  gpl
        loaded_at 2022-11-05T23:53:24-0700  uid 0
        xlated 464B  jited 292B  memlock 4096B
  174: cgroup_skb  name sd_fw_egress  tag 6deef7357e7b4530  gpl
        loaded_at 2022-11-05T23:53:24-0700  uid 0
        xlated 64B  jited 58B  memlock 4096B
Does anyone know if it is possible, and how to get the details on the loaded bpf programs?


Loading a bpf program to snoop on mount/umount events, I get:

  189: kprobe  name syscall__mount  tag f618bcdbd5252ac9  gpl
        loaded_at 2022-11-06T10:45:24-0800  uid 0
        xlated 3440B  jited 2478B  memlock 4096B  map_ids 1
        btf_id 200
  190: kprobe  name do_ret_sys_mount  tag 6960d7da9f367709  gpl
        loaded_at 2022-11-06T10:45:24-0800  uid 0
        xlated 600B  jited 442B  memlock 4096B  map_ids 1
        btf_id 200
  191: kprobe  name syscall__umount  tag d332813cc5f79072  gpl
        loaded_at 2022-11-06T10:45:24-0800  uid 0
        xlated 1728B  jited 1223B  memlock 4096B  map_ids 1
        btf_id 200
  192: kprobe  name do_ret_sys_umount  tag a77da47eaebd04a3  gpl
        loaded_at 2022-11-06T10:45:25-0800  uid 0
        xlated 600B  jited 442B  memlock 4096B  map_ids 1
        btf_id 200
So, at least, even without any more details, it is possible to see that something is monitoring mount/unmount. But, still curious about my orginal question.

You can see the interpreter instructions through bpftool:

  bpftool prog dump xlated id 173
To see the JITed instructions:

  bpftool prog dump jited id 173
For the interpreted insns, you can also see the instructions in different forms, which is pretty neat. For example you can get a control flow graph in DOT format with `visual` specified at the end:

  bpftool prog dump xlated id 173 visual
I learned about these commands through this blog post: https://qmonnet.github.io/whirl-offload/2021/09/23/bpftool-f...

Thanks. Visibility seems to be pretty good. Something like AIDE/Tripwire for BPF might be the next step.

This is effectively what Falco(https://falco.org/) is

FWIW, programs compiled through a modern toolchain can ship their own debug data. For example, the restrict_filesystems program loaded by systemd

    $ sudo bpftool prog dump xlated id 50
    int restrict_filesystems(unsigned long long \* ctx):
    ; int BPF_PROG(restrict_filesystems, struct file *file, int ret)
       0: (79) r3 = *(u64 *)(r1 +0)  
       1: (79) r0 = *(u64 *)(r1 +8)
       2: (b7) r1 = 0 
    ; uint32_t *value, *magic_map, zero = 0, *is_allow;
       3: (63) *(u32 *)(r10 -20) = r1
    ; int BPF_PROG(restrict_filesystems, struct file \*file, int ret)
       4: (bf) r1 = r0                                                             
       5: (67) r1 <<= 32             
       6: (77) r1 >>= 32
    ; if (ret != 0)           
       7: (55) if r1 != 0x0 goto pc+59                                             
       8: (b7) r1 = 32               
       9: (0f) r3 += r1
      10: (bf) r6 = r10        

Unfortunately, most of the programs loaded by systemd are more-or-less hand-generated (the ingress/egress programs specifically) and do not include this information.

It's a surprisingly small group of folks who work in this space upstream, but I know that they're aware of this as an opportunity to improve things :)

the bpftool-prog(8) man page documents the bpftool-prog subcommands

> "Well if someone ... then its game over". They are not entirely wrong

They're not at all wrong, and your argument has no merit. As a defender, I'd pray that attackers use toolkits like the ones you linked that are essentialy utter trash and trivially detected (as is anything ebpf-based).

Good kernel/userspace rootkits do not use ebpf and can be very very hard to detect.

Yep, dunno why you've been downvoted. Kernel rootkits are just normal dkms kernel modules. Why would the attacker use these new eBPF rootkits when the old open source ones are far more sophisticated and need similar privileges?

As far as I know, eBPF today is only usable as root - or, if they expanded the permissions for eBPF, can be limited to only be used by root. The same root user can install a kernel module to do everything that eBPF lets you do. Getting to root on a machine is the endgame, at least as far as that machine is concerned.

In 2020, CAP_BPF was added to allow the use of eBPF.

For most programs, you'll use a combination of CAP_BPF, CAP_PERFMON and/or CAP_NET_ADMIN. For some edge cases (such as hardware offload), you'll need the entire CAP_SYS_ADMIN (basically root).

I wrote more about this here: https://mdaverde.com/posts/cap-bpf/

Interesting. That's basically read-only root permission, which should probably be handed out sparingly, but it makes sense for isolation purposes to separate out your BPF programs from root.

An eBPF enabled kernel makes a great jump off point for any exploit chain.

As others have mentioned, eBPF is quite neat software, but it's observability in an of itself is quite difficult. It's hard to understand WHAT eBPF programs are loaded, and what they're doing. Supposedly Android has a dozen or two eBPF programs running at anytime. Is Ubuntu on my laptop running a similar batch? I have no clue, and many of us here probably wouldn't know where to look either without some Googling.

Fedora 36 comes with a bunch of BPFs loaded (you can see their log messages in journalctl), but I have been unable to find any documentation about exactly what they are. Using `bpftool prog` mentioned in another comment in this thread, I can see the names but I still have no idea what they're doing.

In the near term with a recent-enough BTF-enabled kernel, bpftool should be enough for an "ambitious" user to understand what objects are being run on a system.

Unfortunately, "ambitious" here meaning enough to actually understand more than just what eBPF is but also the significance of each hook, what effect the bpf program has (most eBPF programs are GPL-licensed), and which processes have access to these objects.

This is not easy, especially considering that the shape of eBPF changes with each kernel release.

What is the advantage of eBPF hooks over the ptrace system call ? Can't I do most of the same stuff with it ?

Tracing is just one area where eBPF can be used. However there are lot of other applications.

E.g. you can do a lot of low level networking (forward packets to another host before they go into the kernel for building a router or load balancer, drop packets for a firewall) using the XDP eBPF hooks. You can have some on-host TCP policies - e.g. which thread handles a new incoming connection - expressed as eBPF hooks. And there's more.

Safety. Kernel level targeting rather than process.

eBPF is also a lot more efficient than ptrace

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact