Hacker News new | past | comments | ask | show | jobs | submit login
Finding people's use of /usr/bin/Python with the Linux audit framework (utcc.utoronto.ca)
24 points by ingve on Jan 6, 2023 | hide | past | favorite | 11 comments



At my previous job, I wrote a simple bpftrace script [1] to track Python2 invocations. The output of this program was sent to a message bus and eventually stored in a columnar database for exploration.

While Python2 got removed from the base operating system image, we wanted to uncover weird edge cases of engineers running special builds. We found more than one :)

I think BPF is underused for safe deprecations. It's super useful to prove that an executable is not loaded, or that a method is not called while being traced :)

[1]: https://github.com/javierhonduco/bpf-playground/blob/master/...


Fascinating. Why was a message bus necessary and what message bus and database (presumably open source) solutions did you use?

Thanks for sharing the use case. Super clever solution.


The script was running on a subset of the production machines, which still accounted for a rather large number. The message bus eventually transported bpftrace's output to the database.

The roll out was done in phases to ensure that the overhead would not be prohibitive as well as to make sure that it wouldn't cause egregious priority inversions.

Scribe [0] + LogDevice [1] + Scuba [2]. I believe Scribe was open sourced a very long time ago, and LogDevice more recently, but they are both archived now.

- [0]: https://engineering.fb.com/2019/10/07/data-infrastructure/sc...

- [1]: https://logdevice.io/

- [2]: https://research.facebook.com/publications/scuba-diving-into...


Very interesting indeed!

Any idea why those projects (Scribe and LogDevice) were mothballed? Did Facebook simply lose interest in maintaining or did other OSS projects enter the mix that proved to be a better fit? Thanks again for sharing.


I’m not sure there’s any public write-up on this, sorry


Why not just use the alternatives(8) to create a wrapper script to log use?


Because /usr/bin/python comes from the the python-is-python2 package (or the python-is-python3 package), and it's a plain symlink, not something managed by update-alternatives. So the correct thing to use for a wrapper script would probably be dpkg-divert, instead of alternatives.


That is an advantage not a reason it can't be used.

Yes alternatives(8) points a symbolic link to a name in the alternatives directory, which means that when you custom package's link to /user/bin/python is overwritten it is trivial to restore.

A lot less fragile and resource intensive than using the audit system which the blog post clearly mentions is a bit crusty.

Apparmor is another method that is a bit less expensive if it is in use but obviously doesn't work for containers.


To expand on the point, /usr still has to support the shared NFS mount model that has mostly been irrelevant for 20 years, the old perl alternatives was partially written to deal with that case.

I am pretty sure python2-minimal still triggers update alternatives to support that shared filesystem model.

Putting the named links in /etc allows for local configuration and keeps links DAGs


> Yes alternatives(8) points a symbolic link to a name in the alternatives directory, which means that when you custom package's link to /user/bin/python is overwritten it is trivial to restore.

If you use dpkg-divert, it won't be overwritten in the first place. If you install any package which contains a file or symbolic link with that name, that file will instead be written to the name you specified in your dpkg-divert command. It's the correct way (in Debian-derived distributions) to locally override a file or symbolic link which comes from a package which does not use the alternatives mechanism (which is the case for the python-is-python2 and python-is-python3 packages; AFAIK, they contain a plain symbolic link, instead of using update-alternatives).


I agree that dpkg-divert is a good way also, just that in my experience it is easier to use alternatives for many teams as fpm requires less institutional knowledge.

Point being is that at least in my opinion, using the audit system is the hard way especially in a container heavy world and due to the system wide performance impacts of open/close system calls.

While there are lots of politics and exceptions with heir(7), those packages should use a link in /usr and not a Host-specific link in a /etc subdirectory.

But yes we mostly agree, my problem was with the audit framework which is fragile and expensive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: