Usually such PyPI packages with malware are typo-squatting other, well-known packages. They count on people making mistakes in their pip command-lines or requirements.txt or whatever. But "secretslib"? It doesn't ring a bell as a typo for anything. Authors also can't be counting on people installing it organically because the package had no long description of what it supposedly pretended to do. So what was the plan here?
I have come across a handful of malicious packages. Based on the reading the code I do not think authors are very professional - looked more of a script kid quality. Maybe there is no plan. Maybe teens are just fooling around.
If you come across a malicious package you can send a take down request at:
It's a bit ridiculous that setup.py allows you to run custom code on install. Granted, even without this you would run the malware the latest when you import the module. But I've had all kinds of trouble from packages with too smart setup.py scripts. The best Python packages are just "xcopyable", meaning self-contained in a directory that you can just drop in the PYTHONPATH. Or if you really need a binary component, include binaries (plus a simple compilation step for other platforms) on PyPI.
> It's a bit ridiculous that setup.py allows you to run custom code on install.
Agreed! The good news is that it's slowly getting better: just yesterday[1], setuptools added support for PEP 660[2], which removes one of the last remaining reasons to use a `setup.py` instead of a fully declarative `pyproject.toml`.
Edit: Also, just to qualify: this is a consistent problem across packaging ecosystems. Even system package managers (reasonably!) allow custom code during installation, e.g. to add service hooks and update locale files.
I actually wrote a Python tool for downloading binary releases from GitHub and Gitea and wrote a “builder” module that can be used in a pyproject file to tell it to download an arbitrary binary and run some commands.
> Perhaps spit a scary sounding security warning for all packages which use a setup.py?
This would be nice, but there's probably too many practical barriers to this. For example, you can't (easily) replace `setup.py`'s dynamism for packages that contain native packages: they need to be able to spawn compilers, etc. and do so in ways that are hard to describe declaratively.
We probably don't want to provide a security warning on every native non-wheel package, so that's a no-go for the time being.
Should start with a warning, and move after one year to users having confirm that they allow pqckqges to execute setup.py and ultimatrly not allow it any more.
This would probably break too much of the ecosystem, since many top Python packages include native extensions that can only be built with a dynamic `setup.py`.
Wheels are solving that problem, but it's likely going to be many years before warning about `setup.py` is reasonable and not disruptive to the majority of legitimate users.
Could we perhaps sandbox any such compilation steps within a container and then copy out the resulting artifact(s), with a hook that does some security scanning?
This is hard to do in the general case: the compilation steps often need system dependencies or to otherwise query system state to do the right thing, which means that the sandbox has to accurately reflect the host.
In terms of hooks, audit hooks and events[1] get us some of the way there. But they can be circumvented as well; I wouldn't rely on them for untrusted code.
I published my first Python code in a while using a pyproject.toml and it was amazingly simple. As someone who was doing Rust for a while I really feel like they're learning from the modern ecosystems very well.
Follow the setuptools doc [1], it's very easy and there's an example you can start from. It links to PEP 621 [2] which seems to be the specification if that can help.
> It's a bit ridiculous that setup.py allows you to run custom code on install.
In Rust it's "worst", you can run custom code at compile time. But IMO the issue is that we are optimizing everything for convenience at the cost of security. The compile/install custom code wouldn't be an problem if each dependencies was pinned and audited.
PyPI requiring 2 factors authentication on "popular" projects will help, but this will not replace the needs for auditing, vetting and pinning.
I'm convinced that, while it's bad policy, there's no real effect - the first thing you do upon installing a new dependency is import it, and importing it... Runs arbitrary code.
yes, compiled Wheels really are the way to go, but anything.py is going to be able to run custom code, that's the point. And like you said, the code will likely be run on the machine anyway. Minor rant, but I think it's much worse that NPM installs can run custom code, since many(most?) npm modules are only run in a browser, it actually does open up more of a security risk.
Many modules, especially a lot of specialized modules that are not available on PyPI and do not provide wheels, have complex installation steps, including compiling C libraries etc. There is nothing foundamentally wrong with this kind of use case.
Does anyone know what the motivation for mining Monero is? Maybe I'm missing out some massive use of it, but I see Bitcoin as having been the original, Etherium for enabling things like NFTs, and Dogecoin for comedy purposes.
But I've only heard of Monero being used in scams like this, so I'm curious as to if there are legitimate uses of it that allow it to have value.
You list Dogecoin for comedy purposes as a valid reason yet ignore the practical benefits of greater anonymity with Monero.
I'm no coin expert, but if I were to hypothetically buy some illegal psychedelic drops off the dark webs, I'd gravitate towards Monero or a Monero-likes over Dodgecoin. There are plenty of scammy coins, but there are also ones that fill a niche.
IIRC it's because GPU mining Monero is a difficult problem and existing GPU miners aren't very efficient, so CPU mining Monero produces more revenue than CPU mining other coins.
> What an analyst might miss though is that the seemingly-innocuous 'tox' covertly drops another ELF file directly in memory—a sign commonly associated with "fileless malware."
Yeah I agree, but since the whole essay is pretty much a sales pitch for their product, that's to be expected.
Also first-stage dropper are almost always heavyly obfuscated to avoid detection. There isn't anything novel about this and the second-stage malware actually has reasonable detection rates.
I think it's good to remind people that malware happens outside of windows. And yeah it is a single temporary file, and apparently only works on debian based systems and maybe other systems that have wget and cpulimit pre-installed. Doesn't seem like they really tried too hard.
edit:
They could have written the malware in python, and after decoding it run it with eval, done right it would even be cross platform. Now I'm wondering how well i could hide something like this. I gotta stop before I get swept up and start adding easter eggs.
> and maybe other systems that have wget and cpulimit pre-installed
Nope. It's `apt ... && ...`, and as the left-hand side of `&&` would fail on systems without APT, the right-hand side won't ever run, so it wouldn't work because it wouldn't ever download that `tox` binary.
Packj https://github.com/ossillate-inc/packj can easily detect such malicious packages by pointing out if a package accesses sensitive files (e.g., SSH keys), spawns executables, exfiltrates data, is abandoned, lacks 2FA, etc. Disclaimer: I built it.
This is great, is that a good area to move into? Built a few toy projs along these lines on the binary analysis side but never looked further up the chain. How does it handle dynamic analysis?
Thanks! Would love to receive code contributions from rare folks like you who have binary analysis expertise. Currently, it uses strace to track process/net/fs system calls on Linux.
You could maybe reasonably call this “fileless” if it was calling memfd from python. But even then - memfd does leave the file descriptor dangling and accessible in /proc/, it’s not much better than just dropping your file in /dev/shm.
In a windows malware context “fileless” means something entirely different from this.
That everything is a file in Linux does not take away from the fact that this trojan only exists in volatile memory. It’s a fileless trojan for all intents and purposes.
The Linux equivalent of fileless malware on Windows would be a python script you pipe straight into the interpreter. The whole purpose of “fileless” malware is to not get caught by AV which will alert on a new files being created and executed.
If this malware was piggybacking on the actual python process to do its thing, then maybe you could reasonably call it “fileless”.
Uploading linux malware on Virustotal mostly results in it being scanned by Windows AVs. Many will have basic signature detection for common Linux malware, but heuristic detection is essentially nonexistent.
That blinking is the notorious Drift “chat bot”. One of our marketing people convinced the CEO it wound generate tons of leads, while everyone else objected to how annoying the whole thing was. Especially given the very high costs.
Six months later: only four visitors ever chatted with the bot, zero leads generated, marketing person no longer with company. Drift bot code ripped out after 180 days, but we’re still paying the contract the idiot from marketing signed.
Tip for startups: annoying people is not a viable business model.