A lot of exploits are two-stage. Stage one is usually the vulnerability, usually written in C given the low-level and tightly controlled instructions required. The exploit breaks security to run an executable or otherwise gain control. Stage two is usually downloading a python executable to grab the goods.
There's nothing especially sinister about the selection of Python for this case over other interpreted languages. Malware authors are just regular developers - they don't want to spend hours trying to hack together a C binary to dump a database when six lines of Python will do it. Python just runs on a lot of platforms, has a lot of mature drop-in libraries, and decent documentation. They use it for the same reason we use it.
The article just makes it sound like malware developers are using modern packaging tools to turn that two-stage exploit into a single-stage. That doesn't strike me as particularly surprising. Teams tend to gravitate towards specializing in one tool when they can. I'd obviously prefer to write a bunch of python than do the same in C, when performance isn't a huge concern (It's the other guy's CPU, after all).
Just seems like a minor observation, rather than some doom trend.
> A lot of exploits are two-stage. Stage one is usually the vulnerability, usually written in C given the low-level and tightly controlled instructions required. The exploit breaks security to run an executable or otherwise gain control. Stage two is usually downloading a python executable to grab the goods.
This seems like a gross oversimplification & commonly incorrect. Often times a "stage one" vulnerability to gain initial access would be network code written in a high level language such as Python or Ruby (see Metasploit). And an executable payload to interact with the system would be generally written in a compiled language like C or C++. My article is detailing the uncommon rise of interpreted languages (especially Python) being used over the past ~5 years as malware dropped on an endpoint in an attack.
> Just seems like a minor observation, rather than some doom trend.
I wouldn't say this is a minor observation or a "doom trend." I'd say it's a very interesting and insightful observation that is worth keeping an eye on. Malicious actors are no longer operating in a world of slow endpoints and lack of resources. They instead are operating in a world of high-speed internet, very fast endpoints, and have a rich ecosystem of open-source tools at their disposal.
I find it highly interesting that malicious code written in interpreted languages, bundled with their interpreters into an executable, are finding their way into the arsenal of high-tier malicious threat actors over the past few years. Just as the web browser is slowly eating away at the operating system, interpreted languages are slowly eating away at compiled languages in a variety of domains- including malware.
It used to be that malware authors (virus writers in particular) were characteristically more "hardcore" than the average developer, as in preferring native code (even handwritten Asm) and clever optimisations to make their software smaller and more "tricky", for lack of a better term. But that was when it was as a whole not as commercialised, so it's not so surprising to see that aesthetic disappear with increasing commercialisation.
It's the other guy's CPU, after all
...and that might be why malware was initially more optimised than average; it spreads more easily when it's tiny and fast, doing its thing without being noticed, than if it causes a noticeable increase in system load that will prompt further investigation and lead to its discovery.
I wonder when we'll see Electron being used for malware...
A subgroup of them still operates like that but I feel like "it used to be" might be a bit outdated. It doesn't seem new for malware authors to utilize low hanging fruits from languages to infrastructure. We've had VBA macros that are or spread malware for decades now, it used to be a pretty regular sight in the early 2000s to see low-effort payloads to be written in some high level language and utilize some random IRC server as a C&C for example. Not everything out there is some state actor level APT nightmare, with more developers in every part of the market and even more users that simply don't care enough it seems like a normal development to see stuff like this more often.
It's with Josh Pitts, author of this tool  and another payload that caused lots of go projects to be eaten by Kaspersky 
More than that. Easy interop with dlls/shared libs via ctypes
Shameless plug; I wrote a few popular articles on 0x00sec about Python malware on Windows just to show how simple and easy it is to build either using ctypes to call WinAPI functions or using pywin32 wrapper which makes the whole thing a lot faster.
See part 1 here https://0x00sec.org/t/malware-writing-python-malware-part-1/...
Definitely not the way to go if you have limited memory and need to write tiny shell code but it’s good enough for a stage 2 payload.
> Packaging with PyInstaller to create a single (but large) executable is easy and helps avoiding detection as the interpreter is embedded in the PE
If you look down further in the article it explores detecting PyInstaller generated executable using simple YARA rules. So, I'd disagree a bit there. I personally think that Nuitka (talked about in the article) in conjunction with a packer would be the best compilation method to use in-order to evade detection. It's actually quite surprising to me that limited malware samples have been seen in the wild using Nuitka, but as the title of the articles states- it's on the rise.
As for Nuitka, I was not able to make it work but I will try again. The alternative I also tried in the past was using Cython to generate C code then compile it but because it requires packaging Python std libs Dlls it was too much trouble and I ran into crashes when running.
I also had bad experiences when using packers because they have a tendency to trigger AV detection just for being packers, like ASProtect. Python malware is definitely a topic that deserves more in depth dive.
Which can be easily patched out with a simple sed rule as it just uses a text search of the binary.
What I'd want to learn more about is whether or not these Python samples tend to be very large (in terms of actual code, and not just language internals/pyinstaller/boilerplate). I expected the real life samples to be smaller than some of the larger botnets and the like written in these compiled languages, but some of the ones you go in depth on are somewhat surprising.
In short, they shipped a python interpreter that understood rc4 encrypted pyc/opcode files.
A much more modern, and statistically far less common approach (say, top 15% of malware), wants to bring less to the system. Instead, they leverage existing mechanisms for execution - some of this is covered in LOLBAS (Living of the Land Binaries and Scripts).
Interpreters such as Powershell, bash, Python, ruby, even perl, are used by attackers to run their payloads.
There are a number of advantages. For one thing, if you're monitoring the system for new binary executions, it'll appear as just a Python interpreter - often quite normal on a number of systems. You also don't need to set any sort of execution rights, or drop executable files - just a regular, plain old python file.
But the downside is now you're using the system's interpreter, and you have to follow the interpreter's rules. Powershell really kicked this approach off since it was a favorite of malware authors, and Python followed suit with this. As a much newer implementation, with 3.8 being an extremely recent release, it's not so surprising that there are bypasses. Still, you'd be surprised how few attackers will take the time to do so (and how few orgs will monitor their Python interpreters anyways).
In fact, I even implemented the variant of audit hooks that informed PEP-551 and deployed those to production for a major platform.
I think that if you were to use audit hooks, you might want to combine them with a code signing mechanism. I worked on two of these for different platforms, and they add another strong layer of defence.
In light of that, the bypass in the article seems somewhat contrived, especially considering that there are at least three alternate techniques you could employ to accomplish the same goal that would be meaningfully more innocuous than using `ctypes.windll`. If you're going to use `ctypes` or `pywin32` or the like, then you might as well write a C-extension module to patch out the audit hooks directly (as Batuhan Taşkaya shows using a simple trampoline toy library I wrote, `libhook`: https://speakerdeck.com/isidentical/hack-the-cpython?slide=3...).
1. Joe Jevnik and my `tuple_setitem` which uses poisoned bytecode and the lack of bounds checks on `LOAD_FAST`/`GETLOCAL`. (Pure Python.)
2. My `tuple_setitem` using `numpy` raw memory access via `numpy.lib.stride_tricks.as_strided`. (Requires only `numpy`.)
3. Using the `/proc` filesystem on Linux, which gives arbitrary intraprocess read/write access (independent of page permissions.) (Requires only `open(..., 'w')`.)
There are also a couple of techniques you could employ to carry one of these payloads past code signing, some of which are very well known, like the insecurity of `pickle` deserialisation, and some of which are… less well known.
(I have also prototyped using the above exploits to "lift" C code into a Python interpreter, in case there are OS-level defences around `dlopen`.)
However, even taking these into account, I'm a big believer of the value of Python 3.8 audit hooks at PEP-551, but they are technique that requires quite a bit of extra work to effectively employ.
If you're interested in trying to implement audit hooks and these other mechanisms for locking down your execution environment (e.g., you want to mitigate exploits in Python systems, which may run as PID-1 in a containers, where these exploits may try to bring in malware that could exfiltrate data…) please feel free to reach out to me by e-mail or Twitter.
I would be happy to share more with any organisations that are large enough to consider locking things down at this level.
The section on eval was a little more interesting but still nothing special.
Personally, and this is just my probably uninformed opinion, the biggest thing about Python that makes it useful for malware is its huge, mostly uncurated repository of libraries and addons that are easy to install and use without ever looking at it. This aspect of Python seems likely the most appealing for would be malware writers. The ease of making malicious code widely available without a lot of scrutiny.
This is talking about malware payloads themselves. I don't agree that those capabilities (taking screenshots, especially eval) are trivial in other languages. Eval in particular makes things trivial since you can basically do:
eval(get_payload()), which is awesome from a staging perspective - the trend in malware is to modularize more and more for a number of reasons (less code to scan for sigantures, new monetization strategies, easier to update, etc).
So having the ability to do runtime, reflective module loading to trivially get a capability like screenshotting is pretty huge.
That being said, the security of PyPi and python packaging in general is certainly another interesting topic. I like to think that so far it hasn't been as bad as NPM, but there have been backdoored packages put out onto the internet. It's bound to happen with any public software repo, and with any project that trusts outside contributors without perfect review.
1. Packages tend to be smaller, and the transitive dependency trees of projects corresponding larger. This means there are more single points of failure.
2. More people are using it.
Python, and for that matter most language package ecosystems, have the same problems as js, but many of them have gotten away with it for a bit longer due to (lack of) scale.
As far as the opencv library goes, used by PoetRAT, you can choose to bundle third party packages inside your executable with all the executable generators I mentioned at the beginning of the article like PyInstaller or Nuitka.
I wonder how large malware payload size will be when packaged with open-cv :)
However, in all notebooks that have one, that code will make your litle LED indicator right by the webcam turn on, so it's not particularly quiet about it.
I do hope that those reading this will use this knowledge to do good.
Furthermore, red teams being kept up to date is also useful, which might be more surprising to those surprised by this post. Knowing obfuscation techniques in particular is plenty useful to developers and cybercriminals alike.
See PupyRAT, a full-on multi-os admin tool mainly written in Python (2 unfortunately, also it's buggy and outdated), it's a great example. They use a C wrapper around their remote admin tool that is written in Python. Their (C) loader downloads the provided Python payload from an http link, stores it in a specific memory address that gets executed right after. Because it's in memory, it doesn't touch the disk, Unless you are using the Windows payload (which provides multiple options to hide the program using a set of windows' exploits).
Packages are not targeted for now.
Python’s eval() function reminds me almost of Lisp’s eval/apply feature, which is supposedly at the heart of what makes Lisp so special.
I imagined building a program, that I could teach, to eventually write its own programs. But, I figured I would output it to a separate file, and run that file instead.
geared towards someone who has never coded before
Basically, here's SCYTHE's client architecture: https://www.scythe.io/library/under-the-hood-scythe-architec...
And here's how you would load your python to run on the client: https://www.scythe.io/library/software-development-kit
I know for a fact a lot of cybersecurity automation mind share is in Python. Curious to see if this new wave of Python malware will make it into any big cybersecurity vendors. I've performed due diligence on a number of cybersecurity vendors that I wouldn't qualify as having good security posture for stuff like this.