Ugh. This whole blog post is giving me the heebie-jeebies. Replacing all static accesses with dynamic ones, and with endless function calls everywhere has to destroy performance. You're intentionally making it hard for the JIT compiler to do its work.
And for what? Obfuscating your code so that people don't steal it or whatever? This kind of obfuscation can be easily and programmatically reversed engineered if someone really wants to, so... why do it? Just to screw with people trying to look at the source code of a web page?
People complain about JavaScript minifiers and WebAssembly that they're making the web less open and hackable, but at least those things have a point to them. There's a performance upside! This is just "naah, lets make the web slower, more closed and less hackable, for... you know... reasons."
The title of his blog implies he is "working on browser fingerprinting" and is proud of his occupation. So obviously the work he does is for ad agencies that frankly never gave two fricks about performance.
Unless you are making a browser game and want to make life sad for cheaters, I don't really see a need or reason for obfuscating, and even then its not even ideal and a biproduct of a lazy solution (ie. not running simulation on a server)
I would guess he was or is looking to defeat obfuscation of some bots (see his previous post https://antoinevastel.com/javascript/2019/08/31/sneakers-sup...) and then got distracted/fascinated by the techniques the obfuscators used. I'd be generous and read the post in the satirical style of "10 ways you can screw up the web".
BTW I totally agree with the sentiment that obfuscation is bad - and I would include shipping bundles without sourcemaps. The web got popular because of transparency, and people learning from each other. (If you really have IP-expressed-in-code you want to protect, don't ship it to clients!)
He's a researcher, he might be working on browser fingerprinting so that browsers can avoid it. That's how I interpreted it, at least; the fact that he interned at Brave seems to confirm it.
If I were to be making a JavaScript obfuscator, I would simply start by rewriting the AST so that Exceptions would become the driver behind the code. That way, it would make it really hard to reverse the code without executing it.
Also sprinkle some parts of the code that check how much time it takes to execute it and then takes a different code path if it was interrupted.
What is done here is child's play, the author is clearly not familiar with old-school assembly obfuscation - this code is one script away from being de-obfuscated.
May I asume, quite some bluescreen error's etc. were the result of madness like this?
"sprinkle some parts of the code that check how much time it takes to execute it and then takes a different code path if it was interrupted"
I mean, I respect it technically, if someone can do this and not disrupt behaviour or performance, but I doubt it is a smart thing to do, if stability and performance is the goal. An I believe that should be the goal of any software ...
I had once worked on an executable packed by Themida. It used:
- 8 layers of decrypting the initial executable
- one of them decrypted the import table from the executable
- each of those layers employed several methods of detecting that you're running under debugger
- each of those layers employed methods of causing exceptions in popular debugging software
- every single memory page was also encrypted while running, and a breakpoint was set up whenever a jump was made to it. The protection mechanism would first decrypt the next page and then encrypt the previous page.
Even back then (15 years ago), the more complex option of Themida would generate a unique virtual machine with a unique bytecode for itself for a given executable, which would then execute itself.
The current version of themida is good, and widely used, but not much of an obstacle to experienced people (although it depends on what you're trying to do).
Denuvo is widely used on AAA titles and seems like a pain in the ass to deal with (ie games seem to take a while to pirate when protected with it and it adds a stupid CPU burden at times), but it doesn't have the edge it once had in the battle.
In my (extremely limited) experience with reversing JS, I'm pretty sure I've already seen these obfuscation techniques before, and common deobfuscators of the time had no problem reversing the transformation. It doesn't stop anyone except the most easily discouraged.
(The JS that's used to detect adblockers and/or coerce you into viewing ads is often obfuscated. Those of you who have played around with this stuff may recognise this keyword: DtsBlkVFQx.)
The proposed scheme trashes the performance while providing a primitive protection that is statically observable (e.g. distinguishable) and thus easily reversible.
I would like to see the performance differences between the original and obfuscated. Most of the compiler optimizations are being made impossible by removing static access. Plus, a reverse-obfuscator is trivial for all those static-to-dynamic and base64 encoding.
Code obfuscation is idiotic and pointless no matter what form it takes, but this example is particularly egregious. This accomplishes little more than degrading performance across the board for the very real end user (bye bye JIT optimizations) while requiring some purely hypothetical reverse engineer to write one additional script before (gasp) reading the code.
I'm genuinely curious how much time and money was wasted on this imbecilic venture.
Obfuscation may seem a weird topic nowadays but only because it is lagging behind conventional data encryption. Once it achieves more fundamental results the perception will likely change.
The next step of this cat and mouse game would be for the Javascript interpreter to detect intentionally obfuscated code and then give the user an option to stop executing the scripts on the page, just like how browsers do if it detects a script is taking too long to execute, with the default option being to stop it. This could be done heuristically from the JIT based on the CFG and the entropy of the symbol table.
function test(x, y, ...args) {
let z = x / y;
let ret = args.reduce((a, v) => (a * v), z);
console.log(ret);
}
test(1, 2, 3, 4);
are functionally identical. (I used an off the shelf obfuscator)
No amount of obfuscation will stop a determined reverse engineer from pulling apart your code, but it can increase the cognitive load to the point where most people won't bother.
Also, just in case, nobody should ever run untrusted code, particularly not obfuscated untrusted code. Including the examples I posted above.
Absolutely correct. Yet the only point of obfuscating JavaScript I can see is to execute malign code, which implies that it's targeted at people who "won't bother" deobfuscating or even looking at it anyway, which makes obfuscation redundant. And obfuscation won't stop a determined engineer anyway. So, uh, ¯\_(ツ)_/¯.
Thus, it only looks vaguely interesting from an academic point of view. But even then, it's just a matter of flattening and spamming the AST, but you can only get this far with JS.
I think you're quite correct. If you're investing in obfuscation, you're probably making the wrong investment. Malicious actors have to obfuscate to avoid detection for as long as possible, and benign actors would probably be better served focusing on their core tech. That said, obfuscation is very good at increasing the difficulty of interpreting and understanding the purpose of code.
No one serious will analyse obfuscated JS without passing it through a deobfuscator first, so the point is moot. But if you really want to make your code "look confusing", there's more amusing obfuscators for that (also easily deobfuscated):
When I reviewed different obfuscator products, it was based on the idea it was only a "speedbump," for our threat actors, and imposed the cost of someone having both the motive and means to reverse it.
If you would prefer that a summer student at an enterprise customer doesn't replace your product with an in-house work-alike, it's useful. Similar if it's cheaper to buy your product than spending several hours reversing it.
If your business model relies on the integrity of a secret (key, derivation component, method, etc), it probably has a single, catastrophic failure mode and obfuscation isn't your solution.
And for what? Obfuscating your code so that people don't steal it or whatever? This kind of obfuscation can be easily and programmatically reversed engineered if someone really wants to, so... why do it? Just to screw with people trying to look at the source code of a web page?
People complain about JavaScript minifiers and WebAssembly that they're making the web less open and hackable, but at least those things have a point to them. There's a performance upside! This is just "naah, lets make the web slower, more closed and less hackable, for... you know... reasons."