Hacker News new | past | comments | ask | show | jobs | submit login
JavaScript Obfuscation Techniques by Example (trickster.dev)
105 points by EntICOnc 6 months ago | hide | past | favorite | 70 comments



You want to see obfusication? Check out FreeSlots.com. Look at view source on one of the slot machines.[1] Can anyone decode this and figure out the odds generator?

[1] view-source:https://www.freeslots.com/slot515.min.js?v=84


I gave it a 10 minute poke just for fun. My main enemy for the first few minutes was the browser trying to tell me I can't do things like eval/etc. Once I got that out of the way with some policy the next issue was the console not really being used to non-printable characters catching me up. In the end those two tricks (both I'm sure being a pain on purpose not by accident) netted me not getting very far as expected, I was only a couple iterations into the first method of obfuscation.

I'm not sure how many levels it was applied but my general strategy on it was the first obfuscation method seemed to be an IIFE that triggers eval on a big string that has had some transformations applied which then calls eval on the next level and so on so I would take the IIFE, turn it into a function definition stuck in a variable like "decodeFunction1" so I could just recursively call directly and change the ending to return the string instead of eval it directly. There is probably something that could be done with debugger breakpoints here but I don't know enough about how that plays out inside eval. Anyways if I were evil I'd make sure at some point in this chain there is a subtle change in what's happening that breaks this approach so I wouldn't be surprised if someone told me that was the case :).

It'd be interesting to see what some real JS devs could get to, both from a "how hard is it" perspective but also just to see what different obfuscations are used in the real world.


JSNice[0] often does a good job deobfuscating js, the statistical renaming isn't foolproof but often useful.

With it I get https://ghostbin.me/62d52999cc217 , from there it's decoding UTF-16 and at least one more decoding step (parts of decoded UTF-16 are mangled) to get the string j and the function o and resolving the original function with it.

[0] http://jsnice.org/


Even easier: open up Chrome DevTools, add a breakpoint at the eval, step into function. Repeat a couple of times.

Comment at the top also points to https://freeslots.com/libs/mersenne-twister.js

Using those two together to decode the RNG function, and searching around for some more local pointers (with some judicious renaming of functions for legibility, and some comments):

    // polyfill for IE??
    function trunc(a) {
        return a < 0 ? Math.ceil(a) : Math.floor(a)
    }
    var genrand_res53 = function() {
        var a = new MersenneTwister((new Date).getTime() + Math.floor(Math.random() * Math.pow(2, 32)));
        return function() {
            // Generate a random float by varying the 53-bit mantissa.
            var b = genrand_int32(a) >>> 5,
                c = genrand_int32(a) >>> 6;
            // (b << 26 | c) / 2 ** 53
            return (b * 67108864 + c) * 1.1102230246251565E-16
        }
    }();
    function rand_int(a, b) {
        var c = genrand_res53();
        if (a === undefined) return c;
        if (b === undefined) return trunc(c * a);
        if (b > a) return a + trunc(c * (b - a));
        return a;
    }
    function Kma() {
        // credits is `Lma`; if you want to cheat, you can edit that variable directly
        credits < 500 && (Ika = Ika - Ika % 1E6 + rand_int(890001) + 1E5)
    }
So, we're generating a random integer from 0 to 890,000 (inclusive, with negligible chance of generating 890001) with each spin, and using that to update the state (seeded with current time; the top 1e6 of the value is determined via mixing of Date.now() and Math.random() and is fixed for a given session).

There's then more to determine whether your state is a winner (based on the game mode), but I didn't feel like looking into that specifically.


To answer your question, yes. Someone absolutely can decode that and figure out the odds. If they couldn't then there would be less obfuscation used. A browser ABSOLUTELY has to be able to run the javascript. Anyone dedicated enough can de-compile that javascript to a program. Is it easy? No, but people do it all the time.

I have had to deal with client that thought they could keep some bit of code secret on a browser before. I have had to explain many many times that anything the browser can do a human can do. So if a browser can run the code, at some point a human can too.


I think what the parent meant was, can someone looking at it decode what is going on. Not asking whether it is possible in general.


That's why they wrote "Anyone dedicated enough can de-compile that javascript..."

Meaning, effectively, it can be de-obfuscated into code with control flow that's readily understood by a human, even if it would take some patience and practice (and the right tools) to perform the de-obfuscation.

Re: the FreeSlots.com program, https://deobfuscate.io shows that most of the obfuscation is related to decoding characters per some algorithm of their devising and eventually eval'ing the string as a JS program. There are likely several tricky rounds of that technique (and others) used at layers within the obfuscated code.

If the FreeSlots devs are clever, then they likely have a scheme to randomly generate the code they want (producing their desired result in terms of odds), where the random part is w.r.t. how the obfuscation layers are composed. Done well, that could make it rather difficult to mechanically de-obfuscate their code changing over time, i.e. without a human intervening to help identify the distinct layers because... parsers are hard.


As the other commenter said I think everyone understands it is feasible for someone here to accomplish de-obfuscating the code but the actual question was "Can anyone decode this and figure out the odds generator?". As in "can anyone take out the trash" like actually take the time to do the work of taking out the trash. Not as in explain that it is in fact possible for a dedicated person to tie up a bag, lift it out of the bin, take it outside, and put it in a dumpster. One of those quirks of speech.

I gave it a quick shot with some spare time, was selfishly hoping someone else had done the work when I checked back :p.


If the economic incentive (or some other abstruse incentive) is great enough for a someone, then that someone will do it, if it's strictly possible and within the scope of their resources, i.e. because they stand to gain / be fulfilled / for the fun of it / experience fame and glory / etc.


Perhaps the original ask is better explained as: a request for someone here do the actual deobfuscation.


Nice way to put it. :-)

I don't feel so incentivized at present, sorry if I'm letting you down.


nothing special about code in a browser. there are regular reports from bug finders where they detail how they disassembled iOS or some native app etc and worked out how some exploit worked


The Hexrays decompilation plugin for ARM architecture works very fine and produces more readable code than this js :D


I love all the deeply obfuscated junk and weird bitwise stuff... and then `fix_ios6`


Why is the odds generator run on the client?


As long as the generator is seeded by the server and deterministic the only thing running the generator on the server for every spin would do is cost more money. In the rare situation a claim is made the server can run the spins in bulk with the same data it gave the client, if the results are different or the number of spins suspicious the claim is thrown out. If the client doesn't win anything (i.e. almost every client) the server doesn't need to do anything except serve the initial page.


If the odds generator was only seeded by the server, but run on the client, you could test the next run locally before deciding to place a bet. How would that make sense?


To clarify the claim a client makes on this site is that you have been around to make enough tokens to enter your info for the $500 monthly sweepstakes. Your slot results don't actually net you direct money or even improve your chances beyond being able to enter after a relatively low bar. Any server side validation would be to check you are a person who has been seeing ads and giving a real sellable email away instead of a bot trying to game the sweepstakes and, more importantly, lowering the resale value of the email list. Doing live server side hosting of every game spin would probably cost them more than they make, I'd be surprised if they even did the full level of server side validations available.


The code goes through a few eval steps first. Here is what is finally evaled -- you can replace the whole file with this for the same result: https://paste.ee/p/VTgj8

Figuring out the odds generator...is a task I will leave to someone else :)


A favorite trick of mine is to replace regular length variable names with absolutely massive ones that all share the same first 1024 characters, plays hell with debugger UIs and makes differentiation all but impossible without writing a custom lexer. Add into that a bit of Z̶͚͎̙̭͈͚͚̘͗̑̉̈́͌̆̀̚͝ă̶̡͉̠͍̻͔̯͔͖̪̤̤̫̓̽̏̉̎͌͒̆͘̕ḻ̴̡̡̝̫̠͇̻͎̥̲̜͆͌͑̍ͅg̸͈͒̏̀͂̈͊̂̾̑̈́̑͝o̴̡̙͍͉͓̘̮͗̏̒̂̃̏̓́̕ͅ and you’ve got a stew going, baby!


That reminded me of a base4 encoding using these characters: '0' 'O' '1' 'I'

The code looked like:

    var O11IOOO1I011 = 1<<1, 
        II00001OOO1O = 1<<2;


No lowercase 'l' in there?


Probably.

---

This one binary encoded with tabs and spaces:

https://www.youtube.com/watch?v=cQY7klANahY


For mangling, I made a proxy that creates meaningful names in dev, and sequential or pre-baked ones in production.

For example, FileFields.js:

    const FF = proxyFieldNames('FF', { foo: null, bar: null })
    // DEV:  FF.foo → FF_foo
    // PROD: FF.foo → 'a'
https://github.com/uxtely/js-utils/tree/main/proxy-fields-ob...

As a bonus, it's helpful for renaming, autocompleting, and finding usages.


Before someone asks why would you obfuscate, here's common use case: There's plenty of paid/proprietary Electron apps these days and they're not just websites, some of them do some heavy lifting under the hood and people want to protect that better than what Electron offers out of the box (read: nothing).


Isn’t that just the illusion of security?


Not if it isn't used for security


Yep, no one uses obfuscation for security (I hope!) and given the Electron's modus operandi, copying your entire codebase is trivial... unless you make it not trivial.


This actually gave me some new context on why Web technologies can be slow: fast as the runtime might have become, it spends useless cycles converting hex to ASCII, thrashing the stack, and so on. I wonder if it wouldn't be faster to encrypt the JavaScript code, and use something like the existing Widevine DRM to distribute the keys.


Something else that sites do, which is not really deobfuscation, but an anti-debugger technique, is to run a loop checking whether the DevTools are open and crash the page via catastrophic regex backtracking if they are.

You can get around this by intercepting the request and returning a copy of the js with this check patched out, but it's just another hurdle in the way of casual inspection.


Then it turns out the anti-debugger code was contained in a function that gets stringified, so if you change the source code at all, the script stops working. So you have to manually substitute usages of the function-as-string with the string representation of the original obfuscated function...


OP is talking about protected legit JS but I see malware use similar techniques as well. The latest one I have seen has a legit JQuery code on top but eventually functions with english words as names that do weird string operations are seen intermixed with jquery. The script is meant to be run by the windows script host to download malware.


Interestingly enough, passing even the most complex example in the link to GPT-3 with the prompt "What does this code output when run?" returns the correct result.


Now I wonder if it's possible to train a neural network to run JavaScript. Or lay out a web page.


I can imagine obfuscation as a useful thing for Chrome extensions where, in some cases, your business logic cannot exist on server.


Why obfuscating JS when there is WASM?


Business people demand it to protect intellectual property without realizing the ease of reversing it / wanting to say they're doing something to protect IP that their own superior will not realize doesn't help. It is making the best of an impossible situation, the paradox of sending your code to every single customer for them to run it while also wishing nobody could see it.

The more aggressive they make patent law the less useful it seems to become for protecting any actual investment, so here we are, clinging to wooden totems...


With how mediocre most developers today are, obfuscation is enough.


Can't tell if you mean they can't deobfuscate, or that their code isn't worth the effort of deobfuscating.


Probably both


Too mediocre to type "deobsfucate" into Google? The first result is a deobsfucator.

https://deobfuscate.io/


Too mediocre to read deobfuscated JavaScript, yes. Many such cases.


It's a supply demand thing.

If you publish your code on github, it's more likely to be compromised than if it's just in the webapp, very well obfuscated.

A sufficiently motivated actor will break it, and frankly break almost anything else, so it's a game of probabilities etc..

Obfuscation probably does make sense so long as it's not obviously getting in the way of dev. and, with the key understanding that 'it can be broken'.

Physical security at most companies can be thwarted with enough effort, it doesn't mean we don't do it.


You can use wasm disassembler (like https://github.com/JoseFMP/wasm-disassembler) as a starting point to understand what’s happening. It would be much harder if it was obfuscated on top of that.


Why WASM when you can create a full VM with its own custom bytecode implementation complete with nonsense instructions and compile to that.


An incredibly cool example of this is the newest iteration of the HIVE malware which does exactly that. They build a custom VM via RCE in a buggy image format parser which allowed them to execute custom code on an iOS device.


This is already what (a lot of) Lua obfuscators do for game cheats, since Lua has a reference implementation that generates bytecode, that can be easily modified to generate the most awful, encrypted, obfuscated mess ever.


This is how ReCaptcha is implemented, right?


Why obfuscate, when you can just follow modern trends and use webpack (or similar) which gives you completely unreadable shit.


I certainly agree with your tone/sympathize with your frustration but I will say that on several occasions I have followed the webpack breadcrumbs to figure out what the hell is going on with a vendor's misbehaving script, knowing it's going to be faster than going through support. Some of these methods would make that much harder.


"Modern"? Webpack/code bundlers is quite an ancient tech by now.

Regard it as an intermediate representation (IR) of your code, a stage between your readable source code and browser bytecode/jit.

The "shit" is still readable since webpack also generates source maps.


Do you mean ancient like something has replaced it? Or old enough to be in widespread use?


That's because Webpack includes a minifier (Terser) by default when running in production mode (I think since version 5, which went a lot in the direction of convention-over-configuration). It is easy to disable if you want to.

Terser transforms non-global identifiers lexically and does some simple substitutions.

Normally you want to bundle modules with their dependencies anyway, maybe transpile code... Then why not minify?

Et voila, some completely unreadable shit.


Oh I love this.

Take it one step further: hire sufficiently terrible spaghetti coders that nobody, not you or even they know what the code does, and any hacker trying to make sense of it will feel ill.


I'm imagining some state or APT engineers who, having reversed the mess, are then having very fraught discussions about what it could mean and getting the boss to bring in a specialist to figure out what they're missing.


So this is closer to reality than we might want to admit. I once had a terrible business idea that we laughed at internally and mentioned it to someone in the industry, who then promptly 'ripped it off'. Our 'secret' was not a great new product, it was actually a ridiculous concept.


If I don’t know what I am doing, surely my enemy cannot know what I am doing either…


end of the day once you can apply a tool like ghidra to a given program, understanding what it is doing is straightforward unless they went to extremely great lengths beyond simple obfuscation

reversing is one of the few classic skills from the home computing era that is alive and well in 2022. it's kind of nice.


Sorry for the slightly offtopic question: this page caused Chrome on my mobile phone to freeze completely. I had to reboot my phone, and even after that, I had to figure out a way to close the tab without opening Chrome. Did it happen to someone else?


Damn, I thought I was crazy. I had to reset Chrome to get it working again, could finally read the article after I installed Firefox.

Android 12 with Chrome v103.


Had the same problem. The browser was essentially soft locked as the tab would be brought back after restarting the app until I cleared Chrome's storage which closes all tabs (and clears history, cookies, etc. too which is a bit annoying).


Yes, actually. Works perfectly fine on Firefox but on both Chrome and Bromite it causes the browser to crash. My phone just let me kill the app after a few seconds but there's definitely something weird going on here.

Interesting, I have Bromite set up to disable JIT by default, so if it's because it weird JS, it's a bug in both the JIT engine and in the interpreter.

Edit: I doubt it's JS related because this site doesn't seem to use JS (unless there's some UA sniffing going on). I'm guessing this is a Chrome bug, hopefully not an exploitable one!

Android 11, Chrome 103.


Although I won't ship you my phone, it worked on 103.0.5060.71 chrome on Android


Nope, but I can see the jsfuck example doing something like that as it kinda weirdly lagged on mine (android 12, chrome 103).


If you obfuscate client side javascript that is being served in a browser you should get banned from the internet


Why would serving it in the browser or not matter? It's like saying all code should be open source, which is a valid opinion but I don't see why writing JavaScript and serving it in the browser would be any different. It's not like I automatically agree on making my code fully available just because I happen to target the web.

Sure, you can de-obfuscate JS but you can also reverse engineer other software.


Users of the webpage can't determine whether they want to run the scripts from looking at the scripts if it's (well)obfuscated.

Very few users may even want to do this (perhaps none), but in theory it's a nice thing that has historically been made possible by the web. Unlike binaries or backend code, the user gets the source themselves to run in their browser... fine, if you want to obfuscate it you can, but I think it's fair for users to also dislike websites that do this.

I'm not talking about minifiers/bundlers which are used to make the content more user-friendly, I'm specifically talking about steps taken to make the web less accessible and less free.


I'm curious. Could you explain why you feel this way?

I can understand the desire to be able to vet code that a website wants to run on your device. I don't see why that preference should create an imperative for websites to either accommodate you or be banned from the internet.


I second this proposal.


What?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: