Hacker News new | more | comments | ask | show | jobs | submit login
Write any javascript code with just these characters: ()[]{}+ (patriciopalladino.com)
289 points by alcuadrado on Aug 10, 2012 | hide | past | web | favorite | 51 comments



Arg, scooped! I was working on this exact same thing! :D

Since you've beat me to it, let me offer up a couple additional tricks you might want to use. If you want to make this completely independent of browser API's, you can eliminate the dependence on window.location (or atob/btoa as the sla.ckers.org poster did).

Trick #1 is to get the letter "S".

You can extract this from the source code of the String constructor, but you want to be careful to make this as portable as possible. The ES spec doesn't mandate much about the results of Function.prototype.toString, although it "suggests" that it should be in the form of a FunctionDeclaration. In practice you can count on it starting with [whitespace] "function" [whitespace] [function name]. So how to eliminate the whitespace?

For this, we can make use of JS's broken isNaN global function, which coerces its argument to a number before doing its test. It just so happens that whitespace coerces to NaN, whereas alphabetical characters coerce to 0. So isNaN is just the predicate we need to strip out the whitespace characters. So we can reliably get the string "S" from:

[].slice.call(String+"").filter(isNaN)[8]

Of course, to get isNaN you need the Function("return isNaN")() trick, and you know how the rest of the encoding works.

Trick #2 then lets you get any lowercase letter, in particular "p".

For this, we can make use of the fact that toString on a number allows you to pick a radix other than 2, 8, 10, or 16. Again, the ES spec doesn't mandate this, but in practice it's widely implemented, and the spec does say that if you implement it its behavior needs to be the proper generalization of the other radices. So we can get things like:

(25).toString(26) // "p"

(17).toString(18) // "h"

(22).toString(23) // "m"

and other hard-to-achieve letters.

But once you've got "p", you're home free with escape and unescape, as you said in your post.

Dave


Great idea Dave, I considered using String+"" but didn't know how standard it was, so I discarded it.

The slice & isNaN trick is brilliant!


PS I flipped the logic in my explanation; whitespace coerces to 0 and letters coerce to NaN. Which is why filter removes the whitespace and not the letters.


Now that you mention it — V8 has an interesting bug with excessive Number#toString() decimal digits (http://code.google.com/p/v8/issues/detail?id=1627).

For example:

    (1.1536999999997645e-10).toString(33).match(/[a-z]+/g)[81]; // 'oops'
More here: https://gist.github.com/1153826


Funky! Luckily that bug doesn't interfere with this trick, since it's only relying on pretty-printing integers.

Dave


Thanks for this great comment & thanks OP!

Is there some reason not to use 36 as a radix and access the whole lowercase alphabet like

    (10).toString(36) // "a"
    ...
    (36).toString(36) // "z"
? I'm curious why you use varied combinations of radixes & base numbers.

EDIT: Friend pointed out that you are only extending the number set out to what's required for that one character. Makes sense now. :)


I think even though that's not guaranteed by the standard, it's a lot more portable in principle than relying on the DOM.


This is like a bizarro-world lambda calculus, complete with its own Church numerals.


I made a little script to extract the original javascript from a script obfuscated with OP's tool (http://patriciopalladino.com/files/hieroglyphy/).

And because I felt it was appropriate, I created this extraction script in an obfuscated form!

Use this to extract obfuscated scripts: http://pastebin.com/raw.php?i=Q9TB4wEF

Just save your obfuscated script in a variable called "original" and then run my code. It'll return with the extracted script.

Oh, and it won't work on itself. That's because I didn't use the obfuscation tool to create it. I made it mostly by hand: http://pastebin.com/9LBWCSJs


There are no words to describe how dirty this makes me feel.


This post title omits "!" which is also necessary.


My fault, not intended


So, basically Javascript is just a superset of an esolang that contains itself.

http://esolangs.org/wiki/Main_Page

(Especially true if you're developing with a Javascript interpreter hosted in Javascript. Really, it's esolangs all the way down.)


If you like reducing programs to basic expressions you should read into SKI combinator calculus and the X combinator. Here is a paper that describes the construction of an efficient X combinator[1]. Reading the paper gave me insight in how simple yet powerful combinatory logic is.

[1]www.staff.science.uu.nl/~fokke101/article/combinat/combinat.ps


I evalled all pieces of Javascript of <30 characters in Rhino, takes 1 minute on my laptop. 4219 possible values, after stripping out some really uninteresting stuff. Doesn't seem to contain anything interesting, unfortunately.

http://pastebin.com/CM5ac6Xi


I am not sure about those results. I entered (+[][{}]+{})[+[]] into Chrome console and got N (from NaN[Object object]) while your code lists it as u. If you replace the first +[] with a 0 you get an u (from undefined...). Interesting.


Looks cool, but I couldn't make it work.

I went to http://patriciopalladino.com/files/hieroglyphy/ and put in a script "alert(1);". This provided me with a script of about 8300 characters.

I created a web page to execute the script:

    <body onload="
    [][(![]+[])[!+[] ...
    </body>
Firebug reports:

    ReferenceError:  Unescaee is not defined.
Looks like it's having trouble picking up a "p".


Did you try to do this locally? The article explains that the "p" is picked up from window.location, assuming it's http or https. If you're using "file://...", that third character index is 'e' instead.


Good catch! You diagnosed my eroblem eerfectly.


The article lists [][+[]] for undefined; you can get away with just [][[]].


Some of you may also enjoy aaencode by Yosuke Hasegawa:

http://utf-8.jp/public/aaencode.html

Encode any JavaScript program to Japanese style emoticons (^_^)

And of course jjencode:

http://utf-8.jp/public/jjencode.html

(hint: have a look at "palindrome")


Apparently, he also did the OP's trick: http://utf-8.jp/public/jsfuck.html (but without, {} even)


Unfortunately, his tricks no longer work in current JS engines; it relies on using

  [].sort.call()
which I believe used to return the global object but now throws an exception.

AFAICT, you need to add {} to make this work in current JS engines.

Dave


Man, if you didn't care about performance or bandwidth, this would be a hell an of obfuscation technique.


This is pretty easy to reverse. Most JS parsers can print the source code of functions, so you can do that for the generated lambdas.


Yes of course. And even if they couldn't, it would be trivial to fork an existing JS implementation and make eval spit out its input.


That's not neccesary, I've got right-click disabled on my website.


Okay, that made me laugh. :)


Performance might not be too bad actually. My understanding is that he's building up a string with the code you run normally and then evaling it, so the performance might not be bad, aside from the start-up cost. Bandwidth... I don't want to speculate on that one :)


Actually, I did a small test and found that after gzip, the file size only expands by about 10x. Running both input and output through bz2, the obfuscated file only comes out 3x larger. If you were very protective of your code, and you had enough of it to justify loading up a bz2 decoder on the client side, you could actually make that economical bandwidth-wise.

That said, this was a very small test; the original file was a random snippet of JS code less than 500 bytes, and that itself took a considerable amount for hieroglyphy to chew on, so I can't really do a proper test of a larger input file.


I think if you wanted to make this really robust, you could use more techniques to beef it up.

One technique would be to store verbose or commonly-used string constants in accessible locations like Array.prototype.f. Then you could access, say, the string "prototype" by simply writing

    [][(![]+[])[+[]]]
Once you build up a little scratch storage of the most common or hard-to-encode strings, everything starts getting orders of magnitude smaller.

(Technically, this means that you're polluting the space shared with the program being encoded, so for everything to work the program can't make use of it. But that's a pretty simple invariant to ask of the input program: "don't get or set the 'f' property of arrays.")

Another technique would be to break up large statements into smaller substatements, to avoid fixed limits of JS engines on statement size. You can always avoid semicolons, since ASI is guaranteed to work if you start your statement with a !.

Dave


But think of the reduced complexity (in terms of characters) this language is! This makes me think of the small amount of primitives needed to make a LISP machine. Or Brainfuck. Or Unlambda.


Eh, I can restrict my self to ones and zeros for even more reduced language complexity and have amazing performance if I'm clever.


Bandwidth is probably not too bad. It seems like it would be quite amenable to gzip.


I mentioned in another reply that it actually does gzip fairly well (and bzips even better). But you still end up with an order of magnitude expansion with gzip (as compared to gzipping the original source), and around 3x expansion with bz2 (again compared to bz2 on the original source).

That's from a <0.5KB test input, so the expansion might be mitigated a little more for larger files. I was going to test on a 3KB microlibrary, but gave up after about 10 minutes of waiting for the conversion to finish.


Cross this with John Horton Conway's notion of "Surreal Numbers" and you might be onto something.


This guy did it with 6 characters by removing {}. But it lacks the detailed description available in this post.

EDIT: I didn't check properly. You only use {} for a minor detail.

http://utf-8.jp/public/jsfuck.html


Is it just me, or does recursing his example break chrome?


Could someone please enlighten me as to how this helps doing an XSS attack?


Some sites "filter" user input instead of escaping it.


I remember something like this a few years ago. They were using it for XSS. http://news.ycombinator.com/item?id=1153383


this is very cool...let me know if you want a job at inaka (we're in BA and have other people in school working for us)


I wonder how well gzip would compress this?


Example: 47,734 bytes to 813. ~98%


Pretty damn well! I wonder what the decompression and parsing time is...


Minor typo:

> "[object Object]" with {}+[]

I believe it should be []+{}


i pasted the entire json library into the field and it just hung. Any tips?


Witchcraft


really cool


Write any Windows application with just the following characters: 0 1


... you'd need some preprocessing first. The ASCII characters `0' and `1' aren't easy to use to write a program, though you could do it with nasm and some '-' and '+'s I suppose.

If someone can prove me wrong, I'd be very happy though. Writing a program using just '0' and '1' (the ASCII characters) would be awesome. (in an established programming language, and no homomorphisms. :) )




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: