Hacker News new | past | comments | ask | show | jobs | submit login
Stacksort – Searches StackOverflow for sorting functions and runs them (2013) (gkoberger.github.io)
318 points by colinprince 5 months ago | hide | past | web | favorite | 79 comments

Hilariously awesome.

I'm curious whether the multiple warnings about running untrusted code in the browser are necessary. I feel like all websites are already untrusted code, and the browser is quite well sandboxed and protected from anything too bad happening. What is the worst case scenario here for the user within the JS ecosystem, under known avenues of attack, not counting an unknown zero day browser exploit?

Some worst case scenarios:

- Your tab freezes due to an infinite loop for a while until your browser notices and asks you if you want to kill the script.

- The script downloads illegal content, and your ISP notifies authorities.

- CSRF attack against a site you're signed into that's not properly secured.

All of these things are things any site you go to can do. There's nothing special about `eval` that makes this site more dangerous.

Given that this site is not hosted on a domain where you have important session cookies (or where you’ve granted permissions like camera or location usage) I don’t see much problem with eval-ing untrusted code. Most of the common XSS threats just don’t apply here.

It's all fun and games until the next JavaScript sandbox busting zero day drops.

If I had that kind of exploit I surely wouldn't post it on stackoverflow.

Why not? They even have a tag for it


Maybe someone should create a site like this that runs all security exploits on SO in your browser.

Reminds me of "security expert" warnings in the mass media where every time something happens they tell people effectively the same thing "don't do things unless you know / trust it".

Most people have no clue what they're running and can't possibly take the time to know enough to really have a clue.

Trust is fairly worthless as well. Malware devs buy trusted software to turn in to malware so something that was good and trusted for years is now bad with no warning.

Exactly my thought too. Trust,however, is a weakness. This weakness is overcome by the average web surfer due to knowledge and time constraints or unintended clicks on links and ads. The future hopefully will have security audited browsers which increasingly minimize dangers arising from these actions.

I can't speak to the worst case scenario as I'm not very knowledgeable about browser sandboxing or what JS can['t] do in a given context. But as a developer I do think there's a pretty significant difference between running code that someone wrote on their own website and loading arbitrary code from a third party then executing that without any human looking at the code.

That’s more or less what all iframed ads are. Plus most popular web sites load libraries for services where the site owners can’t easily see what’s happening under the covers. Google analytics would be an example of this, but there are many, and lots of sites use multiple such services.

Of note though, cross-domain iframes have intense restrictions on them to prevent them from XSS-ing the host page or its domain.

Of course, if you’re running 3rd party ad scripts on your actual page, you’re at their mercy.

I see what you're saying and I agree for the most part, but all that code is written by someone, reviewed (maybe), and put to some form of a production environment. The code in StackOverflow is usually just typed right into the answer box and submitted. I know I've edited answers where there were compilation errors in the accepted solutions! Not strictly relevant to JS of course.

That's exactly why I would rather trust code from random SO answers than from the ad network.

That's how it should work with ad networks, but you really never know what code you're going to end up serving.


This is unintentionally fucking hilarious. You already concede that review is a “maybe”... so we have to assume it won’t happen. So basically your argument boils down to web code meeting the high standards of being “written by someone” and put in “some form” of a production environment. Any http server on a routable ip address qualifies as “some form” of production environment.

I'd go further. I'd be much more trusting of running and random JS found on stack overflow than the JS found on your average production environment.

Stack overflow is heavily moderated. I think it'd be a challenge to put something malicious on there which lived for more than an hour or so.

Compare that to your run of the mill PHP shared hosting. I know which side I'd rather take my chances with.

You don't think there's the slightest bit of difference between production code written at even very dysfunctional organizations with mediocre developers, and random answers on StackOverflow?

And you've only been here for a few weeks so I'll give you the benefit of the doubt, but calling something "unintentionally...hilarious" is usually taken as just another way of calling someone an idiot. I'd suggest you try to find ways to make your point without deriding those who don't share your view, especially on something so inconsequential.

> You don't think there's the slightest bit of difference between production code written at even very dysfunctional organizations with mediocre developers, and random answers on StackOverflow?

I do; a random SO answer will be miles better than anything coming out of an average organisation.

To write an SO answer, you need to actually care about answering SO questions. To write code in an average company, you need to... have a pulse, it seems.

I'm sure at the very least some of the sorts are incorrect and will brick your browser tab from an infinite loop or something.

Thats crazy talk everything on Stack Overflow is correct because it has the most upvotes.

To be fair, I'd imagine the code in SO answers is less likely to be malicious than the code in adverts on most news websites...

But what if it’s the answer to the question how do you brick your browser tab from an infinite loop?

You've discovered a new unit test.

The standard attack in this case is that someone could steal your cookies (if they are Js cookies) or execute instructions on gkoberger.github.io without your permission, since that is the context.

The way you would do this is create an answer to one of these questions on stack overflow that includes malicious javascript.

However, it's kind of unlikely there is much you can do with it. Sometimes a site will share cookies with subdomains, but this is not likely for github because you are allowed to publish arbitrary js there, so that would be a huge security hole.

This is why GitHub uses a separate domain (github.io as opposed to github.com) for user sites.

Moreover, github.io is in the public suffix list, so it is effectively a TLD (foo.github.io is a different site from bar.github.io; they can't become same origin by means of document.domain).

The risk of eval() is giving control of the site data of foo.github.io to the author of a stackoverflow comment.

The warning is part of the fun, though.

Github hasn't always used a seperate domain for user sites [0]. They do now to directly address the GP's concerns.


I run a Flatpak'd browser with limited privileges, so the worst that can happen is my `~/Downloads` folder gets messed up.

Hey, creator here! I built this a few years ago on a whim, and am surprised how well it still works. Thanks for sharing again :)

(Psst, if you're an engineer and like dev tools, I'm hiring! https://readme.io/careers)

You might want to update the warning:

Is it safe?

Uh… it evals both user input and random code, unchecked, from an external site. This is what the security-minded folks writing anti virus software would refer to as: hey, our unpacker does that too! In kernel mode if we're from symantec! Must be perfectly safe!

Yeah, cheap shot, I know ;-)


Is it terrible that I am highly tempted to use this as an API, forcing the value in via headless chrome then printing the page to a PDF and using OCR & regular expressions to extract a sorted list? I'm pretty sure that's an O(1) (ish, not really) which I always heard was the best kind.

It's not O(1). You are only calling the API once, but if you put in a larger list the time will increase in accordance to whatever the complexity is of the sorting happening behind the scenes. Calling this O(1) would be like saying qsort is O(1) because you are only calling the function once.

dude.its called a joke

do you hire remote?

this reminds me of the 4chan's sleepsort: for each number $n in the array, spin a thread that sleeps $n and then append $n to the result array.

Would the algorithmic runtime of this be O(N)?

The O(N log N) best case only applies to comparison sorts. What's described here is not a comparison sort, and could easily be made O(N) with radix sort if the size of the numeric type is constant.

No, the runtime complexity is just hidden in the scheduler of your OS

I don’t think this fully resolves the problem. If you had n computers and had each of them sleep for one of the numbers, and then append to a shared list, you could sort in O(n) without a scheduler.

My favorite part about this algorithm is that you can speed it up by a factor of k - for any k! - by simply dividing the time you sleep by by k.

Comparison-based sorts have O(1) complexity on the size of the elements being sorted. I.e., comparison between two numbers is a single instruction assuming numbers fit within a word. Even if you assume very very large numbers then comparison would be O(k), where k is number of bits needed to represent a number. So a more accurate runtime of O(nlogn) algorithms should be O(k.nlogn).

The runtime of this algorithm is O(n.2^k).

Is it just me, or do none of them sort this array correctly?


Correction, the page finally found this algorithm that sorts my array[1]. So I am disappointed with the verifier function on this web page and may need to submit a PR

1 - http://stackoverflow.com/questions/3730510/#3730579

Maybe they just have a different definition of "correctly" sorted.

That conversation about for..in - any JS experts that can say whether modules or other new advances will help with this? Not being able to use the standard functions on primitives because random other code messes with them sounds like pogo sticking in a minefield.

    for (let x of xs)
in more recent versions of JS is how you iterate an array without facing that prototype pollution problem.

See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

Good to know, thanks!

the website doesn't verify the accuracy of sort output. it just gives you the first SO answer that a) has a code snippet b) which is error-free c) and has a function that can be extracted and run ...for your input format, and returns the output of passing your input code into it.

It's true to the xkcd mission, which is just to find SO answers until it returns something, not to actually confirm it was sorted correctly.

The alt-text on the linked comic says "...until the list is sorted." Ergo, one must verify that it is sorted before knowing whether to execute the next code snippet.

It only checks that the output is sorted, not that it's a sorted version of the original string.

The first answer that apparently works has a comment stating that it fails with more than 2 duplicates; indeed I tried a list with three 3s and the resulting passing answer only had one of those threes.

So a function that always returns an empty list passes the test?

that's because that answer is for how to find unique values in an array. it works as intended with any duplicates, stacksort just chose an answer that wasnt only sorting

See also previous discussion:


Wow! I was inspired by this back in 2013 - I hacked a version together that also took a second argument, which was a description of the operation to be done (so it wasn't always "Sort"). Worked surprisingly well.


I cringe at the code now, but still think the idea is neat.

Wow. Hadn't seen this before. I clicked the button and laughed audibly and loudly at the alert.


Of course, the first thing that I did was to try it in a safe environment. It worked!!! :)

Doesn't seem to work. Tried with input ["z","b","a","c","d"] and returns the same ["z","b","a" "c","d"] as the output, using StackOverflow answer 4833835

For string sorts, i needed to keep running about 5 or 6 times until a sorted result was found.

Ah you're absolutely right, didn't see the "Didn't work? Try the next answer" button

This project is both Awesome and Awful at the same time. Love it!

Pure eval

Click the button and PRAY this thing doesn't load any sketchy code

I don't think that's so different from normal web browsing. (Normal being javascript enabled, etc.)

I don't think "bad" code could do much more than phishing or crashing a browser, but to be safe, the API only loads code snippets posted (and unedited) before I launched this. So, there's no way to target this with an attack via Stack Overflow.

Some say it's still running bogosort to this day...

Wonder how long until this code pops up in some CTF....

How does this work? I thought CORS prevents this?

ah, I see

this should be tagged [2015]

An attempt was made, but right now it's at "(2103)". Nice to know xkcd will still be around in the next century!

The worst part about this is that there is an existing thread from 2013 directly on the site linked.

This made me squeeee

Well done

More crazy sorting techniques from a recent reddit thread: https://redd.it/9s9kgn


- GenghisKhanSort: delete all elements except for the first, repopulate the list with successors of the first element - HitlerSort: Choose an element in the list you think is the best, then loop through the list removing any element that does not match. - ThanosSort: delete half the array. The arrays may or not be sorted, but it'll help for future sorting - TrumpSort O(0): the array is always sorted. Anyone who says otherwise is fake news.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact