It's really hard to write correct code. Sorting was broken in java and nobody no...

malisper · on July 10, 2020

> OTOH the standard library sort can be improved if you know the distribution of the numbers you're sorting (e.g., small numbers can bucket sort, small lists can use a sorting network, etc). If this thing can efficiently almost-correctly-sort because it's better at these types of pattern matching, we can just run a final pass of insertion sort to make it useful!

I haven't read this paper, but that strategy was exactly the strategy behind "The Case for Learned Index Structures"[0]. The idea was you could train a model to predict the location of a row based on a primary key. After computing the model, you could then go through every row and compute a bound on how far off the model is from the actual row. To perform a lookup, you start off at the model's guess, then look around based on the bound on the model for a row with the given primary key.

[0] https://arxiv.org/abs/1712.01208

eru · on July 10, 2020

> If this thing can efficiently almost-correctly-sort because it's better at these types of pattern matching, we can just run a final pass of insertion sort to make it useful!

Depends on what kinds of mistakes the almost-correct approach makes. If it just puts elements in the wrong order, your suggestion works. But if it makes mistakes like duplicating elements, or dropping them, or completely making up new entries, no post-processing will help.

> OTOH the standard library sort can be improved if you know the distribution of the numbers you're sorting (e.g., small numbers can bucket sort, small lists can use a sorting network, etc).

Even better: the standard library sort usually assumes only comparison. For special cases like sorting on integer keys you can conceivably do better.

ogogmad · on July 10, 2020

> Depends on what kinds of mistakes the almost-correct approach makes. If it just puts elements in the wrong order, your suggestion works. But if it makes mistakes like duplicating elements, or dropping them, or completely making up new entries, no post-processing will help.

The algorithm modifies the original list only by swapping entries. As such, it's guaranteed to result in a permutation.

eru · on July 11, 2020

Yes, in that case the post-processing would work.

But it also limits their system to in-place algorithms only.

gryz · on July 12, 2020

Why is this limitation? Do you think the algorithm can be improved if it was doing a copy to a new location?

im3w1l-alt · on July 10, 2020

If you let the number of objects to sort go to infinity while keeping size of objects in bits capped, then they will behave like integers.

eru · on July 10, 2020

I thought more of being able to do bit manipulation or hashing etc on your keys. Those things don't necessarily require a fixed key length. (And in some languages, even integers don't have a fixed length.)

You are right in some sense, but you'd still have trouble implementing something like bucket sort in your setting, if all you have are comparisons.

mpfundstein · on July 10, 2020

IMO its not about replacing deterministic algorithms used in van Neuman machines (even though that would be a nice side effect).

I feel that if we ever want to build an intelligent problem solver machine, we need some facility for abstract reasoning and work like this papet is a major contributor towards that goal. (if its reproducible and valid ofc...)

pansa2 · on July 10, 2020

> I am not sure I will ever trust an ML algorithm trained on inputs/outputs only

It’s doing TDD!

jtms · on July 10, 2020

Imagine a world where all programmers do is write the tests and the generalized learning algos write all the implementations.

GedByrne · on July 11, 2020

No need to imagine. That would just be Declarative Programming going mainstream?

Its just that algorithms like the Prolog’s WAM or the reet used in rules are replaced with neural nets and the like.

Rules use the same When Then format acting on a Given fact base. Prologs Horn Clauses can easily be translated into that form.

BDD writes all tests in the format Given When Then.

It’s already here.

makapuf · on July 10, 2020

Welcome to my company a few years ago. Except they were not programmers but PM and architects, and algos were subcontractors. Now we have real developers (I'm in one of those teams)

klyrs · on July 11, 2020

That's such a big pile of yikes, I'm morbidly drawn to see that codebase.

vertbhrtn · on July 12, 2020

I'm pretty sure the generated code will be an incomprehensible mess that passes precisely those tests and nothing more, e.g. it would correctly sort arrays [1,3,2] and [3,2,1], but not [2,1,3]; and obviously not [1,4,2], because the specs didn't mention number 4.

nardi · on July 10, 2020

Next step is to prove the ML-produced algorithm is correct. I can see a world where we generate algorithms and then automatically prove they're correct based on a formal description of the problem being solved.

bufferoverflow · on July 10, 2020

It doesn't need to be 100% correct. There's a place for fast sorting algorithms which are correct most of the time - basically anything non-critical, like sorting comments by upvotes.

According to the article, they couldn't find a case where it wasn't correct.

jcelerier · on July 10, 2020

> There's a place for fast sorting algorithms which are correct most of the time - basically anything non-critical, like sorting comments by upvotes.

Nothing saddens me more than this trend of the modern web where everything works semi-probabilistically (even if it's likely for "good" technical reasons, such as, "we wrote our server backend in Ruby which is slow as molasses so now we need 230 CDN and 800 databases instances around the whole world and transformed our simple centralized problem into an horrendous decentralized one).

The central reason for me to use computers is that they are (or at least were) deterministic to a much higher degree that normal life, and so many things in the 10 last years becoming much more non-deterministic in particular on social websites is something that frustrates me every single day as it just makes the whole experience and process of using computers & the web very unreliable compared to what it used to be.

gdoptimizer · on July 10, 2020

Jim Keller once said randomness was baked in the modern computer chip design. The execution routes were almost always different even if you ran the same deterministic code for multiple times. But in the end you got the same result almost every single time (“almost” has many 9s after the decimal point). But this layer of operations has been abstracted away from most of us. So don’t feel too sad... maybe the world has been like this for a long time

diegoperini · on July 10, 2020

Today's perfections are yesterday's "good enoughs". Don't be sad for the trend. In 10 years, you will have new perfections to enjoy.

UncleOxidant · on July 10, 2020

Like the probabilistic bank account balance. "You have between $10 and $1000 with the greatest likelihood being $537 (52% chance)."

FabHK · on July 10, 2020

There was this thread a while ago with HSBC switching to MongoDB... so, yeah, distinct possibility :-)

https://news.ycombinator.com/item?id=23507197

(The article was very light on details though, and it was probably just one team that consolidated to MongoDB, not the accounts itself... one hopes.)

adonese · on July 11, 2020

A friend of mine went to a doctor, and after the medical lab the doctor told him you either have malaria or typhoid.

This trend seems to be everwhere

ojbyrne · on July 10, 2020

I think the good technical reasons are quite often the CAP theorem.

jcelerier · on July 10, 2020

I know, but my argument is that a lot of time (not always, of course) the "distributed" issues could just be solved by not building a distributed system in the first place and just having more efficient tech stacks & a few big servers.

Stackoverflow was just using 4 db servers in 2016 (https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar...)

adrianN · on July 10, 2020

We suck really hard at proving code correct that was written explicitly with the goal in mind to be formally verified. The techniques there are still quite immature. I wonder how long it will take until we have a reasonably working pipeline of ML+formal verification. Maybe it's just easier to generate correct code from the spec than to learn it from examples and then verifying it with the specification?

hutzlibu · on July 10, 2020

"We suck really hard at proving code correct that was written explicitly with the goal in mind to be formally verified."

That is why I am not a huge fan of proving code in the first place. If you have a really complicated proof of formal correctness, then who guarantees you, there is no misstake in the proof itself? At least this is what I have seen in university, lots of complicated stuff on the whiteboard, with a result. And then someone figured out, it was wrong.

So I rather do lots of testing, for all known test cases.

adrianN · on July 10, 2020

Your automatic proof checker guarantees that there is no mistake in the proof. That's the easy part! The real question is who guarantees that your specification is what you actually want?

infinity0 · on July 10, 2020

Automatic proof checkers still have bugs in them. However, one nice characteristic is that, since its algorithms are so general, i.e. about the language rather than about the problem you're solving (e.g. sorting), any bugs in the checker are either so common they hit every problem solution and are easily discovered and fixed, or so rare that it affects no real-world problem solutions.

adrianN · on July 10, 2020

The kernel of an automatic proof checker is much easier to test and verify than the programs you verify using a proof checker though.

chrisandchips · on July 10, 2020

+1 verification tends to be the easy part.

infinity0 · on July 10, 2020

Sure, not all bugs are kernel bugs though.

louthy · on July 10, 2020

> So I rather do lots of testing, for all known test cases.

How is this any different? You can still make mistakes in your test code, test data, omit cases that would fail, etc. I think I'd be less confident in a set of unit tests than I would with an automated proof checker.

hutzlibu · on July 10, 2020

Tests are simple to unterstand.

Complicated proofs I do not understand, without much effort.

And I surely know that you can make mistakes with test cases as well. So I surely do not claim my way to be superior. But it works way better for me.

eru · on July 10, 2020

Test cases are not necessarily simple to understand.

Especially the test cases needed to trigger some deep bug.

gnufx · on July 10, 2020

And so aphyr gets paid, for instance. You may not even be able realistically to run a test case. The laws of life say your 10000 core compute job will fail non-deterministically only after a day or so. It would clearly be helpful to have at least some formal guarantees, such as liveness at some level.

rusticpenn · on July 10, 2020

Well designed ML techniques are always correct 99% of the time. Its the 1% that is the problem.

ukj · on July 10, 2020

99% of the time it works all of the time.

Like humans.

How do we deal with problematic humans?

Retraining or replacement.

adrianN · on July 10, 2020

No, we deal with 99% accuracy in humans by designing all our human-based algorithms to be resilient against mistake (and still mistakes in the end result happen very frequently). This is completely different from the way we have designed our computer systems, because it's way easier to do this with flexible agents like humans than with inflexible agents like computers.

ukj · on July 11, 2020

So, the mistakes algorithms make are different to the mistakes humans make?

That’s... equivocation.

Both humans and ML algorithms are flexible. That is the point of “learning”.

Adaptation.

Random_ernest · on July 10, 2020

But the problems where we apply human labour are vastly different from the ones where we apply machine labour. In (most) tasks where we apply human labour a few errors are tolerated.

ukj · on July 11, 2020

This seems irrelevant.

Neither humans nor ML make zero errors.

Ceteris paribus, if an ML algorithm makes fewer errors at a task which with low error tolerance - you would use the algorithm instead of the human, no?

mtrower · on July 12, 2020

I would expect that might depend on what sort of errors each make, no?

rusticpenn · on July 10, 2020

That is not practical when you expect the system to be correct 100% of the time and make decisions based on that. There are many situations where this is critical. You would never want your car's safety system to be correct only 99% of the time.

ukj · on July 11, 2020

If you expect 100% correctness you are not a very practical man.

Perfect is the enemy of good enough.

mtrower · on July 12, 2020

And good enough is the enemy of perfection.

I suppose it depends on your end goals.

ukj · on July 12, 2020

Over-achievement of my goals is not my goal.

fiblye · on July 10, 2020

With humans, we can ask what went wrong and fix the problem.

With black box algorithms, we throw in some new training data and just hope it's enough.

andybak · on July 10, 2020

> With humans, we can ask what went wrong and just hope we've fixed the problem.

> With black box algorithms, we throw in some new training data and just hope it's enough.

A small wording change and they're equivalent again.

To some extent human beings are also a black box - with some very peculiar failure conditions and side-channel weaknesses.

bufferoverflow · on July 10, 2020

No, we don't do anything against the humans who are 99% correct. 99% is good enough in most real world cases.

bryanrasmussen · on July 10, 2020

what's the promote option in this analogy?

ukj · on July 11, 2020

ML took your job.

Cthulhu_ · on July 10, 2020

Call me naive but, wouldn't feeding it random input and expected output sorted by existing and proven (but slower) sorting algorithms do the trick?

Scea91 · on July 10, 2020

For an exact proof of correctness no, because that would only show the algorithm is correct for a selected set of input-output pairs.

However, it is true that validating it for a very large number of inputs and outputs has value. This is not uncommon even in mathematics. For example, Goldbach's Conjecture (https://www.wikiwand.com/en/Goldbach%27s_conjecture) has not been proven but has been shown to hold for all integers less than 4 × 10^18

exdsq · on July 10, 2020

This is done! The existing correct algorithm is called the oracle and it’s used to validate the generated algorithm.

There’s some cool work on using neural program induction in compilers to try generate faster implementations based off the ‘oracle’ program that’s being compiled.

algo_trader · on July 10, 2020

> neural program induction in compilers to try generate faster implementations

Can u provide more details?

I am familiar with garden-variety ML-optimization-of-combinatorical-space in compilers.

How would u even "measure" the expected performance unless these are gradual greedy changes

Shorel · on July 10, 2020

It would be funny if a two phase approach to sorting would result in increased average performance.

First sort using the ML algorithm. Then sort the almost sorted result with a traditional algorithm optimized for almost sorted results.

In fact, I think my own brain does the two phase approach to many problems.

_puk · on July 10, 2020

Could you train the ML algorithm to select the best traditional algorithm based on the distribution of the items being sorted?!

User23 · on July 10, 2020

> The same was true for java's binary search

I used to ask "implement binary search" as an interview question and gave up on using it because not a single candidate could do it correctly in 45 minutes. This was at a FAANG.

vertbhrtn · on July 12, 2020

Ha, I can get an offer from any of the FAANG without even trying and can comfortably say that not only I failed the binary search question in past when interviews were 1 hour long, I would still fail it now.

Der_Einzige · on July 10, 2020

I don't believe you. Surely, after a few hints, the FAANG canidates must have been able to figure out the classic gotcha of this question (why does this sometimes overflow and how do you fix {high + low / 2})

If they didn't, than I believe you did a poor job of guiding them. I refuse to believe that this is some esoteric, tough question. If you can solve dynamic programming problems, you can easily solve this one.

nardi · on July 10, 2020

I interview tons of candidates at a FAANG company, and I can tell you for sure binary search is way too hard to get right on a whiteboard. Even in a proper coding environment I wouldn't ask it. There are lots of better questions with fewer sharp edges.

Der_Einzige · on July 10, 2020

What else is hard besides converting

(high + low) / 2

To

Low + (high - low) / 2

What else about this problem is a "sharp edge"?

nardi · on July 10, 2020

I may have been a little too harsh. Binary search is close to being a good interview question, especially since the algorithm does, rarely, come up in real code. But I prefer questions that are more like what people will be doing in their day-to-day job, and binary search is just a little too fiddly.

CamperBob2 · on July 10, 2020

Me: "Show me a binary search in C on the whiteboard."

Candidate: bsearch(key, base, num, size, compare_fn);

Me: "Your offer should arrive within one to two weeks. Need any help with relocation expenses?"

nardi · on July 10, 2020

100% this. Maybe not everyone knows the specific function you’re looking for, but for the ones that know standard library APIs really well, that’s your hire.

User23 · on July 10, 2020

I wouldn’t say so. In theory you’d expect a FAANG engineer to be able to construct a correctly terminating loop by reasoning about the loop invariant, but in practice every single candidate gets lost in guesswork and messing up the edge cases and termination condition.

DaiPlusPlus · on July 10, 2020

Apparently modern GPUs have sorting-networks implemented in hardware (giving you effectively constant-time sorting for fixed-size input) - which may be preferable for use if you’re processing hostile input which may be contrived to hit QuickSort’s worst-case and the cost of hitting the PCI Express bus is less than the cost of randomizing the input to mitigate QuickSort attacks.

But yes - I agree it’s weird that many standard-libraries (looking at you, .NET!) have only a single `Sort` method oh their vector-types with the details of the implementation buried in the documentation instead of shipping with a first-class sorting-algorithms library.

david-gpu · on July 10, 2020

GPUs having hardware explicitly for sorting sounds a bit odd, when you can efficiently implement sorting networks in software on the same GPU.

If you have a source, I would like to learn more.

KajMagnus · on July 10, 2020

> "just run a final pass of insertion sort to make it useful!"

(And if this insertion sort won't complete in O(n), say <= 3n, then, fallback to Quicksort, so won't turn into O(n^2) just because the ML alg hit a corner case)

mkl · on July 10, 2020

Possibly better would be Timsort (used in Python, Java, V8, etc.), as it's "designed to take advantage of runs of consecutive ordered elements that already exist": https://en.wikipedia.org/wiki/Timsort

eru · on July 10, 2020

Haskell's standard library sort, a variant of merge sort, also makes use of runs.

It runs in O(n log (number of runs)). It's quite a lot simpler than Timsort.

YeGoblynQueenne · on July 10, 2020

>> So I am not sure I will ever trust an ML algorithm trained on inputs/outputs only (which is what I think "neural program induction" means).

According to the authors, the proposed approach is notable for training not only on input/output pairs but also on program traces. In addition, reinforcement learning is used to train e.g. sorting agents on the behaviour of "teacher agents" that comprise hand-crafted implementations of well-known sorting algorithms (bubble- insertion- and quick-sort).

eli_gottlieb · on July 10, 2020

>So I am not sure I will ever trust an ML algorithm trained on inputs/outputs only (which is what I think "neural program induction" means).

Pfffft, just wait until they get it writing code with its own proofs of correctness in dependent type theory. It'll be correct and twice as unreadable as incorrect human-written code!

riedel · on July 10, 2020

Why not check in $n$ on a sorted if the postcondition holds or raise an exception otherwise. You can even generate a nearly sorted list and run a verified bubblesort. Don't see big problems if ML is used carefully.

fdupress · on July 10, 2020

Checking that the output is sorted is the easy part.

Checking that the output is a (optionally stable) permutation of the input is the hard part. You can't do that dynamically if you have, for example, overwritten your input by sorting in place. And making a copy of your input is only going to be affordable in very specific and small instances on which you might as well just run a well-understood algorithm.

ogogmad · on July 10, 2020

It looks like their algorithm can only modify the input list by swapping elements. This is guaranteed to result in a permutation.