Hacker News new | comments | show | ask | jobs | submit login

The algorithm discussed in this post isn't actually the Sieve of Eratosthenes; it's an inefficient algorithm with performance worse than trial division. See https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf



The paper discusses the fundamental difference between the Sieve of Eratosthenes and Trial Division, then goes on to discuss a number of ways that the Sieve of Eratosthenes can be implemented in a pure functional style with greater and greater optimizations.

Since I wanted to talk about laziness and not priority queues, I eschewed trying to write the fastest possible implementation in favour of the simplest implementation that is still recognizable as “cross off every nth number.”

If you feel that the implementation is inefficient, I agree. But if you feel it isn’t actually the Sieve of Eratosthenes in some fundamental way, please explain the difference in a little more detail, it would be instructive to share with HN and me.


The Sieve of Eratosthenes is an algorithm that computes all the primes up to n in O(n log log n). It does this by eliminating composite numbers in a clever way: instead of checking when it reaches the number if it's divisible by any primes seen so far, it crosses off multiples of a prime in advance as soon as it sees that prime.

Your code maintains a nested "stack" of iterators: the outer iterator is a nullEveryNth() that consumes from a nullEveryNth()...that eventually consumes from a range(). There's one for each prime encountered so far. Each iterator passes through a number if it's not divisible by the corresponding prime. That's trial division.

You should be able to observe that your lazy implementation is asymptotically slower than the eager implementation (not just slower by a constant factor, but the running time grows more quickly).

The paper by O'Neill is a good read (although a little dense), you should check it out! https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf

N.B. I think you added skipFirst to nullEveryNth since publishing this post, but it doesn't help get the algorithm down to the O(n log n log log n) performance that O'Neill describes.


I’m familiar with the ‘Neill paper, and although my memory is hazy, I recall writing something along those lines in Scheme in college when we were learning about its own particular flavour of laziness (IIRC it involved lists where the head was a value and the tail was a thunk).

The thing is... I was aiming for the most literal translation of the human-readable instructions, which conveniently set things up to introduce the bug in using n eager version of `compact`, which segued into discussing the problem of mixing eager and lazy code in a language that uses the exact same interface for both.

The code I gave mimics what happens when a child is taught the method. They look at the list of numbers and count one TWO (cross off) one TWO (cross off)... one two THREE (cross off) one two THREE (cross off)... one two three four FIVE (cross off) one two three four FIVE (cross off) and so on.

It’s quite right and proper to advance from there to talk about how you can do multiplication in a faster way to derive FIVE, TEN, FIFTEEN, TWENTY and so on, and to line up the grid of numbers in such a way that you have faster access to an arbitrary number than traversing the list sequentially, and then you have the performance discussed.

But all of that would encumber the story I was telling.

As for the skipFirst, I added that when I decided to cite O’Neill’s paper, as it made for a literal translation of the introductory explanation she gives of the algorithm.

---

So now, I’m going to write a’proper' version in JavaScript, I look forward to your feedback.


As someone who's also concerned with precision, I'd say your sieve is about 25% as bad as an O(n^2) quicksort. That's not horrible, whether you think it's okay is up to you.


When writing about X, quite often the algorithm that communicates “X” most succinctly ignores considerations “Y” and “Z.”

Sometimes, Y and Z would just get in the way. But then again, sometimes their omission distracts the reader and provokes a lot of bikeshedding.

There is no easy answer, and I often get this balance wrong.

This is, I think, what makes some writers so brilliant: They find a way to present “X” in a clear and strong way without making an absolute hash of “Y” and “Z.”

Such writing is a treasure.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: