Hacker News new | past | comments | ask | show | jobs | submit login
Sattolo's Algorithm (2017) (danluu.com)
100 points by laybak 4 days ago | hide | past | favorite | 20 comments

The motivation for writing this amuses me. The Wikipedia prose "proof" is frustratingly familiar to many math textbooks I had when getting my BS many years ago. Fortunately, I had a couple professors who insisted on properly structuring proofs so that they were, you know, actually readable and understandable.

That Wikipedia prose is like the lazy author's approach, it's a brain dump and not a real attempt to inform or educate. It is nice that there is Python code in the relevant section on the algorithm, that's informative (and why I like Python, it's executable pseudocode). But it would also be nice to have properly structured proofs to go along with the awful prose section (and either remove it or clean it up).

Also, here's the discussion from when this was first posted: https://news.ycombinator.com/item?id=14967697

I don't know if it is math but formal scientific writing can make even simple concepts hard to grasp for me, a developer.

During my studies, I had to assist to a conference by a mathematician explaining some image processing algorithm. It took me half an hour to realize that all that was a good old "make each pixel the weighted average of its surroundings" blur, also known as a convolution filter. And I spent the other half hour completely lost trying to understand what was special about his blur. I think no one actually understood anything, in fact, I probably did better than most because I was familiar with convolution filters from personal experience.

Had he explained it in terms of pixels and loops instead of what was probably (for us) a Greek letter soup, I am sure we would have understood.

There is a chronic problem in audio/image processing where domain experts have developed their own notation and terminology and do a really poor job connecting it to the simple concepts they represent.

> Had he explained it in terms of pixels and loops instead of what was probably (for us) a Greek letter soup, I am sure we would have understood.

Unfortunately concretizing the math to realizations leaves you with a cookbook for implementing the least interesting bits of signal processing. Looking at some matrix multiplication as for loops doesn't teach you how a filter works.

Like just to give an example, I don't think it's possible to come up with an explanation of the Remez Exchange Algorithm in any physically meaningful context since it's fundamentally about recognizing the mathematical properties of a higher order abstraction of the system designed with it. But we can be a bit smarter about how we write, since "an iterative algorithm used to find simple approximations to functions, specifically, approximations by functions in a Chebyshev space that are the best in the uniform norm L∞ sense" is Wikipedia word soup that belies the concepts and connections that make it useful. It's just hard to do it in a sentence and not a semester.

I had the same experience as a math undergrad. Too many professors seem to have forgotten what it was like to be a student. They write books that are great reference material for their colleagues but which are utterly opaque to a student who is still working to build their foundation.

Textbooks often format proofs as a sequence paragraphs, but this really obscures the logical structure of the argument. As a student, I found it very helpful to write down proofs as nested lists [1,2] instead.

Admittedly, now that I have more experience, I do like my proofs to be more terse when I'm communicating them just to myself.

[1]: https://benrbray.com/static/notes/separating-hyperplane-thm_... [2]: https://benrbray.com/static/notes/minimax-approx_nov16.pdf

Your examples are, more or less, how I was taught to construct proofs. It's just much clearer and easier to read/follow than the prose-style in that Wikipedia page. And you can always supplement one with the other, just like the code example. Technically unnecessary since there was a prose description of the algorithm, but from a readability/understandability perspective it's immensely useful.

Wikipedia is notoriously useless for learning anything even vaguely mathematical. It's too bad there's not a "Wikimath" for people who aren't mathematicians.

> Wikipedia is notoriously useless for learning anything even vaguely mathematical. It's too bad there's not a "Wikimath" for people who aren't mathematicians.

The nlab and wiki2.org are nice for the subjects they cover.

Weird, I was actually searching for a function that would do a "permutation of a list with exactly one cycle" an hour ago. I found Sattolo's, but found its use of random() disagreeable - it would depend on external/stdlib code and generate a new sequence every time if the seed is not fixed. Instead I looked into key schedules for cryptographic ciphers and found an easy solution - the key schedule function for RC4 for example. (My use does not care about the flaws/biases). It's a couple of lines of code, generates the same permutation for the same input+key - just use a couple of arbitrary hardcoded bytes as the key and you are done.

If you want a quick, easy and good quality random number generator, you might want to look at xorshifts.

There are a lot of variations, but I particularly like the 128 bit version from the original paper [1].

Here is the entire code

  unsigned long xor128() {
    static unsigned long x=123456789,y=362436069,z=521288629,w=88675123;
    unsigned long t;
    return( w=(wˆ(w>>19))ˆ(tˆ(t>>8)) );
Note: Do not use for security applications! It is not cryptographically secure. But if you need that, or if you are running Monte Carlo simulations and want nothing but the best, you probably should use specialized libraries anyways.

[1] https://www.jstatsoft.org/index.php/jss/article/view/v008i14...

Edit: And since we are in the subject of picking random numbers for Sattolo's algorithm. Using a simple modulo operator for get a range from a single random number going from 0 to 2^n-1 (like most RNGs produce) can introduce a bias. If the list is short and the numbers are large, it doesn't make that big a difference, but it is something to consider.

In case anyone is wondering how to do an unbiased choice of the numbers 0 to N given an RNG that generates numbers 0 to 2^Y-1, where 2^Y-1 >= N, there are two ways, both rejecting some samples:

   1. Find the smallest Z s.t. 2^Z-1 <= N, generate a random number, bitwise-and with 2^Z-1, accept the result if the result <= N, otherwise repeat.  The expected number of times through the loop depends on Y and N, but is strictly less than 2.

   2. Let W = floor((2^Z-1) / N), generate a random number, if the number < N*W, return the number mod N, otherwise repeat.  The expected number of times through the loop depends on N and Y, but is no more than the first algorithm.  The downside is that division and modular division are much slower than bitwise-and on most processors.
If division or modular division (on many processors, the same instruction) is much slower than bitwise-and, and your PRNG is pretty fast (such as xorshift / xorishiro), you're better off with the first algorithm.

Note that finding the mask from the first algorithm on a 64-bit machine is as simple as:

  mask = N; for( i = 1; i < 64; i *= 2) { mask |= mask >> i; }  // Check if your compiler unrolls this loop

Something like this:

  def sattolo(a, key):
      n = len(a)
      keylength = len(key)
      j = 0
      for i in range(n - 1):
          j = (j + a[i] + key[i % keylength]) % n
          a[i], a[j] = a[j], a[i]
(key needs to be a byte string because it's easier that way)


Doesn't work for producing a singlar path:

  >>> x = [0,1,2,3,4,5,6,7,8,9]
  >>> sattolo(x, b'zanzibar')
  >>> x
  [0, 2, 1, 5, 4, 3, 8, 6, 9, 7]

> permutation of a list with exactly one cycle

I'm having trouble parsing the motivation and terminology. When a list has one cycle, does that mean it repeats once? Or is this basically 'i want a random but stable order of events'?

Permutations can be written as the "product" of "cycles", as described here [1].

If you choose an element belonging to a given cycle, then generate new elements by repeatedly applying the permutation, all the elements you generate will belong only to that cycle.

So, if your permutation can be written as the product of more than one cycle, and you choose a random starting element and generate new elements by successive application of the permutation, you will only generate a strict subset of all the possible elements.

On the other hand, if your permutation only has a single cycle, then that procedure will generate all the elements.

[1] https://en.wikipedia.org/wiki/Permutation#Cycle_notation

Hmm. I think my fundamental disconnect here was the idea of permutation as a repeated process. As a formal term it's really ever come up in a discrete mathematics class comparing it with combinations, on the subject of counting. Here we seem to be defining permutation as a function that can be repeatedly applied.

Looks like the first sentence of wikipedia confirms it's an overloaded word:

>In mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. The word "permutation" also refers to the act or process of changing the linear order of an ordered set.

I'm curious what you are doing where you have access to cryptographic key schedules, but not access to a use of "random()"?

Just learning a new language, so the requirement is self-imposed. But the fun thing with RC4 key schedule is that it doesn't have to be provided, one can just implement it based on the Wikipedia code. I have long found RC4 exciting to implement when trying out new languages - I get a quick dopamine hit because I implemented something... real, and it has several other uses besides encryption - especially when just toying. Today I learned I can use it to shuffle arrays.

Does anyone have an example for when you'd want to use this algorithm?

Sometimes you'd like to visit all elements of some logical array in a random order, with each element pointing to the next, e.g., to benchmark the speed of pointer chasing. Using Sattolo, you can set up such a full-sized linked list fairly simply, e.g.:


I have used it a half dozen times in more or less that specific context.

Random ordering of elements in a list is often used in maze generation. It's possible that a permutation that had only one cycle might produce a maze with a particular structure.

Incremental buffer copy?

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact