My first thought was to use linear transformations over Z_2 as a field, as that ...

thaumasiotes · 2024-12-07T05:39:38 1733549978

I did some fooling around with this and it seems promising.

The first problem is that, when your enciphering function is a matrix, f(0) = 0. This looks bad, but we can easily solve that problem by starting the webpage sequence at an index higher than 0.

I tried to work through a much smaller version of the problem* by hand, and it looks like this:

We have our enciphering matrix N:

    [[1 1 1 1 1]
     [1 0 0 1 1]
     [1 1 1 0 0]
     [0 1 1 1 0]
     [1 0 1 0 0]]

and our deciphering matrix D, the inverse of N:

    [[1 1 1 0 0]
     [0 0 1 0 1]
     [1 1 1 0 1]
     [1 1 0 1 0]
     [0 1 1 1 0]]

We want to find the next index whose encipherment ends in -110. This sets up a system of equations Dx = y, where x_3 = 1, x_4 = 1, x_5 = 0, and y tells us the index of x. By multiplying that out, we get:

    y_1 = x_1 + x_2 + 1
    y_2 = 1
    y_3 = x_1 + x_2 + 1
    y_4 = x_1 + x_2 + 1
    y_5 = x_2

So we can freely choose any values for y_1 and y_5, and the rest will be filled in by constraints.

Assuming we want the least possible value for y, this means we will pick y_1 = 0 and y_5 = 0, which then tells us that we want index [0 1 0 0 0], and we can jump to there. If we wanted the least possible value for y above a threshold (such as the current viewport), we'd pick y_n values accordingly.

Instinct tells me that libraries should exist for quickly solving systems of linear equations like this.

(For full full-text search, we'd need to do this several times, also finding solutions for the enciphered values 110xx, and x110x. This multiplies the work we need to do and the storage we need to use by an amount that is linear in the difference in length between the search string and the full UUID, which is still a lot better than trial-and-error.)

* I ended up doing it in 5 bits because every random 4x4 matrix I generated was noninvertible.

thaumasiotes · 2024-12-07T09:02:26 1733562146

Following up, using this encipherment scheme and starting from a 5-bit seed of 17, you produce values in this sequence:

5, 7, 31, 10, 18, 16, 8, 11, 19, 17, 9, 28, 4, 6, 30, 0, 24, 26, 2, 23, 15, 13, 21, 22, 14, 12, 20, 1, 25, 27, 3, 29.

This might pass an eyeball test for random when viewed as decimal numbers. (Although there sure are a lot of cases where adjacent values are 2 apart!) It looks much worse as binary. Here's every ones digit, all concatenated into a hex string, starting from 17 again:

e1e01e1f

Let's note that ~(e1e0) = 1e1f. Start from 16 instead and you'd see f0f00f0f. Here are the eights:

3332bbbc

Individual bits show pretty striking patterns. Since the UUID is reported in hex, that can be mitigated a little by the fact that each hex digit combines four binary columns. But so far it seems pretty likely that this would result in the list of UUIDs looking decidedly nonrandom. There might be quite a bit of shared material between adjacent UUIDs.

thaumasiotes · 2024-12-07T13:36:34 1733578594

correction: the run of the eights is 3332cccd, not 3332bbbc.

subleq · 2024-12-10T10:40:08 1733827208

Can you solve this in general without doing integer linear programming? How else would you know how to find the lowest index greater than the current? In the field GF(2), using 0 might not minimize.

eieio · 2024-12-07T16:14:32 1733588072

Hi! I do not know enough of the relevant math to appreciate what you're doing here (yet); I am going to try to do some reading to understand this better, but if you have any advice on where I should start I will gladly take it.

movpasd · 2024-12-19T15:40:24 1734622824

The important thing about matrices is their linear properties. Most proofs in linear algebra over real vector spaces don't rely on the particularities of the real number system, just the fact that it is a field [0].

To make an analogy with programming, you can think of a matrix as a generic container type. Normally, we think of the contained type as being the real numbers, but you could replace it with anything that supports the multiplication and addition operations -- that implements the "field interface/trait/typeclass". [1]

So you can define linear algebra over any field: the real numbers of course, but also the rational numbers, or the complex numbers. Fields can also be finite. In particular, the integers modulo n is a field if n is prime. My suggestion is to consider the case n=2. [2]

It's a bit hard to get geometric intuition about what matrices on finite fields even mean, but a lot of theorems we can prove in the real case generalize, because they depend only on the field properties of the reals. That's why I suggested it -- it would make things like inverting straightforward. And that natural relationship between linear subspaces and holding certain bits constant might make search possible.

The issue (which others pointed out) is that linearity imposes so much structure that it might not seem pseudo-random anymore. But you could consider something like XORing all numbers with a randomly generated bit string -- it won't hold up if someone looks closely but might fool the cursory human eye.

---

[0] https://en.wikipedia.org/wiki/Field_(mathematics)

[1] The analogy breaks down slightly because not only must the operations be defined, but they must also obey laws, which can't be expressed in (most) programming language type systems. But if you've used Haskell before, you'll have seen this before in the fact that a Monad has to obey the Monad laws.

[2] https://en.wikipedia.org/wiki/GF(2)