
How many reversible integer operations do you know? - espeed
https://lemire.me/blog/2016/08/09/how-many-reversible-integer-operations-do-you-know/
======
rcthompson
Since the domain and range of any integer operation are the same, the only way
for such an operation to be reversible is if it maps every input to a
different output. That means that the set of reversible integer operations is
exactly the set of permutations on the integers (for a specified number of
bits). So how many unique permutations are there for the set of 32-bit
integers, which is a set of size 2^32? That would be (2^32)!, a very large
number indeed.

Incidentally, multiplying by an odd integer works because when working in mod
N space, multiplying by any number that is coprime with N yields a
permutation, and since for 32-bit integers N = 2^32, the only prime factor of
N is 2, which means that N is coprime with all odd integers. (I'm assuming
unsigned integers and wrap-around behavior on over/underflow.)

------
throwaway080383
As others have pointed out, there are (2^32)! such operations on 32-bit ints.
However, I think it can be shown that the Kolmogorov complexity of most of
these has to be

O(lg((2^32)!))

which I think is rougly 32*2^32.

In other words, you'd need about 16GB just to store the program to compute the
permutation! Of course, that is not the case for the operations shown here.

So maybe implicitly the real question is, "How many reversible integer
operations do you know with small Kolmogorov complexity?" Or in more practical
terms, "How many reversible integer operations do you know which don't require
too many lines of code?"

~~~
ecesena
I suspect he’s also implicitly assuming “the inverse is easy to compute”, so
both the operation and its inverse have small K complexity.

~~~
ogdan
If f has small K complexity, then f^-1 has small K complexity.

~~~
nwjtkjn
Is there a formal theorem making this precise?

~~~
gizmo686
I'm not familiar with the formal definition of Kolmogorov complexity, nor its
related theorems, but informally, it appears to be about the _length_ of the
specifying program, not the time it takes to run.

Given that we have a finite domain, and 1:1 functions, we should be able to
specify f^-1 with some constant overhead and embedding f.

Something along the lines of:

    
    
      //embed the definition of f.
      f^-1(x) = for a in Domain:
        if f(a) = x
          return a
    

Formalizing this would involve specifying what description language you are
using and how you encode functions.

------
espeed
Scott Aaronson, Daniel Grier, and Luke Schaeffer did a paper on this a few
years back...

The Classification of Reversible Bit Operations
[https://www.scottaaronson.com/papers/gates.pdf](https://www.scottaaronson.com/papers/gates.pdf)

Grier's ITCS presentation on it
[https://www.youtube.com/watch?v=egU5JLbpmkA](https://www.youtube.com/watch?v=egU5JLbpmkA)

BTW see Lehmer codes for permutation sequences
[https://en.wikipedia.org/wiki/Lehmer_code](https://en.wikipedia.org/wiki/Lehmer_code)
(no relation to Lemire ;)

------
js8
That's really a strange question to ask, a bit like: "What is the largest
integer you know?"

You can start with a reversible gate (such as
[https://en.wikipedia.org/wiki/Toffoli_gate](https://en.wikipedia.org/wiki/Toffoli_gate)
or
[https://en.wikipedia.org/wiki/Fredkin_gate](https://en.wikipedia.org/wiki/Fredkin_gate))
and combine these.

~~~
jws
Not explicit in the title is “efficiently on a modern CPU”. They tend to be 1
to 4 opcodes each. I think the list might serve as a jumping off point if you
are looking at an algorithm which depends on an integer permutation and want
to consider how it would change if you used a different permutation.

And missing from the list—the identity function! Very fast to compute and
universally (not)implemented on all CPUs.

~~~
haimez
The identity function is not an interesting opcode because there's already the
NOOP code and because compilers are expected to be able to identify the
identity transform with great accuracy.

------
qmalzp
Fun fact: compositions of the two operations

x -> x + 1

and

x -> (x==0) ? 1 : (x == 1) ? 0 : x

generate all possible operations.

~~~
rbehrends
Well, this is just a roundabout way of stating that the symmetric group S_n is
generated by (1, 2) and (1, 2, ..., n) for all n. :)

------
analog31
I'm thinking there should be n! possible reversible operations on an integer
bounded by n.

------
nasso
I don't really get the point of this. Why does it matter if it's reversible?

~~~
krastanov
It is a pretty interesting question in fundamental physics (but we are quite
far from considering it in practice, except in some quantum computing
architectures).

See
[https://en.wikipedia.org/wiki/Reversible_computing](https://en.wikipedia.org/wiki/Reversible_computing)

One of the main points is that it would permit significantly lower waste heat
generation.

~~~
plus
You are discussing a very different use of the word "reversible". This blog
post is discussing mathematical operations on 32-bit integers where the
original number can be recovered through an inverse operation. You are talking
about the concept of physical reversibility, which refers to processes in
which the change in entropy is (close to) zero. The first is a mathematical
concept relating to operations over the field of 32-bit integers, while the
latter is a physical concept related to second law of thermodynamics. I
guarantee that XORing a 32-bit integer with a fixed value is _not_ a
thermodynamically reversible process on any known architecture.

~~~
krastanov
This was my point when I said "far from considering it in practice". An
operation has to be (a) logically reversible (thermodynamically reversible in
theory, like XOR) before you can make (b) real reversible hardware with it
(thermodynamically reversible hardware implementation). Most operations we use
in typical hardware are obviously not (b), but they are also not even (a).

------
twotwotwo
For folks confused that the blog post doesn't just say "2^INT_BITS factorial",
I think the idea is that you want _efficient_ reversible operations on modern
CPUs. These can be useful in, for example, cryptography and non-cryptographic
checksums and RNGs. Even if you don't actually ever run the inverse function,
using an invertible op ensures that no information from the input is dropped
and every output is possible. Of course, you also need the invertible function
to have whatever other properties you want for your particular use, like
diffusion or nonlinearity. The XOR-three-rotations operation, for example, is
used in the "sigma" functions providing diffusion in SHA-2, though you never
use the inverse.

With a block of >1 integer, there are some additional fun ones:

\- XOR an even number of integers together, then XOR the resulting value, or
any function of it, into all of them. (It's its own inverse, since the XOR of
an even number integers isn't changed by XORing the same value into all of
them.)

\- Bitslicing S-boxes, a potentially _nonlinear_ operation. You can represent
S-boxes (esp. small ones) as a network of gates, and you can turn those into
AND/OR/XOR/NOT instructions and have your CPU do them on registers at a time.
This was core to the AES candidate Serpent using 32-bit registers; now you
have larger regs to do it with.

\- Some matrix multiplications (including in fields). Thinking of AES's use of
MDS matrices.

\- Generalized Feistel-like operations: run some function--not necessarily
reversible--on part of the input, and use the result to munge another part of
the input. a ^= f(b) or (some of the generalizations) a += f(b) or a ^= f(b,
c, d). Used tons of places, of course. Then you can easily build an invertible
function (on the larger block) from a non-invertible round function, and
repeat for a zillion rounds (a ^= f(b, k1); b ^= f(a, k2); a ^= f(b, k3)...).
With a loose enough definition of "Feistel-like", even lots of ARX ciphers
could qualify.

(You could, of course, do some of these things within a single integer by
thinking of it as two, four, etc. narrower ints, or you could call a whole
block a very large int. Mostly thinking of register-sized integers when I say
integers, FWIW.)

Keccak (SHA-3) has some steps where, like with the XOR-three-rotatations
thing, an inverse exists but looks a bit different from the original
operation, which is neat! There's a reference
([https://keccak.team/files/Keccak-
reference-3.0.pdf](https://keccak.team/files/Keccak-reference-3.0.pdf))
describing the operations using math notation, and lots of C implementations
([https://github.com/rhash/RHash/blob/master/librhash/sha3.c](https://github.com/rhash/RHash/blob/master/librhash/sha3.c)).
It looks like the team published code for the inverses too; see
[https://github.com/gvanas/KeccakTools/blob/master/Sources/Ke...](https://github.com/gvanas/KeccakTools/blob/master/Sources/Keccak-f.h)

