

Regularly divisible - michaeltwofish
http://www.alexbowe.com/education/regularly-divisible/

======
gjm11
The author constructs his regular expression ad hoc, by trying examples and
spotting patterns. His proof that it's correct consists of trying it for the
first 5000 non-negative integers. Here's a more principled approach for those
who prefer to know that, why, and how something works.

I'll use "@" instead of the usual asterisk to denote "repeated zero or more
times" to avoid HN formatting issues.

First, let's construct a finite state machine that does the job. It needs
three states, one for each value the number-so-far might have mod 3. You can
pretty much write this down immediately. State 0 goes to itself on seeing a 0,
and to state 1 on seeing a 1. State 1 goes to state 2 on seeing a 0, and to
state 0 on seeing a 1. State 2 goes to state 1 on seeing a 0, and to itself on
seeing a 1.

Now, there's a general technique for converting an FSM to a regular
expression. The idea is that you label edges with regular expressions and not
just single symbols. Then you eliminate states one by one, until you're left
with only the start state and the accepting end state. To eliminate a state y,
consider all x,z such that there are transitions x -> y -> z; replace y and
those transitions with a bunch of transitions straight from x to z, of the
form [what it takes to go x -> y] [what it takes to go y -> y]@ [what it takes
to go y -> z]. Repeat. (When you get multiple edges from one state to another,
use the | operator.) Eventually you just have a start state and an end state,
and then the label on the edge from one to the other is the RE you need.
Unless the start and end states are the same, in which case you need to allow
arbitrary repetitions of that RE, so add another star.

OK. So now we eliminate state 2. It happens that the only edge we need to add
goes 1 -> 1, and it's labelled 01@0. In other words: if you have something
that's 1 mod 3, then the bit-sequences that take you back to 1 mod 3 while
visiting only {1 mod 3, 2 mod 3} are those of the form: 0, maybe some 1s, 0.

So now we have states 0 and 1, with transitions 0 -> 0 (labelled 0), 0 -> 1
and 1 -> 0 (both labelled 1), and 1 -> 1 (labelled 01@0). State 0 is both the
start state and the accepting end state.

And now we eliminate state 1. In just the same way, the only edge we need to
add goes 0 -> 0, and it's labelled ... well, we have 0 -> 1 [1], then zero or
more of 1 -> 1 [01@0], then 1 -> 0 [1]. In other words: 1(01@0)@1.

So now we have only one state, which has the 0 -> 0 transition labelled 0 that
it always had, plus a new one, labelled 1(01@0)@1. That's equivalent to a
single transition 0 -> 0, labelled 0|1(01@0)@1. So, finally, our RE is
obtained by starring that: (0|1(01@0)@1)@. Where, to repeat, "@" means what's
usually denoted by an asterisk.

This RE is actually simpler than Alex's: fewer characters and fewer stars. I
don't know which will execute faster in a typical RE engine.

[EDITED to fix asterisk screwage -- I had no idea that HN italicization works
across paragraphs -- and to add: while I was writing this, _qntm_ made a
similar comment. Except that I think constructing the FSM is trivial and
converting it to an RE is nontrivial, whereas _qntm_ thinks constructing the
FSM is nontrivial and converting it to an RE is trivial. Hopefully the
combination of _qntm_ 's comments and mine will make everything clear.]

------
qntm
A simpler solution is to use modular arithmetic (this works for any base).

If we consume a binary number from left to right, let N be the binary value
already expressed. If we then consume a 0, then the new value is now 2N. If we
consume a 1, then the new value is 2N+1.

Create a 3-state machine where state 0 represents a running value of N mod 3.
Starting state is 0. Follow all the transitions. e.g. from state 2 (N = 2 mod
3), if we consume a 0, we are now at 2N + 1 = 1 mod 3, so f(2,0) = 1. Full
transition function is:

    
    
        f(0,0) = 0, f(0,1) = 1
        f(1,0) = 2, f(1,1) = 0
        f(2,0) = 1, f(2,1) = 2
    

This is trivial to turn into the regex (0|1(01* 0)* 1)* (had to add spaces to
stop italicisation). If leading zeroes are a problem you have to modify the
FSM a little which makes the resulting regex a little more complicated.

~~~
qntm
I should probably clarify that it only becomes trivial to turn that FSM into a
regex when you've drawn it out! I encourage HNers to do this and see what I
mean. In my experience, just inspecting the transition function visually isn't
much use at all.

------
patio11
You can construct a regular expression / state machine for these fairly
easily.

Let X be the number we have seen so far.

State A: X mod 3 = 0 State B: X mod 3 = 1 State C: X mod 3 = 2

We'll need a separate start state if you swing that way.

Now, observe what happens if we have interpreted a binary number and then
suddenly stick a 0 at the end of it. It gets multiplied by 2. Not shocking. If
we stick a 1 at the end of it, it gets multiplied by 2 and one gets added.
Still not that hard.

OK, I'm going to rename our states now. Instead of X, we're going to describe
them in terms of k. k is some positive integer which is a multiple of 3 -- we
don't care which.

A: k + 0 B: k + 1 C: k + 2

Now, if we multiple k by 2, we get 2k, which we know is also going to be
divisible by 3 because k is. So a 0 transitions A to A.

If we multiply k by 2 and add 1, we get 2k + 1. 2k + 1 % 3 = 1, because 2k is
evenly divisible by 3. So a 1 transitions A to B.

The next four transitions are left as an exercise for the reader -- they're
pretty trivial.

You can use this same general pattern to construct any "divides by N" state
machine.

Going from a state machine to a regular expression (particularly the smallest
possible regular expression) is a bit trickier.

------
robinhouston
Just to link the whole thing up (in case someone finds this thread via Google,
for example), here's my demonstration of the general algorithm that can find a
regular expression for testing divisibility by _any_ number, in _any_ base:
[http://s3.boskent.com/divisibility-regex/divisibility-
regex....](http://s3.boskent.com/divisibility-regex/divisibility-regex.html)

------
bmm6o
I wrote up something similar a while back:
[http://bmm6o.blogspot.com/2008/03/divisibility-testing-
and-p...](http://bmm6o.blogspot.com/2008/03/divisibility-testing-and-
pattern_27.html) .

------
Syama
has anyone considered that all numbers where the sum of digits are divisible
by 3, the whole number is divisible by 3?

------
vijaydev
Interesting problem. Well explained.

