"regular" has a fairly simple mathematical definition: the set of languages that can be matched with a finite state automaton.
You can think of this as the languages that can be matched with an algorithm that is bounded (i.e., O(1) ) in memory, no matter what is the string to be matched.
The following pseudocode is not bounded in memory -- can you guess why?
bool is_prime(Number n)
{
for (Number i = 2; i < n; ++i) {
if (n % i == 0) {
return false;
}
}
return true;
}
The value i can take an a priori unbounded amount of memory.
It should be noted, so can the value n, but in defining regular languages in this way, we allow our O(1) memory algorithms to be fed the characters of an a priori unbounded input string one by one sequentially.
Yeah, the O(1) refers to the size of the EXTRA memory you need, aside from the input. Defining it any other way would be strange. So, for instance, binary search over a sorted list needs O(log n) of memory (for the same reason as the prime algorithm, the pointer in the array is size O(log n) of the size of the array), even though the input is O(n).
Yup. Note that the method of specifically being fed the input sequentially, one by one, means that, for example, "Strings of odd length whose middle character is 'X'" does not comprise a regular language, even though one might naively reason this to be trivial to detect with O(1) memory ("Just go look at the middle character!").
Yeah, obviously, there are lots of ways to improve that code. First off all, much more efficient to to do (i * i <= n) instead of (i <= sqrt(n)), or at the very least you can cache sqrt(n) so you don't recalculate it every step.
You can also do a separate test for 2, then start at 3 and use i+=2 instead of i++, that cuts the test time in half. Once you've done that, you can observe that all primes are 6k +/- 1 except 2 and 3, which makes your code 3 times faster than the naive version.
At this point, you've basically got a simple wheel primality testing algorithm going, so why not go all the way?
I don't think the parent poster was making the point that his was the most efficient algorithm for testing primes. I think his point was that regular numbers are O(log n) in CS terms, not O(1), which is not obvious at first.
You can think of this as the languages that can be matched with an algorithm that is bounded (i.e., O(1) ) in memory, no matter what is the string to be matched.
The following pseudocode is not bounded in memory -- can you guess why?