There's a shorter way without a bunch of mysterious constants:
// decode a 5-digit base52 number
unsigned long b52(const char *c) {
unsigned long factor = 1;
unsigned long result = 0;
for(int i = 0; i < 5; i++) {
result += factor * b52map[c[i]];
factor *= 52;
}
return result;
}
(52^5 < 2^32 so an 'unsigned long' is plenty; 52^11 < 2^64 so you could get an 11-digit base52 number into an 'unsigned long long'. I'd be tempted to generalize the function to handle 0-11 digits, by requiring the parameter to be null-terminated or passing in a length.)
Yeah, a ull is overkill. This code comes from a furious couple weeks of coding, an attempt at being clever and going for speed. I'm not sure if I accomplished either. The thinking at the time was that the constants might cache better. I know, don't be clever.
How could that possibly make a difference for the intended application? Is the server CPU-maxed and never faces IO waits where more calculation is free?
If speed were a valid goal (and it's not), then:
(1) You would only use overkill 64-bit integers if on a 64-bit processor and profiling had proven them to be faster.
(2) Precalculating the factors would speed the clearer code, possibly making it faster:
const unsigned long factors[5] =
{ 1, 52, 52*52, 52*52*52, 52*52*52*52 };
// decode a 5-digit base52 number
unsigned long b52(const char *c) {
unsigned long result = 0;
for(int i = 0; i < 5; i++) {
result += factors[i] * b52map[c[i]];
}
return result;
}
...or unroll the loop...
// decode a 5-digit base52 number
unsigned long b52(const char *c) {
unsigned long result = b52map[c[0]];
result += (52) * b52map[c[1]];
result += (52*52) * b52map[c[2]];
result += (52*52*52) * b52map[c[3]];
result += (52*52*52*52) * b52map[c[4]];
return result;
}
(Here I'm trusting the compiler to precalculate the powers of 52, but the code is still clearly 'parsing a base' rather than 'mysterious multiplications and divisions'.)
(3) You would strongly consider using a Base64-based approach, so all conversions involve bit-shifts instead of multiplication. If certain characters can't be used -- eg vowels -- you might still prefer large gaps in your ID series. (If you're using a SQL database, it won't care -- but then again, if there's a SQL database anywhere in the system, its operations will make these micro-optimizations even more ludicrous than they are already.)