Problem: Sum all collatz steps from 0 to 10000000
https://en.wikipedia.org/wiki/Collatz_conjecture
First solution: Naive loop
Code: Select all
uint64_t steps(uint64_t n) {
uint64_t steps = 0;
while (n != 1) {
if (n % 2 == 0) {
n = n / 2;
steps++;
}
else {
n = (3 * n + 1) / 2;
steps += 2;
}
}
return steps;
}
Code: Select all
uint64_t steps(uint64_t n) {
uint64_t N = n;
//Make input odd
uint64_t steps = std::countr_zero(n);
if (steps != 0) n >>= steps;
while (n != 1) {
//Repeated expand until even
while (n & 1) {
n = (3 * n + 1) / 2;
steps += 2;
}
//Repeated division by 2 until odd
steps += std::countr_zero(n);
n >>= std::countr_zero(n);
}
return steps;
}
This can be solved via recursive expansion:
n0 = (3n+1)/2
n1 = (3*(3n+1)/2+1)/2 = (9 n + 5) / 4
n2 = (3*(9 n + 5) / 4 + 1) / 2 = (27n + 19) / 8
n3 = (3 * (27n + 19) / 8 + 1) /2 = (81n + 65) / 16
n4 = (3 * (81n + 65) / 16 + 1) / 2 = (243 n + 211) / 32
When looking at the numbers here we get the general collatz pattern:
http://oeis.org/A001047
n = ((3^i)*n + (3^i - 2^i)) / 2^i
in c++ world 3^i cannot be calculated quickly. But since its a normal f(x) => y function we can do a simple lookup table for 3^i and 3^i - 2^i:
Code: Select all
n = (table_pow3[expand] * n + table_pow3minuspow2[expand]) >> expand;
Code: Select all
uint64_t steps(uint64_t n) {
uint64_t steps;
//Make input odd
n >>= steps = std::countr_zero(n);
while (n != 1) {
//Repeated expand(e) 3x+1 until even
uint64_t e = std::countr_one(n);
steps += e * 2;
//General form:
//((3 ^ i) * n + (3 ^ i - 2 ^ i)) / 2 ^ i
n = (table_pow3[e] * n + table_pow3minuspow2[e]) >> e;
//Repeated division by 2(c) until odd
uint64_t c = std::countr_zero(n);
steps += c;
n >>= c;
}
return steps;
}
((3 ^ i) * n + (3 ^ i - 2 ^ i)) / 2 ^ i
The whole collatz problem could be compacted into a single formula.
This is not a solution to the problem if there are loops inside the collatz series or all numbers return to 1.
But it would be a piece of code that starts with a number and returns the number of steps without any loops.
All that because we found a general pattern:
How often can we apply x/2 to a number until its odd? : std::countr_zero(n) - very clear
How often can we apply (3x+1)/2 to a number until its even? : std::countr_one(n) - not so very clear
At the end I can say: The fastest code is code that is not called. So of course a series of numbers are perfect to lookup via dictionary but thats not a solution to the core problem.
For chess its exactly how you would write fast code.
1) implement a slow prototype.
2) replace all conditions with a branchless alternative (n = 1 << countr_zero(n)) would replace a while loop and is branchless.
3) get rid of all complex math and replace with branchless binary logic
4) find general patterns and use arrays to lookup the result directly without any binary logic at all
5) lookahead into a loop or recursive function.
Its so cool because chess is also a recursive problem. That shows that leapfrogging (generating moves + countermoves) in a single function could also be a very interesting approach to make things faster. And you could remove much code because its always "your own" move.