Re: Removing Large Arrays
Posted: Tue Mar 10, 2020 5:52 pm
At least they shouldn't call it "simplification", as this doesn't simplify anything under any definition. Looks like change for the sake of change.
In the past, I've often noticed that C programmers tend to try and make code as brief as possible.AndrewGrant wrote: ↑Tue Mar 10, 2020 1:41 pm I've long disagreed with a certain users string of "simplifications"....
if it succeeds you gain nothing but obfuscate a complex code base even further.
Code: Select all
let mask = if is_rook {
create_rook_mask(square)
} else {
create_bishop_mask(square)
};
let bits = mask.count_ones();
let permutations = 2u64.pow(bits);
let end_offset = offset + permutations - 1;
...
Code: Select all
let m = if isr { crrm(s) } else { crbm(s) };
let b = m.count_ones();
let p = 2u64.pow(b);
let e = o + p - 1;
Code: Select all
let m = if isr { crrm(s) } else { crbm(s) };
let e = o + 2u64.pow(m.count_ones()) - 1;
Code: Select all
fn get_stuff() -> u64 {
let mut number = random::generate();
number += current_time_stamp();
return hash(number);
}
Code: Select all
fn get_stuff() -> u64 {
hash(random::generate() + current_time_stamp())
}
Code: Select all
// gcc -O3 -Wall -m64 test.c -o test -lm -s
// speed test method for different functions
#include <stdio.h>
#include <stdint.h>
#include <time.h>
#define rank_of(sq) ((sq) >> 3)
#define file_of(sq) ((sq) & 0x7)
#define _max(a, b) ((a) >= (b) ? (a) : (b))
#define _min(a, b) ((a) < (b) ? (a) : (b))
const int PushToEdges[64] = {
100, 90, 80, 70, 70, 80, 90, 100,
90, 70, 60, 50, 50, 60, 70, 90,
80, 60, 40, 30, 30, 40, 60, 80,
70, 50, 30, 20, 20, 30, 50, 70,
70, 50, 30, 20, 20, 30, 50, 70,
80, 60, 40, 30, 30, 40, 60, 80,
90, 70, 60, 50, 50, 60, 70, 90,
100, 90, 80, 70, 70, 80, 90, 100
};
enum { RANK_1, RANK_2, RANK_3, RANK_4, RANK_5, RANK_6, RANK_7, RANK_8, N_RANK };
enum { FILE_A, FILE_B, FILE_C, FILE_D, FILE_E, FILE_F, FILE_G, FILE_H, N_FILE };
inline int edge_distance_f(int f) { return _min(f, (FILE_H - f)); }
inline int edge_distance_r(int r) { return _min(r, (RANK_8 - r)); }
inline int push_to_edge(int s) {
int rd = edge_distance_r(rank_of(s)), fd = edge_distance_f(file_of(s));
return 90 - (7 * fd * fd / 2 + 7 * rd * rd / 2);
}
uint64_t time_in_ms() {
struct timespec t;
clock_gettime(CLOCK_MONOTONIC, &t);
return (uint64_t)t.tv_sec * 1000 + t.tv_nsec / 1000000;
}
int main() {
uint64_t c,d,e;
int start;
printf("10 second test ...\n");
start = time_in_ms();
e = 0;
while((time_in_ms() - start) < 10000) {
for (c = 0; c < 64; c++) {
d = PushToEdges[c];
e++;
}
}
printf(" PushToEdges %I64d\n",e);
start = time_in_ms();
e = 0;
while((time_in_ms() - start) < 10000) {
for (c = 0; c < 64; c++) {
d = push_to_edge(c);
e++;
}
}
printf(" push_to_edge %I64d\n",e);
return 0;
}
nonsense, the compiler completely eliminates the inner loops, so it boils down to nothing (busy-loop for 10 seconds).D Sceviour wrote: ↑Tue Mar 10, 2020 8:30 pm Often, I test the speed of different function methods. The results indicate the fixed array PushToEdges[] seems to be faster on average:
10 second test ...
PushToEdges 16519006400
push_to_edge 16610155136
Code: Select all
10 second test ...
PushToEdges 3889291392
push_to_edge 2689210048
It is not nonsense. For one thing, you can understand the busy-loop. I only spent about 20 minutes writing this example, and there are better ways to do it. The timer test could be performed with much less frequency to give the busy-loop more activity. Still, I have been using this type of method for a long time and have no complaints in the methodology. The method does not describe the effects of internal optimization of the caches which could have an effect on results. However, that would be more due to hardware than software.mar wrote: ↑Tue Mar 10, 2020 8:39 pmnonsense, the compiler completely eliminates the inner loops, so it boils down to nothing (busy-loop for 10 seconds).D Sceviour wrote: ↑Tue Mar 10, 2020 8:30 pm Often, I test the speed of different function methods. The results indicate the fixed array PushToEdges[] seems to be faster on average:
10 second test ...
PushToEdges 16519006400
push_to_edge 16610155136
In a real chess program with many instances fighting for the caches, things may turn out differently than a tiny loop test suggests.D Sceviour wrote: ↑Tue Mar 10, 2020 8:30 pm Often, I test the speed of different function methods. The results indicate the fixed array PushToEdges[] seems to be faster on average:
10 second test ...
PushToEdges 16519006400
push_to_edge 16610155136
I agree that optimization is a problem for comparison of small arrays and small loops. However, I used the timer type test to compare large syzygy reads with coded endgame table base calculations and demonstrated syzygy was about 30,000 times slower.Gerd Isenberg wrote: ↑Tue Mar 10, 2020 9:16 pmIn a real chess program with many instances fighting for the caches, things may turn out differently than a tiny loop test suggests.D Sceviour wrote: ↑Tue Mar 10, 2020 8:30 pm Often, I test the speed of different function methods. The results indicate the fixed array PushToEdges[] seems to be faster on average:
10 second test ...
PushToEdges 16519006400
push_to_edge 16610155136
Even small array lookups will throw other cachelines out of L1, computation may be done concurrently with other instructions improving ipc.
The optimizer can remove what we are testing because the results are unused.D Sceviour wrote: ↑Tue Mar 10, 2020 8:30 pm Often, I test the speed of different function methods. The results indicate the fixed array PushToEdges[] seems to be faster on average:
10 second test ...
PushToEdges 16519006400
push_to_edge 16610155136
Code: Select all
int main() {
uint64_t c,d,e;
int start;
printf("10 second test ...\n");
start = time_in_ms();
e = 0;
d = 0;
while((time_in_ms() - start) < 10000) {
for (c = 0; c < 64; c++) {
d += PushToEdges[c];
e++;
}
}
printf(" PushToEdges %ld\n",e);
printf(" prevent optimization removal %ld\n",d);
start = time_in_ms();
e = 0;
d = 0;
while((time_in_ms() - start) < 10000) {
for (c = 0; c < 64; c++) {
d += push_to_edge(c);
e++;
}
}
printf(" push_to_edge %ld\n",e);
printf(" prevent optimization removal %ld\n",d);
return 0;
}