Version 0.9.12 was a stable and good version but was riddled with memory leaks, partial code in C++ but most in C, and any future version (0.9.14 till 0.9.20) had issues in compiling on Mac and Windows. I decided thus to restart from scratch in plain C++ and hoped that I progressed faster. Unfortunately, my C++ knowledge was rusty and it took up to version 2.0.8.2 to have something more or less stable and with more than a pure random mover (2.0.0).
Since that, 2.1.0 performed very well but it was time to start working on the position evaluator. Doing this, the NPS got down to only 40% of the speed in 2.1.2... All performance loss located in 2 functions: calcHash and calcPieces.... On top of this, Gabor Szots announced that version 2.1.2 was faster in 32 bit mode than in 64 bit mode. Strangely enough, 2.1.1 performed better on my tests while on CCRL lists, it performed much worse than 2.1.0. Time for an evaluation.
Please find below the NPS reading for different versions:
Code: Select all
Performance results
after xboard + new + perft 5 commands
Nodes: 4865609
Linux 64 bit - Mint - LLVM 10.0 compiles
Version - bits - time - NPS
2.0.8.2 - 32 - 4.581 - 1062128 (c)
2.0.8.2 - 64 - 3.819 - 1274053 (c)
2.0.9 - 32 - 4.587 - 1060738 (c)
2.0.9 - 64 - 3.778 - 1287879 (c)
2.1.1.2 - 32 - 5.778 - 841999
2.1.1.2 - 64 - 4.704 - 1034237
2.1.2 - 32 - 12.787 - 380502
2.1.2 - 64 - 10.599 - 459053
2.1.2.2 - 32 - 13.231 - 367726
2.1.2.2 - 64 - 11.356 - 428430
Linux 32 bit - Mint - LLVM 10.0 compiles
Linux 64 bit - Mint - GCC 9.3 optimized compiles
2.1.2.2 - 64 - 7.547 - 644623
VM tests (VBOX 2GB 2 cores)
GCC 4.9.2 compiles
Win7-32 - 2.1.2 - 32 - 1:06.859 - 72773
Win7-32 - 2.1.2.2 - 32 - 1:07.125 - 72485
Win7-64 - 2.0.9 - 32 - 27.546 - 176635 (c)
Win7-64 - 2.0.9 - 64 - 32.500 - 149711 (c)
Win7-64 - 2.1.2 - 32 - 1:08.205 - 71337
Win7-64 - 2.1.2 - 64 - 1:08.828 - 70692
Win10-64 - 2.1.2.2 - 64 - 1:18:437 - 62031
Window 32 bit - GCC 4.9.2 compiles
Window 64 bit - GCC 8.1 compiles
Win7-64 - 2.1.2.1 - 64 - 20.562 - 236625
Win7-64 - 2.1.2.2 - 64 - 21.671 - 224512
Win10-64 - 2.1.2.2 - 32 - 1:13.112 - 66550
Win10-64 - 2.1.2.2 - 64 - 18.890 - 257569
VM tests (VBOX 4GB 4 cores)
GCC 4.9.2 compiles
Win7-32 - 2.1.2 - 32 - 1:04.859 - 75362
Wine tests (6.8 staging)
Wine-6.8 - 2.1.2.2 - 32 - 1:09.206 - 70305
I have reposted release 2.1.2 for windows 64 as 2.1.2.2 which is compiled by GCC 8.1.
Upcoming speedups:
I have to upgrade my windows toolchain to GCC 9/10, also for 32 bit. (400% performance gain)
I have to do tests with LLVM compiles as well, and play with compile options beyond -O3.
I have to change my board representation from uchar[8][8] into int8_t[64] (11% performance gain)
I have to change my piece representation from 'K', 'Q' into int8_t values (15% performance gain)
I have to change my hash calculation and piece calculation into incremental (>25% performance gain)
I have to change my move generation (? gain)
I have to remove move history and other burden from board copy and only copy it when needed - lazy copy (20% performance gain)
Maybe it is time to do makeMove/unMakeMove instead of board copy and move apply in the search.
Other changes:
Even with C++, I still have plenty of memory leaks. ... to be continued
In short, belofte is not finished yet...