emadsen wrote: ↑Wed Dec 22, 2021 4:51 am
Chessnut1071 wrote: ↑Mon Dec 20, 2021 11:57 pm
Got my bit boards routine working and surprised by the results compared to my mailboxes. I was expecting very little reduction in speed, thinking my mailbox engine was pretty efficient. Results as follows:
Average over 10,000,000 runs.
Mailboxes = 2,272,507 nodes per second
bitboards = 13,843,647 nodes per second
I don't think I have a standard bitboard since I only use bit operations on the sliding pieces.
What bitboard technique are you using for sliders? Have you implemented magic bitboards? If so, 13M NPS is a bit slow. I can't remember exactly, but when I first got move generation functioning, prior to implementing all the bookkeeping necessary to sort moves, prior to staged move generation, etc... I think I got 50M NPS in C# using magic bitboards for sliders. On an AMD Ryzen 1950X.
federico wrote: ↑Wed Dec 22, 2021 2:43 am
Most of the time is spent on search and evaluation. Better search and/or eval, less cpu time will be spent on movegen. Movegen is minor piece on any decent engine. More so with staged movegens, where most moves aren't even generated, or engines that simply try the move from TT without generating any moves at all.
This is not a novelty. Bob Hyatt said it long ago:
"
Speed here is not so important. I doubt anyone's move generator takes more than 10% of total search time, which means a 20% improvement in perft numbers is only a 2% overall speed gain. I would not worry about anything but matching the node counts exactly..."
https://www.chessprogramming.org/Perft#Quotes
That's correct. Perft is a regression test.
I have a very simple bitboard, a precomputed move vector (8) for each of the 64 squares. I just use the intrinsics TrailingZeros and LeadigZeros after I XOR out the last move to find the next move. I stop at a blocker.
btw, that 13.8 million was in debug mode. Also, I capture the following data for each move:
1 check 2 double check 3 discovered check 4 pin 5 ent passant 6 pawn promotion 7 move direction 8 capture 9 defender.
My speed should about 2x in release mode, however, I have to copy all my precomputed tables into the release directory before I can test there.
I haven't tried magic bitboards because I don't know enough about them to program that yet. Once I have the bitboards fully implemented without error, I'll try that next.
We're planning on using a chess engine as one of the algorithms to test on a quantum computer. Translating these C# instruction to machine code is well above my pay grade. Are any of you developers doing an engine at the assembler level?