Heh yea, I perfectly understand with regards to the unknown code execution.
Gigantua: 1.5 Giganodes per Second per Core move generator
Moderator: Ras
-
- Posts: 24
- Joined: Tue Mar 16, 2021 11:11 pm
- Full name: Het Satasiya
-
- Posts: 10874
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Gigantua: 1.5 Giganodes per Second per Core move generator
124 million is the number of times he needs to call make move based on my understanding but he does additional work after the last make move that is counting the number of legal moves based on my understanding so 124 million nodes is misleading because there is some work that he does not need for these 124 million nodes.R. Tomasi wrote: ↑Wed Sep 29, 2021 1:25 am I feel compelled to tell you again, that you're calculating the speed of your movegen in a misleading way: Let's take the start position, for example.
A depth 7 perft of that position traverses 3.20 billion leafs and 124 million nodes. If on your machine it takes 2736ms that means that your movegen is running at 45.3 MN/s. Please stop deceiving people by using a flawed calculation (elsewise one has to wonder if you're doing it on purpose).
-
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: Gigantua: 1.5 Giganodes per Second per Core move generator
A Node in a tree is a well defined concept. If you dont get it stop trolling in this thread. Movegen is about knowing all positions of a chess board up to depth N.
Just measure the milliseconds against other movegens.
You are dense on purpose - Idk why this is even a discussion... its a well defined concept and literally all movegens including mine know what a node is...
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Daniel Inführ - Software Developer
-
- Posts: 10874
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Gigantua: 1.5 Giganodes per Second per Core move generator
I know from reading source code that chess programs usually increase nodes by one when they make a move and not when they generate moves so the poster is right that you do not use the usual definition of a node.dangi12012 wrote: ↑Wed Sep 29, 2021 10:12 amA Node in a tree is a well defined concept. If you dont get it stop trolling in this thread. Movegen is about knowing all positions of a chess board up to depth N.
Just measure the milliseconds against other movegens.
You are dense on purpose - Idk why this is even a discussion... its a well defined concept and literally all movegens including mine know what a node is...
From Stockfish's code
part of
Code: Select all
do_move(Move m, StateInfo& newSt, bool givesCheck)
is
Code: Select all
thisThread->nodes.fetch_add(1, std::memory_order_relaxed);
Of course your move generator is faster than by a big factor relative to other move generators but it is not a reason to use a different definition of a node.
Node is not equivalent to legal move.
-
- Posts: 307
- Joined: Wed Sep 01, 2021 4:08 pm
- Location: Germany
- Full name: Roland Tomasi
Re: Gigantua: 1.5 Giganodes per Second per Core move generator
Indeed it is a well-defined concept. And you're using a different definition than the one of that well-defined concept. The funny part is, that even by your definition (leafs count aswell) you're having it wrong: you would need to count the non-leaf nodes aswell, then, because elsewise your NPS will drop with search depth (because the deeper you search the higher the ratio of non-leaf nodes to leafs becomes). It's you who does quite evidently not get the concept of "node".dangi12012 wrote: ↑Wed Sep 29, 2021 10:12 amA Node in a tree is a well defined concept. If you dont get it stop trolling in this thread. Movegen is about knowing all positions of a chess board up to depth N.
Just measure the milliseconds against other movegens.
You are dense on purpose - Idk why this is even a discussion... its a well defined concept and literally all movegens including mine know what a node is...
-
- Posts: 3318
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Gigantua: 1.5 Giganodes per Second per Core move generator
AFAIK it is not uncommon to count moves/s for perft, Ankan's perft gpu made >20 GigaMoves/s, I guess people like to discuss how these perft moves/s translate to actual chess engine's nodes/s, and w/o the source code we won't be able to judge about that, cos you claim an alternative, incremental implementation.
--
Srdja
--
Srdja
-
- Posts: 307
- Joined: Wed Sep 01, 2021 4:08 pm
- Location: Germany
- Full name: Roland Tomasi
Re: Gigantua: 1.5 Giganodes per Second per Core move generator
Well, counting moves/s is perfectly fine. Just don't label it nodes/s in that case.smatovic wrote: ↑Wed Sep 29, 2021 1:34 pm AFAIK it is not uncommon to count moves/s for perft, Ankan's perft gpu made >20 GigaMoves/s, I guess people like to discuss how these perft moves/s translate to actual chess engine's nodes/s, and w/o the source code we won't be able to judge about that, cos you claim an alternative, incremental implementation.
--
Srdja
-
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: Gigantua: 1.5 Giganodes per Second per Core move generator
Is Ankan Banerjee still active? His code does not compile with VS 2019 and current Cuda. https://github.com/ankan-ban/perft_gpu/issuessmatovic wrote: ↑Wed Sep 29, 2021 1:34 pm AFAIK it is not uncommon to count moves/s for perft, Ankan's perft gpu made >20 GigaMoves/s, I guess people like to discuss how these perft moves/s translate to actual chess engine's nodes/s, and w/o the source code we won't be able to judge about that, cos you claim an alternative, incremental implementation.
--
Srdja
I tested your ZetaDva.
If I enter go depth 6 - it crashes.
If I enter sd 6 & perft - it prints nodecount:119060324, seconds: 9.903000, nps: 12022652
I tested you Zeta.
computing perft depth 5 -
nodecount:4865609, seconds: 20.607000, nps: 236114
RTX 3080 and Ryzen 5950x.
Maybe I will move my movegen to CUDA. Its currently the fastest movegen by a lot. Others need hashing to still be slower. Others need 32 Cores and are still slower. This is almost finished and I literally have 5 *if* statements left in my movegen.
Maybe I can remove them all. Then the core would be 100% branchless.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Daniel Inführ - Software Developer
-
- Posts: 3318
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Gigantua: 1.5 Giganodes per Second per Core move generator
Ankan did really some magic with perft 15:dangi12012 wrote: ↑Wed Sep 29, 2021 2:37 pmIs Ankan Banerjee still active? His code does not compile with VS 2019 and current Cuda. https://github.com/ankan-ban/perft_gpu/issuessmatovic wrote: ↑Wed Sep 29, 2021 1:34 pm AFAIK it is not uncommon to count moves/s for perft, Ankan's perft gpu made >20 GigaMoves/s, I guess people like to discuss how these perft moves/s translate to actual chess engine's nodes/s, and w/o the source code we won't be able to judge about that, cos you claim an alternative, incremental implementation.
--
Srdja
I tested your ZetaDva.
If I enter go depth 6 - it crashes.
If I enter sd 6 & perft - it prints nodecount:119060324, seconds: 9.903000, nps: 12022652
I tested you Zeta.
computing perft depth 5 -
nodecount:4865609, seconds: 20.607000, nps: 236114
RTX 3080 and Ryzen 5950x.
Maybe I will move my movegen to CUDA. Its currently the fastest movegen by a lot. Others need hashing to still be slower. Others need 32 Cores and are still slower. This is almost finished and I literally have 5 *if* statements left in my movegen.
Maybe I can remove them all. Then the core would be 100% branchless.
https://www.chessprogramming.org/Perft#15
http://www.talkchess.com/forum3/viewtop ... =4#p729152
AFAIK he wrote the cudnn and DX12 backends for Lc0, not sure if he is currently
involved in computer chess.
My perfts are not optimized and for testing the move gen against perft results
primarily, the Zeta perft move gen runs only on one SIMD unit of the GPU, both
engines speak only xboard protocol.
Maybe my Kogge-Stone vector-based move gen is of interest for you?
https://zeta-chess.app26.de/post/64-bit ... tor-based/
--
Srdja
-
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: Gigantua: 1.5 Giganodes per Second per Core move generator
https://zeta-chess.app26.de/post/64-bit ... tor-based/
This looks interesting I can respect originality. The thing about CUDA or GPGPU is that a gpu thread is not comparable to a cpu thread. Each warp (32 threads) all do the same instruction at the same time. So each thread must execute the same instruction as the current warp or stall. So branches are off the table. But its perfect if you have 1E6 Positions and want to populate all possible Slider moves since they are all the same.
Different topic - very interesting. Will look at gpu side once this is finished since I already implemented much code in CUDA and its really promising. The thing to keep in mind is that every 64 bit instruction needs to be emulated by at least 2x 32 bit (boolean arithmetic) or 6x 32 bit instructions for multiply or 3x 32 bit for add.
But the movegen speed of gpu vs cpu can easily be approximated to be the factor between the two "64-Bit Integer IOPS"
https://onedrive.live.com/?cid=13641B43 ... 20&o=OneUp
This looks interesting I can respect originality. The thing about CUDA or GPGPU is that a gpu thread is not comparable to a cpu thread. Each warp (32 threads) all do the same instruction at the same time. So each thread must execute the same instruction as the current warp or stall. So branches are off the table. But its perfect if you have 1E6 Positions and want to populate all possible Slider moves since they are all the same.
Different topic - very interesting. Will look at gpu side once this is finished since I already implemented much code in CUDA and its really promising. The thing to keep in mind is that every 64 bit instruction needs to be emulated by at least 2x 32 bit (boolean arithmetic) or 6x 32 bit instructions for multiply or 3x 32 bit for add.
But the movegen speed of gpu vs cpu can easily be approximated to be the factor between the two "64-Bit Integer IOPS"
https://onedrive.live.com/?cid=13641B43 ... 20&o=OneUp
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Daniel Inführ - Software Developer