Gigantua: 1.5 Giganodes per Second per Core move generator

Discussion of chess software programming and technical issues.

Moderator: Ras

BrokenKeyboard
Posts: 24
Joined: Tue Mar 16, 2021 11:11 pm
Full name: Het Satasiya

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Post by BrokenKeyboard »

spirch wrote: Wed Sep 29, 2021 2:33 am until source code is released i take this thread as a funny imaginary idea, a nice storytelling

and no running unknown code even in a VM / sandbox is not an option for me
Heh yea, I perfectly understand with regards to the unknown code execution.
Uri Blass
Posts: 10876
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Post by Uri Blass »

R. Tomasi wrote: Wed Sep 29, 2021 1:25 am I feel compelled to tell you again, that you're calculating the speed of your movegen in a misleading way: Let's take the start position, for example.

A depth 7 perft of that position traverses 3.20 billion leafs and 124 million nodes. If on your machine it takes 2736ms that means that your movegen is running at 45.3 MN/s. Please stop deceiving people by using a flawed calculation (elsewise one has to wonder if you're doing it on purpose).
124 million is the number of times he needs to call make move based on my understanding but he does additional work after the last make move that is counting the number of legal moves based on my understanding so 124 million nodes is misleading because there is some work that he does not need for these 124 million nodes.
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Post by dangi12012 »

R. Tomasi wrote: Wed Sep 29, 2021 2:27 am
A Node in a tree is a well defined concept. If you dont get it stop trolling in this thread. Movegen is about knowing all positions of a chess board up to depth N.
Just measure the milliseconds against other movegens.

You are dense on purpose - Idk why this is even a discussion... its a well defined concept and literally all movegens including mine know what a node is...
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Uri Blass
Posts: 10876
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Post by Uri Blass »

dangi12012 wrote: Wed Sep 29, 2021 10:12 am
R. Tomasi wrote: Wed Sep 29, 2021 2:27 am
A Node in a tree is a well defined concept. If you dont get it stop trolling in this thread. Movegen is about knowing all positions of a chess board up to depth N.
Just measure the milliseconds against other movegens.

You are dense on purpose - Idk why this is even a discussion... its a well defined concept and literally all movegens including mine know what a node is...
I know from reading source code that chess programs usually increase nodes by one when they make a move and not when they generate moves so the poster is right that you do not use the usual definition of a node.

From Stockfish's code
part of

Code: Select all

do_move(Move m, StateInfo& newSt, bool givesCheck)
do_move(Move m, StateInfo& newSt, bool givesCheck)
is

Code: Select all

thisThread->nodes.fetch_add(1, std::memory_order_relaxed);
In other words it is clear that stockfish increase nodes not when it generates the move and has the list of moves but only when it does the move(it had the move before calling do_move otherwise it could not know m inside the function).

Of course your move generator is faster than by a big factor relative to other move generators but it is not a reason to use a different definition of a node.

Node is not equivalent to legal move.
R. Tomasi
Posts: 307
Joined: Wed Sep 01, 2021 4:08 pm
Location: Germany
Full name: Roland Tomasi

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Post by R. Tomasi »

dangi12012 wrote: Wed Sep 29, 2021 10:12 am
R. Tomasi wrote: Wed Sep 29, 2021 2:27 am
A Node in a tree is a well defined concept. If you dont get it stop trolling in this thread. Movegen is about knowing all positions of a chess board up to depth N.
Just measure the milliseconds against other movegens.

You are dense on purpose - Idk why this is even a discussion... its a well defined concept and literally all movegens including mine know what a node is...
Indeed it is a well-defined concept. And you're using a different definition than the one of that well-defined concept. The funny part is, that even by your definition (leafs count aswell) you're having it wrong: you would need to count the non-leaf nodes aswell, then, because elsewise your NPS will drop with search depth (because the deeper you search the higher the ratio of non-leaf nodes to leafs becomes). It's you who does quite evidently not get the concept of "node".
smatovic
Posts: 3322
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Post by smatovic »

AFAIK it is not uncommon to count moves/s for perft, Ankan's perft gpu made >20 GigaMoves/s, I guess people like to discuss how these perft moves/s translate to actual chess engine's nodes/s, and w/o the source code we won't be able to judge about that, cos you claim an alternative, incremental implementation.

--
Srdja
R. Tomasi
Posts: 307
Joined: Wed Sep 01, 2021 4:08 pm
Location: Germany
Full name: Roland Tomasi

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Post by R. Tomasi »

smatovic wrote: Wed Sep 29, 2021 1:34 pm AFAIK it is not uncommon to count moves/s for perft, Ankan's perft gpu made >20 GigaMoves/s, I guess people like to discuss how these perft moves/s translate to actual chess engine's nodes/s, and w/o the source code we won't be able to judge about that, cos you claim an alternative, incremental implementation.

--
Srdja
Well, counting moves/s is perfectly fine. Just don't label it nodes/s in that case.
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Post by dangi12012 »

smatovic wrote: Wed Sep 29, 2021 1:34 pm AFAIK it is not uncommon to count moves/s for perft, Ankan's perft gpu made >20 GigaMoves/s, I guess people like to discuss how these perft moves/s translate to actual chess engine's nodes/s, and w/o the source code we won't be able to judge about that, cos you claim an alternative, incremental implementation.

--
Srdja
Is Ankan Banerjee still active? His code does not compile with VS 2019 and current Cuda. https://github.com/ankan-ban/perft_gpu/issues

I tested your ZetaDva.
If I enter go depth 6 - it crashes.
If I enter sd 6 & perft - it prints nodecount:119060324, seconds: 9.903000, nps: 12022652

I tested you Zeta.
computing perft depth 5 -
nodecount:4865609, seconds: 20.607000, nps: 236114

RTX 3080 and Ryzen 5950x.
Maybe I will move my movegen to CUDA. Its currently the fastest movegen by a lot. Others need hashing to still be slower. Others need 32 Cores and are still slower. This is almost finished and I literally have 5 *if* statements left in my movegen.
Maybe I can remove them all. Then the core would be 100% branchless.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
smatovic
Posts: 3322
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Post by smatovic »

dangi12012 wrote: Wed Sep 29, 2021 2:37 pm
smatovic wrote: Wed Sep 29, 2021 1:34 pm AFAIK it is not uncommon to count moves/s for perft, Ankan's perft gpu made >20 GigaMoves/s, I guess people like to discuss how these perft moves/s translate to actual chess engine's nodes/s, and w/o the source code we won't be able to judge about that, cos you claim an alternative, incremental implementation.

--
Srdja
Is Ankan Banerjee still active? His code does not compile with VS 2019 and current Cuda. https://github.com/ankan-ban/perft_gpu/issues

I tested your ZetaDva.
If I enter go depth 6 - it crashes.
If I enter sd 6 & perft - it prints nodecount:119060324, seconds: 9.903000, nps: 12022652

I tested you Zeta.
computing perft depth 5 -
nodecount:4865609, seconds: 20.607000, nps: 236114

RTX 3080 and Ryzen 5950x.
Maybe I will move my movegen to CUDA. Its currently the fastest movegen by a lot. Others need hashing to still be slower. Others need 32 Cores and are still slower. This is almost finished and I literally have 5 *if* statements left in my movegen.
Maybe I can remove them all. Then the core would be 100% branchless.
Ankan did really some magic with perft 15:

https://www.chessprogramming.org/Perft#15

http://www.talkchess.com/forum3/viewtop ... =4#p729152

AFAIK he wrote the cudnn and DX12 backends for Lc0, not sure if he is currently
involved in computer chess.

My perfts are not optimized and for testing the move gen against perft results
primarily, the Zeta perft move gen runs only on one SIMD unit of the GPU, both
engines speak only xboard protocol.

Maybe my Kogge-Stone vector-based move gen is of interest for you?

https://zeta-chess.app26.de/post/64-bit ... tor-based/

--
Srdja
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Post by dangi12012 »

https://zeta-chess.app26.de/post/64-bit ... tor-based/

This looks interesting I can respect originality. The thing about CUDA or GPGPU is that a gpu thread is not comparable to a cpu thread. Each warp (32 threads) all do the same instruction at the same time. So each thread must execute the same instruction as the current warp or stall. So branches are off the table. But its perfect if you have 1E6 Positions and want to populate all possible Slider moves since they are all the same.

Different topic - very interesting. Will look at gpu side once this is finished since I already implemented much code in CUDA and its really promising. The thing to keep in mind is that every 64 bit instruction needs to be emulated by at least 2x 32 bit (boolean arithmetic) or 6x 32 bit instructions for multiply or 3x 32 bit for add.

But the movegen speed of gpu vs cpu can easily be approximated to be the factor between the two "64-Bit Integer IOPS"
https://onedrive.live.com/?cid=13641B43 ... 20&o=OneUp
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer