Gigantua: 1.5 Giganodes per Second per Core move generator

Karlo Bala · Post by **Karlo Bala** » Sat Sep 25, 2021 10:14 pm

smatovic wrote: ↑Sat Sep 25, 2021 8:42 pm
Mergi wrote: ↑Sat Sep 25, 2021 8:18 pm ...
And if this generator actually turns out to be usable, there would surely be some ELO gains as well.
I think that is the point, the author says his approach might be useful in a Monte Carlo search, hence I understand if he wants to keep the source closed, to develop that approach first and maybe then release....

--
Srdja

It is a totally flawed concept. The pure MC (or UCB1) fails badly in chess, but let's assume there is potential. Would you rather waste 2 years on movegen, or try with whatever movegen you have just to test the idea. If the idea is good then you can optimize further, but if the idea fails you save a lot of unnecessary work.

R. Tomasi · Post by **R. Tomasi** » Sat Sep 25, 2021 10:16 pm

Karlo Bala wrote: ↑Sat Sep 25, 2021 10:07 pm
Mergi wrote: ↑Sat Sep 25, 2021 8:18 pm
Karlo Bala wrote: ↑Sat Sep 25, 2021 8:01 pm I don't understand this insane obsession with the perft. My perft is just 30% faster than the search since it uses a full eval at every node and a staged move generator. The only important thing is to test for the correctness of other parts of a chess engine, so who cares about the perft speed at all?!?

Perhaps admins should open the sub-forum talkperft.
What's the difference between ELO and fast perft? In the end, both are just numbers. One person likes to watch the ELO number grow and another likes the perft number more.

And if this generator actually turns out to be usable, there would surely be some ELO gains as well.
It seems I was in a big delusion when I thought that chess engines are used for playing and analyzing. Now there is a whole new dimension, watching engines elo rise

Usable or not, when one adds all that is needed to build a complete chess engine, the move generator looks much much different.

What do you think, how much elo worth is twice as fast move generator if for example, move generator eats 10% of the engine time?

It's not only a question of getting some tiny ELO advantage (I would guess in your example it's not even 5 ELO), but also maintainability. I'm prepared to bet that at the end of the day, a move generator that is easier to integrate/results in better maintainable code will be worth more than those 5 ELO, since it makes it easier to apply changes/concepts that really win some measurable strength.

That's the basic problem/issue that I see with OPs movegen (as I tried pointing out earlier): the tiny speed-up that you get by not using movelists will come haunt you every step of the way when you want to use TT, killers, and all the other fancy search tricks that we use in modern engines.

Karlo Bala · Post by **Karlo Bala** » Sat Sep 25, 2021 10:43 pm

R. Tomasi wrote: ↑Sat Sep 25, 2021 10:16 pm
Karlo Bala wrote: ↑Sat Sep 25, 2021 10:07 pm
Mergi wrote: ↑Sat Sep 25, 2021 8:18 pm
Karlo Bala wrote: ↑Sat Sep 25, 2021 8:01 pm I don't understand this insane obsession with the perft. My perft is just 30% faster than the search since it uses a full eval at every node and a staged move generator. The only important thing is to test for the correctness of other parts of a chess engine, so who cares about the perft speed at all?!?

Perhaps admins should open the sub-forum talkperft.
What's the difference between ELO and fast perft? In the end, both are just numbers. One person likes to watch the ELO number grow and another likes the perft number more.

And if this generator actually turns out to be usable, there would surely be some ELO gains as well.
It seems I was in a big delusion when I thought that chess engines are used for playing and analyzing. Now there is a whole new dimension, watching engines elo rise

Usable or not, when one adds all that is needed to build a complete chess engine, the move generator looks much much different.

What do you think, how much elo worth is twice as fast move generator if for example, move generator eats 10% of the engine time?
It's not only a question of getting some tiny ELO advantage (I would guess in your example it's not even 5 ELO), but also maintainability. I'm prepared to bet that at the end of the day, a move generator that is easier to integrate/results in better maintainable code will be worth more than those 5 ELO, since it makes it easier to apply changes/concepts that really win some measurable strength.

That's the basic problem/issue that I see with OPs movegen (as I tried pointing out earlier): the tiny speed-up that you get by not using movelists will come haunt you every step of the way when you want to use TT, killers, and all the other fancy search tricks that we use in modern engines.

That is exactly why I wrote that if you start with a perft, the final movegen will look much much different. Whenever you need to add some functionality to the engine, you will find that movegen is not suitable and should be changed/updated. At the end of the day, you end up with a slow (compared to perft) and totally different movegen.

Sopel · Post by **Sopel** » Sun Sep 26, 2021 12:39 am

The usefulenss of this in itself is none, because we already have ~300Gnps perft available on GPUs. It also sounds like the approach is very specific to perft and might be hard to port to an actual chess engine, for example there is an assumption that moves are used and discarded, whereas in an actual engines they are scored, played, put in the histories, put in the TT, sorted... Whether this will be useful in any way depends on the source code, and so far we've seen none.

BrokenKeyboard · Post by **BrokenKeyboard** » Sun Sep 26, 2021 12:44 am

In regards to sopels point, could you please release the source of your move generator?
Cleaning up your source code has nothing to do with what we can see.
And explanations in English of your code can only go so far in helping us understand exactly what your code is doing. We could even try implementing your movegen into a basic engine like VICE or other to see just how much improvement this can bring to other chess engines.

Uri Blass · Post by **Uri Blass** » Sun Sep 26, 2021 1:01 am

Sopel wrote: ↑Sun Sep 26, 2021 12:39 am The usefulenss of this in itself is none, because we already have ~300Gnps perft available on GPUs. It also sounds like the approach is very specific to perft and might be hard to port to an actual chess engine, for example there is an assumption that moves are used and discarded, whereas in an actual engines they are scored, played, put in the histories, put in the TT, sorted... Whether this will be useful in any way depends on the source code, and so far we've seen none.

I totally disagree.
1)I believe that it can be useful at least to prove that some helpmate problem has no errors and there are no more solutions.
order of moves is not important for the proof.

2)What we have for GPUs is not relevant because not everybody have a GPU and you can also use CPU and GPU at the same time.

Uri Blass · Post by **Uri Blass** » Sun Sep 26, 2021 1:21 am

Karlo Bala wrote: ↑Sat Sep 25, 2021 10:07 pm
Mergi wrote: ↑Sat Sep 25, 2021 8:18 pm
Karlo Bala wrote: ↑Sat Sep 25, 2021 8:01 pm I don't understand this insane obsession with the perft. My perft is just 30% faster than the search since it uses a full eval at every node and a staged move generator. The only important thing is to test for the correctness of other parts of a chess engine, so who cares about the perft speed at all?!?

Perhaps admins should open the sub-forum talkperft.
What's the difference between ELO and fast perft? In the end, both are just numbers. One person likes to watch the ELO number grow and another likes the perft number more.

And if this generator actually turns out to be usable, there would surely be some ELO gains as well.
It seems I was in a big delusion when I thought that chess engines are used for playing and analyzing. Now there is a whole new dimension, watching engines elo rise

Usable or not, when one adds all that is needed to build a complete chess engine, the move generator looks much much different.

What do you think, how much elo worth is twice as fast move generator if for example, move generator eats 10% of the engine time?

I think that if you have a faster move generator by a big factor
then you may design the engine in a different way so assuming the move generator eats 10% of the engine time may be wrong.

With a fast move generator you may write some static evaluation function that generate a lot of moves that is not a good idea with a slow move generator because in that case the evaluation may be too expensive.

Joost Buijs · Post by **Joost Buijs** » Sun Sep 26, 2021 8:55 am

Karlo Bala wrote: ↑Sat Sep 25, 2021 10:43 pm
R. Tomasi wrote: ↑Sat Sep 25, 2021 10:16 pm
Karlo Bala wrote: ↑Sat Sep 25, 2021 10:07 pm
Mergi wrote: ↑Sat Sep 25, 2021 8:18 pm
Karlo Bala wrote: ↑Sat Sep 25, 2021 8:01 pm I don't understand this insane obsession with the perft. My perft is just 30% faster than the search since it uses a full eval at every node and a staged move generator. The only important thing is to test for the correctness of other parts of a chess engine, so who cares about the perft speed at all?!?

Perhaps admins should open the sub-forum talkperft.
What's the difference between ELO and fast perft? In the end, both are just numbers. One person likes to watch the ELO number grow and another likes the perft number more.

And if this generator actually turns out to be usable, there would surely be some ELO gains as well.
It seems I was in a big delusion when I thought that chess engines are used for playing and analyzing. Now there is a whole new dimension, watching engines elo rise

Usable or not, when one adds all that is needed to build a complete chess engine, the move generator looks much much different.

What do you think, how much elo worth is twice as fast move generator if for example, move generator eats 10% of the engine time?
It's not only a question of getting some tiny ELO advantage (I would guess in your example it's not even 5 ELO), but also maintainability. I'm prepared to bet that at the end of the day, a move generator that is easier to integrate/results in better maintainable code will be worth more than those 5 ELO, since it makes it easier to apply changes/concepts that really win some measurable strength.

That's the basic problem/issue that I see with OPs movegen (as I tried pointing out earlier): the tiny speed-up that you get by not using movelists will come haunt you every step of the way when you want to use TT, killers, and all the other fancy search tricks that we use in modern engines.
That is exactly why I wrote that if you start with a perft, the final movegen will look much much different. Whenever you need to add some functionality to the engine, you will find that movegen is not suitable and should be changed/updated. At the end of the day, you end up with a slow (compared to perft) and totally different movegen.

I fully agree.

When your goal is to write a fast move-generator that has to be used with perft only, you will end up with something different than what is needed for a chess-engine. Perft is so small that it will run completely from the cache, when you start using the 'hyper-fast' move-generator in your engine you will find that it suddenly is not so fast anymore.

You have to add all kinds of stuff too, building move-lists and sorting them, incrementally updating hashes and many other things, in the end you will find that the 'hyper-fast' move-generator didn't make a difference at all, and that it probably would have been better to spend your time on other things like e.g. the evaluation.

gflohr · Post by **gflohr** » Mon Sep 27, 2021 8:36 pm

What is this thread actually about? Gossip about the amazing performance aside, the only concrete information I could gain is that

a) the author is trying to fund money for luring people into downloading and running(???) an unsigned executable
b) the author's name is probably really Daniel because his sources are stored under C:\Users\Daniel\source\repos\ChessCpp\Gigantua

Upload the sources of your project, and I will maybe be interested. For now I am not. And, by the way, why do you limit the perft depth to 12? Will the future Stockfish contender also limit its search depth to 12?

dangi12012 · Post by **dangi12012** » Mon Sep 27, 2021 9:58 pm

gflohr wrote: ↑Mon Sep 27, 2021 8:36 pm What is this thread actually about? Gossip about the amazing performance aside, the only concrete information I could gain is that

a) the author is trying to fund money for luring people into downloading and running(???) an unsigned executable
b) the author's name is probably really Daniel because his sources are stored under C:\Users\Daniel\source\repos\ChessCpp\Gigantua

Upload the sources of your project, and I will maybe be interested. For now I am not. And, by the way, why do you limit the perft depth to 12? Will the future Stockfish contender also limit its search depth to 12?

You can run the executable in a sandbox. Its the current build.
Anyways its really fun to optimize stuff - because at that performance level every single AND instruction that can be removed adds around 10 Meganodes/s. These are just the low hanging fruit. You really want to see the sourcecode? Why do you want to see it?
My first ever error free movegen in C# using Arrays and Dictionaries had a total performance of 3Meganodes/s. Now its literally 600x faster and hashing is still an option. (I loop over the square of every piece anyways to expand the move - so zobrist would be a single lookup then and there. But I will release before that)

Its was limited to 12 because I had a problem with comdat folding. But its fixed now.
Also IMO Bitboards will beat Mailslot/Array lookup forever.

Something like this needs 4 instructions and will check every enemy pawn for check or castling obstrusion. So 8 pawns and 2 directions and still literally only 4 instructions. Zero Ifs - Zero for loops. (2x AND, 2x Shift).

Code: Select all

Pawn_AttackLeft<enemy>(Pawns<enemy>(brd) & Pawns_NotLeft());
Pawn_AttackRight<enemy>(Pawns<enemy>(brd) & Pawns_NotRight());

Also maybe it will help someone - the fastest way I have found to loop over all set bits in a uint64_t in terms of squares is this:

Code: Select all

_Inline static Bit PopBit(uint64_t& val)
{
    uint64_t lsb = _blsi_u64(val);
    //val = _blsr_u64(val); - 3% slower.
    val ^= lsb; //This is faster than blsr_u64
    return lsb;
}

_Inline static Square SquareOf(Bit val) {
    return _tzcnt_u64(val);
}

//Code would be:
while(Rooks){
	const Bit pos = PopBit(Rooks);
        const Square sq = SquareOf(pos);
        //Square is 0..63
}

Anyways the version is out on github with the git repo tiedied up for release soon!
Latest Release v1.2 now 20% faster. https://github.com/Gigantua/Gigantua

Gigantua: 1.5 Giganodes per Second per Core move generator

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Re: Gigantua: 1.5 Giganodes per Second per Core move generator

Re: Gigantua: 1.5 Giganodes per Second per Core move generator