Chess AI engine in 5 years.

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
towforce
Posts: 12511
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: Graham Laight

Re: Chess AI engine in 5 years.

Post by towforce »

Dann Corbit wrote: Sun Oct 13, 2024 1:13 pm...I remember back when CPUs were around 800MHz, there were predictions in this forum that 1GHZ was impossible due to trace creep on the ICs.

I remember Robert Hyatt saying that the upper limit is approximately 1 Ghz, and I believed him because, as a professor in the subject, he was an authority. As best I remember the argument was something along the lines that at this frequency, too much of the electrical power would turn into electromagnetic waves.

However knowledge is generated, though, be it game tress, hand coded evaluations or NN evaluations, it has diminishing returns. Any kind of knowledge (not just chess) has diminishing returns - hence chatbots will cease to get significantly better in 20-30 years (link).

The value of increasing amounts of knowledge follows an "S" curve:

* at first, you don't get much return from adding knowledge to your system

* at some point, you start getting tremendous returns

* at some point after that, the returns diminish to the point at which you could multiply the amount of knowledge by a thousand, and your system would barely be any better
Human chess is partly about tactics and strategy, but mostly about memory
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Chess AI engine in 5 years.

Post by Dann Corbit »

Viz wrote: Sun Oct 13, 2024 1:58 pm
Dann Corbit wrote: Sun Oct 13, 2024 1:13 pm Software improvements by the SF team are focused on improving the net to a large degree
This is just false.
When we had HCE improvements in strength were 80% search / 20% eval.
With NNUE and after we stabilized it a bit it's more like 60-70% search and the rest is eval. Which still means that search improves on average signifficantly faster than evaluation.
I stand corrected because you are certainly in a position to know better than me.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Jouni
Posts: 3651
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Chess AI engine in 5 years.

Post by Jouni »

It's stunning, that there has not been much clock speed progress after 2000. Even Pentium 4 has 3,8 Ghz! But running 128 cores simultaneously gives sure gain :D .
Jouni
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Chess AI engine in 5 years.

Post by Dann Corbit »

We also get a lot of gain from shrinking component size. At some point that will end, because clearly we cannot make a trace smaller than one atom in width.
;-)
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Viz
Posts: 223
Joined: Tue Apr 09, 2024 6:24 am
Full name: Michael Chaly

Re: Chess AI engine in 5 years.

Post by Viz »

Dann Corbit wrote: Sun Oct 13, 2024 3:40 pm
Viz wrote: Sun Oct 13, 2024 1:58 pm
Dann Corbit wrote: Sun Oct 13, 2024 1:13 pm Software improvements by the SF team are focused on improving the net to a large degree
This is just false.
When we had HCE improvements in strength were 80% search / 20% eval.
With NNUE and after we stabilized it a bit it's more like 60-70% search and the rest is eval. Which still means that search improves on average signifficantly faster than evaluation.
I stand corrected because you are certainly in a position to know better than me.
Well you can see it yourself.
https://tests.stockfishchess.org/tests/ ... 47d953c2c5
https://tests.stockfishchess.org/tests/ ... 47d953c2c3
https://github.com/Disservin/Stockfish/ ... 9766db8139
Net is still the same as it was in sf 17 (there is a pending pull request for a new one but it's not included in this test) - and look at all this commits, lol.
User avatar
Rebel
Posts: 7382
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Chess AI engine in 5 years.

Post by Rebel »

Viz wrote: Sun Oct 13, 2024 12:17 pm
Jouni wrote: Sun Oct 13, 2024 10:33 am The progress has almost stopped now (CCRL):

Stockfish 17 64-bit 8CPU 3808
Stockfish 16 64-bit 8CPU 3807
Stockfish 16.1 64-bit 8CPU 3804
Stockfish 15 64-bit 8CPU 3802
It's only because CCRL isn't a suitable list for the top engines and top is killed by elo compression.
If you look at play with unbalanced position there is a reasonable progress.
Well, even with balanced positions there is, just that CCRL has too big of error bars and long time controls to show -
https://tests.stockfishchess.org/tests/ ... f9b33d15a0
You may say that 7 elo is not a lot but in actual fact it wins 4,5 times more game pairs than it loses, which is a big deal. Just that it doesn't look as impressive if you draw 95% of the time.
Exactly.

We just need a new elo system that deals with that.
90% of coding is debugging, the other 10% is writing bugs.
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Chess AI engine in 5 years.

Post by Mike Sherwin »

PeSTO demonstrated that a learned piece-square table can produce an engine that is above 3000 elo. Personally I believe the apparent strength of PeSTO is more due to the weakness of HCE than the superiority of PeSTO. I don't understand NNUE very well but from what I have read it is a piece-square table for every position of the kings. How the NN manipulates the tables to get an eval I have no idea. So I probably should not be commenting. In my experience static piece-square tables don't understand chess. However, combining something better than flawed HCE with a far superior search produces the strongest engines so far.

That is not going to be proved good enough. Real-time learning is the future. More and more depth in search gives diminishing returns. If an engine searches deeper than it needs to in any line, that is wasted search. If an engine can use that extra time before the search to learn, making a piece-square table, 'smarter', then it will search, 'smarter'.

There are many ways that can be done. MC could be used for that. Shallow searched a/b games with RL can be used for that. For an engine like RomiChess (CCRL 2400) in any given position only needed 20 games against Glaurung2 to go from 5% to 95% performance. IIRC that is a 1;000 elo swing.

I have said all this many times already but no one will believe me. Or there are too many bandwagon riders and not enough innovators!
User avatar
hgm
Posts: 28387
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Chess AI engine in 5 years.

Post by hgm »

Mike Sherwin wrote: Mon Oct 14, 2024 5:11 am I don't understand NNUE very well but from what I have read it is a piece-square table for every position of the kings. How the NN manipulates the tables to get an eval I have no idea.
Nearly right, but it is actually a large number (256) of tables for each position of the king. The subsequent NN decides how to weight the scores from these tables.

Note that PST can be used to mimic other things we typically calculate in HCE. Such as the game phase (when you fill the table with the same value for every square). Or the square shade a Bishop is on. This is why you need so many; it improves the chances that, starting from a random initialization, some of these will learn to recognize a meaningful feature of the position, and the later parts of the NN then learns how to combine these features.
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Chess AI engine in 5 years.

Post by Mike Sherwin »

hgm wrote: Mon Oct 14, 2024 9:51 am
Mike Sherwin wrote: Mon Oct 14, 2024 5:11 am I don't understand NNUE very well but from what I have read it is a piece-square table for every position of the kings. How the NN manipulates the tables to get an eval I have no idea.
Nearly right, but it is actually a large number (256) of tables for each position of the king. The subsequent NN decides how to weight the scores from these tables.

Note that PST can be used to mimic other things we typically calculate in HCE. Such as the game phase (when you fill the table with the same value for every square). Or the square shade a Bishop is on. This is why you need so many; it improves the chances that, starting from a random initialization, some of these will learn to recognize a meaningful feature of the position, and the later parts of the NN then learns how to combine these features.
Thanks for the explanation. It still sounds static. Once the NN is trained with a billion positions it just creates static tables in that one huge set of generalized values are used for each specific position. That cannot be optimal. Spending some time in the beginning of a search to learn better values for the tables for the specific position on the board will destroy a static only NN.
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Chess AI engine in 5 years.

Post by Mike Sherwin »

Mike Sherwin wrote: Mon Oct 14, 2024 5:11 pm
hgm wrote: Mon Oct 14, 2024 9:51 am
Mike Sherwin wrote: Mon Oct 14, 2024 5:11 am I don't understand NNUE very well but from what I have read it is a piece-square table for every position of the kings. How the NN manipulates the tables to get an eval I have no idea.
Nearly right, but it is actually a large number (256) of tables for each position of the king. The subsequent NN decides how to weight the scores from these tables.

Note that PST can be used to mimic other things we typically calculate in HCE. Such as the game phase (when you fill the table with the same value for every square). Or the square shade a Bishop is on. This is why you need so many; it improves the chances that, starting from a random initialization, some of these will learn to recognize a meaningful feature of the position, and the later parts of the NN then learns how to combine these features.
Thanks for the explanation. It still sounds static. Once the NN is trained with a billion positions it just creates static tables in that one huge set of generalized values are used for each specific position. That cannot be optimal. Spending some time in the beginning of a search to learn better values for the tables for the specific position on the board will destroy a static only NN.
A first experiment for pre search learning might be to create PeSTO tables with a range of values around the static values. Use MC games to modify those values within the range. See if it results in a stronger engine.