Top programs get today 3-5 elo from core doubling
Moderator: Ras
-
Jouni
- Posts: 3759
- Joined: Wed Mar 08, 2006 8:15 pm
- Full name: Jouni Uski
Top programs get today 3-5 elo from core doubling
No more 100 elo 70 elo or 30 elo as previously! Se here https://prodeo.actieforum.com/t1832-arc ... games-3m2s . When gain is zero chess is solved?
Jouni
-
Ajedrecista
- Posts: 2164
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: Top programmes get today 3-5 Elo from core doubling.
Hello Jouni:
This computer checkers example holds with computer chess IMHO, i.e. pre-NNUE and NNUE eras. We all know that the latest pre-NNUE SF against itself with balanced openings and without time handicaps retrieved a very high draw ratio... were we playing 'near perfectly'? Just compare the current latest SF with the latest pre-NNUE SF with the same conditions and you will see that the current latest SF has a measurable edge over the other, of course (well, I do not know in a 1 week/move TC). I guess we are heading to hit a wall with the current method (there is still room for improvement) until the next major breakthough will appear and we move on again. When and how much good? Time will tell, as always.
Other insights are welcome.
Regards from Spain.
Ajedrecista.
Not really. I always bring the example of English draughts/American checkers case, where 100% of top games were draws for years until a major breakthough appeared. I recall reading somewhere that a top checkers programmer once joked to other that if he really needed 10 seconds/move to play 'perfectly' instead of 1 second/move he was able to. This happened before the major breakthough... I can not find the quote, sadly. I guess that this 10"/move vs. 1"/move comparison somewhat relates to speed gains and core doublings in chess, but 10"/move vs. 1"/move referred about different engines, to be fair.Jouni wrote: ↑Wed Dec 10, 2025 11:46 am No more 100 elo 70 elo or 30 elo as previously! Se here https://prodeo.actieforum.com/t1832-arc ... games-3m2s . When gain is zero chess is solved?
This computer checkers example holds with computer chess IMHO, i.e. pre-NNUE and NNUE eras. We all know that the latest pre-NNUE SF against itself with balanced openings and without time handicaps retrieved a very high draw ratio... were we playing 'near perfectly'? Just compare the current latest SF with the latest pre-NNUE SF with the same conditions and you will see that the current latest SF has a measurable edge over the other, of course (well, I do not know in a 1 week/move TC). I guess we are heading to hit a wall with the current method (there is still room for improvement) until the next major breakthough will appear and we move on again. When and how much good? Time will tell, as always.
Other insights are welcome.
Regards from Spain.
Ajedrecista.
-
royb
- Posts: 574
- Joined: Thu Mar 09, 2006 12:53 am
Re: Top programs get today 3-5 elo from core doubling
This makes one wonder - is it really worth purchasing an expensive CPU with lots of cores to run Stockfish on? With such a rapidly diminishing return - I think that may not be warranted for many people. And of course many CPUs these days are multi-core, but going past 8 or even 4 cores may not provide much benefit in terms of Stockfish playing strength.Jouni wrote: ↑Wed Dec 10, 2025 11:46 am No more 100 elo 70 elo or 30 elo as previously! Se here https://prodeo.actieforum.com/t1832-arc ... games-3m2s . When gain is zero chess is solved?
Makes me think that even a fast single core CPU (not that there are single core CPUs anymore) that has very fast single core performance should be more than enough even for analysis of GM games (since Stockfish is SOOOOOO much stronger than any GM even on a smartphone, let alone a laptop or desktop).
-
Uri Blass
- Posts: 11120
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Top programmes get today 3-5 Elo from core doubling.
I read the following:Ajedrecista wrote: ↑Wed Dec 10, 2025 7:15 pm Hello Jouni:
Not really. I always bring the example of English draughts/American checkers case, where 100% of top games were draws for years until a major breakthough appeared. I recall reading somewhere that a top checkers programmer once joked to other that if he really needed 10 seconds/move to play 'perfectly' instead of 1 second/move he was able to. This happened before the major breakthough... I can not find the quote, sadly. I guess that this 10"/move vs. 1"/move comparison somewhat relates to speed gains and core doublings in chess, but 10"/move vs. 1"/move referred about different engines, to be fair.Jouni wrote: ↑Wed Dec 10, 2025 11:46 am No more 100 elo 70 elo or 30 elo as previously! Se here https://prodeo.actieforum.com/t1832-arc ... games-3m2s . When gain is zero chess is solved?
This computer checkers example holds with computer chess IMHO, i.e. pre-NNUE and NNUE eras. We all know that the latest pre-NNUE SF against itself with balanced openings and without time handicaps retrieved a very high draw ratio... were we playing 'near perfectly'? Just compare the current latest SF with the latest pre-NNUE SF with the same conditions and you will see that the current latest SF has a measurable edge over the other, of course (well, I do not know in a 1 week/move TC). I guess we are heading to hit a wall with the current method (there is still room for improvement) until the next major breakthough will appear and we move on again. When and how much good? Time will tell, as always.
Other insights are welcome.
Regards from Spain.
Ajedrecista.
"To better understand what an enormous improvement those machine-learned engines represent, it might be worth noting that Cake 1.8x (and the last handtuned version of KingsRow) were far better than the last computer world champion, Nemesis, which in turn was far better than the famous Chinook was"
It is a surprise for me for 2 reasons:
1)I thought Chinook was clearly the best because the programmer is already responsible for solving checkers in 2007.
https://en.chessbase.com/post/500-billi ... e-checkers
2)I am surprised that people got interest in making better checkers engines many years after knowing the game is proved to be a draw.
I wonder if most of the games in the 3-1 with 620 (!) draws match were games with no losing mistake or maybe there was a losing mistake in many of them that the opponent did not know to take advantage of them.
-
jefk
- Posts: 1082
- Joined: Sun Jul 25, 2010 10:07 pm
- Location: the Netherlands
- Full name: Jef Kaan
Re: Top programs get today 3-5 elo from core doubling
depends on time control (and hardware); and sort of openings. even without goinglatest pre-NNUE SF against itself with balanced openings and without time handicaps retrieved a very high draw ratio...
to unbalanced Tcec openings, there are very sharp 'normal' openings (eg Najdorf
with Bg5) where draw rate wasn't so high, even in correspondence chess.
In those times the effect of faster hardware was much bigger, maybe 15 Elo (very rough
estimate) for every doubling of nr cores, but later on decreasing.
Some correspondence players may confirm this.
So you can't compare the current situation with the pre-nnue situation.
What has changed is that current hardware plus Nnue plus latest opening theory
is approaching a limit; so Jouni is right.
Some software improvements still might be possible of course, but it would
surprise me if they would be as big as the jump from HCE to Nnue.
Look at the graph of Elo vs playing strength (in whatever way achieved,
hardware, software, time control), for slow games it is converging to possibly
4000 or so or slightly lower. Thus of course indicating a draw with perfect play:
from the normal opening position. Of course in problem chess, especially
complicated problems, there still is plenty of room for further improvement.
Show me an opening line for White which is achieving a big advantage
in all (sub)variations, and advantage which is increasing with depth.
(like the advantage for Black is increasing after 1.g4? the deeper you go).
As long as you (or someone else) can't do this there's no reason to believe chess is
not a draw (with perfect play); whether you (or others) like it or not.
-
Ajedrecista
- Posts: 2164
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: Top programmes get today 3-5 Elo from core doubling.
Hello:
I also said that time will tell when and how much good the next major breakthough will be. This does not mean by force hundreds or thousands of Elo all at once.
Regards from Spain.
Ajedrecista.
I am the first one that firmly thinks that chess is a draw with perfect play. However, this is not incompatible to a draw ratio below 100% nowadays, where a vastly stronger engine sometimes can exploit inaccuracies of the weaker engine in the middlegame... whether you (or others) like it or not.jefk wrote: ↑Fri Dec 12, 2025 8:55 pm[...]latest pre-NNUE SF against itself with balanced openings and without time handicaps retrieved a very high draw ratio...
So you can't compare the current situation with the pre-nnue situation.
[...]
Some software improvements still might be posible of course, but it would
surprise me if they would be as big as the jump from HCE to Nnue.
[...]
Show me an opening line for White which is achieving a big advantage
in all (sub)variations, and advantage which is increasing with depth.
(like the advantage for Black is increasing after 1.g4? the deeper you go).
As long as you (or someone else) can't do this there's no reason to believe chess is
not a draw (with perfect play); whether you (or others) like it or not.
I also said that time will tell when and how much good the next major breakthough will be. This does not mean by force hundreds or thousands of Elo all at once.
Regards from Spain.
Ajedrecista.
-
Ryan Benitez
- Posts: 725
- Joined: Thu Mar 09, 2006 1:21 am
- Location: Portland Oregon
Re: Top programs get today 3-5 elo from core doubling
I think it depends on your goals but if your goal is the analyze GM level games I would get a decent GPU as well so you can analyze with Stockfish, Dragon/Komodo, and Lc0.royb wrote: ↑Fri Dec 12, 2025 1:36 amThis makes one wonder - is it really worth purchasing an expensive CPU with lots of cores to run Stockfish on? With such a rapidly diminishing return - I think that may not be warranted for many people. And of course many CPUs these days are multi-core, but going past 8 or even 4 cores may not provide much benefit in terms of Stockfish playing strength.Jouni wrote: ↑Wed Dec 10, 2025 11:46 am No more 100 elo 70 elo or 30 elo as previously! Se here https://prodeo.actieforum.com/t1832-arc ... games-3m2s . When gain is zero chess is solved?
Makes me think that even a fast single core CPU (not that there are single core CPUs anymore) that has very fast single core performance should be more than enough even for analysis of GM games (since Stockfish is SOOOOOO much stronger than any GM even on a smartphone, let alone a laptop or desktop).
-
jefk
- Posts: 1082
- Joined: Sun Jul 25, 2010 10:07 pm
- Location: the Netherlands
- Full name: Jef Kaan
Re: Top programs get today 3-5 elo from core doubling
ajedrista wrote:
At the top level, eg top four (CCRL) engines, with sufficient calculation time,
a vastly stronger engine will make no difference, just like doubling
Cpu's (as Jouni pointed out) will make not difference.
I wrote 'draw with perfect play' to which you seem to agree,
but with many draw lines there's no such things of 'perfect play.'
There are zillions of drawing lines, and the only thing the weaker
engine has to do is to stay within the drawing margin. So there
are many many 'perfect lines' and actually not so difficult to find (Sf 6 cores
20 mins). Of course keeping the White score below 0.1 is more 'perfect' than
eg below 0.2- 0.3 and such lines are more 'perfect' for Black. What i usually
do if i end up with some disadvantage (eg. - 0.25) is taking more time with SF
and then White still won't be able to win. Against a 'vastly stronger' engine
it probably would be wise to keep the score within 0.1 but this not
difficult to do. Look at the Chinese opening database, and take sufficient
time once out of book. Oh and btw it's not 97 but 100 pct draw nowadays already
if the players put in the first choice engine move, rather than sometimes experimenting
with odd book lines or the third/fourth engine line which maybe sharper or whatever.
So the level (100 pct draws) is already achieved with Iccf correspondence chess
at the top, believe it or not (apparently you're not a correspondence player).
NB talking about such possible future 'vastly stronger' engines, you can only distinguish
them from what we have now it we would have different (engine) Elo's for different
categories of chess, not only for blitz/rapid/slow, but also for problem chess.
A 'vastly stronger' engine in problem chess eg Elo 4500 then would be possible,
not in normal chess because the Elo is converging (eg to 4000) as result of
the drawish nature of the game. Nb this also means that such a 'vastly
stronger' engine Elo 4500 or higher for problem chess will not be able to
beat the 'weaker' engine (at eg 3900 Elo or even current level) because
the 'weak' engine will be able to stay within the draw margin; whether
you like it or not. Ideas (like eg by mr TF) that you might be able to 'steer
the game into such complicated positions' to take advantage of the stronger
engine are not correct because Black nowadays can always prevent this (something
which maybe only some opening experts are able to understand); many important
opening lines have already been analyzed to the -almost- end eg. Marshall gambit
in the RL, etc. So with such a 'vastly stronger' engine you could deviate, eg.
starting with 1.b3 (or worse, 1.a4 or so) but then the first four or five
moves of the Chinese db will still be good enough, then not trusting the
old SF analysis anymore at shallow depth, you simply let SF17.1 run
for sufficiently long time (eg from one hour to possibly even a whole night).
Result: draw (Steinitz already said he could in principle draw against God with one
pawn less with Black and he was right btw; but that's another story again
.
Now if White cannot win against Black with 7 pawns instead of 8, how do
you think the situation is if Black keeps all 8 pawns.
PS from my memory (although not 100 pct sure), decades ago it was
estimated that doubling of the nr of Cpu's could give 50 Elo extra.
Now it's only 5, a clear indication we reached the limit.
As for the 'vastly stronger' engines in problem chess (with artificially constructed
complicated positions) , well i guess not many programmers will be interested
in this, so progress in this area anyway probably will be rather slow. There exist
such developments of course, not often mentioned here and sometimes even difficult to
find (the latest (dev) version) on Github, a certain 'Huntsman' (mating), Crystal
(tactics), Sf clones with fortress detection (later also in Don ?) Sting black hole, etc.
etc. Interesting toys but usually useless in correspondence chess btw.
Depends how weak the 'weaker' engine is.vastly stronger engine sometimes can exploit inaccuracies of the weaker engine in the middlegame...
At the top level, eg top four (CCRL) engines, with sufficient calculation time,
a vastly stronger engine will make no difference, just like doubling
Cpu's (as Jouni pointed out) will make not difference.
I wrote 'draw with perfect play' to which you seem to agree,
but with many draw lines there's no such things of 'perfect play.'
There are zillions of drawing lines, and the only thing the weaker
engine has to do is to stay within the drawing margin. So there
are many many 'perfect lines' and actually not so difficult to find (Sf 6 cores
20 mins). Of course keeping the White score below 0.1 is more 'perfect' than
eg below 0.2- 0.3 and such lines are more 'perfect' for Black. What i usually
do if i end up with some disadvantage (eg. - 0.25) is taking more time with SF
and then White still won't be able to win. Against a 'vastly stronger' engine
it probably would be wise to keep the score within 0.1 but this not
difficult to do. Look at the Chinese opening database, and take sufficient
time once out of book. Oh and btw it's not 97 but 100 pct draw nowadays already
if the players put in the first choice engine move, rather than sometimes experimenting
with odd book lines or the third/fourth engine line which maybe sharper or whatever.
So the level (100 pct draws) is already achieved with Iccf correspondence chess
at the top, believe it or not (apparently you're not a correspondence player).
NB talking about such possible future 'vastly stronger' engines, you can only distinguish
them from what we have now it we would have different (engine) Elo's for different
categories of chess, not only for blitz/rapid/slow, but also for problem chess.
A 'vastly stronger' engine in problem chess eg Elo 4500 then would be possible,
not in normal chess because the Elo is converging (eg to 4000) as result of
the drawish nature of the game. Nb this also means that such a 'vastly
stronger' engine Elo 4500 or higher for problem chess will not be able to
beat the 'weaker' engine (at eg 3900 Elo or even current level) because
the 'weak' engine will be able to stay within the draw margin; whether
you like it or not. Ideas (like eg by mr TF) that you might be able to 'steer
the game into such complicated positions' to take advantage of the stronger
engine are not correct because Black nowadays can always prevent this (something
which maybe only some opening experts are able to understand); many important
opening lines have already been analyzed to the -almost- end eg. Marshall gambit
in the RL, etc. So with such a 'vastly stronger' engine you could deviate, eg.
starting with 1.b3 (or worse, 1.a4 or so) but then the first four or five
moves of the Chinese db will still be good enough, then not trusting the
old SF analysis anymore at shallow depth, you simply let SF17.1 run
for sufficiently long time (eg from one hour to possibly even a whole night).
Result: draw (Steinitz already said he could in principle draw against God with one
pawn less with Black and he was right btw; but that's another story again
Now if White cannot win against Black with 7 pawns instead of 8, how do
you think the situation is if Black keeps all 8 pawns.
PS from my memory (although not 100 pct sure), decades ago it was
estimated that doubling of the nr of Cpu's could give 50 Elo extra.
Now it's only 5, a clear indication we reached the limit.
As for the 'vastly stronger' engines in problem chess (with artificially constructed
complicated positions) , well i guess not many programmers will be interested
in this, so progress in this area anyway probably will be rather slow. There exist
such developments of course, not often mentioned here and sometimes even difficult to
find (the latest (dev) version) on Github, a certain 'Huntsman' (mating), Crystal
(tactics), Sf clones with fortress detection (later also in Don ?) Sting black hole, etc.
etc. Interesting toys but usually useless in correspondence chess btw.
-
Ajedrecista
- Posts: 2164
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: Top programmes get today 3-5 Elo from core doubling.
Hello Uri:
1) Computer checkers in those days were basically building good, deep opening books and relying on EGDB (endgame databases), where the middlegame was played by the engines and the aim was to shortened that phase, targeting a short transition between the opening and the endgame, thus minimizing the role of the engines over the game. Although the engines searches and evaluations were good, they were not as good as nowadays. Opening books and EGDB were built with the help of engines, of course, but these tasks were not timed like a game.
2) I see it highly related to the former answer: try to get more autonomous engines that find the correct moves in less time and make them less dependent of opening books and EGDB.
Regarding the 624-game match, the PGN (Portable Draughts Notation, equivalent to PGN in chess) file is available here. I guess that anyone can analyse all these games with current KingsRow and Cake engines. Computer chess over different eras might not be that far of your quote 'maybe there was a losing mistake in many of them that the opponent did not know to take advantage of them'. Mistakes are becoming more and more subtle over time, for sure.
Regards from Spain.
Ajedrecista.
Please take my answers with care:Uri Blass wrote: ↑Fri Dec 12, 2025 2:31 amI read the following:
"To better understand what an enormous improvement those machine-learned engines represent, it might be worth noting that Cake 1.8x (and the last handtuned version of KingsRow) were far better than the last computer world champion, Nemesis, which in turn was far better than the famous Chinook was"
It is a surprise for me for 2 reasons:
1)I thought Chinook was clearly the best because the programmer is already responsible for solving checkers in 2007.
https://en.chessbase.com/post/500-billi ... e-checkers
2)I am surprised that people got interest in making better checkers engines many years after knowing the game is proved to be a draw.
I wonder if most of the games in the 3-1 with 620 (!) draws match were games with no losing mistake or maybe there was a losing mistake in many of them that the opponent did not know to take advantage of them.
1) Computer checkers in those days were basically building good, deep opening books and relying on EGDB (endgame databases), where the middlegame was played by the engines and the aim was to shortened that phase, targeting a short transition between the opening and the endgame, thus minimizing the role of the engines over the game. Although the engines searches and evaluations were good, they were not as good as nowadays. Opening books and EGDB were built with the help of engines, of course, but these tasks were not timed like a game.
2) I see it highly related to the former answer: try to get more autonomous engines that find the correct moves in less time and make them less dependent of opening books and EGDB.
Regarding the 624-game match, the PGN (Portable Draughts Notation, equivalent to PGN in chess) file is available here. I guess that anyone can analyse all these games with current KingsRow and Cake engines. Computer chess over different eras might not be that far of your quote 'maybe there was a losing mistake in many of them that the opponent did not know to take advantage of them'. Mistakes are becoming more and more subtle over time, for sure.
Regards from Spain.
Ajedrecista.