for different engines with different time control.
The way to make the research is the following steps:
1)Take a big database of human chess game
2)Analyze every position for 10 seconds(assuming 10 seconds per move is the longest time control you want to nake a research for it.
3)Save the evaluation of every engine after 0.3 seconds ,1 seconds, 3 seconds and 10 seconds for every position in the database
4)For every evaluation make engine-engine game between the engine and itself amd save the results.
For example if engine A has evaluation of 1.2-1.29 in 10000 positions after 1 second and the result of A against A at 1 second per move is 7900 wins for the better side and 2000 draws and 100 losses it means that 1.2-1.29 is translated to evpected result of 0.795 at 1 seconds per move.
Of course making the research may take a long computer time but I think results may be interesting.
I wonder if the same evaluation at longer time control has bigger expected result or smaller expected result or no difference.
It is not cllear.
Of course if there is a win there is a better probability the engine is going to find it at longer time control but it is not that the same positions are evaluated as 1.2-1.29 at different time controls.
I wonder if there is a research about expected result as function of eval
Moderator: Ras
-
Uri Blass
- Posts: 11161
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
-
towforce
- Posts: 12796
- Joined: Thu Mar 09, 2006 12:57 am
- Location: Birmingham UK
- Full name: Graham Laight
Re: I wonder if there is a research about expected result as function of eval
My guesses (Eval: win%, draw%, loss%):Uri Blass wrote: ↑Thu Feb 05, 2026 4:42 pm for different engines with different time control.
The way to make the research is the following steps:
1)Take a big database of human chess game
2)Analyze every position for 10 seconds(assuming 10 seconds per move is the longest time control you want to nake a research for it.
3)Save the evaluation of every engine after 0.3 seconds ,1 seconds, 3 seconds and 10 seconds for every position in the database
4)For every evaluation make engine-engine game between the engine and itself amd save the results.
For example if engine A has evaluation of 1.2-1.29 in 10000 positions after 1 second and the result of A against A at 1 second per move is 7900 wins for the better side and 2000 draws and 100 losses it means that 1.2-1.29 is translated to evpected result of 0.795 at 1 seconds per move.
Of course making the research may take a long computer time but I think results may be interesting.
I wonder if the same evaluation at longer time control has bigger expected result or smaller expected result or no difference.
It is not cllear.
Of course if there is a win there is a better probability the engine is going to find it at longer time control but it is not that the same positions are evaluated as 1.2-1.29 at different time controls.
0: 60, 20, 20
0.25: 65, 19, 16
0.5: 71, 18, 11
0.75: 77, 15, 8
1: 90, 9, 1
1.25: 93, 6, 1
1.5: 95, 4, 1
1.75: 97, 3, 0
2: 99, 1, 0
Human chess is partly about tactics and strategy, but mostly about memory
-
Uri Blass
- Posts: 11161
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: I wonder if there is a research about expected result as function of eval
With a strong engine and long time control I expect 0 to be something like 99.9% draws 0.05% win for white and 0.05% win for black.towforce wrote: ↑Thu Feb 05, 2026 9:11 pmMy guesses (Eval: win%, draw%, loss%):Uri Blass wrote: ↑Thu Feb 05, 2026 4:42 pm for different engines with different time control.
The way to make the research is the following steps:
1)Take a big database of human chess game
2)Analyze every position for 10 seconds(assuming 10 seconds per move is the longest time control you want to nake a research for it.
3)Save the evaluation of every engine after 0.3 seconds ,1 seconds, 3 seconds and 10 seconds for every position in the database
4)For every evaluation make engine-engine game between the engine and itself amd save the results.
For example if engine A has evaluation of 1.2-1.29 in 10000 positions after 1 second and the result of A against A at 1 second per move is 7900 wins for the better side and 2000 draws and 100 losses it means that 1.2-1.29 is translated to evpected result of 0.795 at 1 seconds per move.
Of course making the research may take a long computer time but I think results may be interesting.
I wonder if the same evaluation at longer time control has bigger expected result or smaller expected result or no difference.
It is not cllear.
Of course if there is a win there is a better probability the engine is going to find it at longer time control but it is not that the same positions are evaluated as 1.2-1.29 at different time controls.
0: 60, 20, 20
0.25: 65, 19, 16
0.5: 71, 18, 11
0.75: 77, 15, 8
1: 90, 9, 1
1.25: 93, 6, 1
1.5: 95, 4, 1
1.75: 97, 3, 0
2: 99, 1, 0
-
towforce
- Posts: 12796
- Joined: Thu Mar 09, 2006 12:57 am
- Location: Birmingham UK
- Full name: Graham Laight
Re: I wonder if there is a research about expected result as function of eval
Good point: in your 4-point plan, point 1 specified games played by domesticated apes - but point 4 specified that each position would be played out to a result by strong computers.
Human chess is partly about tactics and strategy, but mostly about memory
-
chrisw
- Posts: 4791
- Joined: Tue Apr 03, 2012 4:28 pm
- Location: Midi-Pyrénées
- Full name: Christopher Whittington
Re: I wonder if there is a research about expected result as function of eval
When I last looked, SF and many others have calibrated the evaluation to “expected result”, presumably WDL statistics. This ought to be readily available somewhere.Uri Blass wrote: ↑Thu Feb 05, 2026 4:42 pm for different engines with different time control.
The way to make the research is the following steps:
1)Take a big database of human chess game
2)Analyze every position for 10 seconds(assuming 10 seconds per move is the longest time control you want to nake a research for it.
3)Save the evaluation of every engine after 0.3 seconds ,1 seconds, 3 seconds and 10 seconds for every position in the database
4)For every evaluation make engine-engine game between the engine and itself amd save the results.
For example if engine A has evaluation of 1.2-1.29 in 10000 positions after 1 second and the result of A against A at 1 second per move is 7900 wins for the better side and 2000 draws and 100 losses it means that 1.2-1.29 is translated to evpected result of 0.795 at 1 seconds per move.
Of course making the research may take a long computer time but I think results may be interesting.
I wonder if the same evaluation at longer time control has bigger expected result or smaller expected result or no difference.
It is not cllear.
Of course if there is a win there is a better probability the engine is going to find it at longer time control but it is not that the same positions are evaluated as 1.2-1.29 at different time controls.
-
Dann Corbit
- Posts: 12828
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: I wonder if there is a research about expected result as function of eval
There are a number of engines that let you show the engine evaluation as either centipawns or as win probability.
I did a calculation a long time ago using my database that shows the opponent almost never recovers from a score of -4.4 pawns.
That was a long time ago. New data might change that.
I did a calculation a long time ago using my database that shows the opponent almost never recovers from a score of -4.4 pawns.
That was a long time ago. New data might change that.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
Uri Blass
- Posts: 11161
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: I wonder if there is a research about expected result as function of eval
I know that +1 means 75% expected result(50% for win and 50% for draw when I do not include probability for loss that is probably almost 0) but I am not sure what is the data that they are based on it.chrisw wrote: ↑Fri Feb 06, 2026 6:24 amWhen I last looked, SF and many others have calibrated the evaluation to “expected result”, presumably WDL statistics. This ought to be readily available somewhere.Uri Blass wrote: ↑Thu Feb 05, 2026 4:42 pm for different engines with different time control.
The way to make the research is the following steps:
1)Take a big database of human chess game
2)Analyze every position for 10 seconds(assuming 10 seconds per move is the longest time control you want to nake a research for it.
3)Save the evaluation of every engine after 0.3 seconds ,1 seconds, 3 seconds and 10 seconds for every position in the database
4)For every evaluation make engine-engine game between the engine and itself amd save the results.
For example if engine A has evaluation of 1.2-1.29 in 10000 positions after 1 second and the result of A against A at 1 second per move is 7900 wins for the better side and 2000 draws and 100 losses it means that 1.2-1.29 is translated to evpected result of 0.795 at 1 seconds per move.
Of course making the research may take a long computer time but I think results may be interesting.
I wonder if the same evaluation at longer time control has bigger expected result or smaller expected result or no difference.
It is not cllear.
Of course if there is a win there is a better probability the engine is going to find it at longer time control but it is not that the same positions are evaluated as 1.2-1.29 at different time controls.
Maybe the data is stockfish-stockfish games that they do in testing changes when the programs are not equal but almost equal so it is a good approximation but I am not sure.
For analysis of human-human games
I would like data that is based on computer-computer games only from positions that human got in chess games and I am not sure if the expected result is the same because positions of some specific evaluation range(1.2-1.29 for example) in human-human games and positions of the same range in comp-comp testing are different but I guess it is hard to get data in this way because one game does not give data about many positions and when you play many comp-comp games with UHO you get not only the positions that engines started but also more positions during the game with evaluation and result.