I wonder if there is a research about expected result as function of eval

Uri Blass · Post by **Uri Blass** » Thu Feb 05, 2026 4:42 pm

for different engines with different time control.

The way to make the research is the following steps:
1)Take a big database of human chess game
2)Analyze every position for 10 seconds(assuming 10 seconds per move is the longest time control you want to nake a research for it.
3)Save the evaluation of every engine after 0.3 seconds ,1 seconds, 3 seconds and 10 seconds for every position in the database
4)For every evaluation make engine-engine game between the engine and itself amd save the results.
For example if engine A has evaluation of 1.2-1.29 in 10000 positions after 1 second and the result of A against A at 1 second per move is 7900 wins for the better side and 2000 draws and 100 losses it means that 1.2-1.29 is translated to evpected result of 0.795 at 1 seconds per move.

Of course making the research may take a long computer time but I think results may be interesting.

I wonder if the same evaluation at longer time control has bigger expected result or smaller expected result or no difference.

It is not cllear.
Of course if there is a win there is a better probability the engine is going to find it at longer time control but it is not that the same positions are evaluated as 1.2-1.29 at different time controls.

towforce · Post by **towforce** » Thu Feb 05, 2026 9:11 pm

Uri Blass wrote: ↑Thu Feb 05, 2026 4:42 pm for different engines with different time control.

The way to make the research is the following steps:
1)Take a big database of human chess game
2)Analyze every position for 10 seconds(assuming 10 seconds per move is the longest time control you want to nake a research for it.
3)Save the evaluation of every engine after 0.3 seconds ,1 seconds, 3 seconds and 10 seconds for every position in the database
4)For every evaluation make engine-engine game between the engine and itself amd save the results.
For example if engine A has evaluation of 1.2-1.29 in 10000 positions after 1 second and the result of A against A at 1 second per move is 7900 wins for the better side and 2000 draws and 100 losses it means that 1.2-1.29 is translated to evpected result of 0.795 at 1 seconds per move.

Of course making the research may take a long computer time but I think results may be interesting.

I wonder if the same evaluation at longer time control has bigger expected result or smaller expected result or no difference.

It is not cllear.
Of course if there is a win there is a better probability the engine is going to find it at longer time control but it is not that the same positions are evaluated as 1.2-1.29 at different time controls.

My guesses (Eval: win%, draw%, loss%):

0: 60, 20, 20
0.25: 65, 19, 16
0.5: 71, 18, 11
0.75: 77, 15, 8
1: 90, 9, 1
1.25: 93, 6, 1
1.5: 95, 4, 1
1.75: 97, 3, 0
2: 99, 1, 0

Uri Blass · Post by **Uri Blass** » Thu Feb 05, 2026 10:47 pm

towforce wrote: ↑Thu Feb 05, 2026 9:11 pm
Uri Blass wrote: ↑Thu Feb 05, 2026 4:42 pm for different engines with different time control.

The way to make the research is the following steps:
1)Take a big database of human chess game
2)Analyze every position for 10 seconds(assuming 10 seconds per move is the longest time control you want to nake a research for it.
3)Save the evaluation of every engine after 0.3 seconds ,1 seconds, 3 seconds and 10 seconds for every position in the database
4)For every evaluation make engine-engine game between the engine and itself amd save the results.
For example if engine A has evaluation of 1.2-1.29 in 10000 positions after 1 second and the result of A against A at 1 second per move is 7900 wins for the better side and 2000 draws and 100 losses it means that 1.2-1.29 is translated to evpected result of 0.795 at 1 seconds per move.

Of course making the research may take a long computer time but I think results may be interesting.

I wonder if the same evaluation at longer time control has bigger expected result or smaller expected result or no difference.

It is not cllear.
Of course if there is a win there is a better probability the engine is going to find it at longer time control but it is not that the same positions are evaluated as 1.2-1.29 at different time controls.
My guesses (Eval: win%, draw%, loss%):

0: 60, 20, 20
0.25: 65, 19, 16
0.5: 71, 18, 11
0.75: 77, 15, 8
1: 90, 9, 1
1.25: 93, 6, 1
1.5: 95, 4, 1
1.75: 97, 3, 0
2: 99, 1, 0

With a strong engine and long time control I expect 0 to be something like 99.9% draws 0.05% win for white and 0.05% win for black.

towforce · Post by **towforce** » Thu Feb 05, 2026 11:55 pm

Uri Blass wrote: ↑Thu Feb 05, 2026 10:47 pmWith a strong engine and long time control I expect 0 to be something like 99.9% draws 0.05% win for white and 0.05% win for black.

Good point: in your 4-point plan, point 1 specified games played by domesticated apes - but point 4 specified that each position would be played out to a result by strong computers.

chrisw · Post by **chrisw** » Fri Feb 06, 2026 6:24 am

Uri Blass wrote: ↑Thu Feb 05, 2026 4:42 pm for different engines with different time control.

The way to make the research is the following steps:
1)Take a big database of human chess game
2)Analyze every position for 10 seconds(assuming 10 seconds per move is the longest time control you want to nake a research for it.
3)Save the evaluation of every engine after 0.3 seconds ,1 seconds, 3 seconds and 10 seconds for every position in the database
4)For every evaluation make engine-engine game between the engine and itself amd save the results.
For example if engine A has evaluation of 1.2-1.29 in 10000 positions after 1 second and the result of A against A at 1 second per move is 7900 wins for the better side and 2000 draws and 100 losses it means that 1.2-1.29 is translated to evpected result of 0.795 at 1 seconds per move.

Of course making the research may take a long computer time but I think results may be interesting.

I wonder if the same evaluation at longer time control has bigger expected result or smaller expected result or no difference.

It is not cllear.
Of course if there is a win there is a better probability the engine is going to find it at longer time control but it is not that the same positions are evaluated as 1.2-1.29 at different time controls.

When I last looked, SF and many others have calibrated the evaluation to “expected result”, presumably WDL statistics. This ought to be readily available somewhere.

Dann Corbit · Post by **Dann Corbit** » Fri Feb 06, 2026 8:23 am

There are a number of engines that let you show the engine evaluation as either centipawns or as win probability.
I did a calculation a long time ago using my database that shows the opponent almost never recovers from a score of -4.4 pawns.
That was a long time ago. New data might change that.

Uri Blass · Post by **Uri Blass** » Fri Feb 06, 2026 8:27 am

chrisw wrote: ↑Fri Feb 06, 2026 6:24 am
Uri Blass wrote: ↑Thu Feb 05, 2026 4:42 pm for different engines with different time control.

The way to make the research is the following steps:
1)Take a big database of human chess game
2)Analyze every position for 10 seconds(assuming 10 seconds per move is the longest time control you want to nake a research for it.
3)Save the evaluation of every engine after 0.3 seconds ,1 seconds, 3 seconds and 10 seconds for every position in the database
4)For every evaluation make engine-engine game between the engine and itself amd save the results.
For example if engine A has evaluation of 1.2-1.29 in 10000 positions after 1 second and the result of A against A at 1 second per move is 7900 wins for the better side and 2000 draws and 100 losses it means that 1.2-1.29 is translated to evpected result of 0.795 at 1 seconds per move.

Of course making the research may take a long computer time but I think results may be interesting.

I wonder if the same evaluation at longer time control has bigger expected result or smaller expected result or no difference.

It is not cllear.
Of course if there is a win there is a better probability the engine is going to find it at longer time control but it is not that the same positions are evaluated as 1.2-1.29 at different time controls.
When I last looked, SF and many others have calibrated the evaluation to “expected result”, presumably WDL statistics. This ought to be readily available somewhere.

I know that +1 means 75% expected result(50% for win and 50% for draw when I do not include probability for loss that is probably almost 0) but I am not sure what is the data that they are based on it.

Maybe the data is stockfish-stockfish games that they do in testing changes when the programs are not equal but almost equal so it is a good approximation but I am not sure.

For analysis of human-human games
I would like data that is based on computer-computer games only from positions that human got in chess games and I am not sure if the expected result is the same because positions of some specific evaluation range(1.2-1.29 for example) in human-human games and positions of the same range in comp-comp testing are different but I guess it is hard to get data in this way because one game does not give data about many positions and when you play many comp-comp games with UHO you get not only the positions that engines started but also more positions during the game with evaluation and result.

I wonder if there is a research about expected result as function of eval

I wonder if there is a research about expected result as function of eval

Re: I wonder if there is a research about expected result as function of eval

Re: I wonder if there is a research about expected result as function of eval

Re: I wonder if there is a research about expected result as function of eval

Re: I wonder if there is a research about expected result as function of eval

Re: I wonder if there is a research about expected result as function of eval

Re: I wonder if there is a research about expected result as function of eval