New Stockfish 15,1 evaluations

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Rowen
Posts: 119
Joined: Tue Nov 15, 2016 1:19 pm
Location: Cheshire, England

New Stockfish 15,1 evaluations

Post by Rowen »

Any thoughts on the new evaluations? Why the change? is it due to Lc0 data being used in the Nets? On a practical level what do the evaluations represent, a +1 evaluation indicates a 50% chance of winning but what about other evaluations? Is this a step forward?
CornfedForever
Posts: 650
Joined: Mon Jun 20, 2022 4:08 am
Full name: Brian D. Smith

Re: New Stockfish 15,1 evaluations

Post by CornfedForever »

Rowen wrote: Thu Feb 02, 2023 12:24 pm Any thoughts on the new evaluations? Why the change? is it due to Lc0 data being used in the Nets? On a practical level what do the evaluations represent, a +1 evaluation indicates a 50% chance of winning but what about other evaluations? Is this a step forward?
Better for analysis...in my opinion.
Uri Blass
Posts: 11120
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: New Stockfish 15,1 evaluations

Post by Uri Blass »

Rowen wrote: Thu Feb 02, 2023 12:24 pm Any thoughts on the new evaluations? Why the change? is it due to Lc0 data being used in the Nets? On a practical level what do the evaluations represent, a +1 evaluation indicates a 50% chance of winning but what about other evaluations? Is this a step forward?
The only way to know what the evaluation represent is testing.
The evaluation may mean something different at different time control.

If somebody play a lot of games of stockfish against itself from random chess positions from chess games at 1 second per move then we can calculate based on statistics what +1 evaluation means at 1 second per move with some hardware) and what +1.3 means at 1 second per move with the same hardware).

Not sure if it means the same at 10 seconds per move and it may be interesting to know the difference but somebody need to devote a lot of computer time for it if we want to know.

I do not know based on what time control and what hardware +1 means expected result of 0.75(I guess it is not exactly about probability to win because it may be something like 50.1% for a win 49.8% for a draw and 0.1% for a loss).
Plutie
Posts: 20
Joined: Sun Jan 30, 2022 6:14 am
Full name: Evan Engler

Re: New Stockfish 15,1 evaluations

Post by Plutie »

Uri Blass wrote: Thu Feb 02, 2023 3:28 pm ...
I do not know based on what time control and what hardware +1 means expected result of 0.75(I guess it is not exactly about probability to win because it may be something like 50.1% for a win 49.8% for a draw and 0.1% for a loss).
+1.00 is equal to a 50% win chance at move 32, fitted to fishtest LTC data (60s+0.6s @ 1.328m nps)
CornfedForever
Posts: 650
Joined: Mon Jun 20, 2022 4:08 am
Full name: Brian D. Smith

Re: New Stockfish 15,1 evaluations

Post by CornfedForever »

Plutie wrote: Thu Feb 02, 2023 3:45 pm
Uri Blass wrote: Thu Feb 02, 2023 3:28 pm ...
I do not know based on what time control and what hardware +1 means expected result of 0.75(I guess it is not exactly about probability to win because it may be something like 50.1% for a win 49.8% for a draw and 0.1% for a loss).
+1.00 is equal to a 50% win chance at move 32, fitted to fishtest LTC data (60s+0.6s @ 1.328m nps)
Right, you beat me to that.
I (may be wrong and apologize if so) think I remember (can't find the post) Larry K saying a similar thing had been done to Dragon. He may well have been referring to something else though.
CornfedForever
Posts: 650
Joined: Mon Jun 20, 2022 4:08 am
Full name: Brian D. Smith

Re: New Stockfish 15,1 evaluations

Post by CornfedForever »

And it seems they just had an update on that today:

Author: Joost VandeVondele
Date: Thu Feb 2 17:58:05 2023 +0100
Timestamp: 1675357085

Update WLD model

update the WLD model with about 400M positions extracted from recent LTC games after the net updates.
This ensures that the 50% win rate is again at 1.0 eval.

closes https://github.com/official-stockfish/S ... /pull/4373
dkappe
Posts: 1632
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: New Stockfish 15,1 evaluations

Post by dkappe »

CornfedForever wrote: Thu Feb 02, 2023 6:43 pm I (may be wrong and apologize if so) think I remember (can't find the post) Larry K saying a similar thing had been done to Dragon.
That is correct.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Milton
Posts: 141
Joined: Thu Mar 09, 2006 12:58 am

Re: New Stockfish 15,1 evaluations

Post by Milton »

CornfedForever wrote: Thu Feb 02, 2023 6:43 pm
Plutie wrote: Thu Feb 02, 2023 3:45 pm
Uri Blass wrote: Thu Feb 02, 2023 3:28 pm ...
I do not know based on what time control and what hardware +1 means expected result of 0.75(I guess it is not exactly about probability to win because it may be something like 50.1% for a win 49.8% for a draw and 0.1% for a loss).
+1.00 is equal to a 50% win chance at move 32, fitted to fishtest LTC data (60s+0.6s @ 1.328m nps)
Right, you beat me to that.
I (may be wrong and apologize if so) think I remember (can't find the post) Larry K saying a similar thing had been done to Dragon. He may well have been referring to something else though.
So if an evaluation of "1" means a 50% chance of a win, would this be equivalent to an evaluation of "0" (i.e. no advantage to either side) under the previous scheme?
Plutie
Posts: 20
Joined: Sun Jan 30, 2022 6:14 am
Full name: Evan Engler

Re: New Stockfish 15,1 evaluations

Post by Plutie »

Milton wrote: Fri Feb 03, 2023 5:36 am
CornfedForever wrote: Thu Feb 02, 2023 6:43 pm
Plutie wrote: Thu Feb 02, 2023 3:45 pm
Uri Blass wrote: Thu Feb 02, 2023 3:28 pm ...
I do not know based on what time control and what hardware +1 means expected result of 0.75(I guess it is not exactly about probability to win because it may be something like 50.1% for a win 49.8% for a draw and 0.1% for a loss).
+1.00 is equal to a 50% win chance at move 32, fitted to fishtest LTC data (60s+0.6s @ 1.328m nps)
Right, you beat me to that.
I (may be wrong and apologize if so) think I remember (can't find the post) Larry K saying a similar thing had been done to Dragon. He may well have been referring to something else though.
So if an evaluation of "1" means a 50% chance of a win, would this be equivalent to an evaluation of "0" (i.e. no advantage to either side) under the previous scheme?
no, 0.00 keeps the same meaning of 100% draw chance. the difference is that now, an old eval of, let's say 1.7, is now equal to a current eval of 1.00 (probably not exactly this, I forget the exact scaling right now)
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: New Stockfish 15,1 evaluations

Post by lkaufman »

CornfedForever wrote: Thu Feb 02, 2023 6:43 pm
Plutie wrote: Thu Feb 02, 2023 3:45 pm
Uri Blass wrote: Thu Feb 02, 2023 3:28 pm ...
I do not know based on what time control and what hardware +1 means expected result of 0.75(I guess it is not exactly about probability to win because it may be something like 50.1% for a win 49.8% for a draw and 0.1% for a loss).
+1.00 is equal to a 50% win chance at move 32, fitted to fishtest LTC data (60s+0.6s @ 1.328m nps)
Right, you beat me to that.
I (may be wrong and apologize if so) think I remember (can't find the post) Larry K saying a similar thing had been done to Dragon. He may well have been referring to something else though.
We scaled Dragon 3.2 by a factor of 0.68, which makes opening evals for positions near the win,draw line about the same as sf15.1, I.e. about 1. Didn’t check middlegame evals (like move32) but probably also close to Sf on average for such positions.
Komodo rules!