Stockfish has included WDL stats in engine output

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
kinderchocolate
Posts: 428
Joined: Mon Nov 01, 2010 5:55 am
Full name: Ted Wong
Contact:

Re: Stockfish has included WDL stats in engine output

Post by kinderchocolate » Thu Jul 16, 2020 10:07 pm

> Fishtest has 400cp adjudication, so any game reaching it for a few plies get marked as a win, though in some instance playing on it would end in a draw.

No. The graph in Fishest was approaching 1 as the x-axis goes to 400. This is not the same as the C++ code.

> You got the (cp, ply) -> wdl function very wrong, because with Stockfish's actual WDL formula the draw probability increase as ply count increase instead of the winning probability increasing as in your graph.

That makes no sense...

Alayan
Posts: 327
Joined: Tue Nov 19, 2019 7:48 pm
Full name: Alayan Feh

Re: Stockfish has included WDL stats in engine output

Post by Alayan » Thu Jul 16, 2020 10:42 pm

You are free to misunderstand.

Pio
Posts: 173
Joined: Sat Feb 25, 2012 9:42 pm
Location: Stockholm
Contact:

Re: Stockfish has included WDL stats in engine output

Post by Pio » Thu Jul 16, 2020 10:44 pm

Alayan wrote:
Thu Jul 16, 2020 9:59 pm
You got several things wrong.

Fishtest has 400cp adjudication, so any game reaching it for a few plies get marked as a win, though in some instance playing on it would end in a draw.

Stockfish internal units aren't the same as centipawns. 600 or so would be the value of a knight in internal units, not cp. Of course usually a position down a knight snowballs into much worse quickly.

You got the (cp, ply) -> wdl function very wrong, because with Stockfish's actual WDL formula the draw probability increase as ply count increase instead of the winning probability increasing as in your graph.
I just want to say that (cp, ply) -> wdl seems not to be the best function you could get, since ply is not something the function should be heavily dependent on. Very little of a chess position is a function of the moves before leading up to the position. Only 50-move draw rule, threefold repetition and maybe enpassant and castling (depending on how you look at it) is history dependent or ply-dependent.

I realise however that it is hard to substitute the ply for something else/better in the (cp, ply) -> wdl because if it was easy we would have great evaluation-functions.

I guess it is possible to get a better predictor than (cp, ply) and one is to look at what type of moves were in the history prior to the position and what moves are in the PV. My guess is that shuffling moves (could be identified as many moves made by the same piece and the number of pawn moves) and material left are better predictors that is (cp, shuffling moves(recent_history, PV), material left(recent_history, PV)) -> wdl might be better

/Pio

zullil
Posts: 6345
Joined: Mon Jan 08, 2007 11:31 pm
Location: PA USA
Full name: Louis Zulli

Re: Stockfish has included WDL stats in engine output

Post by zullil » Thu Jul 16, 2020 11:44 pm

kinderchocolate wrote:
Thu Jul 16, 2020 9:44 pm
Thanks. I think probability of winning is a good measure for chess reporting, and is in fact better than "cp":
  • cp is a programming concept not for chess analysis
  • cp is heavily implementation dependent
Reporting probability from fitting a sigmoid curve is a nice way to normalize the conflicts. I attach a plot of the Stockfish's WDL code.
  • https://github.com/glinscott/fishtest/w ... n-fishtest saturate around 400, but the SF code saturate around 600. Not sure why the author of patch reported "The model fits rather accurately the LTC fishtest statistics". The saturation point is critically important in the model, so if I'm not mistaken the patch was horribly badly programmed.
  • 600 is a little less than a knight in SF
  • At cp==0, the winning chance in the Fishtest link is about little less than 1 (hard to see). The SF code is 0.076 (vertical line in the plot).
Basically, the code tells us if we have an advantage something between a pawn and a knight, it's almost certain win. Up by a pawn is approximately 25% winning chance, not including draws.
I believe the attached graph correctly depicts (a continuous approximation to) Stockfish's current model of win rate as a function of game ply, assuming the current evaluation is 0.00.
Attachments
WinRate.png
WinRate.png (16.59 KiB) Viewed 478 times

syzygy
Posts: 4601
Joined: Tue Feb 28, 2012 10:56 pm

Re: Stockfish has included WDL stats in engine output

Post by syzygy » Fri Jul 17, 2020 1:03 am

kinderchocolate wrote:
Thu Jul 16, 2020 8:04 pm
Probably asked somewhere else, but I can't find it. What's the impact of using ply in the calculation? The problem here is we don't have such information if we start from a non-initial position.
When starting from a complete fen, we do have that information. A complete fen includes the move number.

syzygy
Posts: 4601
Joined: Tue Feb 28, 2012 10:56 pm

Re: Stockfish has included WDL stats in engine output

Post by syzygy » Fri Jul 17, 2020 1:07 am

kinderchocolate wrote:
Thu Jul 16, 2020 9:44 pm
  • 600 is a little less than a knight in SF
If you are talking about cp as reported by SF, then a knight in SF is much less than 600cp.

When SF reports a score, it scales its internal score so that a pawn is about 100cp (as it should).

syzygy
Posts: 4601
Joined: Tue Feb 28, 2012 10:56 pm

Re: Stockfish has included WDL stats in engine output

Post by syzygy » Fri Jul 17, 2020 1:14 am

kinderchocolate wrote:
Thu Jul 16, 2020 9:57 pm
If I was to add it analysis, I may just drop the ply parameter, and just hard-code it to 10. It looks like at 10, a knight advantage is about 75% winning. I like it to be 75% winning for a piece up.
Your "I like it to be 75% winning for a piece up" seems to be another good reason to just stick to reporting cp. A cp score is an objective score (for conventional engines like Stockfish) and everybody can subjectively interpret a cp score however they like.

User avatar
MikeB
Posts: 4195
Joined: Thu Mar 09, 2006 5:34 am
Location: Pen Argyl, Pennsylvania

Re: Stockfish has included WDL stats in engine output

Post by MikeB » Fri Jul 17, 2020 3:12 am

kinderchocolate wrote:
Thu Jul 16, 2020 9:57 pm
If I was to add it analysis, I may just drop the ply parameter, and just hard-code it to 10. It looks like at 10, a knight advantage is about 75% winning. I like it to be 75% winning for a piece up.
That might be true in some human games, but with computers , it is above 90%
Image

zullil
Posts: 6345
Joined: Mon Jan 08, 2007 11:31 pm
Location: PA USA
Full name: Louis Zulli

Re: Stockfish has included WDL stats in engine output

Post by zullil » Fri Jul 17, 2020 1:23 pm

kinderchocolate wrote:
Thu Jul 16, 2020 9:44 pm
Thanks. I think probability of winning is a good measure for chess reporting, and is in fact better than "cp":
  • cp is a programming concept not for chess analysis
  • cp is heavily implementation dependent
Reporting probability from fitting a sigmoid curve is a nice way to normalize the conflicts. I attach a plot of the Stockfish's WDL code.
  • https://github.com/glinscott/fishtest/w ... n-fishtest saturate around 400, but the SF code saturate around 600. Not sure why the author of patch reported "The model fits rather accurately the LTC fishtest statistics". The saturation point is critically important in the model, so if I'm not mistaken the patch was horribly badly programmed.
  • 600 is a little less than a knight in SF
  • At cp==0, the winning chance in the Fishtest link is about little less than 1 (hard to see). The SF code is 0.076 (vertical line in the plot).
Basically, the code tells us if we have an advantage something between a pawn and a knight, it's almost certain win. Up by a pawn is approximately 25% winning chance, not including draws.
I think your graphs are using (cp * PawnValueEg) rather than cp as horizontal units. I believe the graph below is a correct rendering of Stockfish's WDL model, assuming a game ply of 10.
Attachments
WinRatePly10.png
WinRatePly10.png (18.04 KiB) Viewed 391 times

Post Reply