Usually opening books provide for each move per position simple information such as frequency, average score, draw rate, average elo and maximum elo. Comparing moves provided this information is difficult if not impossible. I propose a statistical method to (a) estimate the quality of each move, (b) calculate the precision of the estimate and (c) use the information to compare moves.
I define my move quality score analogue to the ELO formula as a logit transformation from winning probability. A score of 0 implies that equal players have equal winning chances on a certain position. Winning probability = (1+exp(score/400))^(1). This has the advantage that measures of precision are straight forward to interpret in a second step.
Next I need to postulate a certain model for chess outcomes. I refer to the paper by Shawul and Coulom (2013) and apply the Davidson model. This model predicts – for each score – the expected win, draw and lossratio.
Next for a certain move in a position I consider all instances in my database when it was played. I collect the relative elo of the players and the game outcome. My movescore I estimate through maximum likelihood. It is the parameter that linearly adjusts the relative elo and thus best explains the realized outcome.
So I am estimating the $score that minimizes: SUM[for each game] LN(x)
Where for won games x = 1/(1+EXP(($delta/400+$score))+ $v * EXP(($delta/400+$score)/2))
for lost games x = 1/(1+EXP( ($delta/400+$score))+ $v * EXP( ($delta/400+$score)/2))
for drawn games x = 1/(1+EXP(($delta/400+$score)/2)*(1+EXP($delta/400+$score))/$v)
$delta refers to the elo difference of the players and $v is a constant from the Davidson model related to the draw rate (a possible value for $v could be 1).
You might ask yourself how this can be effectively estimated. Apparently the loglikelihood function dependent on $score can be very well approximated through a quadratic equation. L = ax^2+bx+c.
So I propose for a quick approximation the calculation of L for three specific scores (1, 0 and +1) and then solve the equation:
c=L(0)
b=(L(1)L(1))/2
a=cb
we find the maximum by setting the first differential 0: score = b/(2*a)
Applying the fisher information, we can directly derive the standard deviation around our estimate by taking the inverse of the second differential: sd = 1/(2*a)
I have explained how to estimate the score of a certain move and the standard deviation of this score. For easier interpretation and given the approximate relationship between centipawns and elo value I suggest reporting score and sd multiplied by 400.
If you then want to compare different moves against each other I propose a onesided welch test, which will yield a score similar in principle to LOS.
Statistical assessment of chess opening book moves
Moderators: bob, hgm, Harvey Williamson
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.

 Posts: 3900
 Joined: Fri Mar 10, 2006 4:23 am
 Location: http://www.arasanchess.org
Re: Statistical assessment of chess opening book moves
I have seen many examples where the score (at least from a reasonablelength search) does not predict the winning chances for a move. In correspondence play especially it is common to see moves that don't look at all good initially but some moves later show a different eval. These are players that analyze a single position for days. It can take that long to find the optimal line.
And if you take a look at engine matches that use limitedlength books (such as CCRL) you will see many, many games where an engine plays into a known inferior opening line, often on the first move they search.
Jon
And if you take a look at engine matches that use limitedlength books (such as CCRL) you will see many, many games where an engine plays into a known inferior opening line, often on the first move they search.
Jon
Re: Statistical assessment of chess opening book moves
Maybe I was imprecise. My "score" has nothing to do with engineevaluations. I am proposing an improvement for presenting moveinformation from a database of played chessgames, where currently only very limited summary statistics are generated.jdart wrote:I have seen many examples where the score (at least from a reasonablelength search) does not predict the winning chances for a move. In correspondence play especially it is common to see moves that don't look at all good initially but some moves later show a different eval. These are players that analyze a single position for days. It can take that long to find the optimal line.
And if you take a look at engine matches that use limitedlength books (such as CCRL) you will see many, many games where an engine plays into a known inferior opening line, often on the first move they search.
Jon

 Posts: 3900
 Joined: Fri Mar 10, 2006 4:23 am
 Location: http://www.arasanchess.org
Re: Statistical assessment of chess opening book moves
Ok, sorry I didn't pick up that you were using Win/Loss/Draw statistics, although you did say that. But that is just as problematic, if not more so. Typically strong players play a move until it no longer works (has a refutation), then they switch to something else. The losing move might well have a good win/loss ratio but that is only because it worked for a considerable time. So "goodness" is a factor of time.
Jon
Jon