Opening book from a statistical point of view

stegemma · Post by **stegemma** » Fri Jul 29, 2016 8:49 pm

I've implemented a simple opening book in Satana. I miss some math here, from a statistical point of view, how can we compute the probability of win, giving a different score for any book entry? In my book, I store any position as an entry with the white/black score. Since some position are played often and some rarely, just computing a % like this one isn't always good:

Code: Select all

perc = white_score / (white_score + black_score)

If I compare two positions, for sample, one played 100000 times with a white % of win of 53% with another played only 10 times with a % of win for white of 70%, just comparing % would let the engine prefers the rare continuation.

I think that I should compute a more complex % but I don't know why (my math is poor, in statistical).

For sample, this is my engine output for the starting position:

Code: Select all

hash white_score black_score %
# 26B24248B0B213A7 Ng1-h3 3 1 75.00%
# A6F5BE14996CE34B Ng1-f3 141781 105371 57.30%
# A9CD43D265BB27C1 Nb1-c3 8670 8386 50.80%
# A2D57A5C5D03A1B4 Nb1-a3 ???
# A65545D9385B5A3B Ph2-h3 2 0 100.00%
# A9D5495D9934AFB0 Pg2-g3 4782 3596 57.00%
# A6CABC12D080A7CD Pf2-f3 2 0 100.00%
# A6CB76195BE75793 Pe2-e3 33 59 35.80%
# B6D55CBAA963D796 Pd2-d3 40 32 55.50%
# B6CD5B84A78F4849 Pc2-c3 32 32 50.00%
# 46D5AEC46E7F602A Pb2-b3 11438 11100 50.70%
# E6B53D23459CEBA5 Pa2-a3 ???
# 9FCA61F1E64C0CF0 Ph2-h4 ???
# 804C915C8920E580 Pg2-g4 ???
# 9DD7B0A819627060 Pf2-f4 13996 15814 46.90%
# 98A8776C1A9B3E65 Pe2-e4 1271383 1108327 53.40%
# 80DCEC6D58853A63 Pd2-d4 686201 550747 55.40%
# 9DE3F5BD01E05A54 Pc2-c4 198730 162554 55.00%
# 9B31EA80413BAD98 Pb2-b4 3965 4551 46.50%
# 9EEDB481C8A96A61 Pa2-a4 ???

The best entry would be:

Code: Select all

# A65545D9385B5A3B Ph2-h3 2 0 100.00%

but, of course, that's not good!

I know that I can filter entry with a low bound (I already do it, in fact) but I would like to know a more correct way.

Dann Corbit · Post by **Dann Corbit** » Fri Jul 29, 2016 9:07 pm

stegemma wrote:I've implemented a simple opening book in Satana. I miss some math here, from a statistical point of view, how can we compute the probability of win, giving a different score for any book entry? In my book, I store any position as an entry with the white/black score. Since some position are played often and some rarely, just computing a % like this one isn't always good:
Code: Select all
perc = white_score / (white_score + black_score)
If I compare two positions, for sample, one played 100000 times with a white % of win of 53% with another played only 10 times with a % of win for white of 70%, just comparing % would let the engine prefers the rare continuation.

I think that I should compute a more complex % but I don't know why (my math is poor, in statistical).

For sample, this is my engine output for the starting position:
Code: Select all
hash white_score black_score %
# 26B24248B0B213A7 Ng1-h3 3 1 75.00%
# A6F5BE14996CE34B Ng1-f3 141781 105371 57.30%
# A9CD43D265BB27C1 Nb1-c3 8670 8386 50.80%
# A2D57A5C5D03A1B4 Nb1-a3 ???
# A65545D9385B5A3B Ph2-h3 2 0 100.00%
# A9D5495D9934AFB0 Pg2-g3 4782 3596 57.00%
# A6CABC12D080A7CD Pf2-f3 2 0 100.00%
# A6CB76195BE75793 Pe2-e3 33 59 35.80%
# B6D55CBAA963D796 Pd2-d3 40 32 55.50%
# B6CD5B84A78F4849 Pc2-c3 32 32 50.00%
# 46D5AEC46E7F602A Pb2-b3 11438 11100 50.70%
# E6B53D23459CEBA5 Pa2-a3 ???
# 9FCA61F1E64C0CF0 Ph2-h4 ???
# 804C915C8920E580 Pg2-g4 ???
# 9DD7B0A819627060 Pf2-f4 13996 15814 46.90%
# 98A8776C1A9B3E65 Pe2-e4 1271383 1108327 53.40%
# 80DCEC6D58853A63 Pd2-d4 686201 550747 55.40%
# 9DE3F5BD01E05A54 Pc2-c4 198730 162554 55.00%
# 9B31EA80413BAD98 Pb2-b4 3965 4551 46.50%
# 9EEDB481C8A96A61 Pa2-a4 ???
The best entry would be:
Code: Select all
# A65545D9385B5A3B Ph2-h3 2 0 100.00%
but, of course, that's not good!

I know that I can filter entry with a low bound (I already do it, in fact) but I would like to know a more correct way.

First, if you have less than 32 games, the statistical significance of wins and losses is so small you should ignore it and rely purely on the computer evaluation. If the computer evaluation is so shallow that at the current time control you can out-think it, then you are officially out of book.

I think the question you are asking is : How to know what the breakpoint is to trust the data?

Clearly, this depends on how good your data is.
For instance from human games, is the data from correspondence chess between world championship candidates? Is it from FICS games between 'beanhead' and 'gizmo' with Elo of 800 and 750?

Is the data from bullet games? Is the data from TCEC games?

If you separate the data you collect into wins/losses and draws by type, then you can gain a lot more from it.

You should also store your own engine's wins/losses/draws from the given position. Even if it is a good position, you may want to avoid it if your engine does not play it well.

So, I think that the best way to form a book is to use as much carefully focused data as possible. Separate the games into their sources and you can statistically calibrate the value of each source. Consider the time control and also the strength of the players/engines.

When you have finely divided data, perform experiments exactly as you would for chess engine evaluation parameters (try different weights and run SPRT tests).

stegemma · Post by **stegemma** » Fri Jul 29, 2016 9:16 pm

I'm starting from games played by M and GM of all the times, extracted from PgnMentor, so the quality of the data should be good. This is just the first step, I would refine the book later.

Supposing to have a "perfect" and well refined book, I simply don't know the correct statistical approach for choosing the best move or just to sort them in a good order. Maybe there are no a predefined formula for this... I'm searching right now the net to learn more about statistic.

Ajedrecista · Post by **Ajedrecista** » Fri Jul 29, 2016 9:48 pm

Hello Stefano:

You may try the lower bound of the Wilson score interval (with or without continuity correction). It takes care of small samples. Wikipedia link:

Binomial proportion confidence interval

Just scroll down until you reach Wilson score interval. 95% confidence interval (z ~ 1.96) should be fine. In your example:

1e+5 games, 53% for white: lower bound (95% confidence) ~ 52.69%.
10 games, 70% for white: lower bound (95% confidence) ~ 39.68%.

(Online calculator).

Regards from Spain.

Ajedrecista.

Ferdy · Post by **Ferdy** » Fri Jul 29, 2016 9:57 pm

stegemma wrote:I've implemented a simple opening book in Satana. I miss some math here, from a statistical point of view, how can we compute the probability of win, giving a different score for any book entry? In my book, I store any position as an entry with the white/black score. Since some position are played often and some rarely, just computing a % like this one isn't always good:
Code: Select all
perc = white_score / (white_score + black_score)
If I compare two positions, for sample, one played 100000 times with a white % of win of 53% with another played only 10 times with a % of win for white of 70%, just comparing % would let the engine prefers the rare continuation.

I think that I should compute a more complex % but I don't know why (my math is poor, in statistical).

For sample, this is my engine output for the starting position:
Code: Select all
hash white_score black_score %
# 26B24248B0B213A7 Ng1-h3 3 1 75.00%
# A6F5BE14996CE34B Ng1-f3 141781 105371 57.30%
# A9CD43D265BB27C1 Nb1-c3 8670 8386 50.80%
# A2D57A5C5D03A1B4 Nb1-a3 ???
# A65545D9385B5A3B Ph2-h3 2 0 100.00%
# A9D5495D9934AFB0 Pg2-g3 4782 3596 57.00%
# A6CABC12D080A7CD Pf2-f3 2 0 100.00%
# A6CB76195BE75793 Pe2-e3 33 59 35.80%
# B6D55CBAA963D796 Pd2-d3 40 32 55.50%
# B6CD5B84A78F4849 Pc2-c3 32 32 50.00%
# 46D5AEC46E7F602A Pb2-b3 11438 11100 50.70%
# E6B53D23459CEBA5 Pa2-a3 ???
# 9FCA61F1E64C0CF0 Ph2-h4 ???
# 804C915C8920E580 Pg2-g4 ???
# 9DD7B0A819627060 Pf2-f4 13996 15814 46.90%
# 98A8776C1A9B3E65 Pe2-e4 1271383 1108327 53.40%
# 80DCEC6D58853A63 Pd2-d4 686201 550747 55.40%
# 9DE3F5BD01E05A54 Pc2-c4 198730 162554 55.00%
# 9B31EA80413BAD98 Pb2-b4 3965 4551 46.50%
# 9EEDB481C8A96A61 Pa2-a4 ???
The best entry would be:
Code: Select all
# A65545D9385B5A3B Ph2-h3 2 0 100.00%
but, of course, that's not good!

I know that I can filter entry with a low bound (I already do it, in fact) but I would like to know a more correct way.

From my mini shogi engine Lima. No worries this is also poor

.

Code: Select all

5   r  b  s  g  k
4   .  .  .  .  o
3   .  .  .  .  .
2   P  .  .  .  .
1   K  G  S  B  R

    a  b  c  d  e

st 10
go
# Probing book file Lima.bok ...
# Searching for position rbsgk/4p/5/P4/KGSBR[-] w
#  No.  Move  Perf  Game   Win  Loss  PlayPct
#   1.  d1c2  0.69    13     9     4  67.54
#   2.  d1e2  0.67     3     2     1  46.00
#   3.  b1b2  0.00     3     0     3  6.00
#   4.  c1b2  0.00     1     0     1  2.00
0 0 0 0 (Book Move d1c2)
move d1c2

Code: Select all

PlayPct = (Y * Win/Game) + (Z * Game/TotalGame)
Y = 60 (win factor)
Z = 40 (game factor)

jdart · Post by **jdart** » Fri Jul 29, 2016 10:39 pm

I use something like:

frequency*winloss/(total frequency)

as the relative weight. But in addition I appy a "selectivity" value. If that value is high, weights of moves besides the best move are decreased in relative score; if the value is low they are boosted. And I also decrease even further the weight of moves with very bad winloss, in an amount based on the selectivity value.

And on top of all this I have a manually tuned book that takes precedence and can override or modify weight decisions made by the code. This is necessary because you can have lines that are both popular and high-scoring but are now refuted, so they should not be highly weighted.

--Jon

stegemma · Post by **stegemma** » Fri Jul 29, 2016 10:57 pm

Thanks, I'm try something based on that hint but simpler. I compute the square root of the sum of the w/bscores and then I exclude the position with a resulting score less than 2%. At end I choose randomly from the remaining entries:

Code: Select all

clsPCollection<clsBookEntryExt> cllBookEntries;
if(int nBookMoves = GetBookEntries(cllBookEntries, true))
{
	double squareSum = 0;

	for(int e = 0; e < cllBookEntries.Count(); e++)
		if(clsBookEntryExt *pEntry = cllBookEntries[e])
			squareSum += sqrt(pEntry->myScore + pEntry->otherScore);

	for(int e = cllBookEntries.Count(); e >= 0; e--)
		if(clsBookEntryExt *pEntry = cllBookEntries[e])
			if(sqrt(pEntry->myScore + pEntry->otherScore) / squareSum < 0.02) cllBookEntries.Remove(e, true);

	if(clsBookEntryExt *pEntry = cllBookEntries[Rand(nBookMoves)])
	{
		FOR_MOVES
		{
			if(GetUserMove(pLoopMove, true) == pEntry->sMove)
			{
				UserSetBestMove(pLoopMove);
				sfout.Push("# book move " + pEntry->sMove);
				return 0;
			}
		}
	}
}

PS: next step would be to choose proportionally to %, of course

Edmund · Post by **Edmund** » Fri Jul 29, 2016 11:09 pm

Half a year ago I wrote some lines on this very topic:
http://talkchess.com/forum/viewtopic.php?t=59374

flok · Post by **flok** » Fri Jul 29, 2016 11:16 pm

What about http://www.evanmiller.org/how-not-to-so ... ating.html

stegemma · Post by **stegemma** » Fri Jul 29, 2016 11:42 pm

As pointed by other, it seems that this is one is the more correct solution. It is a little too complex to apply that solution now; I must study very carefully maybe in the morning, not after a whole day of work.

Opening book from a statistical point of view

Opening book from a statistical point of view

Re: Opening book from a statistical point of view

Re: Opening book from a statistical point of view

Re: Opening book from a statistical point of view.

Re: Opening book from a statistical point of view

Re: Opening book from a statistical point of view

Re: Opening book from a statistical point of view.

Re: Opening book from a statistical point of view

Re: Opening book from a statistical point of view

Re: Opening book from a statistical point of view