Creating Books from .PGN files

Dave_N · Post by **Dave_N** » Thu Dec 22, 2011 9:05 pm

What about Engine scoring for the book moves too ... I am not sure about the problem with Alpha Beta searching (with hash tables etc) when for example a program runs for 30 seconds on each move up to ply 12, the score would need to change lower down in a lot of cases and for most users they can't tell if the improving score for a move that originally scored low is the result of a better position or fail low/high.

Dave_N · Post by **Dave_N** » Thu Dec 22, 2011 9:19 pm

it might be worth sacrificing some book compile speed for flexibility, i.e. lua hooks and flexible field structures, for example someone might want to add a score for an element ... e.g. dynamism or "hotness" ...

Don · Post by **Don** » Thu Dec 22, 2011 9:29 pm

I think mini-max with scores or winning percentages is pretty unreliable. A lot of research in Monte Carlo Go indicates that you should play the move that scores best period (without the mini-max) but that assumes the samples are played with consideration for their winning percentage, which might not be the case with human games.

But you can see that explicit mini-max is wrong just by mini-maxing to the point where only 1 game is played. You will pass to the root a score of -1, 0 or 1 (assuming you adjust the winning percent 0-1 to -1, +1.)

So a big problem is that when you get to the point where the samples are low, the statistics are just all noise. You don't want to mini-max that.

But then you face the problem where the winning statistics looks great, but it is based on the opponent responding with a popular move that is actually not very good. I have seen that on-line where a move gets fairly high winning statistics but a move that is rarely tried gets the best result for the opponent. But it's not clear if that is because the move was odd enough to throw off the opponent or whether it was actually good.

If you want to mini-max you could require a minimum sample for end nodes, but I'm quite sure that will not be as good as just playing the most successful move (unless you can do some sort of statistical analysis to determine when the error margin justified it.)

I think in chess humans tend very strongly to play the best move and even though there is no explicit mini-max you get the same effect, just like in Monte Carlo Tree Search with GO.

hgm wrote:
yoshiharu wrote:What I am really after, though, is a sort of minimax on the opening book, because I am sick and tired of seeing my engine entering some continuation whose stats look great before a critical move, but after that the opponent has a move that *inverts* the stats refuting the whole plan...
Only problem is that I don't know what evaluation to use in this restricted minimax, using the usual stats seems too naive...
This is one of the things that is still on my (long) to-do list: put a good book-building algorithm into Polyglot. I think I designed a good algorithm for this (indeed using minimax-like back-propagation of the move stats). Pure minimax would only be valid if you prepare a book for 'best-move-only' play. If you want to switch on variety, you will also have to play non-best moves with a finite probability. E.g. you could reduce the probability of the move to be played by a factor 0.9 for every percent it scores less. But that means that the score of a position that has a move that scores 53% and two moves that score 50% is not is not 53% (as it would be in minimax) but (1*53% + 0.729*50% + 0.729*50%)/(1+0.729+0.729). This is worse then the stats of the best move, because the others drag it down, but better than the raw stats of the position, because the inferior moves have lower weight.

hgm · Post by **hgm** » Thu Dec 22, 2011 9:32 pm

Dave_N wrote:shouldn't there be a probability 0 for an engine to play a move that scores 75% wins for the opponent?

Not really. the point is to force him to prepare. The 75% stats is probably from players that did some amount of preparation, and if (as strong side) you did notprepareat all you woul sore a lot less. So the question is how much preparation is worth,in terms of percentage. By forcing him to prepare, you divert his efforts from preparing against your better moves, making you score better with those, so that it can still improve your average.

Compare it to the gambling game where two players each lay down a coin (covered to the opponent, but visible to themselves). When th faces are unequal, player A pays $2 to B. When they are both heads, B pays $3 to A, and both tails B pays $1 to A. So heads on average is more profitable for A. But if he would always play heads, B would catch on pretty quickly, and starts to play always tails himself, so that A now always loses because the faces are unequal.

yoshiharu · Post by **yoshiharu** » Fri Dec 23, 2011 4:11 pm

Don wrote: So a big problem is that when you get to the point where the samples are low, the statistics are just all noise. You don't want to mini-max that.

Indeed. My plan goes further, actually: I want to make the engine gradually kick in as the stats get lower, so to avoid having the engine being brought uniquely by the book in some position it doesn't understand, just as the book leaves it all alone. So I think a sort of "weighted sum" of stats and search is required, with weights running with the stats (less games in the book means that the moves in the book are less reliable).
The only problem is that I don't have a clue about how to build such a weighted score...

Cheers, Mauro

hgm · Post by **hgm** » Mon Dec 26, 2011 10:47 am

Don wrote:But you can see that explicit mini-max is wrong just by mini-maxing to the point where only 1 game is played. You will pass to the root a score of -1, 0 or 1 (assuming you adjust the winning percent 0-1 to -1, +1.)

You should of course not use minimax in a node with just a handfull of games. The rational algorithm is to use the playing frequency as prior info on the relative move quality (possibly weighted with Elo-dependent weights), and use the individual move-result statistics as Bayesian estimate for the absolute move quality. The weight of the prior compared to the stats is one adjustable parameter of the method, the translation of scoring percentage to ideal playing frequency another.

Normally, the playing frequency would have a much larger weight as the scores. E.g. with 3 play, 3 different moves, two winning, one losing, the fact that the moves were each played once is much more significant than that one happened to win, and the other happened to lose, so the prior tells you they are equal (and thus each scoring 2 out-of 3, for a Bayesian score estimate of 60%). This prior is than shifted by the actual move stats to, say, 61% vs 58%. The position percentage is then weighted with the implied playing frequency, which is also used to derive the statistical error (e.g. expressed as virtual number of plays).

This way you would have a continuous transition from fequency-dominated weighting (where statistics is poor) and score-dominated. So that when you get in a position (closer to the root) where move A scored 60 out of 100 and move B 400 out of 1000, you start to trust the now very significant scoring stats, rather than counting it as a node that scores 460 out of 1100. The playing frequency might suggest move B is (say) 300 Elo better with some weight derived from player info (e.g. expresssed as 20 plays), but the actual number of plays is much larger even for move A, and suggests very convincingly B is the weakest move. So the prior is heavily outvoted. Because this will eventually lead youu to play A from the book 95% of the time (say) and B only 5%, the score of the node will be about 59%, but the number of equivalent plays will be only slightly over 110 (and not 1100) because of this.

Don · Post by **Don** » Mon Dec 26, 2011 1:11 pm

hgm wrote:
Don wrote:But you can see that explicit mini-max is wrong just by mini-maxing to the point where only 1 game is played. You will pass to the root a score of -1, 0 or 1 (assuming you adjust the winning percent 0-1 to -1, +1.)
You should of course not use minimax in a node with just a handfull of games. The rational algorithm is to use the playing frequency as prior info on the relative move quality (possibly weighted with Elo-dependent weights), and use the individual move-result statistics as Bayesian estimate for the absolute move quality. The weight of the prior compared to the stats is one adjustable parameter of the method, the translation of scoring percentage to ideal playing frequency another.

Yes, this makes sense. When you say Bayesian I assume that you mean priors from previous levels in the "tree" also affect these statistic if I understand you correctly. So in your next example with 2 wins and a loss the prior is not 66% but a kind of sum of previous priors too, right?

I did an interesting experiment with computer go on small boards, I think in this case it was 7x7 and monte carlo tree search based programs very strong. They play 7x7 and 9x9 extremely well, at the level of the stronger players. Taking statistics from tens of thousands of games I wanted to derive the best way to play the first few moves.

I discovered that very early on, perhaps it was after just 3 or 4 plays, the move percentage for a particular move made it look best, but if you mini-maxed the response, it looked like the best one was not played nearly as often. The was based on computer vs computer play on CGOS. It was a particularly noteworthy deviation.

It could be that the "good" response was particularly profound or difficult to find. I think in the game theoretic sense both moves were equally "correct", but one must have been more difficult to play than the other. But this threw into question the value of the original move, was IT really the best? It was only seen as best because the players were not playing the most effective response as often. Even though the refutation was not played as often, there were still a lot of samples and it did not look like it was simply statistical noise.

I guess the bottom line here is that no methods is really going to work that well unless it is strictly controlled. If humans don't like the best move or it's out of fashion, it won't get represented like it should. Same with computer programs. So I think it should be structured similar to Monte Carlo Tree Search where the games and frequency of play for any given move are controlled by a higher level function that adjusts this frequency as more statistical evidence comes in.

Normally, the playing frequency would have a much larger weight as the scores. E.g. with 3 play, 3 different moves, two winning, one losing, the fact that the moves were each played once is much more significant than that one happened to win, and the other happened to lose, so the prior tells you they are equal (and thus each scoring 2 out-of 3, for a Bayesian score estimate of 60%). This prior is than shifted by the actual move stats to, say, 61% vs 58%. The position percentage is then weighted with the implied playing frequency, which is also used to derive the statistical error (e.g. expressed as virtual number of plays).

This way you would have a continuous transition from fequency-dominated weighting (where statistics is poor) and score-dominated. So that when you get in a position (closer to the root) where move A scored 60 out of 100 and move B 400 out of 1000, you start to trust the now very significant scoring stats, rather than counting it as a node that scores 460 out of 1100. The playing frequency might suggest move B is (say) 300 Elo better with some weight derived from player info (e.g. expresssed as 20 plays), but the actual number of plays is much larger even for move A, and suggests very convincingly B is the weakest move. So the prior is heavily outvoted. Because this will eventually lead youu to play A from the book 95% of the time (say) and B only 5%, the score of the node will be about 59%, but the number of equivalent plays will be only slightly over 110 (and not 1100) because of this.

Harald · Post by **Harald** » Mon Dec 26, 2011 9:19 pm

Perhaps this post has some examples for parts of this thread:
http://www.talkchess.com/forum/viewtopi ... ght=python

Dave_N · Post by **Dave_N** » Wed Jan 04, 2012 1:08 pm

Oh right, thanks for the link, I have similar for ECO codes however not sure about transpositions :/

Dave_N · Post by **Dave_N** » Wed Jan 04, 2012 11:41 pm

Harald wrote:Perhaps this post has some examples for parts of this thread:
http://www.talkchess.com/forum/viewtopi ... ght=python

In fact I have memory issues for files > 4Mb, I have to rethink my loading strategy because I created a vector for strings for each game, so either I have a memory leak or 4Mb will inflate to >40Mb when the vectors have been created... I don't parse the games until they have been selected and after the merge the game that was inserted into the root game is deleted.
I also found a bug on vector<Object*>::clear() where the memory that was allocated does not dissappear (according to task manager).

Creating Books from .PGN files

Re: Creating Books from .PGN files

Re: Creating Books from .PGN files

Re: Creating Books from .PGN files

Re: Creating Books from .PGN files

Re: Creating Books from .PGN files

Re: Creating Books from .PGN files

Re: Creating Books from .PGN files

Re: Creating Books from .PGN files

Re: Creating Books from .PGN files

Re: Creating Books from .PGN files