On Opening books in 2015

jdart · Post by **jdart** » Wed Mar 11, 2015 2:30 pm

You misunderstand me. I am not talking here about the learning phase (which I understand). I am talking about selecting moves at playing time, when you may have many bits of information about moves to select but you have to choose one of them.

--Jon

Peter Berger · Post by **Peter Berger** » Wed Mar 11, 2015 11:53 pm

Uri Blass wrote:<snipped>
I disagree and if you build opening book correctly the book can be useful also in 3 minutes per move.

I have no idea how people build opening book but the logical way to do it is simply to give program like stockfish or komodo a lot of time in every position that you want to add to the book and add the move to the opening book.

OK - I was a bit sloppy here.
Of course "no book" is an option that can never be optimal by design. E.g. just by letting the engine search for a first white move at a reasonable time control and for answers to plausible white moves as black - and adding this to your book - you have already improved it beyond any reasonable doubt for a tournament game, even if an ELO effect will be barely measurable if at all.
Once you start to try to be clever with this idea and design and especially ++maintain++ a good book things get a lot less clear though.
There are three main issues I see here:
a.) The engine improves over time and now finds a better move by itself that is better than your (previous) book move.
b.) A book move generated the way you described might be too deep to follow up on it in actual games ( how useful is your 1. ..c5 follow-up on 1. e4 at depth 50 if you are stuck with a miserable 32 ply search for the next move?
c.) A book move might be good for one engine but not for another one ( your suggestion to use both Stockfish and Komodo for the generation process).
Anyway - I agree that it should always be possible to generate a somehow useful opening book under given conditions on principle.
The interesting question is whether existing aka real opening books really improve on actual engine performance at long time controls. I tend to think that this is an open question.
LG
Peter

Peter Berger · Post by **Peter Berger** » Thu Mar 12, 2015 12:36 am

[quote="Laskos]
I once went from game in 1'+1'' on one core to game in 5'+3'' on 4 cores, and although differences between books get a bit smaller, it's due as usual to increasing draw rate. One has to keep in mind that books are not simply successions of positions analyzed by engines, and it doesn't matter how long the positions are analyzed. People use game databases to build books, for example PlayChess engine room game databases. Engines as of now are unable to discern by analysis of an opening position (even very long analysis) what that line leads to, and some statistic of played outcomes has to be used. Also, engine analysis doesn't care what sort of intermediate position is reached, tactical with a lot of possible traps, closed, fortress having weird evals, and so on. Books can be tuned to take into account all that. Monte Carlo search can offset somewhat the advantage of the books, but even where MCTS is applied (Go), books are very useful in the openings, and some Go authors spend large amounts of time building them, either by hand or via automated process.[/quote]
I will keep this succinct as I think you have spent some thinking time on this issue yourself and will be able to follow easily.
Computer opening books and human opening theory basically represent some search at some depth IMHO. What the actual depth really is is an unknown when it is about human knowledge/theory, but it shouldn't be sneezed at.
I think that 5´3´´on 4 cores just ain't deep enough to challenge human opening theory. A three minute search by Stockfish on reasonably fast hardware might be though.
And while it might be desirable to get fed some cool book move during a game the much more serious problem is to NOT/NEVER get a real feeble one.
Already in 2004/2005 this was a serious issue under WCCC conditions. Then current commercial books were simply useless as they contained way too much GM "knowledge" that could simply be outsearched.

Zenmastur · Post by **Zenmastur** » Thu Mar 12, 2015 12:59 am

jdart wrote:You misunderstand me. I am not talking here about the learning phase (which I understand). I am talking about selecting moves at playing time, when you may have many bits of information about moves to select but you have to choose one of them.

--Jon

I'll keep it short,

The move selected maximizes this function:

Average score for move(i) + Sqrt(2*log(t)/number of times move(i) has been played thus far)

t is the number of times this position has been seen thus far.
This is calculated for all moves and the move with the highest score is played.

Regards,

Zen

jefk · Post by **jefk** » Thu Mar 12, 2015 6:34 pm

[quote="Zenmastur"]
The move selected maximizes this function:

Average score for move(i) + Sqrt(2*log(t)/number of times move(i) has been played thus far)
t is the number of times this position has been seen thus far.
This is calculated for all moves and the move with the highest score is played.
[/quote]

hard to imagine if this would function optimally,
if i look at Crafty book learning, it is changing slowly,
eg it still plays Scandinavian against e4, also after
repeated losses, only is changing moves further
down in the (scandinavian) tree, but to no avail
(with French or even ..e5 it could do better i think).

if your function is A(i) + f(t)
then you might apply a scaling option c thus making it eg
A(i) + c x f(t)
whereby you could play around a little with the value
of c to see how it performs

just a little suggestion

jef

Zenmastur · Post by **Zenmastur** » Thu Mar 12, 2015 11:47 pm

jefk wrote:
Zenmastur wrote: The move selected maximizes this function:

Average score for move(i) + Sqrt(2*log(t)/number of times move(i) has been played thus far)
t is the number of times this position has been seen thus far.
This is calculated for all moves and the move with the highest score is played.
hard to imagine if this would function optimally,
if i look at Crafty book learning, it is changing slowly,
eg it still plays Scandinavian against e4, also after
repeated losses, only is changing moves further
down in the (scandinavian) tree, but to no avail
(with French or even ..e5 it could do better i think).

if your function is A(i) + f(t)
then you might apply a scaling option c thus making it eg
A(i) + c x f(t)
whereby you could play around a little with the value
of c to see how it performs

just a little suggestion

jef

Jef,

Thanks for the suggestion even though its a little misplaced. This is not my function!

I suggest that you make this suggestion to the people that PROVED that this method is optimal for a wide variety of problems. Or ... you could just do a little more reading on the subject. I would suggest these tittles "Finite-time Analysis of the Multiarmed Bandit Problem" and "UCB REVISITED: IMPROVED REGRET BOUNDS FOR THE STOCHASTIC MULTI-ARMED BANDIT PROBLEM".

As far as me modifying this function, I have thought about it but , first I thought it would be a good idea to test it against some game data bases to see how it would far score wise and to see how much variety it would introduce into the games. This would give me a point for comparison against any modified version of the equation.

I'm in the process of manually testing this algorithm using data from one of my human game databases. So far, I haven't found a single position in which this algorithm would have scored as badly as the humans. I'm going to test a few more positions and then try it on a database of computer games. I suspect that the result will be even better on the computer-games database because the book move selection algorithms used by most programs are ABSOLUTELY ABYSMAL in my opinion.

One thing I have noticed is that when this equation is used to select moves you don't have to worry about your opening book not providing enough variety. There will be so much variety in the games played you won't need to speed hours working on your opening books just to add some variety to the play. That problem will be a thing of the past!

Regards,

Zen

jefk · Post by **jefk** » Sat Mar 14, 2015 1:02 pm

[quote="Zenmastur"]
I suggest that you make this suggestion to the people that PROVED that this method is optimal for a wide variety of problems. [/quote]

any specific link except then what you mentioned for a muldi-armed
'bandit problem'?

> "UCB REVISITED: IMPROVED REGRET BOUNDS FOR THE STOCHASTIC MULTI-ARMED BANDIT PROBLEM".

at first sight i thought you might be joking but indeed
the paper exists.
http://personal.unileoben.ac.at/rortner/Pubs/UCBRev.pdf

>As far as me modifying this function, I have thought about it but , first I thought it would be a good idea to test it against some game data bases suspect that the result will be even better on the computer-games database because the book move selection algorithms used by most programs are >ABSOLUTELY ABYSMAL in my opinion.

agree with that. although it is a nice idea to
call chess a multi-(4?) armed 'bandit' problem, you might
know also that eg in the Chessbase Gui ('Fritz') there
are two parameters for book learning, the socalled
'learning strenght' and another one. From some comp
games i suspect that the 'optimal' (max) settings are
too high though for good performance book learning.
NB the statistics method is not an unreasonable idea
for human play in chess i guess, but for -more perfect-
computer play minimax (or using minimaxed score instead
of positional eval) in my experience can perform better (possibly
combined with statistics for robustness although John Dart , ie
Arasan author apparently doesnt like to combine such apples
and pears, so to say; something which i can understand;
being an engineer, not a pure mathematician, personally i
have no problem fiddling with constants, as long as it works)

>One thing I have noticed is that when this equation is used to select moves you don't have to worry about your opening book not providing enough variety. There will be so much variety in the games played you won't need to >speed hours working on your opening books

sure but then you probably need to play games for
thousands of hours just in order to get a reasonable book (?)

jef

Zenmastur · Post by **Zenmastur** » Mon Mar 16, 2015 10:48 pm

jefk wrote: any specific link except then what you mentioned for a muldi-armed
'bandit problem'?

at first sight i thought you might be joking but indeed
the paper exists.
http://personal.unileoben.ac.at/rortner/Pubs/UCBRev.pdf

Here's the link to the other paper I mentioned.

http://homes.di.unimi.it/~cesabian/Pubb ... /ml-02.pdf

This paper may be a little easier read.

Zenmastur wrote:One thing I have noticed is that when this equation is used to select moves you don't have to worry about your opening book not providing enough variety. There will be so much variety in the games played you won't need to spend hours working on your opening books

jefk wrote:sure but then you probably need to play games for
thousands of hours just in order to get a reasonable book (?)

I agree, that if left unchanged, the equation will want to play a disproportionately large number of games with moves that from a human perspective will contribute little if anything to our knowledge about the opening or to the engines game score.

If you started from scratch on a book this could be a huge problem. The algorithm will be adding about one position per game to the book. The only reason to start from scratch that I can think of would be for academic purposes. The Stockfish team or Bob Hyatt have enough computer resources available that they could easily build a book like this in a relatively short period of time. For the rest of us, this would be too time consuming.

An easy way around this is to start with a full book. This can be done by processing a large number of high quality games from a database. Insert every position that occurs more than once in these games into the book along with its "average" score from its win-lose-draw record. An adjustment would have to be made to the actual number of games recorded in each of the win-lose-draw slots in the record. This needs to be done so that theses numbers match what the move selection equation is expecting. Otherwise, the move selection algorithm will try to balance these numbers by playing a large number of games that most humans would consider useless and a waste of time. This is a minor issue.

The other thing that can be done is to "tune" the equation. There are other equations that can be used, but I haven't had time to try them out yet and none "seem" at first glance to have the qualities needed for this task.

There is one obvious constant in the equation, the "2" inside the radical can be made into a free variable and used to tune the equation. While this helped it wasn't enough to solve the problem. There are two other implied constants in the equation that could be of some help. Taking the square root is the same as exponentiation by the constant 0.5. When taking the logarithm of some variable x the base of the logarithm is an implied constant. Both of these constants can be used as bounded free variables for tuning purposes. I tried this and found the number of exploration games played was still too high for reasonable values of the variables.

At this point I took a hard look at the equation and the list of move counts that it would generate from various positions if we allowed it to play 10,000 games. It became obvious why this equation wasn't responding well to tuning. Its composed of two terms. The first term is the average reward for move "i". This term, in effect, controls how often a game will be played to exploit what has been learned about the position so far. These are referred to as exploitation games and are used to increase the average games score. The second term controls the how often the equation will play an exploration game. The problem is all of the free variables are in the second term and because of its form and the limited precision of the math functions the second term can't be reduced enough to have the desired effect.

The easiest way to solve this problem is to multiply the first term of the equation by a constant. When I was looking at the equation I couldn't help but notice that the first term is linear and multiplying it by a constant will keep it that way. I got to thinking that it would be nice if we could encourage the equation to play the better moves disproportionately more often than the bad moves. This will have the effect of increasing the mean reward for the position. So the problem is how to produce a non-linear increase in the first term that will favor the better moves. Exponentiation seems to fit the bill.

When I tried this it became quite easy to adjust the number of exploitation games relative to the number of exploration games.

The only "real" problem that remains is striking the proper balance between the exploitation and exploration games. I think the only effective way to do this is real world testing and measuring.

Regards,

Zen

lucasart · Post by **lucasart** » Tue Mar 17, 2015 12:06 am

Peter Berger wrote: Simplified claim: Opening books are pretty useless these days. E.g. Stockfish don't need no book at slow time controls.

I almost agree with you. However, to measure the book effect per se, one needs to depollute the measure from the time biais.

For example, if you play 8 book moves, your time is base+8*inc when you start the game at move 8. Instead without book, you start from move 1, and after 8 moves your time on the clock is significantly reduced.

my claim: Most of the book value is simply the time biais.

To verify it, you could test using a fixed time per move.

Generally, the larger the book, the crappier it is. A book is crappy when it contains crappy moves, even < 1%. It only takes one crappy move to lose a game... And with most book being built from GM databases, they are mostly crappy...

Some book are built properly, with every single position verified by search. But most of them are just a dump of PGN crap from GM games...

For me the only utility of a book is to ensure non determinism (not play the same games all the time). But even for that it's useless. A simple opening selection (EPD file) is enough.

mvk · Post by **mvk** » Tue Mar 17, 2015 12:14 am

Agree, a search is necessary, and the main advantage is time. Secondary is to avoid bad moves just beyond your search horizon. But there is only so much you can search in advance. It should be more than you can search during the game (modulo the time advantage). Given the increase in computer power, this gives an upper bound on the useful size of a book. Too big and it will be shallow near the leaves, and you make the mistakes there. Or too big and the verifications are deep, but by the time your tree is ready the tournament machine is a lot faster thanks to semiconductor manufacturing improvements.

On Opening books in 2015

Re: On Opening books in 2015

Re: On Opening books in 2015

Re: On Opening books in 2015

Re: On Opening books in 2015

Re: On Opening books in 2015

Re: On Opening books in 2015

Re: On Opening books in 2015

Re: On Opening books in 2015

Re: On Opening books in 2015

Re: On Opening books in 2015