The MAB index I've been using is very slow to change. The effect when the statistics for a node changes abruptly by being negamaxed from a node deeper in the tree is quite dramatic none the less. In almost all cases the "apparent" number of games played from the node that received the new stats will drop. The MAB algorithm tries to keep the indexes for all descendant nodes roughly equal. A sudden drop in the number of games played from one node due to a negamax of the statistics will cause its index to rise sharply. An increase in the average game score will also increase this nodes index. The combination of both conditions occurring simultaneously is dramatic. This changes the behavior of the book. It gives it a sense of direction or purpose if you prefer.mvk wrote:...I recall that I did all kinds of statistics negamaxing in the previous version of my program, and I have written something about it at the time. The fundamental concern I have with it is that the value of statistics only increases with the sqrt of the number of underlying games, and games are expensive.
The behavior described above can be enhanced if desired by altering the index equation.
The index equation I have been playing with is I=r(i)+sqrt(2*Ln(N)/n(i)). I is the calculated index, r(i) is the average reward or game score for move i, N is the total number of times the position has been seen, and n(i) is the number of times move i has been played.
If, for example, you wanted to make the behavior non-linear with respect to average game score you could change the equation like this: I=r(i)^3+sqrt(2*Ln(N)/n(i)). I have done a lot of experimenting with derivatives of this equation. A more general form of the equation is: I=a*r(i)^b+(Log base c(N)/n(i))^d+f. Where a,b,c,d, and f are constants. "a' can be used to adjust the relative weight given to exploitation vice exploration. Making "a" larger makes the algorithm greedy. "b" adjusts how non-linear you want the rewards to be. I set "c" equal to the number of legal moves in the position, but other values can be used. e=2.71828 is too small a base to use in my opinion but I still have a lot of testing to do. "d" was 1/2 or the square root in the original equation this seems to high IMO. "f" can be used to bias the results by a constant value which can be useful during tests. None of this is written in stone since I'm still experimenting at this stage.
This is precisely why blindly following book moves should be avoided, why third party statistics should be heavily discounted so they only have influence early in the learning process, and why they should be stored separately from engine results so they don't contaminate them.mvk wrote:...Therefore I consider searches more valuable near the tips. And closer to the root negamax does the job. Looking at statistics of arbitrary games feels like a very slow Monte Carlo experiment with an embedded data quality problem, especially from human games.
I thought that's what I was proposing!mvk wrote:...It might be better to generate many fast games instead of relying on external PGNs for that.
The problem is getting the position into the book FIRST so they can collect the needed data. Its much faster to do this with existing PGN's than it is through game play. Even if some of the positions will never be played by an engine it's better to get all the position you can from PGN's during creation. This solves the problem of data being missed during game play because the positions needed to save the data haven't been added to the book yet. i.e. it save a disproportionate amount of time compared to the small amount of disk space wasted.
It makes little difference if there are position in the book that won't ever be played. If they are never played they will also never be searched, and no game statistics will ever be gathered. Thus they will require no additional resources other than the disk space they occupy. They will just sit in the book unused. I don't see this as a much of a problem and it's considerably better than wasting a bunch of time getting the positions into the book through game play IMO.
There are other ways to get around this problem but they require a lot of extra time be spent on it. Disk space is cheap, and my time is very expensive, so I think this is a good trade-off.
Regards,
Zen