A new algorithm accounting for the uncertainty in eval funcs

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

tholden
Posts: 15
Joined: Thu Oct 23, 2014 1:46 pm

A new algorithm accounting for the uncertainty in eval funcs

Post by tholden »

Dear all,

The document below details a new algorithm for solving games such as chess which I'd love to persuade someone to implement. Even if you've no intention of implementing it, I would still love to hear your comments. I'm not naive enough to think it will beat cutting edge algorithms, but it will at least play differently, arguably in a more "human-like" manner, in which there is considerable value.

Before you dismiss me as another nut-job, let me say by way of defence that I am a professional academic economist, so understand well issues around decisions under uncertainty. I also used to be a professional computer games programmer for EA, so I have some experience of practical implementation issues. I have implemented basic alpha-beta in the past, so I'm aware of the "standard" approach to these problems as well.

http://www.tholden.org/wp-content/uploa ... -Chess.pdf

Any comments would be much appreciated,

Best regards,

Tom
PK
Posts: 893
Joined: Mon Jan 15, 2007 11:23 am
Location: Warsza

Re: A new algorithm accounting for the uncertainty in eval f

Post by PK »

It seems that You think along the same lines as the inventors of Monte Carlo Tree Search. This algorithm is currently used in all the major programs playing the Asian game known under the name of go (weichi, baduk). The game is notoriously hard to evaluate, so the programs basically play a bunch of random games, and the aggregate score replaces evaluation function output. Then the similarities start: once a move is tried sufficiently often, it becomes a node, and node value is calculated by a function taking into account scores of child nodes and move frequency. You will understand the equations much better than I do. In chess, however, such attempts did not work.
User avatar
hgm
Posts: 27789
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: A new algorithm accounting for the uncertainty in eval f

Post by hgm »

Well, as you already remark upon, as soon as you make the value of a node not only dependent on that of the best move, but start to weight in the existense of moves that are close in score, you would have to actually find an exact score for those moves, rather than just an upper limit (as lapha-beta does). That can be very expensive.

I once tried something like this in micro-Max, giving 10cP bonus to a node where there is a second move closer than 10cP from the first. This means you always have to lower alpha by 10cP, so recognize moves that fail low by lettle enough that the bonus might bring them in window. And that even holds when you increase alpha to the score of the best move, as long as you haven't found a second move that is close in that node yet.

It wasn't competitive at all to the normal search. I did not try to judge if their was a nearby move with a strongly-reduced search, though. Modern engines do this for determining if a move is singular. (They do search it deeper than, rather than penalizing it in score, though.) And it does not seem to hurt them.

Another thing that worried me is the dependency of moves. Very often two moves have exactly the same score. But that offers only fake security against a score drop, because it turns out they derive that score from the same leaf, reached through transposition of moves. And such dependencies are very hard to detect.
tholden
Posts: 15
Joined: Thu Oct 23, 2014 1:46 pm

Re: A new algorithm accounting for the uncertainty in eval f

Post by tholden »

I am aware of the Monte-Carlo tree search approach. What I'm suggesting is somewhere in between that approach and alpha-beta. Still using evaluation functions for guidance, rather than playing to maximum depth, but using a random (though not uniform) search.
tholden
Posts: 15
Joined: Thu Oct 23, 2014 1:46 pm

Re: A new algorithm accounting for the uncertainty in eval f

Post by tholden »

The concern about it turning into a brute force search of the entire game tree is addressed in the doc. This may be avoided since there is no need to evaluate all nodes to the same depth. The sparsity of the search could easily be tuned, by tweaking the exploration probabilities, so either more or less sparsity than alpha-beta is achievable, though it is unlikely to match the efficiency of alpha-beta in terms of performing the "right" evaluations. Where it will perform better though is in the mapping from evaluation scores to the move chosen.

Not sure I see why having multiple sequences of moves leading to the same state need be problematic. At the cost of some loss of efficiency, you can just treat them as different states. Or you can use a hash-table for the tree-nodes to detect the identical state, and then work with a game lattice rather than a game tree.
kinderchocolate
Posts: 454
Joined: Mon Nov 01, 2010 6:55 am
Full name: Ted Wong

Re: A new algorithm accounting for the uncertainty in eval f

Post by kinderchocolate »

Your algorithm sounds like a standard lattice algorithm. The maths is easy but I'm not very sure in the document how you'd derive those probability. Say if I give you an arbitrary position, how would you use the algorithm to derive the probabilities and make a move?
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: A new algorithm accounting for the uncertainty in eval f

Post by AlvaroBegue »

Eric Baum and Warren Smith have some old papers about handling uncertainty in scores, but I am not optimistic that the whole enterprise makes sense. For the most part the only job of the evaluation function is estimating the expected value of the utility of winning, or any quantity that respects the order.

In the context of MCTS, when using heuristics to initialize the statistics in a new node in the tree, you may want to know an estimate of the expected value and an idea of how certain we are, so you can properly combine the heuristic and the evidence you obtain from running playouts. That's the only context in which I think it makes sense to have an estimate of uncertainty in the evaluation.
tholden
Posts: 15
Joined: Thu Oct 23, 2014 1:46 pm

Re: A new algorithm accounting for the uncertainty in eval f

Post by tholden »

Not sure what a "standard lattice algorithm" is. Link? The google results for "lattice algorithm chess" did not seem to be at all related.

The algorithm for derive probabilities was described in the doc. You explore the tree in a random fashion, with probabilities proportional to probabilities of weak dominance. When new leaves are evaluated for the first time, you use the evaluation function to get probabilities. Probabilities are then updated up the tree using the given formula.

Could you be more specific in your question?
tholden
Posts: 15
Joined: Thu Oct 23, 2014 1:46 pm

Re: A new algorithm accounting for the uncertainty in eval f

Post by tholden »

Why don't you think "the whole enterprise makes sense"? The distribution of the expectation of the maximum of a collection of noisy random variables is very different to the maximum of their expectations. This is the crux of the problem with the standard approach. The result of an evaluation function deep in the tree is highly noisy, and if game play ever arrived at that point, we would know a much more precise estimate of their true expected value. Failing to take the uncertainty into account is just mathematically wrong. (Though it certainly seems like it's an approximation which has done OK of course.)

Your comment about MCTS seems to go some way towards what I'm doing. But what I'm proposing is certainly a long way from standard MCTS.
tholden
Posts: 15
Joined: Thu Oct 23, 2014 1:46 pm

Re: A new algorithm accounting for the uncertainty in eval f

Post by tholden »

Thank you for pointing me to those Baum and Smith papers by the way. They are certainly pursuing a similar idea. In particular the approximating assumptions (a) and (b) in the 1997 AI paper are precisely the ones I made.

Where we differ are the following points:

* They approximate the distribution of a conventional evaluation function. This makes their life much harder. I on the other hand use the realisation that a "perfect" evaluation function would only ever return win, lose or draw, enabling me to work just with three probabilities, rather than entire distributions over the real line. This greatly simplifies computations.
* I propose a similar search heuristic for which child to expand. Their heuristic is interesting, and it would be nice to see performance comparisons, but my hunch is that the cost of the extra complexity of their leaf expansion decision will not make up for its increased efficiency, particularly since it appears to make parallel versions of the algorithm much harder to implement.