Re: eval blending?
Posted: Fri May 05, 2017 9:33 am
I implemented a similar idea in a Java app. For a given root position, each engine separately analyses it. If there is disagreement, the app then does a “shoot out” where engine 1 defends against engine 2’s play and vice versa – giving two lines of play. It expands both lines one ply at a time while monitoring eval trends. The lines keep expanding until an engine’s own eval causes it to prefer the other line. Of course, this doesn’t always happen and I have to manually choose which line I prefer. However, I see plenty of cases of Stockfish switching to Komodo’s proposal and vice versa. This is often at a time control of 4 days for the whole game (even although a whole game is not being play).kbhearn wrote:In its simplest form the idea has problems - namely that when you say have your trio of engines and one disagrees you really don't know if it's seeing more than the other two or if it's blind to something the other two are seeing.
What you'd probably need to do is combine it in an external idea-like framework that maintains a tree with each engine's evaluations and in nodes that are of particular interest to any of the 3 engines you expand that node and get second opinions on the child nodes again from all 3 engines and minimax the results back to the root for each. eventually time would force you to have to just go with one of the moves if the disagreement couldn't be resolved but at least it'd give a chance for the outlier engine to demonstrate it knows better or to realise it was the one missing something.
The app supports any number of engines. If the “winner” between engine 1 and engine 2 disagrees with engine 3, then that winner plays it out versus engine 3, and so forth. Since the process can take days to complete, it is debatable as to whether it would have been better to have just used one engine for the whole time and aim for a deeper search. It’s an interesting question… when an engine is hitting a blind spot due to e.g. pruning or a weakness in its evaluation function, should we let it analyse for longer or get a second opinion from another engine…