Re: Elo boost and time management
Posted: Sat Jan 26, 2019 6:23 pm
i'd like to see some test at long time control, but it wouyld be unfeasible long to get a good elo estimation
Code: Select all
if (TimeMan::isDynamic && depth > TimeMan::emergencyMinDepth && bestScore < depthScores[depth-1] - TimeMan::emergencyMargin) currentMoveMs = std::min(int(TimeMan::msecUntilNextTC*TimeMan::maxStealFraction), currentMoveMs*TimeMan::emergencyFactor);
if (std::max(1,int(std::chrono::duration_cast<std::chrono::milliseconds>(Clock::now() - TimeMan::startTime).count()*1.8)) > currentMoveMs) break; // not enought time
if (!pv.empty() && depth > TimeMan::easyMoveMinDepth && isCapture(pv[0]) && sameMove(depthMoves[depth - 1], pv[0]) && sameMove(depthMoves[depth - 2], pv[0]) && sameMove(depthMoves[depth - 3], pv[0]) && std::abs(depthScores[depth - 1] - bestScore) < TimeMan::easyMoveMargin && std::abs(depthScores[depth - 2] - bestScore) < TimeMan::easyMoveMargin && std::abs(depthScores[depth - 3] - bestScore) < TimeMan::easyMoveMargin) currentMoveMs/=3; // easy move
Thanks for this well thought out answer.odomobo wrote: ↑Fri Feb 01, 2019 11:43 am I've been thinking about this problem lately (haven't implemented it yet), and here are my observations:
The most important (IMO) case of an easy move is one which has immediate consequences -- when it is the only good move for depth 1 or 2 plus qs (actual depth 1 or 2, without using values from the TT). I am pretty sure there can also be easy moves which only manifest at, e.g. depth 3, but those aren't as clear to me, and by their nature, are more complex.
I think this should be obvious, but in this initial search, all of the "bad moves" searches need to use alpha = bestMoveValue - margin (or alpha = -Infinity).
There are 2 risks with such a move -- A: that the "easy move" is bad, or B: that one of the "bad moves" is actually a sacrifice tactic. For case A, you will eventually see the "easy move" lose a lot of value, down to a similar (or worse) value than the best of the "bad moves". This loss of value will be permanent (you won't regain it at a deeper search, unless another tactic is found, which disqualifies this as an easy move anyhow). For case B, given enough depth, one of the "bad moves" will gain value to become similar or greater than the "easy move".
There's actually a 3rd risk -- that one of the "bad moves" is a good positional sacrifice. However, if your evaluation function can't identify a winning position which is down material, then this is a moot point.
So, you need to make sure that you spend enough time searching such that the risks are less than the reward from saving time on the search. The rewards are very minimal -- by taking almost no time for a few turns in the game, you're giving maybe a 10% (or less) time bonus to the rest of the turns in the game. Risk case B isn't that big of a deal -- if the engine misses a strong tactic, there's still a chance for a good game (now with more time on the clock). However, if you miss risk case A, it can immediately lose you the game.
Due to the exponential nature of search time related to depth, if you haven't run into either risk case within, say, 10% of your allotted time, the chance of finding one of them is very minimal. So, it should be fairly safe to stop early (e.g. at 10% time used) if you detect an "easy move" with the initial easy move search, and that move has always been the best move at all previous depths, and that it still has close to its initial value (it has never lost more than, say, 50cp of value).
If you're paranoid about risk case B, this should be possible to prevent within a fail-soft framework, but it will fail (search long) for actual easy moves sometimes. In the case where there is no good sacrifice tactic, it makes sense that the upper bounds returned for all of the "bad moves" can all be much less than the best move score. Or, to put it another way, even though the values won't be exact, they can still all be very low. This should be the case as long as the moves made within these searches have reasonable ordering.
The reason one of the upper bounds might be much higher than the "actual" value (i.e. it's close to the best move score, instead of being close to its actual low score), is if it makes a suboptimal search which involves needless sacrifice. However, needless sacrifice moves should generally be avoidable through good move ordering (for example, delaying SEE<0 moves). I'm not 100% sure, but I think this case shouldn't happen very often when there are no sacrifice tactics. It should reduce risk further, with the downside of sometimes causing easy moves to be searched longer.
So, the additional logic within a fail-soft framework is: if for the current partial search, or the previous full search, if the best "bad move" is within some margin of the best move, then don't immediately return. If a previous depth search had a bad move within a margin, that shouldn't have an impact on this decision.
I don't think TT hits can negatively affect this algorithm (other than the fact that the TT cannot be used in the initial shallow search). This is because the information from only the deepest full search can be used ensure we are not experiencing case A or B.
Sorry for the big wall of text, but I don't think I can condense my thoughts any more than this.
What is movestogo. How do I get movestogo ? Game may last for 350 moves or more.
There are basic three types of time control: