Elo boost and time management

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

elcabesa
Posts: 855
Joined: Sun May 23, 2010 1:32 pm

Re: Elo boost and time management

Post by elcabesa »

i'd like to see some test at long time control, but it wouyld be unfeasible long to get a good elo estimation
Volker Annuss
Posts: 180
Joined: Mon Sep 03, 2007 9:15 am

Re: Elo boost and time management

Post by Volker Annuss »

Another old thread on time management: viewtopic.php?f=7&t=47242
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Elo boost and time management

Post by xr_a_y »

Good read !

I've added this 3 heuristics at the end of ID loop :

Code: Select all

        if (TimeMan::isDynamic && depth > TimeMan::emergencyMinDepth && bestScore < depthScores[depth-1] - TimeMan::emergencyMargin) currentMoveMs = std::min(int(TimeMan::msecUntilNextTC*TimeMan::maxStealFraction), currentMoveMs*TimeMan::emergencyFactor);
        if (std::max(1,int(std::chrono::duration_cast<std::chrono::milliseconds>(Clock::now() - TimeMan::startTime).count()*1.8)) > currentMoveMs) break; // not enought time
        if (!pv.empty() && depth > TimeMan::easyMoveMinDepth && isCapture(pv[0]) && sameMove(depthMoves[depth - 1], pv[0]) && sameMove(depthMoves[depth - 2], pv[0]) && sameMove(depthMoves[depth - 3], pv[0]) && std::abs(depthScores[depth - 1] - bestScore) < TimeMan::easyMoveMargin && std::abs(depthScores[depth - 2] - bestScore) < TimeMan::easyMoveMargin && std::abs(depthScores[depth - 3] - bestScore) < TimeMan::easyMoveMargin) currentMoveMs/=3; // easy move
The first two added 40 elo. The last one is currently under test ...
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Elo boost and time management

Post by xr_a_y »

ok so the third line (some king of easy move impl) is loosing elo (I tried many parameter combinaisons).

One thing that seems better (but still loosing elo for now) is looking for singular move at root with very shallow search (if the best move score is > all other + a margin) and in this case reduce available time.

The thing is that when NOT clearing TT between search, it seems hard to gain something with easy move.

Any hints ?
odomobo
Posts: 96
Joined: Fri Jul 06, 2018 1:09 am
Location: Chicago, IL
Full name: Josh Odom

Re: Elo boost and time management

Post by odomobo »

I've been thinking about this problem lately (haven't implemented it yet), and here are my observations:

The most important (IMO) case of an easy move is one which has immediate consequences -- when it is the only good move for depth 1 or 2 plus qs (actual depth 1 or 2, without using values from the TT). I am pretty sure there can also be easy moves which only manifest at, e.g. depth 3, but those aren't as clear to me, and by their nature, are more complex.

I think this should be obvious, but in this initial search, all of the "bad moves" searches need to use alpha = bestMoveValue - margin (or alpha = -Infinity).

There are 2 risks with such a move -- A: that the "easy move" is bad, or B: that one of the "bad moves" is actually a sacrifice tactic. For case A, you will eventually see the "easy move" lose a lot of value, down to a similar (or worse) value than the best of the "bad moves". This loss of value will be permanent (you won't regain it at a deeper search, unless another tactic is found, which disqualifies this as an easy move anyhow). For case B, given enough depth, one of the "bad moves" will gain value to become similar or greater than the "easy move".

There's actually a 3rd risk -- that one of the "bad moves" is a good positional sacrifice. However, if your evaluation function can't identify a winning position which is down material, then this is a moot point.

So, you need to make sure that you spend enough time searching such that the risks are less than the reward from saving time on the search. The rewards are very minimal -- by taking almost no time for a few turns in the game, you're giving maybe a 10% (or less) time bonus to the rest of the turns in the game. Risk case B isn't that big of a deal -- if the engine misses a strong tactic, there's still a chance for a good game (now with more time on the clock). However, if you miss risk case A, it can immediately lose you the game.

Due to the exponential nature of search time related to depth, if you haven't run into either risk case within, say, 10% of your allotted time, the chance of finding one of them is very minimal. So, it should be fairly safe to stop early (e.g. at 10% time used) if you detect an "easy move" with the initial easy move search, and that move has always been the best move at all previous depths, and that it still has close to its initial value (it has never lost more than, say, 50cp of value).

If you're paranoid about risk case B, this should be possible to prevent within a fail-soft framework, but it will fail (search long) for actual easy moves sometimes. In the case where there is no good sacrifice tactic, it makes sense that the upper bounds returned for all of the "bad moves" can all be much less than the best move score. Or, to put it another way, even though the values won't be exact, they can still all be very low. This should be the case as long as the moves made within these searches have reasonable ordering.

The reason one of the upper bounds might be much higher than the "actual" value (i.e. it's close to the best move score, instead of being close to its actual low score), is if it makes a suboptimal search which involves needless sacrifice. However, needless sacrifice moves should generally be avoidable through good move ordering (for example, delaying SEE<0 moves). I'm not 100% sure, but I think this case shouldn't happen very often when there are no sacrifice tactics. It should reduce risk further, with the downside of sometimes causing easy moves to be searched longer.

So, the additional logic within a fail-soft framework is: if for the current partial search, or the previous full search, if the best "bad move" is within some margin of the best move, then don't immediately return. If a previous depth search had a bad move within a margin, that shouldn't have an impact on this decision.

I don't think TT hits can negatively affect this algorithm (other than the fact that the TT cannot be used in the initial shallow search). This is because the information from only the deepest full search can be used ensure we are not experiencing case A or B.

Sorry for the big wall of text, but I don't think I can condense my thoughts any more than this.
Henk
Posts: 7216
Joined: Mon May 27, 2013 10:31 am

Re: Elo boost and time management

Post by Henk »

Skipper using timeLeftOver/ 60. Doesn't understand increment.

Hardly ever looses on time. On the contrary it loses because of using too less time.
User avatar
tsoj
Posts: 35
Joined: Thu Oct 19, 2017 4:59 pm
Location: Germany, Berlin
Full name: Jost Triller

Re: Elo boost and time management

Post by tsoj »

Henk wrote: Fri Feb 01, 2019 12:04 pm Skipper using timeLeftOver/ 60. Doesn't understand increment.

Hardly ever looses on time. On the contrary it loses because of using too less time.
so even if movestogo is given you assume that you have 60 moves to go?
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Elo boost and time management

Post by xr_a_y »

odomobo wrote: Fri Feb 01, 2019 11:43 am I've been thinking about this problem lately (haven't implemented it yet), and here are my observations:

The most important (IMO) case of an easy move is one which has immediate consequences -- when it is the only good move for depth 1 or 2 plus qs (actual depth 1 or 2, without using values from the TT). I am pretty sure there can also be easy moves which only manifest at, e.g. depth 3, but those aren't as clear to me, and by their nature, are more complex.

I think this should be obvious, but in this initial search, all of the "bad moves" searches need to use alpha = bestMoveValue - margin (or alpha = -Infinity).

There are 2 risks with such a move -- A: that the "easy move" is bad, or B: that one of the "bad moves" is actually a sacrifice tactic. For case A, you will eventually see the "easy move" lose a lot of value, down to a similar (or worse) value than the best of the "bad moves". This loss of value will be permanent (you won't regain it at a deeper search, unless another tactic is found, which disqualifies this as an easy move anyhow). For case B, given enough depth, one of the "bad moves" will gain value to become similar or greater than the "easy move".

There's actually a 3rd risk -- that one of the "bad moves" is a good positional sacrifice. However, if your evaluation function can't identify a winning position which is down material, then this is a moot point.

So, you need to make sure that you spend enough time searching such that the risks are less than the reward from saving time on the search. The rewards are very minimal -- by taking almost no time for a few turns in the game, you're giving maybe a 10% (or less) time bonus to the rest of the turns in the game. Risk case B isn't that big of a deal -- if the engine misses a strong tactic, there's still a chance for a good game (now with more time on the clock). However, if you miss risk case A, it can immediately lose you the game.

Due to the exponential nature of search time related to depth, if you haven't run into either risk case within, say, 10% of your allotted time, the chance of finding one of them is very minimal. So, it should be fairly safe to stop early (e.g. at 10% time used) if you detect an "easy move" with the initial easy move search, and that move has always been the best move at all previous depths, and that it still has close to its initial value (it has never lost more than, say, 50cp of value).

If you're paranoid about risk case B, this should be possible to prevent within a fail-soft framework, but it will fail (search long) for actual easy moves sometimes. In the case where there is no good sacrifice tactic, it makes sense that the upper bounds returned for all of the "bad moves" can all be much less than the best move score. Or, to put it another way, even though the values won't be exact, they can still all be very low. This should be the case as long as the moves made within these searches have reasonable ordering.

The reason one of the upper bounds might be much higher than the "actual" value (i.e. it's close to the best move score, instead of being close to its actual low score), is if it makes a suboptimal search which involves needless sacrifice. However, needless sacrifice moves should generally be avoidable through good move ordering (for example, delaying SEE<0 moves). I'm not 100% sure, but I think this case shouldn't happen very often when there are no sacrifice tactics. It should reduce risk further, with the downside of sometimes causing easy moves to be searched longer.

So, the additional logic within a fail-soft framework is: if for the current partial search, or the previous full search, if the best "bad move" is within some margin of the best move, then don't immediately return. If a previous depth search had a bad move within a margin, that shouldn't have an impact on this decision.

I don't think TT hits can negatively affect this algorithm (other than the fact that the TT cannot be used in the initial shallow search). This is because the information from only the deepest full search can be used ensure we are not experiencing case A or B.

Sorry for the big wall of text, but I don't think I can condense my thoughts any more than this.
Thanks for this well thought out answer.

What i am trying now is :
- use the classic ID loop (window, fail soft, pvs, TT, ...) until some "safe" depth (let's say depth 8)
- launch an open window depth 2 search (with qsearch of course ...)
- if the returned best move score is > second best move score + margin (I take 200cp for margin), if is it a capture or a check evasion, if that score is not too far from the current score (margin 30cp) then remainingTime/=4.
- go on with ID loop : if time is already too long, the ID will quit with the previous iteration move and score, if not we'll try to iterate until the reduced time.

This has (at least) two drawbacks :
- easy move is not triggered often because of delaying checks, or already found deep delaying tactics. It mostly activates for heavy piece recapture or check evasion (with only one choice ...). So, for a 80 moves game, it is used less than 10 times, and thus doesn't get much time in my short TC (40/20sec) tests. In long TC, it will gain at max 8 move time, and thus add around 10% of time for each move, which is not enough to go one depth deeper each move but can be use as emergency time somewhere. So I take some risk, without many chance of compensation.
- as the TT is not cleared between move, a part of the search that is "wasted" on a supposed easy move is in fact saved in TT and thus gain some speed for the next move searched.

I am not sure to gain something with this at the end.

Does someone has some tested fact (elo gain)
Henk
Posts: 7216
Joined: Mon May 27, 2013 10:31 am

Re: Elo boost and time management

Post by Henk »

tsoj wrote: Fri Feb 01, 2019 1:47 pm
Henk wrote: Fri Feb 01, 2019 12:04 pm Skipper using timeLeftOver/ 60. Doesn't understand increment.

Hardly ever looses on time. On the contrary it loses because of using too less time.
so even if movestogo is given you assume that you have 60 moves to go?
What is movestogo. How do I get movestogo ? Game may last for 350 moves or more.
JVMerlino
Posts: 1357
Joined: Wed Mar 08, 2006 10:15 pm
Location: San Francisco, California

Re: Elo boost and time management

Post by JVMerlino »

Henk wrote: Fri Feb 01, 2019 4:34 pm
tsoj wrote: Fri Feb 01, 2019 1:47 pm
Henk wrote: Fri Feb 01, 2019 12:04 pm Skipper using timeLeftOver/ 60. Doesn't understand increment.

Hardly ever looses on time. On the contrary it loses because of using too less time.
so even if movestogo is given you assume that you have 60 moves to go?
What is movestogo. How do I get movestogo ? Game may last for 350 moves or more.
There are basic three types of time control:
1) Game in N minutes
2) Game in N minutes, plus increment with each move
3) X moves in N minutes, repeating (rarely used on chess servers, but you need to support it properly as it is what CCRL uses).

So "movestogo" would be the number of moves remaining in type 3 before getting an additional N minutes.

(timeleftover / 60) just means "whatever amount of time I have remaining on the clock at the start of my turn, I will think for 1/60 of that and then make a move".

This is incredibly rudimentary and also VERY conservative. I've never seen an implementation of this type that had a number larger than 40, so I highly recommend trying some number between 30-40. I'm sure you will gain at least some ELO.

Adding the handling of an increment (type 2) is incredibly easy. Just do "(timeleftover / 40) + increment". That way, if increment is 0, then it still works just fine.

jm