Are Aspiration Windows Worthless?

Ferdy · Post by **Ferdy** » Sun Dec 27, 2020 8:20 am

Tested Deuterium's aspwin, have not tested it for a long time. The version with aspwin won.

TC 15s+100ms

Score of Deuterium_aw vs Deuterium: 89 - 60 - 155  [0.548] 304
...      Deuterium_aw playing White: 49 - 23 - 80  [0.586] 152
...      Deuterium_aw playing Black: 40 - 37 - 75  [0.510] 152
...      White vs Black: 86 - 63 - 155  [0.538] 304
Elo difference: 33.2 +/- 27.4, LOS: 99.1 %, DrawRatio: 51.0 %

The algo is simple, if there is an early sign of score instability reset the bounds to its original value as early as possible.

It started with alpha = -inf, beta = +inf

* If it fails low, set alpha to score - 100, beta = score, but if the score is already losing or winning reset alpha/beta to -inf/+inf, then research meaning use the previous iteration depth.
* If it fails high, set beta to score + 100, alpha = score, but if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
* Otherwise, set alpha = score - 30, beta = score + 30, no research just continue with the next iteration depth.

Next:
* If it fails low and previous score was low or high then reset alpha/beta to -inf/+inf, then research. However if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
* If it fails high and previous score was low or high then reset alpha/beta to -inf/+inf, then research. However if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
So in summary if there is successive lows low/low or successive highs high/high, or alternate high/low or low/high, then reset alpha/beta to -inf/+inf.

I call 100 as BadWindow and 30 as GoodWindow.

I am trying to tune these two params with optuna optimizer at 100 games per trial for 100 trials at TC 15s+50ms to see if the optimizer can improve it.

Code: Select all

python -u tuner.py --study-name deu_aspwindow_opt --sampler name=tpe --engine ./engines/deuterium/deuterium_17.exe --initial-best-value 0.55 --concurrency 6 --opening-file ./start_opening/ogpt_chess_startpos.epd --opening-format epd --input-param "{'AspWindowGood': {'default':30, 'min':5, 'max':100, 'step':1}, 'AspWindowBad': {'default':100, 'min':5, 'max':200, 'step':1}}" --games-per-trial 100 --trials 100 --base-time-sec 15 --inc-time-sec 0.05 --pgn-output deu_aspwindow_opt.pgn --threshold-pruner result=0.25 --plot

The best param it could find so far after 6 trials with 53% score from a 100-game match against the default or init param is:

Code: Select all

2020-12-27 14:58:05,825 | INFO  | init param: {'AspWindowBad': 100, 'AspWindowGood': 30}
2020-12-27 15:05:29,205 | INFO  | study best param: {'AspWindowBad': 171, 'AspWindowGood': 42}
2020-12-27 15:05:29,206 | INFO  | study best value: 0.53

Ferdy · Post by **Ferdy** » Sun Dec 27, 2020 2:08 pm

Ferdy wrote: ↑Sun Dec 27, 2020 8:20 am Tested Deuterium's aspwin, have not tested it for a long time. The version with aspwin won.

TC 15s+100ms
Code: Select all
Score of Deuterium_aw vs Deuterium: 89 - 60 - 155  [0.548] 304
...      Deuterium_aw playing White: 49 - 23 - 80  [0.586] 152
...      Deuterium_aw playing Black: 40 - 37 - 75  [0.510] 152
...      White vs Black: 86 - 63 - 155  [0.538] 304
Elo difference: 33.2 +/- 27.4, LOS: 99.1 %, DrawRatio: 51.0 %
The algo is simple, if there is an early sign of score instability reset the bounds to its original value as early as possible.

It started with alpha = -inf, beta = +inf

* If it fails low, set alpha to score - 100, beta = score, but if the score is already losing or winning reset alpha/beta to -inf/+inf, then research meaning use the previous iteration depth.
* If it fails high, set beta to score + 100, alpha = score, but if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
* Otherwise, set alpha = score - 30, beta = score + 30, no research just continue with the next iteration depth.

Next:
* If it fails low and previous score was low or high then reset alpha/beta to -inf/+inf, then research. However if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
* If it fails high and previous score was low or high then reset alpha/beta to -inf/+inf, then research. However if the score is already losing or winning reset alpha/beta to -inf/+inf, then research.
So in summary if there is successive lows low/low or successive highs high/high, or alternate high/low or low/high, then reset alpha/beta to -inf/+inf.

I call 100 as BadWindow and 30 as GoodWindow.

I am trying to tune these two params with optuna optimizer at 100 games per trial for 100 trials at TC 15s+50ms to see if the optimizer can improve it.
Code: Select all
python -u tuner.py --study-name deu_aspwindow_opt --sampler name=tpe --engine ./engines/deuterium/deuterium_17.exe --initial-best-value 0.55 --concurrency 6 --opening-file ./start_opening/ogpt_chess_startpos.epd --opening-format epd --input-param "{'AspWindowGood': {'default':30, 'min':5, 'max':100, 'step':1}, 'AspWindowBad': {'default':100, 'min':5, 'max':200, 'step':1}}" --games-per-trial 100 --trials 100 --base-time-sec 15 --inc-time-sec 0.05 --pgn-output deu_aspwindow_opt.pgn --threshold-pruner result=0.25 --plot
The best param it could find so far after 6 trials with 53% score from a 100-game match against the default or init param is:
Code: Select all
2020-12-27 14:58:05,825 | INFO  | init param: {'AspWindowBad': 100, 'AspWindowGood': 30}
2020-12-27 15:05:29,205 | INFO  | study best param: {'AspWindowBad': 171, 'AspWindowGood': 42}
2020-12-27 15:05:29,206 | INFO  | study best value: 0.53

I stop the optimization after 40 trials (can be resumed). It came up with the best param below found at 17th trial.

Code: Select all

2020-12-27 19:11:54,617 | INFO  | study best param: {'AspWindowBad': 32, 'AspWindowGood': 29}
2020-12-27 19:11:54,625 | INFO  | study best value: 0.550625
2020-12-27 19:11:54,632 | INFO  | study best trial number: 17

Performance score history.

Run a verification match of 1000 games at TC 15s+100ms. The optimized version using {'AspWindowBad': 32, 'AspWindowGood': 29} won by +7 games vs the default {'AspWindowBad': 100, 'AspWindowGood': 30}

Code: Select all

Score of Deuterium_17_opt vs Deuterium_17: 219 - 212 - 569  [0.503] 1000
...      Deuterium_17_opt playing White: 125 - 96 - 280  [0.529] 501
...      Deuterium_17_opt playing Black: 94 - 116 - 289  [0.478] 499
...      White vs Black: 241 - 190 - 569  [0.525] 1000
Elo difference: 2.4 +/- 14.1, LOS: 63.2 %, DrawRatio: 56.9 %

An ideal games per trial should have been 1000 or more to get more coverage of the starting positions. The optimization done above is only 100 games per trial.

emadsen · Post by **emadsen** » Sun Dec 27, 2020 6:24 pm

Ferdy wrote: ↑Sun Dec 27, 2020 8:20 am The algo is simple, if there is an early sign of score instability reset the bounds to its original value as early as possible... if the score is already losing or winning...

Thanks Ferdy for your explanation and testing. If I understand you correctly, that's the algorithm I've tried (except with a margin of 100, 500, infinite) but failed to find any strength improvement. Though I'm confused what you mean by "if the score is already losing or winning."

Your post gave me an idea. Actually yours and Joost's, who said:

Joost Buijs wrote: ↑Mon Dec 21, 2020 8:48 am If the search fails high I don't try to resolve it but immediately skip to the next iteration/depth with an adjusted window.

I'll try this: Search with a tight aspiration window and if it fails high or low skip to the next ply and search with an infinite window. Just give up on the fail high / low ply altogether and don't research it. Simply move on to the next ply. If the previous ply failed high the move that caused the fail high will be saved in the cache and tried first the next ply (with an infinite window). That would limit the time spent re-searching by guaranteeing no consecutive plies ever fail high or low.

I have an evaluation parameter test running now. When that concludes I'll test the above aspiration window algorithm. I'll post my results here.

Thanks all who have participated in this discussion.

Ferdy · Post by **Ferdy** » Sun Dec 27, 2020 6:41 pm

emadsen wrote: ↑Sun Dec 27, 2020 6:24 pm
Ferdy wrote: ↑Sun Dec 27, 2020 8:20 am The algo is simple, if there is an early sign of score instability reset the bounds to its original value as early as possible... if the score is already losing or winning...
Thanks Ferdy for your explanation and testing. If I understand you correctly, that's the algorithm I've tried (except with a margin of 100, 500, infinite) but failed to find any strength improvement. Though I'm confused what you mean by "if the score is already losing or winning."

I define losing score as less than or equal to -10000 and winning score as greater than or equal to +10000. My inf is 32000.

Your post gave me an idea. Actually yours and Joost's, who said:

Joost Buijs wrote: ↑Mon Dec 21, 2020 8:48 am If the search fails high I don't try to resolve it but immediately skip to the next iteration/depth with an adjusted window.
I'll try this: Search with a tight aspiration window and if it fails high or low skip to the next ply and search with an infinite window. Just give up on the fail high / low ply altogether and don't research it. Simply move on to the next ply. If the previous ply failed high the move that caused the fail high will be saved in the cache and tried first the next ply (with an infinite window). That would limit the time spent re-searching by guaranteeing no consecutive plies ever fail high or low.

I have an evaluation parameter test running now. When that concludes I'll test the above aspiration window algorithm. I'll post my results here.

Thanks all who have participated in this discussion.

Looking forward on your result. I will also try that scheme.

Ferdy · Post by **Ferdy** » Mon Dec 28, 2020 4:14 am

Ferdy wrote: ↑Sun Dec 27, 2020 6:41 pm
emadsen wrote: ↑Sun Dec 27, 2020 6:24 pm
Ferdy wrote: ↑Sun Dec 27, 2020 8:20 am The algo is simple, if there is an early sign of score instability reset the bounds to its original value as early as possible... if the score is already losing or winning...
Thanks Ferdy for your explanation and testing. If I understand you correctly, that's the algorithm I've tried (except with a margin of 100, 500, infinite) but failed to find any strength improvement. Though I'm confused what you mean by "if the score is already losing or winning."
I define losing score as less than or equal to -10000 and winning score as greater than or equal to +10000. My inf is 32000.

Your post gave me an idea. Actually yours and Joost's, who said:

Joost Buijs wrote: ↑Mon Dec 21, 2020 8:48 am If the search fails high I don't try to resolve it but immediately skip to the next iteration/depth with an adjusted window.
I'll try this: Search with a tight aspiration window and if it fails high or low skip to the next ply and search with an infinite window. Just give up on the fail high / low ply altogether and don't research it. Simply move on to the next ply. If the previous ply failed high the move that caused the fail high will be saved in the cache and tried first the next ply (with an infinite window). That would limit the time spent re-searching by guaranteeing no consecutive plies ever fail high or low.

I have an evaluation parameter test running now. When that concludes I'll test the above aspiration window algorithm. I'll post my results here.

Thanks all who have participated in this discussion.
Looking forward on your result. I will also try that scheme.

I tried something similar to what Joost suggested. If it fails high reset the bound to inf and continue with the next iteration. But if it fails low expand at low side by 50, 100, inf, then research. Also start applying aspwin above iteration 3.

Code: Select all

aspWindowBad = 100
aspWindowGood = 30
aspFailLowTries = 2
alpha = -INF
beta = INF

for (iter = 1 to max_iter) {

	score = search(alpha, beta ...)

	if (aspWindow && iter > 3) {
		// Fail low
		if (score <= alpha) {
			fLow++

			if (fLow > aspFailLowTries) {
				alpha = -INF
				beta = INF
				iter--
				continue
			}
			else {
				alpha = score - (aspWindowBad * fLow)/2
				beta = score
				iter--
				continue
			}
		}

		// Fail high
		if (score >= beta) {
			alpha = -INF
			beta = INF
			fLow = 0
		}

		alpha = score - aspWindowGood
		beta = score + aspWindowGood
		fLow = 0
	}

}

Played 1000 games at TC 15s+100ms, Deuterium_18 is using this new method, Deuterium_17 has my current method. The new method won by +13 wins. This needs more tests if I have to use it.

Code: Select all

Finished game 1000 (Deuterium_17 vs Deuterium_18): 1/2-1/2 {Draw by insufficient mating material}
Score of Deuterium_18 vs Deuterium_17: 208 - 195 - 597  [0.506] 1000
...      Deuterium_18 playing White: 122 - 78 - 300  [0.544] 500
...      Deuterium_18 playing Black: 86 - 117 - 297  [0.469] 500
...      White vs Black: 239 - 164 - 597  [0.537] 1000
Elo difference: 4.5 +/- 13.7, LOS: 74.1 %, DrawRatio: 59.7 %

Player: Deuterium_18
   "Draw by 3-fold repetition": 352
   "Draw by adjudication": 100
   "Draw by fifty moves rule": 21
   "Draw by insufficient mating material": 122
   "Draw by stalemate": 2
   "Loss: Black wins by adjudication": 78
   "Loss: White wins by adjudication": 117
   "Win: Black wins by adjudication": 86
   "Win: White mates": 1
   "Win: White wins by adjudication": 121
Player: Deuterium_17
   "Draw by 3-fold repetition": 352
   "Draw by adjudication": 100
   "Draw by fifty moves rule": 21
   "Draw by insufficient mating material": 122
   "Draw by stalemate": 2
   "Loss: Black wins by adjudication": 86
   "Loss: White mates": 1
   "Loss: White wins by adjudication": 121
   "Win: Black wins by adjudication": 78
   "Win: White wins by adjudication": 117
Finished match

emadsen · Post by **emadsen** » Mon Dec 28, 2020 5:29 am

Ferdy wrote: ↑Mon Dec 28, 2020 4:14 am I tried something similar to what Joost suggested. If it fails high reset the bound to inf and continue with the next iteration. But if it fails low expand at low side by 50, 100, inf, then research. Also start applying aspwin above iteration 3.

That's encouraging. My test of skipping to next ply on fail high or low, searching with infinite window is not looking promising. -42 +/- 30 ELO after 600 games of 2m+1s.

Ferdy · Post by **Ferdy** » Mon Dec 28, 2020 7:05 am

emadsen wrote: ↑Mon Dec 28, 2020 5:29 am
Ferdy wrote: ↑Mon Dec 28, 2020 4:14 am I tried something similar to what Joost suggested. If it fails high reset the bound to inf and continue with the next iteration. But if it fails low expand at low side by 50, 100, inf, then research. Also start applying aspwin above iteration 3.
That's encouraging. My test of skipping to next ply on fail high or low, searching with infinite window is not looking promising. -42 +/- 30 ELO after 600 games of 2m+1s.

I actually had a bug in the implementation. If it fails high, I ignore changing bounds and just continue with alpha = score-30, beta=score+30.

Buggy:

Code: Select all

// Fail high
if (score >= beta) {
	alpha = -INF
	beta = INF
	fLow = 0
}

alpha = score - aspWindowGood
beta =  score + aspWindowGood

Correct, now in Deuterium_18_fix.

Code: Select all

// Fail high
if (score >= beta) {
	alpha = -INF
	beta = INF
	fLow = 0
	continue
}

alpha = score - aspWindowGood
beta =  score + aspWindowGood

Currently testing Deuterium_18_fix vs Deuterium_17.
Result so far after 420 games at TC 15s+100ms.

Code: Select all

Score of Deuterium_18_fix vs Deuterium_17: 84 - 84 - 252  [0.500] 420
...      Deuterium_18_fix playing White: 45 - 35 - 129  [0.524] 209
...      Deuterium_18_fix playing Black: 39 - 49 - 123  [0.476] 211
...      White vs Black: 94 - 74 - 252  [0.524] 420
Elo difference: 0.0 +/- 21.0, LOS: 50.0 %, DrawRatio: 60.0 %
Started game 426 of 1000 (Deuterium_17 vs Deuterium_18_fix)

Ferdy · Post by **Ferdy** » Mon Dec 28, 2020 8:08 am

Ferdy wrote: ↑Mon Dec 28, 2020 7:05 am
emadsen wrote: ↑Mon Dec 28, 2020 5:29 am
Ferdy wrote: ↑Mon Dec 28, 2020 4:14 am I tried something similar to what Joost suggested. If it fails high reset the bound to inf and continue with the next iteration. But if it fails low expand at low side by 50, 100, inf, then research. Also start applying aspwin above iteration 3.
That's encouraging. My test of skipping to next ply on fail high or low, searching with infinite window is not looking promising. -42 +/- 30 ELO after 600 games of 2m+1s.
I actually had a bug in the implementation. If it fails high, I ignore changing bounds and just continue with alpha = score-30, beta=score+30.

Buggy:
Code: Select all
// Fail high
if (score >= beta) {
	alpha = -INF
	beta = INF
	fLow = 0
}

alpha = score - aspWindowGood
beta =  score + aspWindowGood
Correct, now in Deuterium_18_fix.
Code: Select all
// Fail high
if (score >= beta) {
	alpha = -INF
	beta = INF
	fLow = 0
	continue
}

alpha = score - aspWindowGood
beta =  score + aspWindowGood
Currently testing Deuterium_18_fix vs Deuterium_17.
Result so far after 420 games at TC 15s+100ms.
Code: Select all
Score of Deuterium_18_fix vs Deuterium_17: 84 - 84 - 252  [0.500] 420
...      Deuterium_18_fix playing White: 45 - 35 - 129  [0.524] 209
...      Deuterium_18_fix playing Black: 39 - 49 - 123  [0.476] 211
...      White vs Black: 94 - 74 - 252  [0.524] 420
Elo difference: 0.0 +/- 21.0, LOS: 50.0 %, DrawRatio: 60.0 %
Started game 426 of 1000 (Deuterium_17 vs Deuterium_18_fix)

Final result after 1000 games. The fix loses by 4 games.

Code: Select all

Score of Deuterium_18_fix vs Deuterium_17: 210 - 214 - 576  [0.498] 1000
...      Deuterium_18_fix playing White: 124 - 85 - 291  [0.539] 500
...      Deuterium_18_fix playing Black: 86 - 129 - 285  [0.457] 500
...      White vs Black: 253 - 171 - 576  [0.541] 1000
Elo difference: -1.4 +/- 14.0, LOS: 42.3 %, DrawRatio: 57.6 %
Finished match

Dann Corbit · Post by **Dann Corbit** » Mon Dec 28, 2020 10:26 pm

Don't you hate it when the bug does better than the correct code

Ras · Post by **Ras** » Tue Dec 29, 2020 1:39 am

Ferdy wrote: ↑Mon Dec 28, 2020 8:08 amFinal result after 1000 games. The fix loses by 4 games.

Your error margin is +/- 31.5 points with 1000 games, so all you can conclude is that it has no major impact either way.

Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?