Are Aspiration Windows Worthless?

emadsen · Post by **emadsen** » Tue Dec 22, 2020 1:07 am

Dann Corbit wrote: ↑Tue Dec 22, 2020 12:39 am One alternative could be to store the next best score from the previous iteration (and possibly the next best move so we can search it first) along with alpha and beta.

In Principal Variation Search, we don't know the next best move. We only know it if MultiPV > 1.

At least in MadChess. I do PVS even at the root. I definitely remember testing this and confirming PVS at the root is stronger than a full alpha / beta window at the root.

Dann Corbit · Post by **Dann Corbit** » Tue Dec 22, 2020 1:23 am

Good point, that idea does not work with pvs search

No4b · Post by **No4b** » Wed Dec 23, 2020 10:45 pm

I tried to implement Aspiration windows in Drofa two times, both were unsuccessful.
But on the first try i still had TT bug in the engine that made aspiration search very bad (plagued TT with worthless entries basically).
In the second try engine searched much less nodes, but still was ~-7 elo.
I suppose the Devil in the details here. You have to get everything right in order for it to work.

i`ll try one or two more times to implement this.
With like every top engine using it I more or less sure it is working technique, but the tricky one.

Ras · Post by **Ras** » Thu Dec 24, 2020 12:23 pm

emadsen wrote: ↑Mon Dec 21, 2020 9:17 pmOtherwise I feel I'm subverting the correctness of the alpha / beta algorithm.

With pruning and reductions (especially LMR), that has gone out of the window long since anyway.

emadsen · Post by **emadsen** » Thu Dec 24, 2020 6:16 pm

Ras wrote: ↑Thu Dec 24, 2020 12:23 pm
emadsen wrote: ↑Mon Dec 21, 2020 9:17 pmOtherwise I feel I'm subverting the correctness of the alpha / beta algorithm.
With pruning and reductions (especially LMR), that has gone out of the window long since anyway.

Is that a bad pun? Window, ha ha. Well, there never was correctness anyway because that assumes perfect static eval which is only true for draw-by-rule, stalemate, and checkmate. I see your point though. I just don’t see any advantage in aspiration windows over what I already get from PVS. Whereas I see a massive advantage in LMR.

abulmo2 · Post by **abulmo2** » Fri Dec 25, 2020 1:43 am

I just run some experimentations that strongly disagree with your findings.
The aspiration windows algorithm is definitely worth on my engines. I got the following results, with self play using SPRT stop condition:
- Dumb : +54.2 +/- 7.3 Elo (834 games)
- Amoeba : +131.6 +/- 12.2 Elo (260 games)
Because of self play and SPRT, the Elo differences are probably exagerated, but obviously significant.
Are you sure your implementation is correct and optimal?

Steve Maughan · Post by **Steve Maughan** » Fri Dec 25, 2020 11:15 am

I've never had any luck with aspiration windows for Maverick. I've always assumed it was down to a poorly tuned evaluation function and generally weak-ish engine.

I've always thought that when searching the first move, the hash table would guide the search down the previous PV and the window would quickly close.

One point: do you have a fixed width for your window after each research, or are you gradually opening the window? For example...

First search: alpha = pv_score - 25; beta = pv_score + 25
Second search on fail high: alpha = pv_score - 25; beta = pv_score + 125
Third search on fail high: alpha = pv_score - 25; beta = pv_score + 300
Fourth search on fail high: alpha = pv_score - 25; beta = +inf

Best regards,

Steve

Uri Blass · Post by **Uri Blass** » Fri Dec 25, 2020 6:28 pm

abulmo2 wrote: ↑Fri Dec 25, 2020 1:43 am I just run some experimentations that strongly disagree with your findings.
The aspiration windows algorithm is definitely worth on my engines. I got the following results, with self play using SPRT stop condition:
- Dumb : +54.2 +/- 7.3 Elo (834 games)
- Amoeba : +131.6 +/- 12.2 Elo (260 games)
Because of self play and SPRT, the Elo differences are probably exagerated, but obviously significant.
Are you sure your implementation is correct and optimal?

If you want to test rating difference then SPRT is not the right test and you need to use fixed number of games.

emadsen · Post by **emadsen** » Sat Dec 26, 2020 5:28 pm

abulmo2 wrote: ↑Fri Dec 25, 2020 1:43 am I just run some experimentations that strongly disagree with your findings... Are you sure your implementation is correct and optimal?

Interesting. Thank you Richard for running these tests. I always am willing to admit the possibility I screwed up something in my code.

Steve Maughan wrote: ↑Fri Dec 25, 2020 11:15 amI've never had any luck with aspiration windows for Maverick. I've always assumed it was down to a poorly tuned evaluation function and generally weak-ish engine... Do you have a fixed width for your window after each research, or are you gradually opening the window?

I tried gradually opening the window by +/- 25, 50, 100, 200, 500, etc but eventually settled on +/- 100, 500, infinite.

abulmo2 · Post by **abulmo2** » Sat Dec 26, 2020 6:35 pm

Uri Blass wrote: ↑Fri Dec 25, 2020 6:28 pm
abulmo2 wrote: ↑Fri Dec 25, 2020 1:43 am I just run some experimentations that strongly disagree with your findings.
The aspiration windows algorithm is definitely worth on my engines. I got the following results, with self play using SPRT stop condition:
- Dumb : +54.2 +/- 7.3 Elo (834 games)
- Amoeba : +131.6 +/- 12.2 Elo (260 games)
Because of self play and SPRT, the Elo differences are probably exagerated, but obviously significant.
Are you sure your implementation is correct and optimal?
If you want to test rating difference then SPRT is not the right test and you need to use fixed number of games.

I ran a gauntlet test with Dumb (aspiration on/off) and found +50.9 Elo (+/- 12, 100 games × 19 opponents) in favour of the engine with the aspiration windows. So the result is on par with the SPRT.

Code: Select all

  # PLAYER                : RATING  ERROR   POINTS  PLAYED    (%)
   1 DiscoCheck 3.7.1      : 2596.8   44.8    163.0     200   81.5%
   2 Glaurung 2.2          : 2548.6   39.5    154.0     200   77.0%
   3 arasan-15.6           : 2472.6   36.6    137.0     200   68.5%
   4 Mini Rodent 1.0       : 2456.6   35.1    133.0     200   66.5%
   5 Zappa 1.1             : 2437.2   35.5    128.0     200   64.0%
   6 Cheese 1.7 64 bits    : 2412.8   34.8    121.5     200   60.8%
   7 Cyrano 0.6b17         : 2407.2   33.8    120.0     200   60.0%
   8 Fruit 2.1             : 2389.1   33.8    115.0     200   57.5%
   9 dumb-1.6              : 2361.3   12.2   1088.5    1900   57.3%
  10 EXchess v6.50b        : 2360.5   33.7    107.0     200   53.5%
  11 Sloppy-0.2.2          : 2325.2   34.1     97.0     200   48.5%
  12 dumb-1.6 no-AW        : 2310.4   12.1    976.5    1900   51.4%
  13 Fridolin 2.00         : 2253.3   34.7     77.0     200   38.5%
  14 Pepito_v1.59          : 2211.1   36.8     66.0     200   33.0%
  15 Yace Paderborn        : 2188.7   37.3     60.5     200   30.2%
  16 OliThink 5.3.2        : 2184.5   36.9     59.5     200   29.8%
  17 amundsen              : 2144.7   39.1     50.5     200   25.2%
  18 Fruit 1.0             : 2135.3   39.4     48.5     200   24.2%
  19 Jazz 501              : 2094.5   42.8     40.5     200   20.2%
  20 phalanx               : 2091.8   41.9     40.0     200   20.0%
  21 beowulf               : 1917.9   61.7     17.0     200    8.5%

White advantage = 0.00
Draw rate (equal opponents) = 50.00 %

PS: sorry for using old version of some engines, the purpose is not to test them, but to test my own program.

Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?

Re: Are Aspiration Windows Worthless?