optimal aspiration window for stockfish question

bob · Post by **bob** » Fri Mar 16, 2012 4:41 am

Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%

version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...

Rémi Coulom · Post by **Rémi Coulom** » Fri Mar 16, 2012 9:38 am

bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%

version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...

You should use CLOP

Rémi

Karlo Bala · Post by **Karlo Bala** » Fri Mar 16, 2012 12:30 pm

bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%

version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...

What is the average depth?

bob · Post by **bob** » Fri Mar 16, 2012 6:20 pm

Rémi Coulom wrote:
bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%

version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...
You should use CLOP

Rémi

I intend on looking at it. But I can test this so simply at present, I just say "runtest" and it runs the test with each parameter change as needed. Of course it is not optimally tuning a parameter, just using the choices I give...

bob · Post by **bob** » Fri Mar 16, 2012 6:26 pm

Karlo Bala wrote:
bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%

version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...
What is the average depth?

A quick looks says that the average is in the 13-14-15 range. I looked at a few logs and there are plenty of re-searches going on, so it is having a chance to exert influence.

Rebel · Post by **Rebel** » Fri Mar 16, 2012 7:31 pm

bob wrote:
Karlo Bala wrote:
bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%

version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...
What is the average depth?
A quick looks says that the average is in the 13-14-15 range. I looked at a few logs and there are plenty of re-searches going on, so it is having a chance to exert influence.

What should not be forgotten is the saturation of the hash table. With an almost full hash table researches will become very expensive and it would make sense to widen the window.

bob · Post by **bob** » Fri Mar 16, 2012 8:16 pm

Rebel wrote:
bob wrote:
Karlo Bala wrote:
bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%

version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...
What is the average depth?
A quick looks says that the average is in the 13-14-15 range. I looked at a few logs and there are plenty of re-searches going on, so it is having a chance to exert influence.
What should not be forgotten is the saturation of the hash table. With an almost full hash table researches will become very expensive and it would make sense to widen the window.

That's one problem I really don't have to deal with. I don't hash the q-search, so I really don't see much in terms of saturation. When I was testing this stuff, I found that hashing the q-search reduced the total tree size by about 10%, but it slowed the program down by almost exactly the same amount. A wash. But by not hashing the q-search, the stress on the ttable is greatly reduced, which tends to make this pay off (not hashing qsearch) in real long games without sufficient memory for the ttable. 8 gigs gives 512 million entries, which is a lot even for a long 40/2hr type game...

I run fast games with a modest hash on the cluster, just to try to keep things within perspective...

diep · Post by **diep** » Sat Mar 17, 2012 2:44 pm

bob wrote:
mcostalba wrote:
bob wrote: Absolutely no change, either up or down.
Have you tested at longer TC ? It would be interesting to know how scales at longer TC, with your cluster should be feasible to test say at 1' TC.
I thought I had said that I tested up to 1 minute + 1 second. I did not go beyond that, and found absolutely no difference, which was surprising. The only thing I didnt like was the excessive re-searches to reach a mate. But at that point, it doesn't affect the game result at all of course...

When we first started doing this in Cray Blitz, we tried lots of ideas. Our aspiration window was roughly 1/3 of a pawn, so we relaxed alpha/beta to +1.0, then +3.0, then +9.0 and then all the way to infinite. In Crafty, I eliminated that +3.0 and have been using +1, +9 and +infinite for the longest...

I am also running a test (since I was not testing anything else) on various aspiration window widths as well. I'll post those results when they finish...

Cray Blitz still alive?

In diep i'm starting search with (-inf,+inf) for a simple reason: hashtable will directly give back a great bound anyway and you nullwindow around that.

Only when your PV is a total mess i assume that aspiration search is a great idea, shouldn't happen for a proper YBW search.

Maybe that's why crafty doesn't suffer from the same problem there.

bob · Post by **bob** » Sat Mar 17, 2012 5:11 pm

diep wrote:
bob wrote:
mcostalba wrote:
bob wrote: Absolutely no change, either up or down.
Have you tested at longer TC ? It would be interesting to know how scales at longer TC, with your cluster should be feasible to test say at 1' TC.
I thought I had said that I tested up to 1 minute + 1 second. I did not go beyond that, and found absolutely no difference, which was surprising. The only thing I didnt like was the excessive re-searches to reach a mate. But at that point, it doesn't affect the game result at all of course...

When we first started doing this in Cray Blitz, we tried lots of ideas. Our aspiration window was roughly 1/3 of a pawn, so we relaxed alpha/beta to +1.0, then +3.0, then +9.0 and then all the way to infinite. In Crafty, I eliminated that +3.0 and have been using +1, +9 and +infinite for the longest...

I am also running a test (since I was not testing anything else) on various aspiration window widths as well. I'll post those results when they finish...
Cray Blitz still alive?

In diep i'm starting search with (-inf,+inf) for a simple reason: hashtable will directly give back a great bound anyway and you nullwindow around that.

Only when your PV is a total mess i assume that aspiration search is a great idea, shouldn't happen for a proper YBW search.

Maybe that's why crafty doesn't suffer from the same problem there.

Depends on what you mean by "alive". The last 3 years of Cray Blitz development was lost. I have a 1991 or so version that is on the net and compiles/runs just fine (no parallel search however as the Cray "task common" is not a standard fortran feature.

I'm beginning to think that the +/- infinity idea is just as good as aspiration search. Most likely because we all use PVS (null-window, not null-move) anyway. I did not go all the way to infinity in my test results I posted here, but I will try it to see if it makes any difference. (I suspect there will be no change at all).

mcostalba · Post by **mcostalba** » Sat Mar 17, 2012 6:28 pm

bob wrote: I'm beginning to think that the +/- infinity idea is just as good as aspiration search. Most likely because we all use PVS (null-window, not null-move) anyway. I did not go all the way to infinity in my test results I posted here, but I will try it to see if it makes any difference. (I suspect there will be no change at all).

A little detail: aspiration window should not start from iteration 1, in Stockfish it starts from iteration 5 and before we use +-inf.

optimal aspiration window for stockfish question

Re: test results

Re: test results

Re: test results

Re: test results

Re: test results

Re: test results

Re: test results

Re: optimal aspiration window for stockfish question

Re: optimal aspiration window for stockfish question

Re: optimal aspiration window for stockfish question