optimal aspiration window for stockfish question

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: test results

Post by bob »

Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%


version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: test results

Post by Rémi Coulom »

bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%


version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...
You should use CLOP :-)

Rémi
Karlo Bala
Posts: 373
Joined: Wed Mar 22, 2006 10:17 am
Location: Novi Sad, Serbia
Full name: Karlo Balla

Re: test results

Post by Karlo Bala »

bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%


version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...
What is the average depth?
Best Regards,
Karlo Balla Jr.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: test results

Post by bob »

Rémi Coulom wrote:
bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%


version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...
You should use CLOP :-)

Rémi
I intend on looking at it. But I can test this so simply at present, I just say "runtest" and it runs the test with each parameter change as needed. Of course it is not optimally tuning a parameter, just using the choices I give...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: test results

Post by bob »

Karlo Bala wrote:
bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%


version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...
What is the average depth?
A quick looks says that the average is in the 13-14-15 range. I looked at a few logs and there are plenty of re-searches going on, so it is having a chance to exert influence.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: test results

Post by Rebel »

bob wrote:
Karlo Bala wrote:
bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%


version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...
What is the average depth?
A quick looks says that the average is in the 13-14-15 range. I looked at a few logs and there are plenty of re-searches going on, so it is having a chance to exert influence.
What should not be forgotten is the saturation of the hash table. With an almost full hash table researches will become very expensive and it would make sense to widen the window.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: test results

Post by bob »

Rebel wrote:
bob wrote:
Karlo Bala wrote:
bob wrote:Here is the output from BayesElo first:

2 Crafty-23.5R06-200 2650 4 4 30000 64% 2539 22%
3 Crafty-23.5R06-24 2650 4 4 30000 64% 2539 22%
4 Crafty-23.5R06-100 2649 4 4 30000 63% 2539 22%
5 Crafty-23.5R06-30 2648 4 4 30000 63% 2539 22%
6 Crafty-23.5R06-50 2648 4 4 30000 63% 2539 22%
7 Crafty-23.5R06-300 2648 4 4 30000 63% 2539 22%
8 Crafty-23.5R06-20 2648 4 4 30000 63% 2539 22%
9 Crafty-23.5R06-10 2645 4 4 30000 63% 2539 22%
10 Crafty-23.5-2 2645 4 4 30000 63% 2539 22%
11 Crafty-23.5R06-1 2645 4 4 30000 63% 2539 22%
12 Crafty-23.5R06-8 2644 4 4 30000 63% 2539 22%
13 Crafty-23.5R06-5 2643 4 4 30000 63% 2539 22%
14 Crafty-23.5-1 2641 4 4 30000 63% 2539 22%
15 Crafty-23.5R06-2 2636 4 4 30000 62% 2539 22%


version 23.5-1 and 23.5-2 are simply two consecutive runs with the same version to provide a normal result. The rest of the tests are version 23.5R06 and were tested where the -n is the aspiration window (delta value in the code posted yesterday). 23.5R06-1 means the aspiration window was +/- 1 with delta=1. 1 and 2 are a bit low, and by the time it gets to 10, it is pretty optimal. Bigger doesn't seem to hurt at all up to +/- 3.0 pawns... I was expecting a better result in the 20-40 range, the reason I ran the big numbers was to produce some worse results so that there is a recognizable curve with a clear optimal value, and worse results on either side. Didn't get exactly what I expected, as you can see...
What is the average depth?
A quick looks says that the average is in the 13-14-15 range. I looked at a few logs and there are plenty of re-searches going on, so it is having a chance to exert influence.
What should not be forgotten is the saturation of the hash table. With an almost full hash table researches will become very expensive and it would make sense to widen the window.
That's one problem I really don't have to deal with. I don't hash the q-search, so I really don't see much in terms of saturation. When I was testing this stuff, I found that hashing the q-search reduced the total tree size by about 10%, but it slowed the program down by almost exactly the same amount. A wash. But by not hashing the q-search, the stress on the ttable is greatly reduced, which tends to make this pay off (not hashing qsearch) in real long games without sufficient memory for the ttable. 8 gigs gives 512 million entries, which is a lot even for a long 40/2hr type game...

I run fast games with a modest hash on the cluster, just to try to keep things within perspective...
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: optimal aspiration window for stockfish question

Post by diep »

bob wrote:
mcostalba wrote:
bob wrote: Absolutely no change, either up or down.
Have you tested at longer TC ? It would be interesting to know how scales at longer TC, with your cluster should be feasible to test say at 1' TC.
I thought I had said that I tested up to 1 minute + 1 second. I did not go beyond that, and found absolutely no difference, which was surprising. The only thing I didnt like was the excessive re-searches to reach a mate. But at that point, it doesn't affect the game result at all of course...

When we first started doing this in Cray Blitz, we tried lots of ideas. Our aspiration window was roughly 1/3 of a pawn, so we relaxed alpha/beta to +1.0, then +3.0, then +9.0 and then all the way to infinite. In Crafty, I eliminated that +3.0 and have been using +1, +9 and +infinite for the longest...

I am also running a test (since I was not testing anything else) on various aspiration window widths as well. I'll post those results when they finish...
Cray Blitz still alive?

In diep i'm starting search with (-inf,+inf) for a simple reason: hashtable will directly give back a great bound anyway and you nullwindow around that.

Only when your PV is a total mess i assume that aspiration search is a great idea, shouldn't happen for a proper YBW search.

Maybe that's why crafty doesn't suffer from the same problem there.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: optimal aspiration window for stockfish question

Post by bob »

diep wrote:
bob wrote:
mcostalba wrote:
bob wrote: Absolutely no change, either up or down.
Have you tested at longer TC ? It would be interesting to know how scales at longer TC, with your cluster should be feasible to test say at 1' TC.
I thought I had said that I tested up to 1 minute + 1 second. I did not go beyond that, and found absolutely no difference, which was surprising. The only thing I didnt like was the excessive re-searches to reach a mate. But at that point, it doesn't affect the game result at all of course...

When we first started doing this in Cray Blitz, we tried lots of ideas. Our aspiration window was roughly 1/3 of a pawn, so we relaxed alpha/beta to +1.0, then +3.0, then +9.0 and then all the way to infinite. In Crafty, I eliminated that +3.0 and have been using +1, +9 and +infinite for the longest...

I am also running a test (since I was not testing anything else) on various aspiration window widths as well. I'll post those results when they finish...
Cray Blitz still alive?

In diep i'm starting search with (-inf,+inf) for a simple reason: hashtable will directly give back a great bound anyway and you nullwindow around that.

Only when your PV is a total mess i assume that aspiration search is a great idea, shouldn't happen for a proper YBW search.

Maybe that's why crafty doesn't suffer from the same problem there.
Depends on what you mean by "alive". The last 3 years of Cray Blitz development was lost. I have a 1991 or so version that is on the net and compiles/runs just fine (no parallel search however as the Cray "task common" is not a standard fortran feature.

I'm beginning to think that the +/- infinity idea is just as good as aspiration search. Most likely because we all use PVS (null-window, not null-move) anyway. I did not go all the way to infinity in my test results I posted here, but I will try it to see if it makes any difference. (I suspect there will be no change at all).
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: optimal aspiration window for stockfish question

Post by mcostalba »

bob wrote: I'm beginning to think that the +/- infinity idea is just as good as aspiration search. Most likely because we all use PVS (null-window, not null-move) anyway. I did not go all the way to infinity in my test results I posted here, but I will try it to see if it makes any difference. (I suspect there will be no change at all).
A little detail: aspiration window should not start from iteration 1, in Stockfish it starts from iteration 5 and before we use +-inf.