Milos wrote:Adam Hair wrote:Perhaps you should think about whether or not I would state something without anything to back up my statement. Also, I made no claim that the increase in Elo is linear as the number of doublings increase.
Take a moment to read what I wrote and realize that I am stating something that I have measured and with no intention to make you look bad.
When I arrive home in a few hours, I will present my data.
I know you understand statistics fairly well as well as methods of chess testing, so I meant no disrespect and I don't think you invented the number. It's just highly probable that you made some error in your testing strategy that led to such a ridiculous result.
You see, there are quiet a bit of papers from 30 years ago till recent about search trees, diminishing returns, etc. and it's quite an established fact that speed doubling provides no more than 70 Elo gain. And that's the number from the past (before strong LMR and pruning). With more selective algorithms when increasing depths the EBF (effective branching factor) is increasing due to tree widening closer to the root. This increases even more diminishing returns than before.
Therefor having numbers like 120 Elo per doubling would be something so improbable as measuring some particles going faster then light. And even when really respectable institution and ppl perform serious measurements sooner or later an error is discovered.
There have been some hints from different authors that they were getting more than 70 Elo per doubling at shorter search times. I have seen a citation that Levy and Newborn stated in
How Computers Play Chess of 50 to 70 Elo per doubling, but I have not seen that confirmed in recent times. So, I decided to see if it was true, for I needed to know for another test.
The following list is the result of my testing the approximate gain in Elo for each doubling of thinking time, which should roughly equal the same increase due to doubling speed:
Code: Select all
Base time control: 6 sec + 0.1 sec
(2) : 2 x (6 sec + 0.1 sec) = 12 sec + 0.2 sec
(4) : 24 sec + 0.4 sec
(8) : 48 sec + 0.8 sec
(16): 96 sec + 1.6 sec
QX6700 @ 3.05 GHz
100 positions per match, each position twice (reversed colors)
Rank Name Elo + - games score oppo. draws
1 Gull 1.0a(16) 359 13 13 2627 69% 219 31%
2 Komodo_2.03(8) 356 11 11 3319 72% 178 28%
3 Stockfish 2.1.1(8) 354 12 12 3154 74% 161 28%
4 Houdini_1.03a(4) 346 11 11 3807 73% 136 28%
5 Rybka_4.1(4) 310 11 11 3711 70% 134 28%
6 Critter_1.01(4) 304 11 11 3853 68% 140 29%
7 Hannibal 1.1(16) 296 18 18 1400 67% 167 28%
8 Stockfish 2.1.1(4) 271 11 11 3524 68% 120 28%
9 Komodo_2.03(4) 259 10 10 4037 63% 138 28%
10 Houdini_1.03a(2) 250 9 9 5471 67% 96 28%
11 Hiarcs_12(16) 245 17 17 1600 59% 176 27%
12 Gull 1.0a(8) 238 10 10 4166 57% 180 32%
13 Critter_1.01(2) 214 8 8 5967 63% 96 28%
14 Shredder_11(16) 207 18 18 1400 53% 179 24%
15 Crafty_23.4(16) 194 13 13 1935 56% 154 33%
16 Rybka_4.1(2) 190 8 8 5685 60% 93 29%
17 Hannibal 1.1(8) 182 9 9 4303 55% 143 34%
18 Fruit_051103(16) 177 16 16 1600 49% 185 32%
19 Stockfish 2.1.1(2) 160 9 9 4900 63% 48 29%
20 Houdini_1.03a 156 7 7 9522 73% -58 23%
21 Komodo_2.03(2) 149 8 8 5504 58% 71 27%
22 Gull 1.0a(4) 145 8 8 6409 55% 104 31%
23 Tornado 4.40(16) 108 10 10 4126 45% 143 29%
24 Hiarcs_12(8) 108 8 8 5400 52% 93 28%
25 Naum 2.0(16) 91 18 18 1400 36% 196 23%
26 Critter_1.01 89 7 7 10215 68% -78 22%
27 Hannibal 1.1(4) 85 8 8 6178 49% 87 31%
28 Crafty_23.4(8) 67 11 11 2999 49% 74 31%
29 Fruit_051103(8) 66 9 9 4769 43% 114 30%
30 Shredder_11(8) 65 8 8 6199 45% 100 28%
31 Ruffian 2.10(16) 46 12 12 2476 54% 14 27%
32 Gull 1.0a(2) 36 7 7 7389 57% -20 27%
33 Hiarcs_12(4) 26 8 8 5600 46% 57 24%
34 Rybka_4.1 21 7 7 9561 61% -76 23%
35 Komodo_2.03 13 7 7 7515 61% -87 24%
36 Stockfish 2.1.1 9 7 7 8154 64% -111 24%
37 Crafty_23.4(4) 5 10 10 3800 54% -23 29%
38 Hannibal 1.1(2) -16 7 7 7588 51% -28 29%
39 Gaviota_0.83(16) -21 8 8 5700 47% 9 23%
40 Naum 2.0(8) -26 8 8 5866 45% 13 27%
41 Smarthink_1.20(4) -29 10 10 3350 49% -25 28%
42 Tornado 4.40(8) -30 8 8 5797 44% 20 26%
43 Shredder_11(4) -37 8 8 6500 44% 16 23%
44 Ruffian 2.10(8) -41 11 11 3279 52% -58 26%
45 Fruit_051103(4) -49 7 7 7395 43% 8 26%
46 Crafty_23.4(2) -75 10 10 3374 49% -72 27%
47 Hiarcs_12(2) -79 9 9 5400 44% -29 23%
48 Gull 1.0a -88 7 7 8186 54% -128 24%
49 Naum 2.0(4) -121 8 8 5799 38% -14 23%
50 Gaviota_0.83(8) -127 9 9 4899 36% -3 20%
51 Ruffian 2.10(4) -140 12 12 2589 43% -90 24%
52 Smarthink_1.20(2) -148 11 11 2995 51% -164 26%
53 Shredder_11(2) -149 8 8 6209 38% -52 21%
54 Fruit_051103(2) -153 7 7 7591 41% -81 23%
55 Tornado 4.40(4) -168 8 8 6597 31% -3 21%
56 Hannibal 1.1 -168 7 7 7397 47% -153 24%
57 Hiarcs_12 -202 10 10 4500 46% -171 21%
58 Crafty_23.4 -204 9 9 4603 40% -129 22%
59 Gaviota_0.83(4) -229 9 9 5786 28% -24 16%
60 Naum 2.0(2) -250 9 9 5799 31% -83 19%
61 Ruffian 2.10(2) -284 12 12 2882 41% -205 20%
62 Smarthink_1.20 -294 11 11 3962 31% -135 17%
63 Shredder_11 -298 11 11 3812 39% -194 17%
64 Fruit_051103 -301 10 10 4200 38% -199 18%
65 Tornado 4.40(2) -311 10 10 4794 27% -92 16%
66 Gaviota_0.83(2) -342 11 11 4668 21% -55 13%
67 Naum 2.0 -416 14 14 2993 27% -186 13%
68 Gaviota_0.83 -437 15 15 2600 22% -158 12%
69 Tornado 4.40 -479 15 15 2800 20% -171 11%
70 Ruffian 2.10 -486 14 14 3377 18% -184 13%
The following is the Elo difference per doubling for each engine:
Code: Select all
Rank Name Elo + - games score oppo. draws difference
1 Gull 1.0a(16) 359 13 13 2627 69% 219 31%
12 Gull 1.0a(8) 238 10 10 4166 57% 180 32% 111
22 Gull 1.0a(4) 145 8 8 6409 55% 104 31% 93
32 Gull 1.0a(2) 36 7 7 7389 57% -20 27% 109
48 Gull 1.0a -88 7 7 8186 54% -128 24% 124
7 Hannibal 1.1(16) 296 18 18 1400 67% 167 28%
17 Hannibal 1.1(8) 182 9 9 4303 55% 143 34% 114
27 Hannibal 1.1(4) 85 8 8 6178 49% 87 31% 97
38 Hannibal 1.1(2) -16 7 7 7588 51% -28 29% 101
56 Hannibal 1.1 -168 7 7 7397 47% -153 24% 152
11 Hiarcs_12(16) 245 17 17 1600 59% 176 27%
24 Hiarcs_12(8) 108 8 8 5400 52% 93 28% 137
33 Hiarcs_12(4) 26 8 8 5600 46% 57 24% 82
47 Hiarcs_12(2) -79 9 9 5400 44% -29 23% 105
57 Hiarcs_12 -202 10 10 4500 46% -171 21% 123
14 Shredder_11(16) 207 18 18 1400 53% 179 24%
30 Shredder_11(8) 65 8 8 6199 45% 100 28% 142
43 Shredder_11(4) -37 8 8 6500 44% 16 23% 102
53 Shredder_11(2) -149 8 8 6209 38% -52 21% 112
63 Shredder_11 -298 11 11 3812 39% -194 17% 149
15 Crafty_23.4(16) 194 13 13 1935 56% 154 33%
28 Crafty_23.4(8) 67 11 11 2999 49% 74 31% 127
37 Crafty_23.4(4) 5 10 10 3800 54% -23 29% 62
46 Crafty_23.4(2) -75 10 10 3374 49% -72 27% 80
58 Crafty_23.4 -204 9 9 4603 40% -129 22% 129
18 Fruit_051103(16) 177 16 16 1600 49% 185 32%
29 Fruit_051103(8) 66 9 9 4769 43% 114 30% 111
45 Fruit_051103(4) -49 7 7 7395 43% 8 26% 105
54 Fruit_051103(2) -153 7 7 7591 41% -81 23% 104
64 Fruit_051103 -301 10 10 4200 38% -199 18% 148
23 Tornado 4.40(16) 108 10 10 4126 45% 143 29%
42 Tornado 4.40(8) -30 8 8 5797 44% 20 26% 138
55 Tornado 4.40(4) -168 8 8 6597 31% -3 21% 138
65 Tornado 4.40(2) -311 10 10 4794 27% -92 16% 143
69 Tornado 4.40 -479 15 15 2800 20% -171 11% 168
31 Ruffian 2.10(16) 46 12 12 2476 54% 14 27%
44 Ruffian 2.10(8) -41 11 11 3279 52% -58 26% 87
51 Ruffian 2.10(4) -140 12 12 2589 43% -90 24% 99
61 Ruffian 2.10(2) -284 12 12 2882 41% -205 20% 144
70 Ruffian 2.10 -486 14 14 3377 18% -184 13% 202
25 Naum 2.0(16) 91 18 18 1400 36% 196 23%
40 Naum 2.0(8) -26 8 8 5866 45% 13 27% 107
49 Naum 2.0(4) -121 8 8 5799 38% -14 23% 95
60 Naum 2.0(2) -250 9 9 5799 31% -83 19% 129
67 Naum 2.0 -416 14 14 2993 27% -186 13% 166
39 Gaviota_0.83(16) -21 8 8 5700 47% 9 23%
50 Gaviota_0.83(8) -127 9 9 4899 36% -3 20% 106
59 Gaviota_0.83(4) -229 9 9 5786 28% -24 16% 102
66 Gaviota_0.83(2) -342 11 11 4668 21% -55 13% 113
68 Gaviota_0.83 -437 15 15 2600 22% -158 12% 95
2 Komodo_2.03(8) 356 11 11 3319 72% 178 28%
9 Komodo_2.03(4) 259 10 10 4037 63% 138 28% 97
21 Komodo_2.03(2) 149 8 8 5504 58% 71 27% 110
35 Komodo_2.03 13 7 7 7515 61% -87 24% 136
3 Stockfish 2.1.1(8) 354 12 12 3154 74% 161 28%
8 Stockfish 2.1.1(4) 271 11 11 3524 68% 120 28% 83
19 Stockfish 2.1.1(2) 160 9 9 4900 63% 48 29% 111
36 Stockfish 2.1.1 9 7 7 8154 64% -111 24% 151
4 Houdini_1.03a(4) 346 11 11 3807 73% 136 28%
10 Houdini_1.03a(2) 250 9 9 5471 67% 96 28% 96
20 Houdini_1.03a 156 7 7 9522 73% -58 23% 94
5 Rybka_4.1(4) 310 11 11 3711 70% 134 28%
16 Rybka_4.1(2) 190 8 8 5685 60% 93 29% 120
34 Rybka_4.1 21 7 7 9561 61% -76 23% 169
6 Critter_1.01(4) 304 11 11 3853 68% 140 29%
13 Critter_1.01(2) 214 8 8 5967 63% 96 28% 90
26 Critter_1.01 89 7 7 10215 68% -78 22% 125
The mean Elo increase per doubling in time is 117.94 +/- 54. If the Elo increases for doubling from the base time to 12 sec + 0.2 sec is discarded, the increase per doubling is 108.16 +/- 38.52.
I will not claim that there are no problems with my testing methodology, but you will have to point out to me exactly what the problems are. I understand that these numbers do not correspond with your expectations, but this data does have a little bit of support in that it corresponds with the results found by the authors of Dirty and Spandrel.
If you do point out some plausible flaws, I would be willing to redo the study with corrections made to the methodology.