round robin or if you just run a gauntlet. I took the games of TL20090922
and TL20080620 that I posted recently and tried different scenarios.
Note: TL20090922 and TL20080620 did not play each other
Here is a subset of the games:
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Bright-0.4a3(2CPU) 86 55 55 110 65% -17 27%
2 Fruit23-EM64T 18 53 53 110 55% -17 34%
3 TwistedLogic20090922_x64 3 38 38 216 49% 8 34%
4 Delfi 5.4 (2CPU) -22 54 54 108 50% -17 31%
5 TwistedLogic20080620 -36 38 38 221 44% 9 33%
6 Spike1.2 Turin -48 52 52 109 45% -17 42% Code: Select all
Rank Name Elo + - games score oppo. draws
1 Bright-0.4a3(2CPU) 104 29 29 275 67% -21 28%
2 Fruit23-EM64T 11 28 28 275 52% -2 35%
3 TwistedLogic20090922_x64 3 32 32 216 49% 9 34%
4 Delfi 5.4 (2CPU) -34 28 28 270 44% 7 33%
5 TwistedLogic20080620 -37 32 32 221 44% 9 33%
6 Spike1.2 Turin -47 28 28 273 41% 10 40%
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Stockfish_151_x64(2CPU) 205 60 60 108 86% -77 22%
2 Rybkav2.3.2a.w32 156 56 56 108 81% -77 25%
3 Bright-0.4a3(2CPU) 27 51 51 110 65% -78 27%
4 Fruit23-EM64T -43 49 49 110 55% -78 34%
5 TwistedLogic20090922_x64 -44 28 28 324 40% 26 34%
6 Delfi 5.4 (2CPU) -82 50 50 108 50% -77 31%
7 Spike1.2 Turin -108 48 48 109 45% -77 42%
8 TwistedLogic20080620 -110 30 30 329 33% 25 27%
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Stockfish_151_x64(2CPU) 194 29 29 377 79% -28 27%
2 Rybkav2.3.2a.w32 137 27 27 377 71% -20 31%
3 Bright-0.4a3(2CPU) 34 35 35 218 48% 43 31%
4 Fruit23-EM64T -34 36 36 218 39% 43 33%
5 TwistedLogic20090922_x64 -43 28 28 324 40% 25 34%
6 Delfi 5.4 (2CPU) -80 37 37 216 34% 44 28%
7 Spike1.2 Turin -98 37 37 217 31% 44 33%
8 TwistedLogic20080620 -110 29 29 329 33% 25 27%
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Stockfish_151_x64(2CPU) 232 65 65 108 86% -49 22%
2 Rybkav2.3.2a.w32 183 60 60 108 81% -49 25%
3 Rybka v2.2n2x64(2CPU) 179 63 63 92 81% -44 29%
4 Thinker54d Inertx64(2CPU) 178 61 61 106 79% -49 21%
5 Stockfish_14_x64_ja(2CPU) 169 58 58 108 81% -49 30%
6 Glaurung22_x64_ja(2CPU) 167 59 59 108 80% -49 29%
7 Toga141se-2cpu 92 55 55 107 70% -49 30%
8 Bright-0.4a3(2CPU) 54 54 54 110 65% -50 27%
9 Crafty_230_x64_ja(2CPU) -8 52 52 106 56% -50 39%
10 Fruit23-EM64T -15 52 52 110 55% -50 34%
11 TwistedLogic20090922_x64 -20 17 17 1125 47% 5 31%
12 Naum2.0_x64 -31 53 53 106 53% -49 36%
13 Delfi 5.4 (2CPU) -51 53 53 109 50% -49 30%
14 Scorpio_21_x64_ja(2cpu) -71 54 54 108 47% -49 25%
15 Frenzee_Feb08_x64 -72 54 54 107 47% -49 28%
16 TwistedLogic20080620 -79 17 17 1119 40% 2 28%
17 Spike1.2 Turin -80 51 51 109 45% -49 42%
18 Et_Chess_130108 -100 55 55 105 43% -50 24%
19 BugChess2_V1_6_3_x64 -118 55 55 108 41% -49 24%
20 Booot415 -132 54 54 108 38% -49 31%
21 Colossus2008b -154 53 53 108 34% -49 35%
22 Movei00_8_438(10 10 10) -157 54 54 105 34% -49 35%
23 Alaric707 -167 55 55 108 34% -49 25%
opponents have played each other.
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Stockfish_151_x64(2CPU) 213 21 21 915 76% 12 28%
2 Rybka v2.2n2x64(2CPU) 208 22 22 779 72% 45 33%
3 Rybkav2.3.2a.w32 176 20 20 917 72% 8 31%
4 Thinker54d Inertx64(2CPU) 172 25 25 504 57% 121 39%
5 Stockfish_14_x64_ja(2CPU) 161 20 20 862 69% 22 33%
6 Glaurung22_x64_ja(2CPU) 120 31 31 336 53% 97 35%
7 Toga141se-2cpu 111 25 25 507 47% 129 37%
8 Bright-0.4a3(2CPU) 81 27 27 475 54% 53 32%
9 Fruit23-EM64T -6 27 27 475 41% 63 33%
10 TwistedLogic20090922_x64 -20 17 17 1125 47% 5 31%
11 Crafty_230_x64_ja(2CPU) -25 33 33 306 32% 106 32%
12 Naum2.0_x64 -29 33 33 306 31% 106 32%
13 Delfi 5.4 (2CPU) -58 28 28 470 33% 70 28%
14 Spike1.2 Turin -65 27 27 472 32% 70 33%
15 Frenzee_Feb08_x64 -78 36 36 307 27% 105 22%
16 TwistedLogic20080620 -79 17 17 1118 40% 2 28%
17 Scorpio_21_x64_ja(2cpu) -98 37 37 307 26% 104 18%
18 Et_Chess_130108 -100 55 55 105 43% -50 24%
19 Booot415 -105 36 36 308 23% 105 25%
20 BugChess2_V1_6_3_x64 -118 54 54 108 41% -50 24%
21 Alaric707 -146 44 44 215 23% 72 22%
22 Colossus2008b -155 53 53 108 34% -50 35%
23 Movei00_8_438(10 10 10) -158 54 54 105 34% -50 35%
in the last set.
It seems that, in general, gauntlets will give you the same information as
round robin tournaments. It does seem that if your engine performs poorly
against one opponent that is very weak against the other engines then
there would be some difference between gauntlet and round robin. But,
how likely is that?