https://www.sp-cc.de/drawkiller-openings.htm
I added 2 sets: Drawkiller balanced and Drawkiller balanced small500, both contain lines, only, which are within a very small eval-interval in the endpositions of [-0.09;+0.09]. That leads to a little bit higher draw-rates, but also to wider Elo-spreadings of the engine-results.
Here the testrun of the new Drawkiller balanced set (and testruns of Drawkiller tournament, Stockfish Framework 8moves and GM-4moves sets for comparsion).
3 engines played a RoundRobin (Stockfish 10, Houdini 6 and Komodo 12), with 500 games in each head-to-head, so each engine played 1000 games. For each game one opening-line was chosen per random by the LittleBlitzerGUI.
Singlecore, 3'+1'', LittleBlitzerGUI, no ponder, no bases, 256 MB Hash, i7-6700HQ 2.6GHz Notebook (Skylake CPU), Windows 10 64bit
In the Drawkiller balanced sets, all endposition-evals (analyzed by Komodo) of the opening lines are in a very small interval of [-0.09;+0.09]. The idea is, that this should lead to wider Elo-spreading of the Engine ratings, which makes the Engine rankings much more statistically reliable (or a much lower number of played games is needed, to get the results out of the errorbar-arrays). Of course, on the other hand, this concept leads to little bit higher draw-rates...
Let's see, if it worked:
Drawkiller balanced:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Stockfish 10 bmi2 : 3506 11 11 1000 70.9 % 3347 36.2 %
2 Houdini 6 pext : 3392 11 11 1000 48.5 % 3404 40.8 %
3 Komodo 12 bmi2 : 3302 11 11 1000 30.6 % 3449 36.6 %
Elo-spreading (1st to last): 204 Elo
Draws: 37.9%
Drawkiller tournament:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Stockfish 10 bmi2 : 3494 11 11 1000 68.9 % 3353 34.2 %
2 Houdini 6 pext : 3387 11 11 1000 47.3 % 3407 38.2 %
3 Komodo 12 bmi2 : 3320 11 11 1000 33.8 % 3440 36.0 %
Elo-spreading (1st to last): 174 Elo
Draws: 36.1%
GM_4moves:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Stockfish 10 bmi2 : 3475 11 11 1000 65.4 % 3363 53.2 %
2 Houdini 6 pext : 3381 10 10 1000 46.0 % 3410 59.9 %
3 Komodo 12 bmi2 : 3345 10 10 1000 38.5 % 3428 55.9 %
Elo-spreading (1st to last): 130 Elo
Draws: 56.3%
Stockfish framework 8moves:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Stockfish 10 bmi2 : 3463 11 11 1000 63.0 % 3369 59.7 %
2 Houdini 6 pext : 3388 10 10 1000 47.5 % 3406 64.2 %
3 Komodo 12 bmi2 : 3349 10 10 1000 39.5 % 3425 60.1 %
Elo-spreading (1st to last): 114 Elo
Draws: 61.3%
1) The Drawkiller balanced idea was a success. The draw-rate is a little bit higher, than Drawkiller tournament (that is price, we have to pay for 2)), but look at point 2) and mention, that even this little higher draw-rate is still much, much lower, than the draw-rate of any other non-Drawkiller openings set...
2) The Elo-spreading, using Drawkiller balanced, was measureable higher, than with any other openings-set. That makes the Engine rankings much more statistical reliable. Or a much lower number of played games is needed, to get the results out of the errorbar-arrays:
Example: Compared to the result of Stockfish framework 8moves openings, the Elo-spreading of Drawkiller balanced is nearly doubled, which means, you can have a doubled errorbar-array size for the same statistical reliability of the Engine rankings in a tournament / ratinglist. Mention, that you have to play 4x more games to half the size of an errorbar! That means, if you are using Drawkiller balanced openings, you have to play only 25%-30% amount of games, which you have to play, when using Stockfish Framework 8move openings for the same statistical result-quality of engine rankings (!!!) - how awesome is that?!?