Final chapter of development (from SALC to Drawkiller to Armageddon to Blackmageddon) and my legacy...
Idea, development and testings by Stefan Pohl
On my website:
Why is Blackmageddon the end of development of my opening-sets? Because the main goal was, to lower the draw-rate in Computerchess (which climbs more and more, the faster the machines get and the stronger the engines play). And doing this, while keeping a good Elo-spreading of the results in engine-tests and -tournaments. And with Blackmageddon, I reached the goal, perfectly:
The number of draws is exactly 0, so the draw-rate is 0% - it cannot be lowered anymore. The Elo-spreading of the results is incredible high (around doubled to normal openings, on one level with Drawkiller openings). And the whitescore and blackscore is much more stable and balanced, than in Armageddon (that is the huge problem of Armageddon openings), which is the reason, I canceled the Armageddon-openings and removed them from my website.
And Blackmageddon is not a sub set of chess (only SALC openings for example (kings on the opposite side of the chessboard), like in my SALC-openings or the Armageddon-concept by Larry Kaufman) and the Blackmageddon openings are not virtual (like Drawkiller).
So, I see no way, to get any better results than Blackmageddon...
What is Blackmageddon? Blackmageddon means, that the following 4 moves line is set in front of normal chess openings (5 moves (10 plies) out of human games of the Megabase (both players 2300 Elo or more):
1. a4 Nc6 2. a5 Nxa5 3. Na3 Nc6 4. Nb1 Nb8
which means, that black is always one pawn ahead (white has no pawn on a2). And all draws are counted as a win for white.
Here an example of a full Blackmageddon opening-line:
1. a4 Nc6 2. a5 Nxa5 3. Na3 Nc6 4. Nb1 Nb8 5. d4 Nf6 6. c4 e6 7. Nf3 d5 8. Nc3 Be7 9. Bg5 h6
(all endpositions of the lines were evaluated by Komodo 13.1 (30''/position on Quadcore))
Armageddon is the opposite: White has an advantage (black has no a7-pawn or black is not allowed to castle or black is not allowed to castle short). And all draws are counted as a win for black.
Why is Armageddon bad and why is Blackmageddon better?
Because in Armageddon white has two advantages and black none: White has the advantage of the first move and the advantage given by the Armageddon opening. The problem is, that these two advantages let the whitescore climb and climb, when the thinking-time gets longer (or the machine gets faster). And a too high whitescore (and a too low blackscore) means, that the Elo-spreading of the results gets smaller and smaller (if the whitescore is (for example – worst case scenario) 100%, the Elo-spreading is 0, because all engine head-to-heads end 50%-50%, because white wins all games).
In Blackmageddon, white has the advantage of the first move, but black has the advantage of being one pawn ahead. That makes the whitescore/blackscore – balance much more stable in my testings, than in Armageddon. So Blackmageddon will work in the future on faster machines properly. Armageddon will not. And Blackmageddon is not a sub set of chess: All castlings are allowed for both sides.
Of course, the engines do not know, that they are playing Blackmageddon, when using Blackmageddon openings. But that is no problem. You just should set the contempt for all engines to 0 or very close to 0. Then, the engine, which plays white, has a huge negative contempt, because black has one pawn more and the evaluation of the engine is clearly negative. And the engine, which plays black, has a huge positive contempt, because black has one pawn more and the evaluation of the engine is clearly positive. So white will try to reach a forced draw and black will try to avoid it...
I added some tools to the Blackmageddon download, which convert a result.pgn-file with played games to Blackmageddon (blackmageddonize_classical.bat) and one tool, that shows a live-Blackmageddon scoring out of a result pgn-file (livescoring_classical.bat) (can be used, while the GUI still runs a tournament). Means: All 1/2-1/2 results are changed to 1-0. And the livescoring-tool starts ORDO and prints a ratinglist of the blackmageddonized games on the screen. If your engine-games are stored in file with a different name (not results.pgn), just change the name with an editor in the .bat-files (use Search & replace).
And I added two tools, which double all games, won by white, so ORDO or bayeselo count all white wins as 2 points. And all 1/2-1/2 as a 1-0 (like classical): blackmageddon_advanced and livescoring_advanced. The idea was, that in classical Blackmageddon, a draw and a white-win are counted with the same score (1-0). If all white-wins are scoring 2 points, they are „worth more“, than a draw.
But in my testings, the difference of classical and advanced counting was very small. So, I see no need to use advanced scoring, but feel free to use it. But mention, that the number of games gets higher, because all white win-games are doubled (but in Blackmageddon white wins are pretty rare, of course (because black is one pawn ahead, when the game starts). That shrinks the errorbar a little bit, when ORDO does its calculation.
Because of this, all my testing results of Blackmageddon below are done with classical counting. And, as you can see below, the results are overwhelming.
(asmFish 170426 vs. Komodo 10.4, 5'+3'' time-control, singlecore, no ponder, no endgame-bases, LittleBlitzerGUI, 1000 games each testrun(!) except Noomen Gambit-lines (only 246 positions, so 492 games were played) and Noomen TCEC Superfinal (only 100 positions, so 200 games were played)). First score: asmFish, second score Komodo.
Stockfish Framework standard 8 move openings: Score 60.3% – 39.7%, draws: 63.4%
FEOBOS v20 contempt 5 top 500 openings: Score 58.7% - 41.3%, draws: 64.1%
HERT 500 set: Score: 60.6% - 39.4%, draws: 60.4%
Noomen Gambit-Lines: Score 59.1% - 40.9%, draws: 59.3%
4 GM-moves short book: Score 60.5% - 39.5%, draws: 57.1%
Noomen TCEC Superfinal (Season 9+10): Score: 62.5% - 37.5%, draws: 50.0%
SALC V5 half-closed: Score 61.6% - 38.4%, draws: 49.2%
SALC V5 full-closed 500 positions: Score 66.5% - 33.5%, draws: 47.7%
Drawkiller (normal set): Score: 65.3% - 34.7%, draws: 33.5%
Drawkiller (tournament set): Score: 65.3% - 34.7%, draws: 33.5%
Drawkiller (small 500 positions set): Score: 66.4% - 33.6%, draws 30.5%
Drawkiller balanced (small 500 positions set): Score: 69.3% - 30.7%, draws 35.2%
Drawkiller balanced: Score 69.4% - 30.6%, draws 36.4%
Drawkiller balanced big (15962 positions): Score 67.4% - 32.6%, draws 38.8%
Drawkiller EloZoom (small 500 positions set): Score: 72.0% - 28.0%, draws 34.6%
Drawkiller EloZoom: Score: 73.2% - 26.8%, draws 36.5%
Drawkiller EloZoom big (20043 positions): Score: 69.2% - 30.8%, draws 40.7%
Blackmageddon (500 positions set): Score 70.1% - 29.9%, draws 0% (whitescore: 52.9%)
Blackmageddon (5000 positions set): Score 70.1% - 29,9%, draws 0% (whitescore: 53.3%)
Blackmageddon (10000 positions set): Score 69.5% - 30.5%, draws 0% (whitescore: 54.5%)
Blackmageddon is the end of my openings development: 0% draws. And a fantastic Elo-spreading of the results: Nearly doubled (around 70%-30%), compared to classical opening-sets like FEOBOS, HERT or Stockfish Framework openings (all around 60%-40%). Mention, that a doubled Elo-spreading means, you have to play only (around) 25% amount of games, to get the results of an engine-test or engine-tournament out of the errorbar. Because you have to play around 4x more games for a half-sized errorbar!
To be clear here: Blackmageddon is not just about avoiding draws and killing the draw-death of computerchess. It is much more: When using Blackmageddon openings, you get the same statistical stability of the engines rankings in an engine-tournament or engine-ratinglist, with playing only 25% of the number of games, you have to play when using a classical openings-set. So, you need only 25% of time on your PC for the same quality of results/rankings. Or, you can play with 4x more thinking-time instead for higher quality chess.
And the number of draws is 0. Always. How awesome is that?
And this is, why Blackmageddon is my legacy and the end of my development of openings-sets for computerchess. So, the journey, which started with the first SALC-openings in 2015 ends here.
I want to thank Hauke Lutz, which helped me a lot in these years, building SALC and Drawkiller and who built the Drawkiller EloZoom openings.
And I want to thank Larry Kaufman, who had the idea of building Armageddon-openings for computerchess (opening-lines, which give a measureable advantage for white and count all draws as a win for black). Even though, this idea is not working so well, because of unstable and climbing whitescores, Blackmageddon is a further development of this idea. And in an evolutionary process, each step on the stairway is built on the step below...
(C) 2019 Stefan Pohl (SPCC)
Additional 3 engines-test:
3 engines played a RoundRobin (Stockfish 10, Houdini 6 and Komodo 12), with 500 games in each head-to-head, so each engine played 1000 games. For each game one opening-line was chosen per random by the LittleBlitzerGUI.
Singlecore, 3'+1'', LittleBlitzerGUI, no ponder, no bases, 256 MB Hash, i7-6700HQ 2.6GHz Notebook (Skylake CPU), Windows 10 64bit
Here the same result for Blackmageddon: 0% draws (of course) and the best Elo-spreading of all openings-sets...
Blackmageddon (5000 positions set):
Code: Select all
Program Elo + - Games Score Av.Op. Draws 1 Stockfish 10 bmi2 : 3421 12 12 1000 73.7 % 3239 0.0 % 2 Houdini 6 pext : 3270 10 10 1000 44.1 % 3315 0.0 % 3 Komodo 12 bmi2 : 3209 11 11 1000 32.2 % 3345 0.0 %
Draws: 0% (whitescore 54.7%)
Code: Select all
Program Elo + - Games Score Av.Op. Draws 1 Stockfish 10 bmi2 : 3506 11 11 1000 70.9 % 3347 36.2 % 2 Houdini 6 pext : 3392 11 11 1000 48.5 % 3404 40.8 % 3 Komodo 12 bmi2 : 3302 11 11 1000 30.6 % 3449 36.6 %
Code: Select all
Program Elo + - Games Score Av.Op. Draws 1 Stockfish 10 bmi2 : 3494 11 11 1000 68.9 % 3353 34.2 % 2 Houdini 6 pext : 3387 11 11 1000 47.3 % 3407 38.2 % 3 Komodo 12 bmi2 : 3320 11 11 1000 33.8 % 3440 36.0 %
Code: Select all
Program Elo + - Games Score Av.Op. Draws 1 Stockfish 10 bmi2 : 3475 11 11 1000 65.4 % 3363 53.2 % 2 Houdini 6 pext : 3381 10 10 1000 46.0 % 3410 59.9 % 3 Komodo 12 bmi2 : 3345 10 10 1000 38.5 % 3428 55.9 %
Stockfish framework 8moves:
Code: Select all
Program Elo + - Games Score Av.Op. Draws 1 Stockfish 10 bmi2 : 3463 11 11 1000 63.0 % 3369 59.7 % 2 Houdini 6 pext : 3388 10 10 1000 47.5 % 3406 64.2 % 3 Komodo 12 bmi2 : 3349 10 10 1000 39.5 % 3425 60.1 %