Sven wrote:One way to avoid "Draw Death" would be the appearance of one single engine that plays significantly stronger than the current top engines, say 100 or 200 Elo points. As of today this seems to be unlikely, but can we really exclude it? In the past there were already times of stagnation, then suddenly a heavy improvement came up when nobody thought it were possible.
I think today we are far away yet from knowing what the best openings really are, and there are also some endgame types that are not always evaluated correctly by top engines. Think about fortresses for instance, or about complex endgames with rooks, minor pieces and pawns for which we still lack any proven theoretical knowledge due to lack of EGTBs for more than 6 or 7 pieces. For me these are two good reasons for not believing that we are already close to "perfect play" in computer chess. It might be "perfect-looking play" only. All we know, in my opinion, is that we have a couple of very strong engines that are really hard to beat with today's chess programming knowledge and hardware.
Once that new mega engine appears (and as a programmer I hope this will happen one day, although it will probably not be my own engine ...) I expect that parts of these mathematical models may become outdated, or will have to be reviewed at least.
I am more of opinion of Greg, that Chess is too easy for computers. You list deficiencies of chess engines, which are true deficiencies, but the bulk is that they are extremely strong. I took very balanced endgame positions (0cp-5cp unbalance) from real games and made the following experiments (no TBs):
Code: Select all
Score of Stockfish dev vs Zurichess_00: 297 - 1 - 702 [0.648] 1000
ELO difference: 106.01 +/- 10.81
Finished match
Stockfish dev is a recent version of Stockfish. Zurichess_00 is the first version of Zurichess I have, and it is about 1800 ELO human level (or pretty strong amateur). 30% of these ultra-balanced (endgame) openings are still playable against strong amateurs. Then I took the same Stockfish dev and Stockfish 7 separated by not that small 120 ELO points (on purpose not that small), and got:
Code: Select all
Score of Stockfish dev vs Stockfish 7: 8 - 4 - 988 [0.502] 1000
ELO difference: 1.39 +/- 2.35
Finished match
Now, only 1.2% of endgames are playable. So, 96% of endgames playable against strong amateurs are dead draws now. And it is not strength difference (a respectable 120 ELO points), but strength itself which gives more draws and less sensitivity.
The paradigm of Computer Chess is unchanged since Knuth of 40 years ago. The same alpha-beta, which reduces effective branching factor to 4-5, then ever more pruning and reductions to EBF 1.5 of today. According to this paradigm, the Computer Chess is capped to 400-800 more ELO points compared to today's top engines. The most comprehensive induction is from Andreas Strangmüller results here (and the discussion):
http://www.talkchess.com/forum/viewtopic.php?t=61784 ,
which points to lower values. That a new, unchanged for 40 years, paradigm will appear, which will make a revolution in these conditions is almost as unlikely in the near future as a solution to Chess. It will be done, but in pretty far future. I believe in 10-20 years we will still be talking in the same terms and paradigm, but with 95%+ draw rate among top engines even in fast tests (with balanced positions). Sure, it might happen that an engine will appear which will be superior significantly, but this superiority diminishes _objectively_ (as separation power) with general strength, longer TC and more hardware.
PS Having performed that test, I was curious what unbalanced positions and time odds bring to that 98.8% draw rate and no significant separation between Stockfish dev and Stockfish 7 using balanced positions:
Balanced:
+8 -4 =988 (1000)
Normalized ELO (trinomial):
0.037 +/- 0.062
Normalized ELO (pentanomial): 0.052 +/- 0.062
Balanced 4x White Time Odds:
+14 -5 =981 (1000)
Normalized ELO (trinomial): 0.065 +/- 0.062
Normalized ELO (pentanomial): 0.079 +/- 0.062
Unbalanced:
+284 -194 =522 (1000)
Normalized ELO (trinomial): 0.131 +/- 0.062
Normalized ELO (pentanomial):
0.254 +/- 0.062
With Red I marked what might happen if we continue to use balanced openings in distant future, with Green what might be the best solution to it. There is a factor of about 7 between separation power, meaning about 50 times less games needed for the same statistical significance.