Middlegame positional test-suite
Posted: Sun Apr 07, 2019 3:39 pm
Just like in the case with opening positional test-suite described here:
http://www.talkchess.com/forum3/viewtop ... =2&t=61858
I built, based on large databases of human games a positional test-suite for middlegame phase of the game, roughly moves 15-22. The statistics of human games on each position is weaker than in openings, so it was harder to have some confidence in chosen unique solutions. I used engines mostly for checking to not have tactical complications in positions.
I have 250 middlegame positions in this positional test-suite. I also combined 4 of them in a suite of 1000 positions, for the statistical significance of the result. Results can vary quite a bit on just one run of the test-suite (especially on many cores), so 4 runs are better to have a more precise result. I uploaded here the positional Middlegames250 test-suite, combined 4 Middlegames1000 test-suite, the old Openings200 test-suite and combined 5 Openings1000 test-suite. They are the most faithfully positional test suites I am aware of, and are my creation .
The results using positional Middlegames1000 suite:
AB engines on 4 strong i7 cores.
Lc0 on RTX 2070 GPU
Positions found in time interval from 1 to 2 seconds per position using Polyglot EPD testing features.
We see Leela performing better on this positional test suite than regular engines. We see also a surprise, that SF_dev, unlike openings, is not the strongest positionally in the middlegames. It seems Houdini is strong here, and to check that my suite is not some artifact of tactics, I included Houdini Tactical too, and it performs worse than the regular Houdini. Also, for sanity check I included the simple eval Fruit 2.1, and it performs pretty badly. Comparing to the results on positional opening test suite with pretty much same engines:
We can see that Leela, although, still by far the best, is not that distanced positionally in midgames as it is in the openings. And it is normal, as these are fairly common human openings on which Leela trained more than on more varied midgames. Also interesting to note that Stockfish seems to excel positionally among AB engines in openings and in endgames, not so much in midgames. But it probably has a much better search all around.
My verrry good positional test suites are attached.
http://www.talkchess.com/forum3/viewtop ... =2&t=61858
I built, based on large databases of human games a positional test-suite for middlegame phase of the game, roughly moves 15-22. The statistics of human games on each position is weaker than in openings, so it was harder to have some confidence in chosen unique solutions. I used engines mostly for checking to not have tactical complications in positions.
I have 250 middlegame positions in this positional test-suite. I also combined 4 of them in a suite of 1000 positions, for the statistical significance of the result. Results can vary quite a bit on just one run of the test-suite (especially on many cores), so 4 runs are better to have a more precise result. I uploaded here the positional Middlegames250 test-suite, combined 4 Middlegames1000 test-suite, the old Openings200 test-suite and combined 5 Openings1000 test-suite. They are the most faithfully positional test suites I am aware of, and are my creation .
The results using positional Middlegames1000 suite:
AB engines on 4 strong i7 cores.
Lc0 on RTX 2070 GPU
Positions found in time interval from 1 to 2 seconds per position using Polyglot EPD testing features.
Code: Select all
Midgames1000:
Lc0 v21.1 ID41844 737/1000
Lc0 v21.1 ID32930 730/1000
Houdini 6.03 655/1000
Houdini 6.03Tactic 635/1000
Komodo 12.3 609/1000
Stockfish_dev 584/1000
Booot 6.3.1 584/1000
Texel 1.07 561/1000
Andscacs 0.95 555/1000
Ethereal 11.25 548/1000
Fire 7.1 545/1000
Fruit 2.1 398/1000
Code: Select all
Openings1000:
Lc0 v21.1 ID41844 762/1000
Lc0 v21.1 ID32930 727/1000
Stockfish_dev 574/1000
Houdini 6.03 558/1000
Komodo 12.3 556/1000
Booot 6.3.1 494/1000
Andscacs 0.95 484/1000
Ethereal 11.00 457/1000
Fire 7.1 431/1000
Texel 1.07 419/1000
My verrry good positional test suites are attached.