I created a new opening suite with very good properties consisting of 6533 opening 3-mover positions played by humans over ELO 2200. The database was "KingBase Lite" of about 1 million human games (above ELO 2200). As mentioned before, some suites like Stockfish Framework 2moves_v1.epd have good signal to noise ratio, but ruin the opening phase of the game (first say 10 moves are completely abnormal due to the first 2 silly random moves), other suites obeying normal opening play have low sensitivity (signal to noise ratio, or t-value), or are too short. It seems my new
"3moves_Elo2200.epd" suite has:
1/ High signal to noise ratio, even higher than 2moves_v1
2/ Has mostly reasonable openings, so that opening phase of the game can be tested too
3/ Has sufficient number of unique opening 3-mover positions to be used in many thousands of games, as developers need
The suite is uploaded here:
http://s000.tinyupload.com/?file_id=687 ... 2789470066
I built this suite as follows: from that million human games, I filtered about 9000 unique 3-move positions. Then, I analyzed with Stockfish all of them for 1 second each. The I filtered the final 6533 positions to have the eval of Stockfish between [-0.40, 0.60], to not have very unbalanced or too wrong openings. If using pentanomial for calculating the variance, one might like the unbalanced ones, but this is a another topic.
So, this suite has a clear advantage over 2moves_v1 that it mostly has reasonable openings. The harder to achieve goal is to have high sensitivity (t-value), as shown in the previous post. But this goal seems to have been achieved too, as this suite compares well with 2moves_v1 even in this department. It is a bit more drawish, but win/loss ratio is significantly higher.
Here are the signal to noise ratios of the two suites for Stockfish dev and Komodo. In ELO and Normalized ELO (which seems to be more relevant than ELO), computed here in determining the benefit from the doubling in time in self-games.
Stockfish 210617:
Code: Select all
2000 games each run, 6''+ 0.06'' vs 3''+ 0.03''
t-value
ELO t-value ELO Normalized ELO Normalized ELO
===================================================================================================
2moves_v1.epd (40456) | 164.50 25.8 0.714 31.9
3moves_Elo2200 (6533) | 169.49 27.0 0.760 34.0
===================================================================================================
Komodo 11.01:
Code: Select all
2000 games each run, 6''+ 0.06'' vs 3''+ 0.03''
t-value
ELO t-value ELO Normalized ELO Normalized ELO
===================================================================================================
2moves_v1.epd (40456) | 195.51 27.4 0.813 36.5
3moves_Elo2200 (6533) | 192.24 29.1 0.854 38.3
===================================================================================================