chrisw wrote: ↑Wed Jul 22, 2020 4:07 pmOk, thanks for those. I put the MikeB two sets of PGNs together with some test games of my engine against two other engines, called here 2900eloengine and 3000eloengine.MikeB wrote: ↑Wed Jul 22, 2020 12:30 pmSo that was pretty impressive - about an '8' Elo gain 15 hours. Of course the "true' Elo gain here would be less than that as the 30,0000 line opening book used here was designed to exaggerate the Elo gain - note the relatively low draw rate % . Impressive regardless. Of course all of this was played at a micro bullet tc of 10 sec games with 0.1 sec increment - which is only a rough proxy on how it would do at a longer tc.MikeB wrote: ↑Wed Jul 22, 2020 7:48 amI just kicked off a 28,000 game set - with these engines - i was curious to see what the Elo gain from one day of training might be. So we have Stockfish NN from yesterday going against the latest sergio net from today. Added Houdini and Komodo. These micro bullet games are not Komodo's cup of tea,chrisw wrote: ↑Tue Jul 21, 2020 8:25 pmCool. I’ll split them up into pairs and take a look at the stats. What would be very useful, if you could find space/time would be an equivalent tourney, same settings, between SF and four strong competitors, say a Leela and three ABs. Then we gave some style comparison dataMikeB wrote: ↑Tue Jul 21, 2020 7:36 pm I ran another 10,000 game set , this is with an FEN set of 30,000 from Pohl's files, all FENs selected at random, this was with the "1907" net from Sergio against current dev Stockfish, time control 10 sec with 0.1 sec increment - Elo differences compared to curr-dev-stockfish may be exaggerated...I think this opening suite tries to do that intentionally
Code: Select all
--------------------------------------------------------------------------------------------------------- 1 Stockfish-XI-NN 20200721-1907 3135 0.0 8 8 4000 2246.5 56.2 1349 856 1795 33.7 44.9 3091 2 Honey-XI-NN 20200721-1907 3130 5.3 8 8 4000 2207.5 55.2 1285 870 1845 32.1 46.1 3093 3 Bluefish-XI-NN 20200721-1907 3128 2.1 8 8 4000 2194.5 54.9 1287 898 1815 32.2 45.4 3093 4 Black-Diamond-XI-NN 20200721-1907 3057 70.7 8 8 4000 1686.0 42.1 795 1423 1782 19.9 44.5 3111 5 stockfish 3051 6.3 8 8 4000 1665.5 41.6 941 1610 1449 23.5 36.2 3112 ---------------------------------------------------------------------------------------------------------
pgn file 0- link expires 7 days https://www.dropbox.com/t/YHNgHdNQN9WBRNKW
Should be completed in a few hoursCode: Select all
Rank Name Elo +/- Games Score Draws 1 Stockfish-XI-NN 0722-0944 102 33 263 64.3% 39.5% 2 Honey-XI-NN 0722-0944 79 32 263 61.2% 43.3% 3 Stockfish-XI-NN 0721-1907 75 33 262 60.7% 38.2% 4 Bluefish-XI-NN 0722-0944. 75 33 263 60.6% 39.9% 5 Black-Diamond-XI-NN 0722-0944 8 33 264 51.1% 37.9% 6 stockfish -28 34 263 46.0% 34.2% 7 Houdini-6 -159 38 262 28.6% 26.7% 8 komodo-14-64bit -170 38 262 27.3% 28.6% 1051 of 28000 games finished.
PGN file : https://www.dropbox.com/t/0vRTno6XTiaY6ISrCode: Select all
Rank Name Rating # W L D W% =% OppR --------------------------------------------------------------------------------------------------------- 1 Stockfish-XI-NN 0722-0944 3175 0.0 6 6 7000 4308.5 61.6 2888 1271 2841 41.3 40.6 3089 2 Bluefish-XI-NN 0722-0944. 3171 4.9 6 6 7000 4249.5 60.7 2857 1358 2785 40.8 39.8 3090 3 Honey-XI-NN 0722-0944 3170 0.3 6 6 7000 4247.0 60.7 2836 1342 2822 40.5 40.3 3090 4 Stockfish-XI-NN 0721-1907 3167 3.1 6 6 7000 4215.5 60.2 2805 1374 2821 40.1 40.3 3090 5 Black-Diamond-XI-NN 0722-0944 3096 71.5 6 6 7000 3405.0 48.6 1996 2186 2818 28.5 40.3 3101 6 stockfish 3088 8.1 6 6 7000 3364.0 48.1 2132 2404 2464 30.5 35.2 3102 7 Houdini-6 2982 105.9 6 6 7000 2256.0 32.2 1185 3673 2142 16.9 30.6 3117 8 komodo-14-64bit 2952 29.5 7 7 7000 1954.5 27.9 929 4020 2051 13.3 29.3 3121 ---------------------------------------------------------------------------------------------------------
Link expires 7/29.
Extraced the median ply depth to win for each engine pair(added 10 to MikeB PGNs because they start counting from move 10 or so).
Interesting results.
https://imgur.com/JNA4EP1
In theory, engines with a tendency to find and create unbalanced situations should show up with a lower median depth to win. See histogram for examples.
In practice, if the engines are unbalanced in Elo, the stronger engine will tend to show lower median (it's quicker and easier to win against weaker opponents). Likewise, see histogram.
In general the NNUE engines, unless they are facing notably weaker opposition (everybody else, I guess) have median win lengths of around 120 ply
Markedly different are the three included 3000-ish CCRL engines, which come in at about 90 to 100 (way faster than the NNUEs, and faster than Stockfish Komodo).
Here's an example of NNUE, where one won way many more games that the other, but median win length remains high.
BluefishNN Black-DiamondNN 376:476:148 median 120.0 120.0
Also showing up, quite consistently, is that median length to win for one engine against other engines, is relatively constant.
So, tentatively, I stick with my earlier conclusion that NNUE is not developing strength in the direction of exciting chess. NNUEs tends, over many games, to drag games out into the ending. Is dull the right word?
Code: Select all
Data: engine 1, engine 2, WDL, median win ply engine1, median win ply engine 2. BluefishNN StockfishNN 254:476:270 median 118.0 122.0 BluefishNN stockfish 407:372:221 median 108.0 120.0 BluefishNN Black-DiamondNN 376:476:148 median 120.0 120.0 BluefishNN HoneyNN 250:491:259 median 120.0 124.0 CoronaVirus 3000eloengine 3661:2987:3352 median 91.0 108.0 CoronaVirus 2900eloengine 1382:900:1721 median 91.0 104.0 komodo-14 HoneyNN 88:255:657 median 125.5 104.0 komodo-14 stockfish 128:327:545 median 120.0 100.0 komodo-14 Houdini-6 257:398:345 median 118.0 113.0 komodo-14 Black-DiamondNN 184:266:550 median 118.0 107.0 komodo-14 BluefishNN 85:257:658 median 122.0 104.0 komodo-14 StockfishNN 103:264:633 median 121.0 101.0 komodo-14 StockfishNN 84:284:632 median 125.0 103.0 StockfishNN stockfish 440:356:204 median 111.5 121.0 StockfishNN Black-DiamondNN 381:463:156 median 119.0 120.0 StockfishNN HoneyNN 258:500:242 median 122.0 122.0 HoneyNN stockfish 435:360:205 median 110.0 121.0 HoneyNN Houdini-6 615:263:122 median 106.0 116.0 HoneyNN Black-DiamondNN 383:470:147 median 116.0 124.0 HoneyNN BluefishNN 237:513:250 median 122.0 120.0 HoneyNN StockfishNN 233:488:279 median 120.0 120.0 HoneyNN StockfishNN 276:473:251 median 120.0 120.0 stockfish Houdini-6 472:370:158 median 109.0 117.5 stockfish Black-DiamondNN 310:355:335 median 115.0 110.0 stockfish Black-DiamondNN 297:364:339 median 118.0 116.0 stockfish HoneyNN 206:366:428 median 120.0 110.0 stockfish BluefishNN 219:324:457 median 118.0 114.0 stockfish StockfishNN 202:351:447 median 120.0 112.0 stockfish StockfishNN 192:368:440 median 120.0 110.0 Houdini-6 Black-DiamondNN 221:299:480 median 118.0 111.0 Houdini-6 BluefishNN 116:265:619 median 123.0 107.0 Houdini-6 StockfishNN 103:294:603 median 118.0 105.0 Houdini-6 StockfishNN 120:253:627 median 119.5 106.0 Black-DiamondNN HoneyNN 156:488:356 median 120.0 117.5 Black-DiamondNN BluefishNN 164:469:367 median 126.5 115.0 Black-DiamondNN StockfishNN 150:478:372 median 130.0 114.0 Black-DiamondNN StockfishNN 166:472:362 median 132.0 113.0 BluefishNN StockfishNN 244:476:280 median 119.5 122.0 BluefishNN StockfishNN 262:481:257 median 125.5 122.0 StockfishNN StockfishNN 274:490:236 median 124.0 120.0
Great analysis Chris and your assessment is correct - and it appears there is no magic bullet to obtain exciting and winning chess, it is just better technique to capitalize on small positional type advantages i.e. boring chess for many , if not most but I still find it interesting.

