Dragon 3.1 Released at KomodoChess.com
Moderator: Ras
-
- Posts: 150
- Joined: Fri Mar 11, 2022 12:10 pm
- Full name: Branislav Đošić
Re: Dragon 3.1 Released at KomodoChess.com
A question for Larry: have Elo values with regular eval been tested by human players?
-
- Posts: 6259
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Dragon 3.1 Released at KomodoChess.com
Not nearly as much. We had some data on the Skill levels of k14/14.1 and some data comparing those levels to current regular Eval Elo settings so they shouldn’t be way off, but I would trust the NNUE Elos much more.
Komodo rules!
-
- Posts: 150
- Joined: Fri Mar 11, 2022 12:10 pm
- Full name: Branislav Đošić
-
- Posts: 72
- Joined: Tue Sep 04, 2018 5:33 am
- Full name: John Kominek
Re: Dragon 3.1 Released at KomodoChess.com
I have a question regarding use of Skill Levels in Dragon.
I also see different patterns of nodes/score wrt depth Dragon running in normal mode, so it's general behavior. Exactly replicated search results quitting/restarting Komodo and re-issuing the same command sequence.
What accounts for the non-determinism across runs, despite clearing that hash and running single-threaded? Is it due to non-TT memory data? If I'm using the feature incorrectly let me know.Dragon 3.1 (C) 2022 Don Dailey, Larry Kaufman, Mark Lefler, Dmitry Pervov, and Dietrich Kappe
using hardware POPCNT
info string Licensed to John Kominek
info string embedded NN is loaded
setoption name Threads value 1
setoption name UCI LimitStrength value true
setoption name UCI Elo value 2400
go depth 32
info depth 1 time 3 nodes 0 nps 0
info depth 1 time 3 nodes 1 score cp -8 nps 250 tbhits 0 pv h2h4
info depth 1 time 3 nodes 4 score cp 88 nps 1000 tbhits 0 pv e2e4
info depth 1 time 3 nodes 5 score cp 98 nps 1250 tbhits 0 pv d2d4
info depth 2 time 3 nodes 20 nps 5000
info depth 2 time 3 nodes 30 score cp 56 nps 7500 tbhits 0 pv d2d4 d7d5
info depth 2 time 3 nodes 56 score cp 85 nps 14000 tbhits 0 pv e2e4 e7e5
info depth 3 time 3 nodes 80 nps 20000
info depth 3 time 4 nodes 166 score cp 115 nps 33200 tbhits 0 pv e2e4 c7c5 g1f3
info depth 4 time 4 nodes 192 nps 38400
info depth 4 time 4 nodes 260 score cp 89 nps 52001 tbhits 0 pv e2e4 c7c5 g1f3 e7e6
info depth 5 time 4 nodes 325 nps 65001
info depth 5 time 4 nodes 581 score cp 84 nps 116202 tbhits 0 pv e2e4 c7c5 g1f3 e7e6 d2d4 c5d4 f3d4
info depth 5 time 6 nodes 1037 score cp 92 nps 172832 tbhits 0 pv d2d4 d7d5 c2c4 e7e6 g1f3 d5c4
info time 6 nodes 1115 nps 185832
bestmove d2d4 ponder d7d5
setoption name Clear Hash
info string Clearing Hash
go depth 32
info depth 1 time 2 nodes 0 nps 0
info depth 1 time 3 nodes 1 score cp 80 nps 333 tbhits 0 pv c2c4
info depth 1 time 3 nodes 2 score cp 81 nps 666 tbhits 0 pv d2d4
info depth 1 time 4 nodes 4 score cp 84 nps 999 tbhits 0 pv g1f3
info depth 2 time 6 nodes 20 nps 3333
info depth 2 time 7 nodes 30 score cp 51 nps 4285 tbhits 0 pv g1f3 c7c5
info depth 2 time 9 nodes 50 score cp 68 nps 5555 tbhits 0 pv c2c4 e7e5
info depth 2 time 9 nodes 76 score cp 89 nps 7600 tbhits 0 pv e2e4 c7c5
info depth 3 time 9 nodes 96 nps 9600
info depth 3 time 12 nodes 129 score cp 106 nps 10749 tbhits 0 pv e2e4 c7c5 g1f3
info depth 4 time 12 nodes 155 nps 12916
info depth 4 time 16 nodes 244 score cp 77 nps 15249 tbhits 0 pv e2e4 c7c5 g1f3 g8f6
info depth 5 time 17 nodes 302 nps 17764
info depth 5 time 25 nodes 480 score cp 70 upperbound nps 19199 tbhits 0 pv e2e4 d7d6
info depth 5 time 39 nodes 865 score cp 74 nps 22179 tbhits 0 pv d2d4 d7d5 c2c4 e7e6 g1f3
info time 40 nodes 899 nps 22474
bestmove d2d4 ponder d7d5
uci
...
option name UCI Elo type spin default 2400 min 1 max 3500
option name UCI LimitStrength type check default true
option name Auto Skill type check default false
option name Skill Time type spin default 1 min 0 max 10
option name UCI_Opponent type string default <empty>
uciok
I also see different patterns of nodes/score wrt depth Dragon running in normal mode, so it's general behavior. Exactly replicated search results quitting/restarting Komodo and re-issuing the same command sequence.
-
- Posts: 6259
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Dragon 3.1 Released at KomodoChess.com
That's any easy question to answer. The Skill or Elo levels in Dragon are based partly on node counts and depth, but also have a randomness factor added to the evals, so naturally they will play differently every time, even on one thread. This is intentional, presumably people playing against Dragon don't want to get the same game over and over. The randomness is a very minor factor in the weakening for the higher elo levels (such as 2400 and above), but becomes increasingly important at lower level. Below 1000 (or a bit less) randomness is the only factor that distinguishes one elo level from another; it grows larger as the Elo drops.jkominek wrote: ↑Sat Aug 27, 2022 10:52 pm I have a question regarding use of Skill Levels in Dragon.What accounts for the non-determinism across runs, despite clearing that hash and running single-threaded? Is it due to non-TT memory data? If I'm using the feature incorrectly let me know.Dragon 3.1 (C) 2022 Don Dailey, Larry Kaufman, Mark Lefler, Dmitry Pervov, and Dietrich Kappe
using hardware POPCNT
info string Licensed to John Kominek
info string embedded NN is loaded
setoption name Threads value 1
setoption name UCI LimitStrength value true
setoption name UCI Elo value 2400
go depth 32
info depth 1 time 3 nodes 0 nps 0
info depth 1 time 3 nodes 1 score cp -8 nps 250 tbhits 0 pv h2h4
info depth 1 time 3 nodes 4 score cp 88 nps 1000 tbhits 0 pv e2e4
info depth 1 time 3 nodes 5 score cp 98 nps 1250 tbhits 0 pv d2d4
info depth 2 time 3 nodes 20 nps 5000
info depth 2 time 3 nodes 30 score cp 56 nps 7500 tbhits 0 pv d2d4 d7d5
info depth 2 time 3 nodes 56 score cp 85 nps 14000 tbhits 0 pv e2e4 e7e5
info depth 3 time 3 nodes 80 nps 20000
info depth 3 time 4 nodes 166 score cp 115 nps 33200 tbhits 0 pv e2e4 c7c5 g1f3
info depth 4 time 4 nodes 192 nps 38400
info depth 4 time 4 nodes 260 score cp 89 nps 52001 tbhits 0 pv e2e4 c7c5 g1f3 e7e6
info depth 5 time 4 nodes 325 nps 65001
info depth 5 time 4 nodes 581 score cp 84 nps 116202 tbhits 0 pv e2e4 c7c5 g1f3 e7e6 d2d4 c5d4 f3d4
info depth 5 time 6 nodes 1037 score cp 92 nps 172832 tbhits 0 pv d2d4 d7d5 c2c4 e7e6 g1f3 d5c4
info time 6 nodes 1115 nps 185832
bestmove d2d4 ponder d7d5
setoption name Clear Hash
info string Clearing Hash
go depth 32
info depth 1 time 2 nodes 0 nps 0
info depth 1 time 3 nodes 1 score cp 80 nps 333 tbhits 0 pv c2c4
info depth 1 time 3 nodes 2 score cp 81 nps 666 tbhits 0 pv d2d4
info depth 1 time 4 nodes 4 score cp 84 nps 999 tbhits 0 pv g1f3
info depth 2 time 6 nodes 20 nps 3333
info depth 2 time 7 nodes 30 score cp 51 nps 4285 tbhits 0 pv g1f3 c7c5
info depth 2 time 9 nodes 50 score cp 68 nps 5555 tbhits 0 pv c2c4 e7e5
info depth 2 time 9 nodes 76 score cp 89 nps 7600 tbhits 0 pv e2e4 c7c5
info depth 3 time 9 nodes 96 nps 9600
info depth 3 time 12 nodes 129 score cp 106 nps 10749 tbhits 0 pv e2e4 c7c5 g1f3
info depth 4 time 12 nodes 155 nps 12916
info depth 4 time 16 nodes 244 score cp 77 nps 15249 tbhits 0 pv e2e4 c7c5 g1f3 g8f6
info depth 5 time 17 nodes 302 nps 17764
info depth 5 time 25 nodes 480 score cp 70 upperbound nps 19199 tbhits 0 pv e2e4 d7d6
info depth 5 time 39 nodes 865 score cp 74 nps 22179 tbhits 0 pv d2d4 d7d5 c2c4 e7e6 g1f3
info time 40 nodes 899 nps 22474
bestmove d2d4 ponder d7d5
uci
...
option name UCI Elo type spin default 2400 min 1 max 3500
option name UCI LimitStrength type check default true
option name Auto Skill type check default false
option name Skill Time type spin default 1 min 0 max 10
option name UCI_Opponent type string default <empty>
uciok
I also see different patterns of nodes/score wrt depth Dragon running in normal mode, so it's general behavior. Exactly replicated search results quitting/restarting Komodo and re-issuing the same command sequence.
Komodo rules!
-
- Posts: 72
- Joined: Tue Sep 04, 2018 5:33 am
- Full name: John Kominek
Re: Dragon 3.1 Released at KomodoChess.com
That makes sense. I was on the verge of positing that very same explanation until I tested in regular mode and observed similar behavior. By way of example I mean:
617 then 466 then 416 nodes reported searched. Normal full-strength Komodo has a randomization element? I wouldn't expect that. I imagine the change in search path is coming from some other factor.Dragon 3.1 (C) 2022 Don Dailey, Larry Kaufman, Mark Lefler, Dmitry Pervov, and Dietrich Kappe
using hardware POPCNT
info string Licensed to John Kominek
info string embedded NN is loaded
go depth 5
info depth 1 time 101 nodes 0 nps 0
info depth 1 time 103 nodes 1 score cp -6 nps 9 tbhits 0 pv h2h4
info depth 1 time 105 nodes 3 score cp -2 nps 28 tbhits 0 pv f2f4
info depth 1 time 105 nodes 4 score cp 90 nps 38 tbhits 0 pv e2e4
info depth 2 time 106 nodes 20 nps 186
info depth 2 time 107 nodes 32 score cp 79 nps 296 tbhits 0 pv e2e4 c7c5
info depth 2 time 108 nodes 47 score cp 91 nps 431 tbhits 0 pv d2d4 e7e6
info depth 3 time 108 nodes 67 nps 614
info depth 3 time 115 nodes 132 score cp 114 nps 1147 tbhits 0 pv d2d4 g8f6 c2c4
info depth 4 time 115 nodes 161 nps 1387
info depth 4 time 118 nodes 221 score cp 74 nps 1872 tbhits 0 pv d2d4 g8f6 c2c4 g7g6
info depth 4 time 121 nodes 294 score cp 78 nps 2409 tbhits 0 pv e2e4 c7c5 g1f3 g8f6
info depth 5 time 123 nodes 382 nps 3080
info depth 5 time 128 nodes 518 score cp 86 lowerbound nps 4046 tbhits 0 pv e2e4 e7e6 g1f3
info depth 5 time 129 nodes 585 score cp 92 nps 4500 tbhits 0 pv e2e4 c7c5 g1f3 b8c6 b1c3
info time 129 nodes 617 nps 4746
bestmove e2e4 ponder c7c5
setoption name Clear Hash
info string Clearing Hash
go depth 5
info depth 1 time 1 nodes 0 nps 0
info depth 1 time 1 nodes 1 score cp 77 nps 999 tbhits 0 pv g1f3
info depth 1 time 1 nodes 2 score cp 90 nps 1999 tbhits 0 pv e2e4
info depth 2 time 1 nodes 20 nps 19999
info depth 2 time 1 nodes 32 score cp 79 nps 31999 tbhits 0 pv e2e4 c7c5
info depth 3 time 1 nodes 62 nps 61999
info depth 3 time 1 nodes 94 score cp 108 nps 93999 tbhits 0 pv e2e4 c7c5 g1f3
info depth 4 time 1 nodes 118 nps 117999
info depth 4 time 1 nodes 191 score cp 78 nps 190999 tbhits 0 pv e2e4 c7c5 g1f3 g8f6
info depth 5 time 1 nodes 266 nps 265955
info depth 5 time 1 nodes 384 score cp 86 lowerbound nps 383936 tbhits 0 pv e2e4 e7e6 g1f3
info depth 5 time 1 nodes 435 score cp 92 nps 434927 tbhits 0 pv e2e4 c7c5 g1f3 b8c6 b1c3
info time 1 nodes 466 nps 465922
bestmove e2e4 ponder c7c5
setoption name Clear Hash
info string Clearing Hash
go depth 5
info depth 1 time 2 nodes 0 nps 0
info depth 1 time 2 nodes 1 score cp 77 nps 499 tbhits 0 pv g1f3
info depth 1 time 3 nodes 2 score cp 90 nps 666 tbhits 0 pv e2e4
info depth 2 time 3 nodes 20 nps 5000
info depth 2 time 5 nodes 32 score cp 79 nps 6399 tbhits 0 pv e2e4 c7c5
info depth 3 time 6 nodes 62 nps 10333
info depth 3 time 6 nodes 94 score cp 108 nps 13428 tbhits 0 pv e2e4 c7c5 g1f3
info depth 4 time 8 nodes 118 nps 14749
info depth 4 time 9 nodes 171 score cp 78 nps 18999 tbhits 0 pv e2e4 c7c5 g1f3 g8f6
info depth 5 time 12 nodes 247 nps 20583
info depth 5 time 13 nodes 336 score cp 87 lowerbound nps 25846 tbhits 0 pv e2e4 a7a6 g1f3
info depth 5 time 13 nodes 369 score cp 96 lowerbound nps 28384 tbhits 0 pv e2e4 c7c5 g1f3
info depth 5 time 13 nodes 384 score cp 108 lowerbound nps 29538 tbhits 0 pv e2e4 c7c5 g1f3
info depth 5 time 13 nodes 396 score cp 108 nps 28285 tbhits 0 pv e2e4 c7c5 g1f3
info time 13 nodes 416 nps 29714
bestmove e2e4 ponder c7c5
quit
-
- Posts: 72
- Joined: Tue Sep 04, 2018 5:33 am
- Full name: John Kominek
Re: Dragon 3.1 Released at KomodoChess.com
I've run a quick self-test on Komodo skill levels. Each setting plays four levels above and below in steps of 100, plus itself to measure draw rate trend. I anchored the Elo scale to 2000. The initial results shows good calibration between 1400 and 1900. That's probably the range where you have the largest pool of chess.com humans to calibrate against. Above 2000 the scale is increasingly expanded. I am accustomed to seeing dilated ratings from engine self-tests. Identical versions of an engine tend not to spot its own holes the way a competing engine of similar strength can. Below 1400 the self-play ratings become compacted. It is hard for me to say if the lower range is well calibrated to a pool of weak human players, and that the self-play scale falls out of sync with that, or if letting randomness dominate is a weak model.That's any easy question to answer. The Skill or Elo levels in Dragon are based partly on node counts and depth, but also have a randomness factor added to the evals, so naturally they will play differently every time, even on one thread. This is intentional, presumably people playing against Dragon don't want to get the same game over and over. The randomness is a very minor factor in the weakening for the higher elo levels (such as 2400 and above), but becomes increasingly important at lower level. Below 1000 (or a bit less) randomness is the only factor that distinguishes one elo level from another; it grows larger as the Elo drops.
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%) LOS
1 KomodoDragon 3.1 Hash 256 Threads 1 Elo 3400 : 4006.4 114.2 160.0 216 74 100
2 KomodoDragon 3.1 Hash 256 Threads 1 Elo 3300 : 3890.8 101.0 177.5 252 70 100
3 KomodoDragon 3.1 Hash 256 Threads 1 Elo 3200 : 3663.2 95.0 169.0 288 59 100
4 KomodoDragon 3.1 Hash 256 Threads 1 Elo 3100 : 3530.7 86.0 173.5 324 54 100
5 KomodoDragon 3.1 Hash 256 Threads 1 Elo 3000 : 3391.2 84.3 173.5 360 48 100
6 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2900 : 3266.2 79.6 174.5 360 48 100
7 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2800 : 3165.9 79.5 182.0 360 51 100
8 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2700 : 3085.3 77.5 194.5 360 54 100
9 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2600 : 2895.3 71.3 181.0 360 50 100
10 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2500 : 2762.3 70.4 183.5 360 51 100
11 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2400 : 2581.1 68.7 176.0 360 49 100
12 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2300 : 2449.7 64.1 179.5 360 50 100
13 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2200 : 2282.5 60.7 173.0 360 48 100
14 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2100 : 2183.9 58.0 182.0 360 51 100
15 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2000 : 2000.0 52.9 167.0 360 46 100
16 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1900 : 1926.4 53.3 179.0 360 50 100
17 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1800 : 1834.0 51.2 185.0 360 51 100
18 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1700 : 1670.4 52.1 167.0 360 46 100
19 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1600 : 1583.3 52.3 170.0 360 47 99
20 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1500 : 1524.9 53.6 179.0 360 50 100
21 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1400 : 1410.2 52.3 166.5 360 46 98
22 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1300 : 1364.1 53.5 177.0 360 49 100
23 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1200 : 1273.8 54.7 167.0 360 46 91
24 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1100 : 1242.6 55.2 178.0 360 49 98
25 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1000 : 1196.3 56.3 183.0 360 51 100
26 KomodoDragon 3.1 Hash 256 Threads 1 Elo 900 : 1110.1 58.0 170.0 360 47 79
27 KomodoDragon 3.1 Hash 256 Threads 1 Elo 800 : 1090.3 61.7 184.5 360 51 98
28 KomodoDragon 3.1 Hash 256 Threads 1 Elo 700 : 1040.3 62.0 186.5 360 52 99
29 KomodoDragon 3.1 Hash 256 Threads 1 Elo 600 : 978.0 65.2 185.5 360 52 99
30 KomodoDragon 3.1 Hash 256 Threads 1 Elo 500 : 922.4 65.1 188.5 360 52 100
31 KomodoDragon 3.1 Hash 256 Threads 1 Elo 400 : 851.3 65.9 186.5 360 52 100
32 KomodoDragon 3.1 Hash 256 Threads 1 Elo 300 : 771.1 69.2 152.5 324 47 100
33 KomodoDragon 3.1 Hash 256 Threads 1 Elo 200 : 693.7 75.1 123.0 288 43 100
34 KomodoDragon 3.1 Hash 256 Threads 1 Elo 100 : 612.0 78.3 96.0 252 38 100
35 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1 : 507.4 85.2 33.0 144 23 ---
White advantage = 0.00 +/- 0.00
Draw rate (equal opponents) = 38.75 % +/- 0.74

Hmm. I am attempting to insert a 45 KB image but do not seem to be having success. The Attachments tab proclaims a warning next to the file "Sorry, the board attachment quota has been reached." Any hints on how to insert an image into the message? Maybe I must earn more posting credits or something.
-
- Posts: 6259
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Dragon 3.1 Released at KomodoChess.com
We often test on fixed depth on our Linux tester, and the results are always replicable on one thread (at full strength), so I can't readily explain what you are reporting. It's possible it is a Windows issue, since we haven't verified this test on Windows, but that would be strange. We may have to investigate this.jkominek wrote: ↑Sun Aug 28, 2022 5:12 am That makes sense. I was on the verge of positing that very same explanation until I tested in regular mode and observed similar behavior. By way of example I mean:617 then 466 then 416 nodes reported searched. Normal full-strength Komodo has a randomization element? I wouldn't expect that. I imagine the change in search path is coming from some other factor.Dragon 3.1 (C) 2022 Don Dailey, Larry Kaufman, Mark Lefler, Dmitry Pervov, and Dietrich Kappe
using hardware POPCNT
info string Licensed to John Kominek
info string embedded NN is loaded
go depth 5
info depth 1 time 101 nodes 0 nps 0
info depth 1 time 103 nodes 1 score cp -6 nps 9 tbhits 0 pv h2h4
info depth 1 time 105 nodes 3 score cp -2 nps 28 tbhits 0 pv f2f4
info depth 1 time 105 nodes 4 score cp 90 nps 38 tbhits 0 pv e2e4
info depth 2 time 106 nodes 20 nps 186
info depth 2 time 107 nodes 32 score cp 79 nps 296 tbhits 0 pv e2e4 c7c5
info depth 2 time 108 nodes 47 score cp 91 nps 431 tbhits 0 pv d2d4 e7e6
info depth 3 time 108 nodes 67 nps 614
info depth 3 time 115 nodes 132 score cp 114 nps 1147 tbhits 0 pv d2d4 g8f6 c2c4
info depth 4 time 115 nodes 161 nps 1387
info depth 4 time 118 nodes 221 score cp 74 nps 1872 tbhits 0 pv d2d4 g8f6 c2c4 g7g6
info depth 4 time 121 nodes 294 score cp 78 nps 2409 tbhits 0 pv e2e4 c7c5 g1f3 g8f6
info depth 5 time 123 nodes 382 nps 3080
info depth 5 time 128 nodes 518 score cp 86 lowerbound nps 4046 tbhits 0 pv e2e4 e7e6 g1f3
info depth 5 time 129 nodes 585 score cp 92 nps 4500 tbhits 0 pv e2e4 c7c5 g1f3 b8c6 b1c3
info time 129 nodes 617 nps 4746
bestmove e2e4 ponder c7c5
setoption name Clear Hash
info string Clearing Hash
go depth 5
info depth 1 time 1 nodes 0 nps 0
info depth 1 time 1 nodes 1 score cp 77 nps 999 tbhits 0 pv g1f3
info depth 1 time 1 nodes 2 score cp 90 nps 1999 tbhits 0 pv e2e4
info depth 2 time 1 nodes 20 nps 19999
info depth 2 time 1 nodes 32 score cp 79 nps 31999 tbhits 0 pv e2e4 c7c5
info depth 3 time 1 nodes 62 nps 61999
info depth 3 time 1 nodes 94 score cp 108 nps 93999 tbhits 0 pv e2e4 c7c5 g1f3
info depth 4 time 1 nodes 118 nps 117999
info depth 4 time 1 nodes 191 score cp 78 nps 190999 tbhits 0 pv e2e4 c7c5 g1f3 g8f6
info depth 5 time 1 nodes 266 nps 265955
info depth 5 time 1 nodes 384 score cp 86 lowerbound nps 383936 tbhits 0 pv e2e4 e7e6 g1f3
info depth 5 time 1 nodes 435 score cp 92 nps 434927 tbhits 0 pv e2e4 c7c5 g1f3 b8c6 b1c3
info time 1 nodes 466 nps 465922
bestmove e2e4 ponder c7c5
setoption name Clear Hash
info string Clearing Hash
go depth 5
info depth 1 time 2 nodes 0 nps 0
info depth 1 time 2 nodes 1 score cp 77 nps 499 tbhits 0 pv g1f3
info depth 1 time 3 nodes 2 score cp 90 nps 666 tbhits 0 pv e2e4
info depth 2 time 3 nodes 20 nps 5000
info depth 2 time 5 nodes 32 score cp 79 nps 6399 tbhits 0 pv e2e4 c7c5
info depth 3 time 6 nodes 62 nps 10333
info depth 3 time 6 nodes 94 score cp 108 nps 13428 tbhits 0 pv e2e4 c7c5 g1f3
info depth 4 time 8 nodes 118 nps 14749
info depth 4 time 9 nodes 171 score cp 78 nps 18999 tbhits 0 pv e2e4 c7c5 g1f3 g8f6
info depth 5 time 12 nodes 247 nps 20583
info depth 5 time 13 nodes 336 score cp 87 lowerbound nps 25846 tbhits 0 pv e2e4 a7a6 g1f3
info depth 5 time 13 nodes 369 score cp 96 lowerbound nps 28384 tbhits 0 pv e2e4 c7c5 g1f3
info depth 5 time 13 nodes 384 score cp 108 lowerbound nps 29538 tbhits 0 pv e2e4 c7c5 g1f3
info depth 5 time 13 nodes 396 score cp 108 nps 28285 tbhits 0 pv e2e4 c7c5 g1f3
info time 13 nodes 416 nps 29714
bestmove e2e4 ponder c7c5
quit
Komodo rules!
-
- Posts: 6259
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Dragon 3.1 Released at KomodoChess.com
The elo spreading at higher levels and the contraction at lower levels are both deliberate; we are trying to emulate humans with those elos. At higher levels, engine vs engine ratings, as in the rating list, consistently overstate rating differences compared to the results they get vs. humans, by roughly the 4 to 3 ratio of spread you show above 2000. At very low levels, below 1000 elo, our feedback is that the Elo settings are (or at least were) too weak, so compacting them as you report is an attempt to offset this. Probably the effect of increased depth or nodes on elo acts quite differently than the effect of changing randomness; the depth is less relevant vs. humans if they are outsearched anyway in general. But poor eval probably matters more vs humans than vs engines. At least that is my hypothesis.jkominek wrote: ↑Sun Aug 28, 2022 6:02 amI've run a quick self-test on Komodo skill levels. Each setting plays four levels above and below in steps of 100, plus itself to measure draw rate trend. I anchored the Elo scale to 2000. The initial results shows good calibration between 1400 and 1900. That's probably the range where you have the largest pool of chess.com humans to calibrate against. Above 2000 the scale is increasingly expanded. I am accustomed to seeing dilated ratings from engine self-tests. Identical versions of an engine tend not to spot its own holes the way a competing engine of similar strength can. Below 1400 the self-play ratings become compacted. It is hard for me to say if the lower range is well calibrated to a pool of weak human players, and that the self-play scale falls out of sync with that, or if letting randomness dominate is a weak model.That's any easy question to answer. The Skill or Elo levels in Dragon are based partly on node counts and depth, but also have a randomness factor added to the evals, so naturally they will play differently every time, even on one thread. This is intentional, presumably people playing against Dragon don't want to get the same game over and over. The randomness is a very minor factor in the weakening for the higher elo levels (such as 2400 and above), but becomes increasingly important at lower level. Below 1000 (or a bit less) randomness is the only factor that distinguishes one elo level from another; it grows larger as the Elo drops.
As an aside, single-threaded Stockfish is deterministic between Hash clears. Though when using Skill Levels it also applies randomness on a sliding scale. The skill levels were calibrated between version 10 and version 11 and have not been modified since then, as far as I can tell. The Stockfish calibration is now out of whack, even when using the classic hand-crafted evaluation.Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%) LOS 1 KomodoDragon 3.1 Hash 256 Threads 1 Elo 3400 : 4006.4 114.2 160.0 216 74 100 2 KomodoDragon 3.1 Hash 256 Threads 1 Elo 3300 : 3890.8 101.0 177.5 252 70 100 3 KomodoDragon 3.1 Hash 256 Threads 1 Elo 3200 : 3663.2 95.0 169.0 288 59 100 4 KomodoDragon 3.1 Hash 256 Threads 1 Elo 3100 : 3530.7 86.0 173.5 324 54 100 5 KomodoDragon 3.1 Hash 256 Threads 1 Elo 3000 : 3391.2 84.3 173.5 360 48 100 6 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2900 : 3266.2 79.6 174.5 360 48 100 7 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2800 : 3165.9 79.5 182.0 360 51 100 8 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2700 : 3085.3 77.5 194.5 360 54 100 9 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2600 : 2895.3 71.3 181.0 360 50 100 10 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2500 : 2762.3 70.4 183.5 360 51 100 11 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2400 : 2581.1 68.7 176.0 360 49 100 12 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2300 : 2449.7 64.1 179.5 360 50 100 13 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2200 : 2282.5 60.7 173.0 360 48 100 14 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2100 : 2183.9 58.0 182.0 360 51 100 15 KomodoDragon 3.1 Hash 256 Threads 1 Elo 2000 : 2000.0 52.9 167.0 360 46 100 16 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1900 : 1926.4 53.3 179.0 360 50 100 17 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1800 : 1834.0 51.2 185.0 360 51 100 18 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1700 : 1670.4 52.1 167.0 360 46 100 19 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1600 : 1583.3 52.3 170.0 360 47 99 20 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1500 : 1524.9 53.6 179.0 360 50 100 21 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1400 : 1410.2 52.3 166.5 360 46 98 22 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1300 : 1364.1 53.5 177.0 360 49 100 23 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1200 : 1273.8 54.7 167.0 360 46 91 24 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1100 : 1242.6 55.2 178.0 360 49 98 25 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1000 : 1196.3 56.3 183.0 360 51 100 26 KomodoDragon 3.1 Hash 256 Threads 1 Elo 900 : 1110.1 58.0 170.0 360 47 79 27 KomodoDragon 3.1 Hash 256 Threads 1 Elo 800 : 1090.3 61.7 184.5 360 51 98 28 KomodoDragon 3.1 Hash 256 Threads 1 Elo 700 : 1040.3 62.0 186.5 360 52 99 29 KomodoDragon 3.1 Hash 256 Threads 1 Elo 600 : 978.0 65.2 185.5 360 52 99 30 KomodoDragon 3.1 Hash 256 Threads 1 Elo 500 : 922.4 65.1 188.5 360 52 100 31 KomodoDragon 3.1 Hash 256 Threads 1 Elo 400 : 851.3 65.9 186.5 360 52 100 32 KomodoDragon 3.1 Hash 256 Threads 1 Elo 300 : 771.1 69.2 152.5 324 47 100 33 KomodoDragon 3.1 Hash 256 Threads 1 Elo 200 : 693.7 75.1 123.0 288 43 100 34 KomodoDragon 3.1 Hash 256 Threads 1 Elo 100 : 612.0 78.3 96.0 252 38 100 35 KomodoDragon 3.1 Hash 256 Threads 1 Elo 1 : 507.4 85.2 33.0 144 23 --- White advantage = 0.00 +/- 0.00 Draw rate (equal opponents) = 38.75 % +/- 0.74
Hmm. I am attempting to insert a 45 KB image but do not seem to be having success. The Attachments tab proclaims a warning next to the file "Sorry, the board attachment quota has been reached." Any hints on how to insert an image into the message? Maybe I must earn more posting credits or something.
Komodo rules!
-
- Posts: 72
- Joined: Tue Sep 04, 2018 5:33 am
- Full name: John Kominek
Re: Dragon 3.1 Released at KomodoChess.com
It has been quite a long time since I've used Windows myself. The screen output I shared are from running dragon-3.1-linux-avx2. For another data point, it won't hurt try Komodo 13.02, the most recent of the free versions.lkaufman wrote: ↑Sun Aug 28, 2022 7:38 am We often test on fixed depth on our Linux tester, and the results are always replicable on one thread (at full strength), so I can't readily explain what you are reporting. It's possible it is a Windows issue, since we haven't verified this test on Windows, but that would be strange. We may have to investigate this.
Ah, so the behavior happens in older versions of your software too. The same move is selected in the end, but the search paths are different. Only slightly between runs 2 and 3, with both dramatically different from run 1.Komodo 13.02 64-bit (C) 2019 Don Dailey, Larry Kaufman and Mark Lefler
using hardware POPCNT
using BMI2
info string Licensed to Komodochess.com
setoption name Threads value 1
setoption name Clear Hash
info string Clearing Hash
go depth 3
info depth 1 time 1 nodes 0 nps 0
info depth 1 time 1 nodes 1 score cp 4 nps 1000 tbhits 0 pv a2a3
info depth 1 time 1 nodes 2 score cp 20 nps 2000 tbhits 0 pv b2b3
info depth 1 time 1 nodes 4 score cp 73 nps 4000 tbhits 0 pv d2d3
info depth 1 time 1 nodes 5 score cp 90 nps 5000 tbhits 0 pv e2e3
info depth 2 time 1 nodes 20 nps 20000
info depth 2 time 1 nodes 43 score cp 31 nps 43000 tbhits 0 pv e2e3 e7e6
info depth 3 time 1 nodes 79 nps 79000
info depth 3 time 1 nodes 120 score cp 59 nps 120000 tbhits 0 pv e2e3 e7e6 b1c3
info time 1 nodes 142 nps 142000
bestmove e2e3 ponder e7e6
setoption name Clear Hash
info string Clearing Hash
go depth 3
info depth 1 time 1 nodes 0 nps 0
info depth 1 time 1 nodes 1 score cp 58 nps 1000 tbhits 0 pv b1c3
info depth 1 time 1 nodes 2 score cp 90 nps 2000 tbhits 0 pv e2e3
info depth 2 time 1 nodes 20 nps 20000
info depth 2 time 1 nodes 28 score cp 31 nps 28000 tbhits 0 pv e2e3 e7e6
info depth 3 time 1 nodes 64 nps 64000
info depth 3 time 1 nodes 84 score cp 59 nps 84000 tbhits 0 pv e2e3 e7e6 b1c3
info time 1 nodes 111 nps 111000
bestmove e2e3 ponder e7e6
setoption name Clear Hash
info string Clearing Hash
go depth 3
info depth 1 time 1 nodes 0 nps 0
info depth 1 time 1 nodes 1 score cp 58 nps 1000 tbhits 0 pv b1c3
info depth 1 time 1 nodes 2 score cp 90 nps 2000 tbhits 0 pv e2e3
info depth 2 time 1 nodes 20 nps 20000
info depth 2 time 1 nodes 28 score cp 31 nps 28000 tbhits 0 pv e2e3 e7e6
info depth 3 time 1 nodes 64 nps 64000
info depth 3 time 1 nodes 83 score cp 59 nps 83000 tbhits 0 pv e2e3 e7e6 b1c3
info time 1 nodes 110 nps 110000
bestmove e2e3 ponder e7e6
quit
If I were to hazard a guess, the data structure storing move ordering is not cleared, with that retrained information informing subsequent searches. In the few test cases examined, node count always decreases. This observation supports the notion of subsequent searches following a tree path with earlier, or more effective pruning, which can be shown to have no effect on the end result. But, I'm not an engine author; my knowledge of the inner workings is superficial.
Addendum. This might not mean a lot, but I tried a comparison between dragon-3.1-linux-avx2 against dragon-3.1-64bit-avx2.exe running under Wine. They give identical outputs. That is, even though the search path differs between Clear Hash commands as above, when given the same sequence of commands the results are identical between the native Linux and emulated Windows versions.
The reasonable conclusion is that single-threaded normal strength Komodo is deterministic, but that issuing Clear Hash does not reset the entirety of the program state, hence what we see. We'll blame Stockfish for skewing my expectations. Stupid fish!