Some experiments with Komodo's Contempt. White Contempt = true in all case. Here are the result against Komodo Contempt = 0 of Contempt 25,50,75,100.lkaufman wrote: ↑Mon Dec 16, 2019 6:37 pmI decided to check out your one bad result, Komodo self-play, to see if it got better or worse with more time and more threads. At 2' + 1" on four threads, latest Komodo dev. got 79 wins, 98 draws, and one loss for 44.4% Armageddon score, somewhat better than your result for selfplay. So at least it's moving in the right direction. With your results for non-selfplay averaging around 49.6% (using only the longest TC for Lc0), looks like this version is most likely as close as we can hope to get to being fair without crossing the line of White having a forced win (it seems). It's better to be below this line than above it, so I think this may be the perfect Armageddon variant at last! I'll try to get some more data and look for ways to promote the idea.Laskos wrote: ↑Mon Dec 16, 2019 4:13 pmAnd again good result at 5000 nodes/move Lc0 testing between best t40 and t30 nets:Laskos wrote: ↑Mon Dec 16, 2019 6:56 am
Second, Lc0 testing, between best t40 and t30 nets:
2000 games at 100 nodes/move:
+1064 =386 -550 62.8%
White score: 53.2%
1000 games at 1000 nodes/move
+514 =303 -183 66.5%
White score: 51.4%
Draw rate increases to longer TC, Armageddon score decreases from above towards 50%. Again, a promising result.
1000 games at 5000 nodes/move
+509 =366 -125 69.2%
White score: 50.9%
Nice progression from lower nodes count. AB engines have 48.3% White score at similar strength. All in all, very balanced and seems to slowly converge towards close to 50%.
30 + 0.3s
Armageddon scoring (no draws)
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%) CFS(next)
1 K_131_50 : 47.35 30.48 227.0 400 56.8 97
2 K_131_25 : 1.74 30.65 201.0 400 50.3 54
3 K_131_0 : 0.00 14.23 796.0 1600 49.8 73
4 K_131_100 : -10.47 29.25 194.0 400 48.5 81
5 K_131_75 : -31.46 29.86 182.0 400 45.5 ---
White advantage = -20.14 +/- 8.70
Draw rate (equal opponents) = 0.00 % +/- 0.00
Overall, White performance is 47.1%.
But Contempt = 75 against Contempt = 0 shows a White win performance at 44.5%, as in earlier tests. While the more adequate Contempt = 50 shows a White win performance of 50.2%. The difference in White performance in these two cases is significant. So, it might be that the low White win ratio of Contempt = 75 is partly due to an inadequately chosen value of the Contempt (aside identical Komodo engines in earlier tests)
Will check now self-play of Contempt = 50 Komodos.