Code: Select all
N Engine Rtng Pts Gm SB X Elo Perf Et Pe Ar De Lc Ne Ha Bo
1 Ethereal 10.81 3176 12.0 15 79.25 0 +127 80.0 ··· == 1= 1== 1= 11 11 11
2 Pedone 1.8 3104 9.0 15 58.25 0 +80 60.0 == ··· == 10 == 0= 11 =11
3 Arasan TCEC13 3142 8.5 15 56.75 0 +37 56.7 0= == ··· 01 == 1=1 == 1=
4 DeusX 1.0 3200 8.0 15 56.75 0 -20 53.3 0== 01 10 ··· == =1 == 1=
5 lc0 16.10520 3219 7.0 15 48.25 0 -65 46.7 0= == == == ··· =0 === 1=
6 Nemorino 5.01 3104 6.5 15 44.25 2 +4 43.3 00 1= 0=0 =0 =1 ··· 1= 01
7 Hannibal 20180806 3193 6.0 15 36.25 0 -77 40.0 00 00 == == === 0= ··· 11
8 Bobcat 8 3072 3.0 15 22.75 0 -86 20.0 00 =00 0= 0= 0= 10 00 ···
I also checked the Lc0 ID10520 time management in TCEC, and is it really so terrible? It might be not optimal, but it is not terrible, and completely wrecking the performance. I guess it might weaken by some 30-40 Elo points at most the performance compared to a better TM, that's all. Are there other bugs? In both ID10520 and Deus?
The remaining thing is TCEC conditions, which are really hard to reproduce (to me at least, impossible). So, I took another approach: match CPU part with GPU part as they are in TCEC by the shown NPS and assumed SMP scaling.
I took the Arasan 21 chess engine, which should be very close in strength to Arasan TCEC, and which I was already using in my gauntlets against AB engines. On my 4 cores, NPS is about 8 times lower than TCEC NPS. Efficiency of the SMP on 43 cores is, even with the best SMP implementation, no higher than 60%-70% (which is very high, by the way, for 43 cores). So, all in all, the "effective speed" (inverse of time-to-strength) of TCEC CPU for Arasan 21 is about 5.0-5.5 that of my CPU. For GPU part, NPS seem to be about 6 times higher in TCEC than on my GPU, and an "effective speed" about 5 times higher (correct me on that one if you know better how to get from NPS speed-up the effective speed-up with 2 GPUs). All in all, I can mimic TCEC conditions to some degree by having Lc0 running on my GPU and Arasan 21 on 4 cores (maybe 5 cores would be even better, but it's not that relevant).
I have chosen time control to be 10 times faster than in TCEC: 3m + 1s.
Partial result:
Code: Select all
Score of lc0_v16 10520 vs Arasan 21: +13 -2 =7 [0.750]
Elo difference: 190.85 +/- 139.14
22 of 40 games finished.
So, what happens? TM, "too few games" and other things seem lame excuses. Is there a serious a bug affecting both Lc0 participants? Is there a hardware misconfiguration, invisible in NPS?
Or something I started to suspect: Lc0 scales badly in this 50x time * hardware configuration?