CEGT - rating lists March 07th 2021

Werner · Post by **Werner** » Sun Mar 07, 2021 2:12 pm

Hi all,
our actual rating lists are online and can be found under the attached links!

40 / 20:
New games: 2.219; 26 different engines
Total: 1.448.116

NEW Engines
11 LCZero 0.27.0c CUDNN (67741): 3524 - 300 games (no good start for version with multigather search enabled)
513 Bagatur 2.2a x64 1CPU: 2896 - 1005 games (= v. 2.2)
550 Counter 3.7 x64 1CPU: 2873 - 995 games (+23 to v. 3.6)

UPDATES
391 Weiss 1.3 x64 1CPU: 2971 - 1200 games (+4)

40 / 4
last update was February 15th: with 8802 new games; now 2.859.072 games
we are testing:
Weiss 1.3 x64 1CPU
Halogen 10 = ca. ELO 3040 out of 300 games (+73 to v9.0)
Texel 1.08a13 x64 1CPU Perf= ~ 2977 out of 1200 games (+4!! to v1.07...)
Counter 3.7 x64 1CPU = ca. ELO 2871 out of 1100 games
Stockfish 13.0 NNUE x64 1CPU = ca. ELO 3620 out of 1400 games (+17 / +39)

25'+8''
last update was February 17th with 1100 new games; total now 28100 games
we are testing
SlowChess Blitz 2.5NN x64 ELO 3326 out of 500 games + 59
Stockfish 13.0 x64 NNUE ELO 3550 out of 900 games + 33
Igel 2.8.0NN x64 Performance = ca. 3244 out of 1400 games

5'+3'' pb=on
last update was February 22th with +9000 games

3'+1'' pb=on
Last update was March 3rd - see extra posting.

A big „Thank you“ to all testers as usual!!

Links

40/20: http://www.cegt.net/rating.htm
Blitz: http://www.cegt.net/blitz.htm
40/120: http://www.cegt.net/rating120.htm
25+8: http://www.cegt.net/rating25plus8.htm
3+1 pb=on: http://www.cegt.net/rating3plus1pbon.htm
5+3 pb=on: http://www.cegt.net/rating5plus3pbon.htm
Tester: http://www.cegt.net/testers/testers.htm
Games of the week: http://www.cegt.net/40_40%20Rating%20Li ... on/gow.jpg

Werner Schüle
CEGT-Team

MMarco · Post by **MMarco** » Mon Mar 08, 2021 2:07 am

NEW Engines
11 LCZero 0.27.0c CUDNN (67741): 3524 - 300 games (no good start for version with multigather search enabled)

You could verify this on Lc0 discord, but I expect multigather search to be slightly detrimental to Lc0 performance under CEGT testing conditions. Multigather is supposed to be usefull when the GPU load is low (you can verify this using GPU-Z: https://www.techpowerup.com/gpuz/ ), helping to fill more batches at a time (and increase NPS).

With a relatively weak graphic card and a large network like T60, the GPU load will be close to 100% under the standard settings. If the card is already working at full load, enabling multigather will only cause overhead and slightly decrease performance (but I'm not sure if the impact will be important. Maybe it won't, and multigather will have no impact whatsoever).

If you try with a very small net (like 48x5), it could prove beneficial given the cards CEGT uses (GTX 10XX).

Werner · Post by **Werner** » Mon Mar 08, 2021 8:03 am

Thank you very much for this posting. As I saw much better results with this net and RTX card, I thought too this could be the reason.

Werner · Post by **Werner** » Mon Mar 08, 2021 4:52 pm

I made a test from start position after 15 sec with 20x256 net and 30x384 net (seems to work with the smaller net here )
here the output from start position after 15 sec (faster with multigather); net is 20x256 here:

with multigather (--multi-gather=true --max-collision-visits=1000 --max-collision-events=1000)
2021-03-08 16:31:51,614<--1:info depth 18 seldepth 42 time 13942 nodes 56366 score cp 12 nps 4048 tbhits 0 pv d2d4 g8f6

default
2021-03-08 16:34:12,502<--1:info depth 17 seldepth 44 time 13952 nodes 34023 score cp 11 nps 2441 tbhits 0 pv e2e4 e7e5

now with 30x384 net:
now 384x30 net
with multigather
2021-03-08 16:45:17,702<--1:info depth 7 seldepth 19 time 12896 nodes 21311 score cp 8 nps 1655 tbhits 0 pv d2d4 g8f6

default
2021-03-08 16:41:36,373<--1:info depth 7 seldepth 20 time 13986 nodes 23407 score cp 8 nps 1676 tbhits 0 pv d2d4 g8f6

MMarco · Post by **MMarco** » Tue Mar 09, 2021 12:29 am

HMMM. Interesting.

I didn't expect multi-gather to improve NPS on the 20x256 net (is your CPU very old??). You get a 66% speed-up!

Just to compare, on my hardware I got +67% (from 167 000 nps to 279 000 nps) using the very small 32x4 distilled net from dkappe. Maybe one could tweak that further playing with the new parameters TaskWorkers, MinimumProcessingWork etc. but I've no idea about how they work.

Here is what I get with Lc0 27.0 + LS-15 on my 2060 mobile + Ryzen 4900H after "go nodes 500000":

Code: Select all

--weights=LS-15.pb
--backend=cuda-fp16
--nncache=1000000  
--max-collision-events=917 
--max-collision-visits=1000 
--max-out-of-order-evals-factor=2.4 
--smart-pruning-factor=0

--multi-gather=false
info depth 22 seldepth 63 time 22523 nodes 500147 score cp 15 nps 24451 tbhits 0
pv e2e4 e7e5 g1f3 b8c6 f1b5 g8f6 e1g1 f6e4 d2d4 e4d6 b5c6 d7c6 d4e5 d6f5 d1d8 e8d8 b1c3 f8e7 h2h3 h7h5 c1f4 f5h4 f3h4 e7h4 a1d1 d8e8 c3e2 c6c5 c2c4 c8f5 b2b3 a8d8 f2f3 f5c2 d1d8 e8d8 f1c1 c2f5 g2g4 f5d7 g1g2

--multi-gather=true
info depth 22 seldepth 63 time 22623 nodes 500573 score cp 15 nps 24362 tbhits 0
pv e2e4 e7e5 g1f3 b8c6 f1b5 g8f6 e1g1 f6e4 d2d4 e4d6 b5c6 d7c6 d4e5 d6f5 d1d8 e8d8 b1c3 f8e7 h2h3 h7h5 c1f4 f5h4 f3h4 e7h4 a1d1 d8e8 c3e2 c6c5 c2c4 c8f5 b2b3 a8d8 f2f3 b7b6 g2g3 h4e7 g3g4 f5d3 g1f2 b6b5 c4b5

I also tried with --minibatch-size=30 (recommended by ./lc0 backendbench --clippy):

Code: Select all

--multi-gather=false
info depth 22 seldepth 63 time 33390 nodes 500088 score cp 15 nps 15966 tbhits 0
pv e2e4 e7e5 g1f3 b8c6 f1b5 g8f6 e1g1 f6e4 d2d4 e4d6 b5c6 d7c6 d4e5 d6f5 d1d8 e8d8 b1c3 f8e7 h2h3 h7h5 c1f4 f5h4 f3h4 e7h4 a1d1 d8e8 c3e2 c6c5 c2c4 c8f5 b2b3 a8d8 f2f3 f5c2 d1d8 e8d8 f1c1 c2h7 g2g3 h4e7

--multi-gather=true
info depth 22 seldepth 62 time 33899 nodes 500061 score cp 15 nps 15715 tbhits 0
pv e2e4 e7e5 g1f3 b8c6 f1b5 g8f6 e1g1 f6e4 d2d4 e4d6 b5c6 d7c6 d4e5 d6f5 d1d8 e8d8 b1c3 f8e7 h2h3 h7h5 c1f4 f5h4 f3h4 e7h4 a1d1 d8e8 c3e2 c6c5 c2c4 c8f5 b2b3 a8d8 f2f3 b7b6 g2g3 h4e7 g3g4 f5d3 g1f2 e7h4 f2e3

So I guess the best is to verify it on each machine and net (unless using a very small one) before deciding to enable it or not.

Werner · Post by **Werner** » Tue Mar 09, 2021 12:18 pm

Thanks,
I have tested first with default Lc0
--weights=LS-15.pb
--backend=cuda-fp16
--nncache=1000000
--max-collision-events=32
--max-collision-visits=9999
--max-out-of-order-evals-factor=1.0
--smart-pruning-factor=1.33
--multi-gather=false

and then with these differences only
--multi-gather=true
--max-collision-visits=1000
--max-collision-events=1000

I did not change the other positions, as I do not know what happens.

MMarco · Post by **MMarco** » Tue Mar 09, 2021 2:33 pm

--max-collision-events=32
--max-collision-visits=9999
--max-out-of-order-evals-factor=1.0
--smart-pruning-factor=1.33

--smart-pruning-factor=0 is only for analysis. When SPF=0, Leela doesn't prune nodes so that if you do "go nodes 50000" you will really have 50000 nodes (or about) analyzed. Otherwise (with SPF=1.33 say) Lc0 will stop the search earlier when it think that the probability to change its move is low in order to save time for later in the game. For tournament play, SPF=1.33 shouldn't be changed Leela as will play stronger with it.

The three others taken together were suggestion by a Leela Dev when enabling multi-gather. Out of curiosity, I tested them them with multi-gather=false and found that they might give a slight elo gain (but the error bars make it a bit unclear):

Match: Parameters test (J94-100 vs Stockfish 210113 16T)
Hardware: RTX 3070, i7-10700 @ 4.6 Ghz
Time control: 40s + 0.4s for Lc0, 20s + 0.2s for Stockfish (time forfeit disabled)
Openings: Morozevich selected openings: 20 plies, 848 positions, included in the attachement.
Stockfish 210113 bmi2: threads=16, hash=1024, Move Overhead=0
Lc0 27.0-rc1: cuda-fp16, nncache=1000000, threads=1, minibatch-size=46, move-overhead=0, multi-gather=false
New params: max-collision-events=917, max-collision-visits=1000, max-out-of-order-evals-factor=2.4
Default: max-collision-events=32, max-collision-visits=9999, max-out-of-order-evals-factor=1.0
TBs and adj.: syzygy 5-men, draw 5 moves 5cp move 50, resign (two-sided) 5 moves 500cp
Comment: The new parameters appear better once again (see my test from yesterday).

Code: Select all

# PLAYER           : RATING  ERROR  PLAYED  (%)   CFS   W     D    L   D(%)
1 sf-210113-16T    :   0.0   ----    3392  50.65   69  384  2668  340  78.66
2 j94-100-newparam :  -1.9    7.5    1696  49.73   85  171  1345  180  79.30
3 j94-100-default  :  -7.4    7.3    1696  48.97  ---  169  1323  204  78.01

White advantage = 47.00 +/- 2.73, Draw rate (equal opponents) = 82.55 % +/- 0.74

Match: Parameters test (J94-100 vs Stockfish 210113 20T)
Hardware: RTX 3080, i9-10900kf @ 3.7 Ghz
Time control: 60s + 1s (time forfeit disabled)
Openings: TCEC SuFi 17-18-19 (150 pos.) + Polh's Unbalanced Human Openings 4mvs v1 (150 first positions)
Stockfish 210113 bmi2: threads=20, hash=1024, Move Overhead=0
Lc0 27.0-rc1: cuda-fp16, nncache=5000000, threads=2, minibatch-size=64, mlh=on, move-overhead=0, multi-gather=false
New params: max-collision-events=917, max-collision-visits=1000, max-out-of-order-evals-factor=2.4
Default: max-collision-events=32, max-collision-visits=9999, max-out-of-order-evals-factor=1.0
MLH tcec-19: moves-left-max-effect=0.2, moves-left-threshold=0, moves-left-slope=0.004, moves-left-scaled-factor=1, moves-left-quadratic-factor=0, moves-left-constant-factor=0
TBs and adj.: syzygy 5-men, draw 5 moves 5cp move 50, resign (two-sided) 5 moves 500cp
Comments: The new parameters appear better by a slight margin, but CFS is low.

Code: Select all

# PLAYER                        : RATING  ERROR  PLAYED  (%)   CFS   W    D    L   D(%)
1 stockfish-210113-20T          :    0.0   ----   1200  52.71   98  288  689  223  57.42
2 lc0-270-rc1-J94-100-newparam  :  -18.6   17.1    600  47.75   73  114  345  141  57.50
3 lc0-270-rc1-J94-100-default   :  -26.3   17.5    600  46.83  ---  109  344  147  57.33

White advantage = 145.25 +/- 6.27, Draw rate (equal opponents) = 77.61 % +/- 2.09

Engine               :  Depth  MIDG    EARLY   ENDG    LATE
lc0-270-rc1-default  :  12.15  13.72 | 14.62 | 11.11 | 8.02
lc0-270-rc1-newparam :  12.34  13.74 | 14.47 | 11.07 | 8.81
stockfish-210113-20T :  35.09  30.03 | 29.54 | 37.15 | 51.69

CEGT - rating lists March 07th 2021

CEGT - rating lists March 07th 2021

Re: CEGT - rating lists March 07th 2021

Re: CEGT - rating lists March 07th 2021

Re: CEGT - rating lists March 07th 2021

Re: CEGT - rating lists March 07th 2021

Re: CEGT - rating lists March 07th 2021

Re: CEGT - rating lists March 07th 2021