Is the 320x24b larger net the strongest around for RTX GPU?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Hai
Posts: 598
Joined: Sun Aug 04, 2013 1:19 pm

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by Hai »

Hai wrote: Tue Jul 23, 2019 5:01 pm
Laskos wrote: Tue Jul 23, 2019 11:05 am Jhorthos is building and training larger and smaller than the default 256x20b T40 nets, one can download them here:

https://github.com/jhorthos/lczero-trai ... a-Training

On several test suites, which combined, were pretty faithful to longer time/position in assessing strength of different Lc0 nets, the last 320x24b net from that place (320x24.J13-swa-410000) comes out as the best net to longer than Blitz TC. Maybe someone can devote some time playing say 50 games of that net against some of the latest T40 nets at say 30 minutes + 15 seconds TC on an RTX card or several of them. Openings would better be a bit unbalanced, to avoid 90%+ draw rates.


Test results on the suites to longer time per position (30 seconds) of that larger net are pretty amazing, and better than anything I saw with Leela (T30 and T40).

4 i7 fast cores
RTX 2070 GPU
Leela on 2 threads, cache = 1,000,000

My own positional test-suite "Openings200revised":

1s / position
Lc0 42810 : 152/200
Lc0 320x24b-410: 147/200

30s / position
Lc0 42810 : 157/200
Lc0 320x24b-410: 163/200

At larger time per position, the big-net surpasses T40 nets on this positional test suite, getting a record beating result for this suite at 30s/position


Tactical "Arasan21beta":

1s / position
Lc0 42810 : 100/199
Lc0 320x24b-410: 93/199

30s / position
Lc0 42810 : 129/199
Lc0 320x24b-410: 133/199

To my surprise, big-net surpasses on this tactical suite the T40 nets to 30s/position.

Also, JH 320x24b big-nets are progressing steadily, for example, the latest "410" big-net performs significantly better on both these test suites than the "370" big-net from the same site. Useless to say, the big-net seems to scale significantly better to longer TC than T40 nets. The big-net might also well be the best in analyzing positions.

Yes, it's a bit naive to rely on test suites to see the strength-wise behavior to longer TC, but my experience with Lc0 shows that a combination of a positional + a tactical suite was pretty faithful indicator of the strength to inaccesible time controls. Again, maybe some would be curious to play actual games at these longer TC.
2x RTX 2080 Ti and 4 CPU cores
Multiplexing
No TB


ERET-TEST-SUITE:

1 second and 1000 mb:
24x320
0 / 111 = 0.0%
42810
0 / 111 = 0.0%

What's wrong?

1 second and 2000 mb:
24x320
62 / 111 = 55.8%
42810
64 / 111 = 57.6%

1 second and 3000 mb:
24x320
62 / 111 = 55.8%
42810
64 / 111 = 57.6%

1 second and 4000 mb:
24x320
62 / 111 = 55.8%
42810
62 / 111 = 55.8%

1 second and 5000 mb:
24x320
63 / 111 = 56.7%
42810
63 / 111 = 56.7%

1 second and 10000 mb:
24x320
65 / 111 = 58.5%
42810
65 / 111 = 58.5%
Both have the same + highest points

1 second and 15000 mb:
24x320
61 / 111 = 54.9%
42810
65 / 111 = 58.5%

1 second and 20000 mb:
24x320
64 / 111 = 57.6%
42810
65 / 111 = 58.5%

1 second and 30000 mb:
24x320
63 / 111 = 56.7%
42810
63 / 111 = 56.7%

1 second and 40000 mb:
24x320
62 / 111 = 55.8%
42810
67 /111 = 60.3%

1 second and 50000 mb:
24x320
62 / 111 = 55.8%
42810
64 / 111 = 57.6%
2 seconds and 10000 mb:
24x320
66 / 111 = 59.4%
42810
70 / 111 = 63.0%

3 seconds and 10000 mb:
24x320
69 / 111 = 62.1%
42810
69 / 111 = 62.1%

4 seconds and 10000 mb:
24x320
74 / 111 = 66.6%
42810
78 / 111 = 70.2%

5 seconds and 10000 mb:
24x320
77 / 111 = 69.3%
42810
81 / 111 = 72.9%



6 seconds and 10000 mb:
24x320
78 / 111 = 70.2%
42810
81 / 111 = 72.9%

7 seconds and 10000 mb:
24x320
76 / 111 = 68.4%
42810
83 / 111 = 74.7%

8 seconds and 10000 mb:
24x320
80 / 111 = 72.0%
42810
82 / 111 = 73.8%

9 seconds and 10000 mb:
24x320
80 / 111 = 72.0%
42810
83 / 111 = 74.7%



10 seconds and 10000 mb:
24x320
80 / 111 = 72.0%
42810
82 / 111 = 73.8%

20 seconds and 10000 mb:
24x320
85 / 111 = 76.5%
42810
88 / 111 = 79.2%

30 seconds and 10000 mb:
24x320
88 / 111 = 79.2%
42810
86 / 111 = 77.4%
brianr
Posts: 536
Joined: Thu Mar 09, 2006 3:01 pm

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by brianr »

zullil wrote: Wed Jul 24, 2019 4:48 pm
brianr wrote: Wed Jul 24, 2019 4:26 pm There is a clear difference in match results with and without the restart=on option in cutechess-cli.
Below are two identical Leelas.

Score of A vs B: 31 - 18 - 51 [0.565] 100
Elo difference: 45.42 +/- 47.89

Dead even within one game with restart=on
Interesting, but not entirely clear to me. In the 31-18-51 match, Leela B was forced to restart each game but Leela A was not? And then a dead-even match when both engines restarted each game?
Same openings are played with sides reversed, so should not matter too much, although if some part of the tree is saved it might have a small effect depending on the particular openings. I generally use 2moves_v2.pgn for shorter matches.

The A v B score is without restart on for either engine.
While within the error margins, the result is far from even.
It was exactly even all of the time with restart on.

Accordingly, I now only run Lc0 with restart=on.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by Laskos »

brianr wrote: Wed Jul 24, 2019 6:28 pm
zullil wrote: Wed Jul 24, 2019 4:48 pm
brianr wrote: Wed Jul 24, 2019 4:26 pm There is a clear difference in match results with and without the restart=on option in cutechess-cli.
Below are two identical Leelas.

Score of A vs B: 31 - 18 - 51 [0.565] 100
Elo difference: 45.42 +/- 47.89

Dead even within one game with restart=on
Interesting, but not entirely clear to me. In the 31-18-51 match, Leela B was forced to restart each game but Leela A was not? And then a dead-even match when both engines restarted each game?
Same openings are played with sides reversed, so should not matter too much, although if some part of the tree is saved it might have a small effect depending on the particular openings. I generally use 2moves_v2.pgn for shorter matches.

The A v B score is without restart on for either engine.
While within the error margins, the result is far from even.
It was exactly even all of the time with restart on.

Accordingly, I now only run Lc0 with restart=on.
No, what's this? I can have no conclusions based on this result.
User avatar
pohl4711
Posts: 2439
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by pohl4711 »

zullil wrote: Wed Jul 24, 2019 6:00 pm
pohl4711 wrote: Wed Jul 24, 2019 5:56 pm
zullil wrote: Wed Jul 24, 2019 5:11 pm
It seems strange to me that a GUI would not send 'ucinewgame' to an engine at the start of a new game. So I assume that Fritz does this. And if it does, then Lc0 does the following in response:
The problem is, that the FritzGUI does it NOT. It is a known weakness since Fritz 6, that no "ucinewgame"-command is sent to the engines, when a new game starts in an engine match. And it was never fixed by ChessBase, because they believe, it is "not necessary"... in an engine-tournament, the engines are reloaded for every game, so there it is not a problem.
Oh, thanks for that information. What does "reloaded" mean? Does it mean that each engine is terminated and then the engine binaries are restarted?
Exactly
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by corres »

brianr wrote: Wed Jul 24, 2019 4:26 pm There is a clear difference in match results with and without the restart=on option in cutechess-cli.
...
Obviously we can run consecutive games on a Fritz GUI with restart(that is "Ucinewgame") = OFF only.
It has two consequences: Some benefit against AB engines and the gradually growing memory usage of Leela.
Opposite to the above if restart = ON Leela misses her benefit not only what arises from the stored information but because of long initializing course of Leela and she spares to us a lot of (virtual) memory.
Nevertheless the test run with restart = ON stands more nearer to a single game than the test run with
restart = OFF.

Note
Cutechess-cli is good tool for running matches but when we use it we also need a completed chess GUI to analyze the games yielded with cutechess-cli.
Moreover setting its parameters is a time consuming thing.
OneTrickPony
Posts: 157
Joined: Tue Apr 30, 2013 1:29 am

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by OneTrickPony »

As to JHortos 320x24b net I think it's the strongest. No formal testing but I run a lot of analysis and it doesn't have as many blindspots as the official one does.
As to ever growing memory consumption. No idea about the memory leak but ever growing RAM usage in infinite analysis mode is a feature of MCTS. My friend actually implemented a RAM limit in Leela Go and I think it was accepted as official patch. It would be nice to have something like this in Lc0. As it is I am always stressed when I am late home to stop the analysis before to freezes the computer to a stall (page file) or just crashes. The way it works is that the program estimates how many nodes can MCTS tree have and it doesn't grow beyond that point.
crem
Posts: 177
Joined: Wed May 23, 2018 9:29 pm

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by crem »

corres wrote: Wed Jul 24, 2019 7:31 pm
Obviously we can run consecutive games on a Fritz GUI with restart(that is "Ucinewgame") = OFF only.
If there's no ucinewgame, the NN cache is not cleared, but a tree is still reset.
It's not a problem and doesn't cause unbounded memory usage growth as cache size is limited (and it fills up pretty quickly anyway).
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by corres »

crem wrote: Wed Jul 24, 2019 9:00 pm
corres wrote: Wed Jul 24, 2019 7:31 pm Obviously we can run consecutive games on a Fritz GUI with restart(that is "Ucinewgame") = OFF only.
If there's no ucinewgame, the NN cache is not cleared, but a tree is still reset.
It's not a problem and doesn't cause unbounded memory usage growth as cache size is limited (and it fills up pretty quickly anyway).
But where Leela knows from she should reset the tree?
Only the GUI knows when a game ends and the new game starts.
So without the command of Ucinewgame the tree is not reset.
When there were AB engines only the lack of Ucinewgame command did not cause issue.
Maybe if Leela will be a Chessbase engine (?) they will solve the problem.

Note
As I see Leela can receive and use the Ucinewgame command if she gets the command from the GUI.
crem
Posts: 177
Joined: Wed May 23, 2018 9:29 pm

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by crem »

corres wrote: Wed Jul 24, 2019 11:17 pm

But where Leela knows from she should reset the tree?
Only the GUI knows when a game ends and the new game starts.
So without the command of Ucinewgame the tree is not reset.
Before every move, a position is sent to an engine, so that it knows what to think about. If it's a startpos (or other position unrelated to previous search), the tree is reset.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by corres »

crem wrote: Wed Jul 24, 2019 11:26 pm
corres wrote: Wed Jul 24, 2019 11:17 pm But where Leela knows from she should reset the tree?
Only the GUI knows when a game ends and the new game starts.
So without the command of Ucinewgame the tree is not reset.
Before every move, a position is sent to an engine, so that it knows what to think about. If it's a startpos (or other position unrelated to previous search), the tree is reset.
I am sorry, but I did not experience any reset effect when I run consecutive games from start position between Leela and Stockfish.
Just the Ucinewgame command can reset Leela and takes the virtual memory down to (near) the level of initializing. In my system this is about 11 GB. From this level the used virtual memory reaches about
14.5 GB during 20 minutes without any reset but in the meantime three new games ended and started.
I do not know such an engine what can compare the consecutive positions in order to reset the engine.