Is the 320x24b larger net the strongest around for RTX GPU?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Is the 320x24b larger net the strongest around for RTX GPU?

Post by Laskos »

Jhorthos is building and training larger and smaller than the default 256x20b T40 nets, one can download them here:

https://github.com/jhorthos/lczero-trai ... a-Training

On several test suites, which combined, were pretty faithful to longer time/position in assessing strength of different Lc0 nets, the last 320x24b net from that place (320x24.J13-swa-410000) comes out as the best net to longer than Blitz TC. Maybe someone can devote some time playing say 50 games of that net against some of the latest T40 nets at say 30 minutes + 15 seconds TC on an RTX card or several of them. Openings would better be a bit unbalanced, to avoid 90%+ draw rates.


Test results on the suites to longer time per position (30 seconds) of that larger net are pretty amazing, and better than anything I saw with Leela (T30 and T40).

4 i7 fast cores
RTX 2070 GPU
Leela on 2 threads, cache = 1,000,000

My own positional test-suite "Openings200revised":

1s / position
Lc0 42810 : 152/200
Lc0 320x24b-410: 147/200

30s / position
Lc0 42810 : 157/200
Lc0 320x24b-410: 163/200

At larger time per position, the big-net surpasses T40 nets on this positional test suite, getting a record beating result for this suite at 30s/position


Tactical "Arasan21beta":

1s / position
Lc0 42810 : 100/199
Lc0 320x24b-410: 93/199

30s / position
Lc0 42810 : 129/199
Lc0 320x24b-410: 133/199

To my surprise, big-net surpasses on this tactical suite the T40 nets to 30s/position.

Also, JH 320x24b big-nets are progressing steadily, for example, the latest "410" big-net performs significantly better on both these test suites than the "370" big-net from the same site. Useless to say, the big-net seems to scale significantly better to longer TC than T40 nets. The big-net might also well be the best in analyzing positions.

Yes, it's a bit naive to rely on test suites to see the strength-wise behavior to longer TC, but my experience with Lc0 shows that a combination of a positional + a tactical suite was pretty faithful indicator of the strength to inaccesible time controls. Again, maybe some would be curious to play actual games at these longer TC.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by corres »

There is a basic problem to make test match with longer TC than blitz: You need a hardware with huge RAM.
Leela has a very unpleasant behavior namely her memory usage enhances together with the playing time.
When you use Leela for infinite analysis it is a natural phenomenon but if you make a consecutive
line of games (engine - engine match) Leela also claims more and more memory nearly without any barrier. Because of this even for a simple RTX 20xx GPU you need at lest 256 GB RAM. Leela uses virtual memory (RAM + pagefile) but it is a question using of pagefile how measure Leela makes slower.
Until the developers of Leela can not (or do not want to) solve this issue competitions with long TC can be made by only some peoples who have near TCEC hardware of GPU.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by mwyoung »

corres wrote: Tue Jul 23, 2019 12:03 pm There is a basic problem to make test match with longer TC than blitz: You need a hardware with huge RAM.
Leela has a very unpleasant behavior namely her memory usage enhances together with the playing time.
When you use Leela for infinite analysis it is a natural phenomenon but if you make a consecutive
line of games (engine - engine match) Leela also claims more and more memory nearly without any barrier. Because of this even for a simple RTX 20xx GPU you need at lest 256 GB RAM. Leela uses virtual memory (RAM + pagefile) but it is a question using of pagefile how measure Leela makes slower.
Until the developers of Leela can not (or do not want to) solve this issue competitions with long TC can be made by only some peoples who have near TCEC hardware of GPU.
I have not seen any issues playing LTC or VLTC with Lc0. I have played matches with time controls as long as Game in 12 hours. I did not see Lc0 using more ram. Than I gave it to use when looking at the system monitor.

I have also played engine matches with Lc0 running for a week without issues.
I have even played Lc0 against Lc0 testing 2 NN of Lc0. And I have never seen this problem.

Could this be a GUI issue?
Last edited by mwyoung on Tue Jul 23, 2019 1:02 pm, edited 1 time in total.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by Laskos »

corres wrote: Tue Jul 23, 2019 12:03 pm There is a basic problem to make test match with longer TC than blitz: You need a hardware with huge RAM.
Leela has a very unpleasant behavior namely her memory usage enhances together with the playing time.
When you use Leela for infinite analysis it is a natural phenomenon but if you make a consecutive
line of games (engine - engine match) Leela also claims more and more memory nearly without any barrier. Because of this even for a simple RTX 20xx GPU you need at lest 256 GB RAM. Leela uses virtual memory (RAM + pagefile) but it is a question using of pagefile how measure Leela makes slower.
Until the developers of Leela can not (or do not want to) solve this issue competitions with long TC can be made by only some peoples who have near TCEC hardware of GPU.
The RAM consumption of Lc0 with fast GPUs is large and not easy to control, but I would bet 32 GB total RAM would suffice to play games in 30 mins + 15 sec between 2 Lc0 on an RTX 2080. Restarting the engines after each game and setting nncache to say 1-2 million. I did play games in 5 min + 3 sec, and the 2 Leelas consumed about max 5 GB of RAM game after game.
Hai
Posts: 598
Joined: Sun Aug 04, 2013 1:19 pm

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by Hai »

I can do the tests but I'm doing it with CB, so I need .cbh and not epd or pgn.
And a link where I can download opening200revised and arasan21beta. Others are welcome too.
Hai
Posts: 598
Joined: Sun Aug 04, 2013 1:19 pm

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by Hai »

Laskos wrote: Tue Jul 23, 2019 1:00 pm
corres wrote: Tue Jul 23, 2019 12:03 pm There is a basic problem to make test match with longer TC than blitz: You need a hardware with huge RAM.
Leela has a very unpleasant behavior namely her memory usage enhances together with the playing time.
When you use Leela for infinite analysis it is a natural phenomenon but if you make a consecutive
line of games (engine - engine match) Leela also claims more and more memory nearly without any barrier. Because of this even for a simple RTX 20xx GPU you need at lest 256 GB RAM. Leela uses virtual memory (RAM + pagefile) but it is a question using of pagefile how measure Leela makes slower.
Until the developers of Leela can not (or do not want to) solve this issue competitions with long TC can be made by only some peoples who have near TCEC hardware of GPU.
The RAM consumption of Lc0 with fast GPUs is large and not easy to control, but I would bet 32 GB total RAM would suffice to play games in 30 mins + 15 sec between 2 Lc0 on an RTX 2080. Restarting the engines after each game and setting nncache to say 1-2 million. I did play games in 5 min + 3 sec, and the 2 Leelas consumed about max 5 GB of RAM game after game.
I need, when using 2xRTX 2080 Ti GPUs, 64 GB RAM for 1 hour.
=640 GB for 10 hours
=1280 GB for 20 hours.
=1536 GB for 24 hours.
=Threadripper with 2 TB RAM makes sense :mrgreen:.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by corres »

Laskos wrote: Tue Jul 23, 2019 1:00 pm
corres wrote: Tue Jul 23, 2019 12:03 pm There is a basic problem to make test match with longer TC than blitz: You need a hardware with huge RAM.
Leela has a very unpleasant behavior namely her memory usage enhances together with the playing time.
When you use Leela for infinite analysis it is a natural phenomenon but if you make a consecutive
line of games (engine - engine match) Leela also claims more and more memory nearly without any barrier. Because of this even for a simple RTX 20xx GPU you need at lest 256 GB RAM. Leela uses virtual memory (RAM + pagefile) but it is a question using of pagefile how measure Leela makes slower.
Until the developers of Leela can not (or do not want to) solve this issue competitions with long TC can be made by only some peoples who have near TCEC hardware of GPU.
The RAM consumption of Lc0 with fast GPUs is large and not easy to control, but I would bet 32 GB total RAM would suffice to play games in 30 mins + 15 sec between 2 Lc0 on an RTX 2080. Restarting the engines after each game and setting nncache to say 1-2 million. I did play games in 5 min + 3 sec, and the 2 Leelas consumed about max 5 GB of RAM game after game.
Na ja!
Restarting the engine after each (short) game is a forced solution on this issue.
But even in this case I can run a game on my DUAL system between two NN engines if it is shorter than half our.
But I talk about "consecutive games" because this is the natural course.
I think this issue would be solved by soft ware and not by a manual method.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by corres »

Hai wrote: Tue Jul 23, 2019 1:27 pm ...
I need, when using 2xRTX 2080 Ti GPUs, 64 GB RAM for 1 hour.
=640 GB for 10 hours
=1280 GB for 20 hours.
=1536 GB for 24 hours.
=Threadripper with 2 TB RAM makes sense :mrgreen:.
This is the sad reality...
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by corres »

mwyoung wrote: Tue Jul 23, 2019 12:47 pm
corres wrote: Tue Jul 23, 2019 12:03 pm There is a basic problem to make test match with longer TC than blitz: You need a hardware with huge RAM.
Leela has a very unpleasant behavior namely her memory usage enhances together with the playing time.
When you use Leela for infinite analysis it is a natural phenomenon but if you make a consecutive
line of games (engine - engine match) Leela also claims more and more memory nearly without any barrier. Because of this even for a simple RTX 20xx GPU you need at lest 256 GB RAM. Leela uses virtual memory (RAM + pagefile) but it is a question using of pagefile how measure Leela makes slower.
Until the developers of Leela can not (or do not want to) solve this issue competitions with long TC can be made by only some peoples who have near TCEC hardware of GPU.
I have not seen any issues playing LTC or VLTC with Lc0. I have played matches with time controls as long as Game in 12 hours. I did not see Lc0 using more ram. Than I gave it to use when looking at the system monitor.
I have also played engine matches with Lc0 running for a week without issues.
I have even played Lc0 against Lc0 testing 2 NN of Lc0. And I have never seen this problem.
Could this be a GUI issue?
I know nothing about your hardware and tests.
There is nothing any GUI issue.
crem
Posts: 177
Joined: Wed May 23, 2018 9:29 pm

Re: Is the 320x24b larger net the strongest around for RTX GPU?

Post by crem »

corres wrote: Tue Jul 23, 2019 4:17 pm
Hai wrote: Tue Jul 23, 2019 1:27 pm ...
I need, when using 2xRTX 2080 Ti GPUs, 64 GB RAM for 1 hour.
=640 GB for 10 hours
=1280 GB for 20 hours.
=1536 GB for 24 hours.
=Threadripper with 2 TB RAM makes sense :mrgreen:.
This is the sad reality...
Those numbers are about right, but note it's memory consumption per a single search session (aka move). When new move starts, most of the memory is freed up for the next search.
So if engine spends 1 hour on a single move, it indeed can require 64 GB of RAM.

But it's not true for 1 hour game. 1 hour game is usually 1 minute per move or so, memory usage will be about 1GB even with 2x2080ti GPUs.