20x256 Leela seemed quite strong

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

shrapnel
Posts: 1339
Joined: Fri Nov 02, 2012 9:43 am
Location: New Delhi, India

Re: 20x256 Leela seemed quite strong

Post by shrapnel »

Nay Lin Tun wrote: Fri May 25, 2018 1:43 pm This leela seemed strong, 2 games played , 2 games draw vs stockfish is amazing. :D

Please test this leela.
https://drive.google.com/uc?id=11ueg-md ... t=download
This Leela, 20x256, seemed quite strong.
I played 3+2 mins blitz cuDNN 21st May (3rd version) Leela vs full strength latest vestion 4 cores i5 3.0GHZ stockfish on my 1060, for two games and both games draw. Opening book, perfect chess, 2017, 8 moves.


I will test more games later.

Nay

P.s average speed of Leela is around 2-2.5 Kn/s and average speed of stockfish is around 5 Mn/s.
https://lichess.org/UQytNWdL

https://lichess.org/IlhlNsul
Is there a webpage/URL where one can keep downloading the latest versions ?
i7 5960X @ 4.1 Ghz, 64 GB G.Skill RipJaws RAM, Twin Asus ROG Strix OC 11 GB Geforce 2080 Tis
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: 20x256 Leela seemed quite strong

Post by Laskos »

Albert Silver wrote: Fri May 25, 2018 6:31 pm
Laskos wrote: Fri May 25, 2018 6:03 pm
Nay Lin Tun wrote: Fri May 25, 2018 1:43 pm This leela seemed strong, 2 games played , 2 games draw vs stockfish is amazing. :D

Please test this leela.
https://drive.google.com/uc?id=11ueg-md ... t=download
This Leela, 20x256, seemed quite strong.
I played 3+2 mins blitz cuDNN 21st May (3rd version) Leela vs full strength latest vestion 4 cores i5 3.0GHZ stockfish on my 1060, for two games and both games draw. Opening book, perfect chess, 2017, 8 moves.


I will test more games later.

Nay

P.s average speed of Leela is around 2-2.5 Kn/s and average speed of stockfish is around 5 Mn/s.
https://lichess.org/UQytNWdL

https://lichess.org/IlhlNsul
Well, it doesn't seem significantly stronger than 192x15 NN338 in my gauntlet, maybe weaker. It also performs a bit worse at 10s/position in both tactical and positional test suites. Probably due to 2x slower speed.

Gauntlet (will continue to 60 games each Lc0) with the latest cuDNN Lc0 on GTX 1060 GPU, latest 256x20 net, and NN338 192x15 net, against Houdini 1.5a on one 3.8 GHz core of i7. Time control: 1s/move. Openings: balanced short 3-mover from GM games.

Code: Select all

Games Completed = 50 of 120 (Avg game length = 116.917 sec)
Settings = Gauntlet/64MB/1000ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817)
Time = 6207 sec elapsed, 8689 sec remaining
 1.  Houdini 1.5a             	28.5/50	23-16-11  	(L: m=16 t=0 i=0 a=0)	(D: r=9 i=1 f=0 s=0 a=1)	(tpm=1001.3 d=19.47 nps=3639708)
 
 2.  Lc0 NN338                	12.0/25	10-11-4  	(L: m=11 t=0 i=0 a=0)	(D: r=4 i=0 f=0 s=0 a=0)	(tpm=753.0 d=3.27 nps=5003)
 3.  Lc0 NN  256x20            	 9.5/25	6-12-7  	(L: m=12 t=0 i=0 a=0)	(D: r=5 i=1 f=0 s=0 a=1)	(tpm=777.8 d=2.07 nps=2505)
I will report later on the final result. Keep in mind that Lc0 scales better than standard AB engines, so it is stronger at LTC.
The author pointed out this did not include the buggy game removal yet, though it is a test he plans to make.
There is also an alchemy of UCI settings one can fiddle a lot with, I just used CLOP settings. Also, short time control of 1s/move. NN338 got 28.5/60 against Houdini 1.5a, this 256x20 net got 18.0/60, significantly worse. Would be curious how it scales and what are better settings for it, Anil at longer TC with my extreme settings got a win over Komodo 11.2, which is not an easy thing (draws are occuring quite often by now).
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: 20x256 Leela seemed quite strong

Post by Albert Silver »

Laskos wrote: Fri May 25, 2018 9:20 pm
Albert Silver wrote: Fri May 25, 2018 6:31 pm
Laskos wrote: Fri May 25, 2018 6:03 pm
Well, it doesn't seem significantly stronger than 192x15 NN338 in my gauntlet, maybe weaker. It also performs a bit worse at 10s/position in both tactical and positional test suites. Probably due to 2x slower speed.

Gauntlet (will continue to 60 games each Lc0) with the latest cuDNN Lc0 on GTX 1060 GPU, latest 256x20 net, and NN338 192x15 net, against Houdini 1.5a on one 3.8 GHz core of i7. Time control: 1s/move. Openings: balanced short 3-mover from GM games.

Code: Select all

Games Completed = 50 of 120 (Avg game length = 116.917 sec)
Settings = Gauntlet/64MB/1000ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817)
Time = 6207 sec elapsed, 8689 sec remaining
 1.  Houdini 1.5a             	28.5/50	23-16-11  	(L: m=16 t=0 i=0 a=0)	(D: r=9 i=1 f=0 s=0 a=1)	(tpm=1001.3 d=19.47 nps=3639708)
 
 2.  Lc0 NN338                	12.0/25	10-11-4  	(L: m=11 t=0 i=0 a=0)	(D: r=4 i=0 f=0 s=0 a=0)	(tpm=753.0 d=3.27 nps=5003)
 3.  Lc0 NN  256x20            	 9.5/25	6-12-7  	(L: m=12 t=0 i=0 a=0)	(D: r=5 i=1 f=0 s=0 a=1)	(tpm=777.8 d=2.07 nps=2505)
I will report later on the final result. Keep in mind that Lc0 scales better than standard AB engines, so it is stronger at LTC.
The author pointed out this did not include the buggy game removal yet, though it is a test he plans to make.
There is also an alchemy of UCI settings one can fiddle a lot with, I just used CLOP settings. Also, short time control of 1s/move. NN338 got 28.5/60 against Houdini 1.5a, this 256x20 net got 18.0/60, significantly worse. Would be curious how it scales and what are better settings for it, Anil at longer TC with my extreme settings got a win over Komodo 11.2, which is not an easy thing (draws are occuring quite often by now).
I think the CLOP settings were lacking games, and had some issues. I have had detailed discussions with the originator, and together we uncovered a few issues meaning it likely needs to be redone and finetuned. The most outstanding one is that it was running the exact same opening position 400 times.

Another guy posted his own results, declaring them final and conclusive, and said there was no arguing with them, aside from the small problem of tons of games being lost on time by Leela in his test. Apparently a mere detail not worth quibbling about. :-)

After ample help by CCC members in the programming forum, I got CLOP running here, debugged, and am doing my own testing, for only those two parameters, FPU and PUCT. Will take a while before I get anything significant though, as I am testing 1m+1s, so patience is advised.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: 20x256 Leela seemed quite strong

Post by jp »

Albert Silver wrote: Fri May 25, 2018 2:58 pm
Jhoravi wrote: Fri May 25, 2018 2:26 pm You mean it got stronger without additional training? How come?
Larger net, no oversampling, removal of a few million buggy games.
Have they specified which games were thrown out for this net?
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: 20x256 Leela seemed quite strong

Post by Albert Silver »

jp wrote: Fri May 25, 2018 10:03 pm
Albert Silver wrote: Fri May 25, 2018 2:58 pm
Jhoravi wrote: Fri May 25, 2018 2:26 pm You mean it got stronger without additional training? How come?
Larger net, no oversampling, removal of a few million buggy games.
Have they specified which games were thrown out for this net?
It is not 'they'. It is a user who built it himself. Anyone can build an NN using the training games as they see fit. There are a number of oddball variations floating around. In any case, the buggy games are still there he said. He plans to do this, but is waiting for more material.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: 20x256 Leela seemed quite strong

Post by jp »

Albert Silver wrote: Fri May 25, 2018 10:46 pm It is not 'they'. It is a user who built it himself. Anyone can build an NN using the training games as they see fit. There are a number of oddball variations floating around. In any case, the buggy games are still there he said. He plans to do this, but is waiting for more material.
OK, thanks. I thought this was an "official" NN, though I couldn't find it on the "official" page.
I think it's a good idea to get rid of buggy games, though I'm not sure how 'they' or 'we' or 'he' would decide which.

We don't even know what bugs are still there.
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: 20x256 Leela seemed quite strong

Post by Albert Silver »

jp wrote: Fri May 25, 2018 11:08 pm
Albert Silver wrote: Fri May 25, 2018 10:46 pm It is not 'they'. It is a user who built it himself. Anyone can build an NN using the training games as they see fit. There are a number of oddball variations floating around. In any case, the buggy games are still there he said. He plans to do this, but is waiting for more material.
OK, thanks. I thought this was an "official" NN, though I couldn't find it on the "official" page.
I think it's a good idea to get rid of buggy games, though I'm not sure how 'they' or 'we' or 'he' would decide which.

We don't even know what bugs are still there.
Well, buggy would be any that led to the known slump, and removing those games. The easiest way to measure is by looking at the value net, and seeing when it began to visibly worsen. This is best seen in the following spreadsheet in column 'v':

https://docs.google.com/spreadsheets/d/ ... li=1#gid=0

You can see a massive drop as of NN241, which would be a logical place to start the cut. It only begins to show clear recovery as of NN316, and the strength of the value net is only on par with NN241 as of NN331. Since this was very recently, it makes sense to trim game in between and then see if the net using only games around NN316 and no sooner, added to all of NN241 and before, does not lead to an overall stronger net. However, NN316 was so recent there are not really enough games, and if NN331 is the cutting point even less. So a bit of time for more data before trying this is the idea.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
shrapnel
Posts: 1339
Joined: Fri Nov 02, 2012 9:43 am
Location: New Delhi, India

Re: 20x256 Leela seemed quite strong

Post by shrapnel »

Albert Silver wrote: Fri May 25, 2018 10:46 pmIt is not 'they'. It is a user who built it himself.
Oh ! Does that mean we will see very few 20x256 Networks ?
I really think that's the way to go, if one really wants to strengthen lc0.
15x192 doesn't really seem to be cutting it and seems to have hit a wall.
Is there a Tutorial on how to build such Networks ?
I'll give it the old college try, if its not too difficult and time-consuming.
i7 5960X @ 4.1 Ghz, 64 GB G.Skill RipJaws RAM, Twin Asus ROG Strix OC 11 GB Geforce 2080 Tis
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: 20x256 Leela seemed quite strong

Post by Laskos »

Albert Silver wrote: Fri May 25, 2018 9:37 pm
Laskos wrote: Fri May 25, 2018 9:20 pm
Albert Silver wrote: Fri May 25, 2018 6:31 pm

The author pointed out this did not include the buggy game removal yet, though it is a test he plans to make.
There is also an alchemy of UCI settings one can fiddle a lot with, I just used CLOP settings. Also, short time control of 1s/move. NN338 got 28.5/60 against Houdini 1.5a, this 256x20 net got 18.0/60, significantly worse. Would be curious how it scales and what are better settings for it, Anil at longer TC with my extreme settings got a win over Komodo 11.2, which is not an easy thing (draws are occuring quite often by now).
I think the CLOP settings were lacking games, and had some issues. I have had detailed discussions with the originator, and together we uncovered a few issues meaning it likely needs to be redone and finetuned. The most outstanding one is that it was running the exact same opening position 400 times.

Another guy posted his own results, declaring them final and conclusive, and said there was no arguing with them, aside from the small problem of tons of games being lost on time by Leela in his test. Apparently a mere detail not worth quibbling about. :-)

After ample help by CCC members in the programming forum, I got CLOP running here, debugged, and am doing my own testing, for only those two parameters, FPU and PUCT. Will take a while before I get anything significant though, as I am testing 1m+1s, so patience is advised.
I lost half a day fiddling with UCI settings to optimize them on tactics (your WAC200 and mine ECM64 suites), I managed to improve the performance on these suites dramatically with these "extreme" tactical settings:

Code: Select all

Scale thinking time=2.8
Cpuct MCTS option=15.0
First Play Urgency Reduction=-0.30
Virtual loss bug=60.0
For example, for NN342 on GTX 1060 with 10s/position, on your WAC 200 I have the following results:

Lc0 default NN342 solved:
100/200

Lc0 "extreme" settings NN342 solved:
153/200

A very large tactical improvement. But a bit worse positionally. I am now running a match of 20 games at 5 minutes + 5 seconds increment between NN342 Lc0 "extreme" settings against SF dev on 4 i7 cores (about 3600 CCRL 40/4' Elo points), and the result seems very good as of now: 6 draws and 7 wins of Stockfish, or 210 Elo points difference. That would give an unprecedented on my hardware and set-up rating of almost 3400 CCRL 40/4' Elo points rating of Lc0 (NN342). The sample is still very small, and it will remain small with only 20 games, but my "extreme" settings are at least not disastrous for game-play. The openings are 3-mover balanced GM openings, side and reversed.
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: 20x256 Leela seemed quite strong

Post by Albert Silver »

Laskos wrote: Sat May 26, 2018 9:24 pm
Albert Silver wrote: Fri May 25, 2018 9:37 pm
Laskos wrote: Fri May 25, 2018 9:20 pm

There is also an alchemy of UCI settings one can fiddle a lot with, I just used CLOP settings. Also, short time control of 1s/move. NN338 got 28.5/60 against Houdini 1.5a, this 256x20 net got 18.0/60, significantly worse. Would be curious how it scales and what are better settings for it, Anil at longer TC with my extreme settings got a win over Komodo 11.2, which is not an easy thing (draws are occuring quite often by now).
I think the CLOP settings were lacking games, and had some issues. I have had detailed discussions with the originator, and together we uncovered a few issues meaning it likely needs to be redone and finetuned. The most outstanding one is that it was running the exact same opening position 400 times.

Another guy posted his own results, declaring them final and conclusive, and said there was no arguing with them, aside from the small problem of tons of games being lost on time by Leela in his test. Apparently a mere detail not worth quibbling about. :-)

After ample help by CCC members in the programming forum, I got CLOP running here, debugged, and am doing my own testing, for only those two parameters, FPU and PUCT. Will take a while before I get anything significant though, as I am testing 1m+1s, so patience is advised.
I lost half a day fiddling with UCI settings to optimize them on tactics (your WAC200 and mine ECM64 suites), I managed to improve the performance on these suites dramatically with these "extreme" tactical settings:

Code: Select all

Scale thinking time=2.8
Cpuct MCTS option=15.0
First Play Urgency Reduction=-0.30
Virtual loss bug=60.0
For example, for NN342 on GTX 1060 with 10s/position, on your WAC 200 I have the following results:

Lc0 default NN342 solved:
100/200

Lc0 "extreme" settings NN342 solved:
153/200

A very large tactical improvement. But a bit worse positionally. I am now running a match of 20 games at 5 minutes + 5 seconds increment between NN342 Lc0 "extreme" settings against SF dev on 4 i7 cores (about 3600 CCRL 40/4' Elo points), and the result seems very good as of now: 6 draws and 7 wins of Stockfish, or 210 Elo points difference. That would give an unprecedented on my hardware and set-up rating of almost 3400 CCRL 40/4' Elo points rating of Lc0 (NN342). The sample is still very small, and it will remain small with only 20 games, but my "extreme" settings are at least not disastrous for game-play. The openings are 3-mover balanced GM openings, side and reversed.
Virtual loss bug should not affect tactical performance, and scale thinking time will have zero effect in tactical suites, since it deals solely with time management.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."