how will Leela fare at the WCCC?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: how will Leela fare at the WCCC?

Post by yanquis1972 »

NNs create a problem for bookless tournaments, unless A/B approach changes...or is SF (for example) tuned from move 1? they're no doubt very capable now in the openings but if tuning is done with a book a highly trained NN will be superior, surely
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: how will Leela fare at the WCCC?

Post by Laskos »

Anybody tested cuDNN lc0 with a good mainbranch net (I used 482) at longer time control? We all assume that it scales well compared to AB regular engines, but I only have evidence of that from ultra-fast TC to fast blitz TC. I got a worrying result at a bit longer TC, 10min + 10s, but in very few games, only 10. Maybe bad luck, but as it is, the LOS is 96.5% that it doesn't scale very well to longer time control.


6s + 0.1s
Score of Arasan 20.5 vs Lc0_11jul ID482: 75 - 70 - 55 [0.512] 200
Elo difference: 8.69 +/- 41.14
Finished match
Performance 3050 CCRL 40/4'.

10 times longer TC:
60s + 1s
Score of Komodo 9.2 vs Lc0_11jul ID482: 33 - 23 - 44 [0.550] 100
Elo difference: 34.86 +/- 51.32
Finished match
Performance 3300 CCRL 40/4', and an improvement of 250 Elo points at 10 times slower time control. I expected that at another 10 times longer TC, it will improve by at least 100-150 Elo points (less than before, because of diminishing returns with TC). In this case, Leela would be on par or superior to Komodo 12.1.1 on one core. But the result is very disappointing:

10 times longer TC:
600s + 10s
Score of Komodo 12.1.1 vs Lc0_11jul ID482: 6 - 1 - 3 [0.750] 10
Elo difference: 190.85 +/- 255.97
Finished match
Performance 3225 CCRL 40/4', even a regression compared earlier TC. LOS = 96.5% for Komodo, which I expected to be very similar in strength at this longer TC. Maybe I will leave overnight for another 10 games, if the result will be again weak, it is probably not bad luck.

I am a bit worried that Leela might scale badly in TCEC conditions. Maybe some folks have experience with Leela on longer time controls. Also, do they know whether in TCEC conditions the rig they are running will have enough RAM? Leela MCTS is very RAM hungry with fast GPUs and LTC.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: how will Leela fare at the WCCC?

Post by Michel »

Not sure if the result against K12.1 is really disappointing. The error bars indicate that both engines could still be equal strength...
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: how will Leela fare at the WCCC?

Post by Laskos »

Michel wrote: Fri Jul 13, 2018 9:01 pm Not sure if the result against K12.1 is really disappointing. The error bars indicate that both engines could still be equal strength...
Not sure about those error bars, in any case, for skewed results, they should be asymmetrical. LOS with a uniform prior is easy to get, and it is 96.5%. Not very encouraging. LOS with a bit informed prior can be tamed somewhat, with a rough prior (4 * score * (1-score))^4, which basically assumes that the difference is hardly above 200 Elo points, I can get a LOS of 93.6%, still not very nice. I will maybe run another 10 games overnight, if something similar happens, it will be much clearer.
User avatar
George Tsavdaris
Posts: 1627
Joined: Thu Mar 09, 2006 12:35 pm

Re: how will Leela fare at the WCCC?

Post by George Tsavdaris »

Laskos wrote: Fri Jul 13, 2018 8:38 pm 10 times longer TC:
600s + 10s
Score of Komodo 12.1.1 vs Lc0_11jul ID482: 6 - 1 - 3 [0.750] 10
Elo difference: 190.85 +/- 255.97
Finished match
Performance 3225 CCRL 40/4', even a regression compared earlier TC. LOS = 96.5% for Komodo, which I expected to be very similar in strength at this longer TC. Maybe I will leave overnight for another 10 games, if the result will be again weak, it is probably not bad luck.
2 times shorter TC from above:
Lc0 ID462 with latest CLOPed settings for this 450-460 series after 100 games

Code: Select all

  Program     CCRL Elo   Error(cl 95%)          Games         Score 
Lc0! ID462    3280.6       ±42.3          100 (+38,=46,-16),  61.0 %

   vs.                 :  games (  +,  =,  -),   (%) :    Diff,    SD, CFS (%)
   Fire 7.1            :     20 (  4,  7,  9),  37.5 :   -60.4,  21.6,    0.3
   Andscacs 9.3        :     20 (  7, 11,  2),  62.5 :   +73.6,  21.6,  100.0
   Gull 3              :     20 (  5, 13,  2),  57.5 :   +87.6,  21.6,  100.0
   Texel 1.07          :     20 ( 10,  8,  2),  70.0 :  +119.6,  21.6,  100.0
   Laser 1.5           :     20 ( 12,  7,  1),  77.5 :  +183.6,  21.6,  100.0
Lc0 kb3 b20 net with CLOPed settings for it after 100 games

Code: Select all

  Program     CCRL Elo   Error(cl 95%)          Games          Score 
Lc0 kb3 b20   3333.6       ±47.2          100 (+46,=43,-11)    67.5 %

   vs.                   :  games (  +,  =,  -),   (%) :    Diff,    SD, CFS (%)
   Fire 7.1              :     20 (  4, 12,  4),  50.0 :    -7.4,  24.1,   38.0
   Andscacs 9.3          :     20 (  8,  8,  4),  60.0 :  +126.6,  24.1,  100.0
   Gull 3                :     20 (  9, 11,  0),  72.5 :  +140.6,  24.1,  100.0
   Texel 1.07            :     20 ( 13,  5,  2),  77.5 :  +172.6,  24.1,  100.0
   Laser 1.5             :     20 ( 12,  7,  1),  77.5 :  +236.6,  24.1,  100.0

Lc0 ID390 with default CLOPed settings for this after 50 games

Code: Select all

  Program     CCRL Elo   Error(cl 95%)          Games         Score 
Lc0 ID390     3321.0      ±69.8           50 (+24,=18,-8)     66.0 %
 
   vs.                 :  games (  +,  =, -),    (%) :    Diff,    SD,   CFS (%)
   Fire 7.1            :     10 (  1,  4, 5),   30.0 :   -20.0,   35.6,   28.7
   Andscacs 0.93       :     10 (  7,  3, 0),   85.0 :  +114.0,   35.6,   99.9
   Gull 3              :     10 (  4,  4, 2),   60.0 :  +128.0,   35.6,  100.0
   Texel 1.07          :     10 (  8,  2, 0),   90.0 :  +160.0,   35.6,  100.0
   Laser 1.5           :     10 (  4,  5, 1),   65.0 :  +224.0,   35.6,  100.0
Conditions:
TC: 5'+5"
Ponder off
1 CPU for all AB engines with i5-6500 3.4 GHz and Nvidia GTX 1070 Ti with Lc0 for Leela.
My system with 1 CPU is 2.1 times faster than 1 CPU of CCRL and since 5+5 is close enough to 40/4 i used in the above Elo's the CCRL 40/4 Elo's of the AB engines.
In any case since Elo is relative you can just subtract any value you want and have the relative Elo of each gaunlet without connecting it to the world of CCRL Elo's.

I am a bit worried that Leela might scale badly in TCEC conditions. Maybe some folks have experience with Leela on longer time controls. Also, do they know whether in TCEC conditions the rig they are running will have enough RAM? Leela MCTS is very RAM hungry with fast GPUs and LTC.
Well Leela had the following 3 results in TCEC (All AB engines use 43 cores and Leela Lc0 with 2 GTX 1080 Ti):
Leela ID483 - Andscacs : 10.0 - 14.0 Elo -58
Leela ID483 - Laser : 11.5 - 12.5 Elo -15
Leela ID483 - Stockfish : 3.0 - 11.0 Elo -226
and Leela scored a win against SF! They said Stockfish has never lost in premier division from any engine other than Houdini or Komodo since season 7.

Andscacs is a premier divsion engine that finished 5th last year!
Laser is a 1st division engine that finished 6th last year.
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: how will Leela fare at the WCCC?

Post by Laskos »

George Tsavdaris wrote: Fri Jul 13, 2018 9:46 pm
Laskos wrote: Fri Jul 13, 2018 8:38 pm 10 times longer TC:
600s + 10s
Score of Komodo 12.1.1 vs Lc0_11jul ID482: 6 - 1 - 3 [0.750] 10
Elo difference: 190.85 +/- 255.97
Finished match
Performance 3225 CCRL 40/4', even a regression compared earlier TC. LOS = 96.5% for Komodo, which I expected to be very similar in strength at this longer TC. Maybe I will leave overnight for another 10 games, if the result will be again weak, it is probably not bad luck.
2 times shorter TC from above:
Lc0 ID462 with latest CLOPed settings for this 450-460 series after 100 games

Code: Select all

  Program     CCRL Elo   Error(cl 95%)          Games         Score 
Lc0! ID462    3280.6       ±42.3          100 (+38,=46,-16),  61.0 %

   vs.                 :  games (  +,  =,  -),   (%) :    Diff,    SD, CFS (%)
   Fire 7.1            :     20 (  4,  7,  9),  37.5 :   -60.4,  21.6,    0.3
   Andscacs 9.3        :     20 (  7, 11,  2),  62.5 :   +73.6,  21.6,  100.0
   Gull 3              :     20 (  5, 13,  2),  57.5 :   +87.6,  21.6,  100.0
   Texel 1.07          :     20 ( 10,  8,  2),  70.0 :  +119.6,  21.6,  100.0
   Laser 1.5           :     20 ( 12,  7,  1),  77.5 :  +183.6,  21.6,  100.0
Lc0 kb3 b20 net with CLOPed settings for it after 100 games

Code: Select all

  Program     CCRL Elo   Error(cl 95%)          Games          Score 
Lc0 kb3 b20   3333.6       ±47.2          100 (+46,=43,-11)    67.5 %

   vs.                   :  games (  +,  =,  -),   (%) :    Diff,    SD, CFS (%)
   Fire 7.1              :     20 (  4, 12,  4),  50.0 :    -7.4,  24.1,   38.0
   Andscacs 9.3          :     20 (  8,  8,  4),  60.0 :  +126.6,  24.1,  100.0
   Gull 3                :     20 (  9, 11,  0),  72.5 :  +140.6,  24.1,  100.0
   Texel 1.07            :     20 ( 13,  5,  2),  77.5 :  +172.6,  24.1,  100.0
   Laser 1.5             :     20 ( 12,  7,  1),  77.5 :  +236.6,  24.1,  100.0

Lc0 ID390 with default CLOPed settings for this after 50 games

Code: Select all

  Program     CCRL Elo   Error(cl 95%)          Games         Score 
Lc0 ID390     3321.0      ±69.8           50 (+24,=18,-8)     66.0 %
 
   vs.                 :  games (  +,  =, -),    (%) :    Diff,    SD,   CFS (%)
   Fire 7.1            :     10 (  1,  4, 5),   30.0 :   -20.0,   35.6,   28.7
   Andscacs 0.93       :     10 (  7,  3, 0),   85.0 :  +114.0,   35.6,   99.9
   Gull 3              :     10 (  4,  4, 2),   60.0 :  +128.0,   35.6,  100.0
   Texel 1.07          :     10 (  8,  2, 0),   90.0 :  +160.0,   35.6,  100.0
   Laser 1.5           :     10 (  4,  5, 1),   65.0 :  +224.0,   35.6,  100.0
Conditions:
TC: 5'+5"
Ponder off
1 CPU for all AB engines with i5-6500 3.4 GHz and Nvidia GTX 1070 Ti with Lc0 for Leela.
My system with 1 CPU is 2.1 times faster than 1 CPU of CCRL and since 5+5 is close enough to 40/4 i used in the above Elo's the CCRL 40/4 Elo's of the AB engines.
In any case since Elo is relative you can just subtract any value you want and have the relative Elo of each gaunlet without connecting it to the world of CCRL Elo's.
All in all, about 3310 CCRL performance at 5' + 5''. Your GPU is maybe 30-40% faster than my GTX 1060, your CPU core is about 10% slower than my i7 3.8 GHz. So, to translate to my ratings, one has to subtract maybe 30 Elo points, resulting in a 3280 CCRL Elo performance at 5' + 5''. But I got 3300 at 1' + 1'' in 100 games, against a single opponent (that may be problematic). The scaling seems again to stall at AB engines levels.

I am a bit worried that Leela might scale badly in TCEC conditions. Maybe some folks have experience with Leela on longer time controls. Also, do they know whether in TCEC conditions the rig they are running will have enough RAM? Leela MCTS is very RAM hungry with fast GPUs and LTC.
Well Leela had the following 3 results in TCEC (All AB engines use 43 cores and Leela Lc0 with 2 GTX 1080 Ti):
Leela ID483 - Andscacs : 10.0 - 14.0 Elo -58
Leela ID483 - Laser : 11.5 - 12.5 Elo -15
Leela ID483 - Stockfish : 3.0 - 11.0 Elo -226
and Leela scored a win against SF! They said Stockfish has never lost in premier division from any engine other than Houdini or Komodo since season 7.

Andscacs is a premier divsion engine that finished 5th last year!
Laser is a 1st division engine that finished 6th last year.
These are impressive. 24 minutes + increment time control, right? If this is the main indicator, then Leela on 8 x V100 will be competitive with the best in WCCC (aside book issue).

Thanks for informing me. Somehow a bit contradictory as scaling with time control goes, but hardware + larger time control seems to help Leela, if TCEC is taken as the best indicator for what will happen at WCCC against the likes of Komodo or Jonny.
User avatar
George Tsavdaris
Posts: 1627
Joined: Thu Mar 09, 2006 12:35 pm

Re: how will Leela fare at the WCCC?

Post by George Tsavdaris »

Laskos wrote: Fri Jul 13, 2018 10:53 pm All in all, about 3310 CCRL performance at 5' + 5''. Your GPU is maybe 30-40% faster than my GTX 1060,
Actually it's about 100% faster (i.e x2) for running Leela as per a chart one of the devs of Leela has made:
Image

your CPU core is about 10% slower than my i7 3.8 GHz. So, to translate to my ratings, one has to subtract maybe 30 Elo points, resulting in a 3280 CCRL Elo performance at 5' + 5''. But I got 3300 at 1' + 1'' in 100 games, against a single opponent (that may be problematic). The scaling seems again to stall at AB engines levels.
Yes that is strange since i also thought Leela scales a little better than AB engines, but anyway there are error bars also so things may be in deed that way.


These are impressive. 24 minutes + increment time control, right? If this is the main indicator, then Leela on 8 x V100 will be competitive with the best in WCCC (aside book issue).

Thanks for informing me. Somehow a bit contradictory as scaling with time control goes, but hardware + larger time control seems to help Leela, if TCEC is taken as the best indicator for what will happen at WCCC against the likes of Komodo or Jonny.
24+16 the match against Stockfish on TCEC and 15+10 for the other 2 i think.
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: how will Leela fare at the WCCC?

Post by yanquis1972 »

Laskos wrote: Fri Jul 13, 2018 8:38 pm Anybody tested cuDNN lc0 with a good mainbranch net (I used 482) at longer time control? We all assume that it scales well compared to AB regular engines, but I only have evidence of that from ultra-fast TC to fast blitz TC. I got a worrying result at a bit longer TC, 10min + 10s, but in very few games, only 10. Maybe bad luck, but as it is, the LOS is 96.5% that it doesn't scale very well to longer time control.


6s + 0.1s
Score of Arasan 20.5 vs Lc0_11jul ID482: 75 - 70 - 55 [0.512] 200
Elo difference: 8.69 +/- 41.14
Finished match
Performance 3050 CCRL 40/4'.

10 times longer TC:
60s + 1s
Score of Komodo 9.2 vs Lc0_11jul ID482: 33 - 23 - 44 [0.550] 100
Elo difference: 34.86 +/- 51.32
Finished match
Performance 3300 CCRL 40/4', and an improvement of 250 Elo points at 10 times slower time control. I expected that at another 10 times longer TC, it will improve by at least 100-150 Elo points (less than before, because of diminishing returns with TC). In this case, Leela would be on par or superior to Komodo 12.1.1 on one core. But the result is very disappointing:

10 times longer TC:
600s + 10s
Score of Komodo 12.1.1 vs Lc0_11jul ID482: 6 - 1 - 3 [0.750] 10
Elo difference: 190.85 +/- 255.97
Finished match
Performance 3225 CCRL 40/4', even a regression compared earlier TC. LOS = 96.5% for Komodo, which I expected to be very similar in strength at this longer TC. Maybe I will leave overnight for another 10 games, if the result will be again weak, it is probably not bad luck.

I am a bit worried that Leela might scale badly in TCEC conditions. Maybe some folks have experience with Leela on longer time controls. Also, do they know whether in TCEC conditions the rig they are running will have enough RAM? Leela MCTS is very RAM hungry with fast GPUs and LTC.
very strange, nice compilation. is it possible default settings, which may be (probably were) tuned to bullet, are terrible for LTC? i don't feel like MCTS & FPU are adequately explained; i'm doing some 4min/move games atm & have no idea if i should increase or decrease either. i don't know if anyone really knows. a couple other variables i'm blanking on, as well.
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: how will Leela fare at the WCCC?

Post by yanquis1972 »

(twice)
Last edited by yanquis1972 on Sat Jul 14, 2018 2:35 am, edited 1 time in total.
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: how will Leela fare at the WCCC?

Post by yanquis1972 »

quoted myself
Last edited by yanquis1972 on Sat Jul 14, 2018 2:34 am, edited 2 times in total.