how will Leela fare at the WCCC?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
yanquis1972
Posts: 1682
Joined: Tue Jun 02, 2009 10:14 pm

Re: how will Leela fare at the WCCC?

Post by yanquis1972 » Sat Jul 14, 2018 12:34 am

George Tsavdaris wrote:
Fri Jul 13, 2018 9:10 pm
Laskos wrote:
Fri Jul 13, 2018 8:53 pm
All in all, about 3310 CCRL performance at 5' + 5''. Your GPU is maybe 30-40% faster than my GTX 1060,
Actually it's about 100% faster (i.e x2) for running Leela as per a chart one of the devs of Leela has made:
Image
i'm skeptical about how that translates; i run a 1080, typically @ 2050-2075 core & a very unscientific estimate of my nps from start position, in a fast TC game, is 11kn/s (mainserver net). i believe it goes up a few kn/s with time before hitting a wall. i have no idea, but i think 1060s are easily above ~6 kn/s (or whatever...not a math guy) at the same depth. i don't recall anymore what gave me that impression.

User avatar
George Tsavdaris
Posts: 1531
Joined: Thu Mar 09, 2006 11:35 am

Re: how will Leela fare at the WCCC?

Post by George Tsavdaris » Sat Jul 14, 2018 6:17 am

yanquis1972 wrote:
Sat Jul 14, 2018 12:34 am
George Tsavdaris wrote:
Fri Jul 13, 2018 9:10 pm
Laskos wrote:
Fri Jul 13, 2018 8:53 pm
All in all, about 3310 CCRL performance at 5' + 5''. Your GPU is maybe 30-40% faster than my GTX 1060,
Actually it's about 100% faster (i.e x2) for running Leela as per a chart one of the devs of Leela has made:
Image
i'm skeptical about how that translates; i run a 1080, typically @ 2050-2075 core & a very unscientific estimate of my nps from start position, in a fast TC game, is 11kn/s (mainserver net). i believe it goes up a few kn/s with time before hitting a wall. i have no idea, but i think 1060s are easily above ~6 kn/s (or whatever...not a math guy) at the same depth. i don't recall anymore what gave me that impression.
We can do controlled measures.

From starting position with 3 different net sizes and IDs here is what i get with 1070 Ti:

Lc0 ID125 10x128

Code: Select all

3/22    00:18     471,185    25,902    +0,14    e2-e4 c7-c5 c2-c3 e7-e6 d2-d4 d7-d5 e4xd5 e6xd5 Ng1-f3 Ng8-f6 Bf1-d3 Bf8-d6 0-0 0-0 d4xc5 
Lc0 ID476 15x192

Code: Select all

2/27    00:14     168,408    11,594    +0,16    c2-c4 Ng8-f6 Nb1-c3 e7-e5 Ng1-f3 Nb8-c6 g2-g3 d7-d5 c4xd5 Nf6xd5 Bf1-g2 Nd5-b6 0-0 Bf8-e7 
Lc0 Test10-10047 20x256

Code: Select all

2/17    00:18     107,491    5,828    +0,09    c2-c4 c7-c5 Nb1-c3 Nb8-c6 e2-e3 e7-e6 d2-d4 d7-d5 d4xc5 Ng8-f6 c4xd5 Nf6xd5 Nc3xd5
Perhaps you with your 1080 and Kai with 1060 6 GB can run the 15x192 ID476 net and compare. Or the other 2 nets also....
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....

User avatar
Laskos
Posts: 8453
Joined: Wed Jul 26, 2006 8:21 pm

Re: how will Leela fare at the WCCC?

Post by Laskos » Sat Jul 14, 2018 7:49 am

George Tsavdaris wrote:
Sat Jul 14, 2018 6:17 am
yanquis1972 wrote:
Sat Jul 14, 2018 12:34 am
George Tsavdaris wrote:
Fri Jul 13, 2018 9:10 pm
Laskos wrote:
Fri Jul 13, 2018 8:53 pm
All in all, about 3310 CCRL performance at 5' + 5''. Your GPU is maybe 30-40% faster than my GTX 1060,
Actually it's about 100% faster (i.e x2) for running Leela as per a chart one of the devs of Leela has made:
Image
i'm skeptical about how that translates; i run a 1080, typically @ 2050-2075 core & a very unscientific estimate of my nps from start position, in a fast TC game, is 11kn/s (mainserver net). i believe it goes up a few kn/s with time before hitting a wall. i have no idea, but i think 1060s are easily above ~6 kn/s (or whatever...not a math guy) at the same depth. i don't recall anymore what gave me that impression.
We can do controlled measures.

From starting position with 3 different net sizes and IDs here is what i get with 1070 Ti:

Lc0 ID476 15x192

Code: Select all

2/27    00:14     168,408    11,594    +0,16    c2-c4 Ng8-f6 Nb1-c3 e7-e5 Ng1-f3 Nb8-c6 g2-g3 d7-d5 c4xd5 Nf6xd5 Bf1-g2 Nd5-b6 0-0 Bf8-e7 
Same ID, GTX 1060 6GB, to the same number of nodes:

Code: Select all

info depth 2 seldepth 27 time 23367 nodes 167131 score cp 16 hashfull 532 nps 7152 pv c2c4 g8f6 b1c3 e7e5 g1f3 b8c6 g2g3 d7d5 c4d5 f6d5 f1g2 d5b6 e1g1 f8e7 d2d3 e8g8 a2a3 c8e6 b2b4 a7a5
So, 1070 Ti is 62% faster than 1060, not exactly 100%. GTX 1080 shouldn't be much faster than 1070 Ti.

User avatar
Laskos
Posts: 8453
Joined: Wed Jul 26, 2006 8:21 pm

Re: how will Leela fare at the WCCC?

Post by Laskos » Sat Jul 14, 2018 9:11 am

Laskos wrote:
Fri Jul 13, 2018 7:26 pm
Michel wrote:
Fri Jul 13, 2018 7:01 pm
Not sure if the result against K12.1 is really disappointing. The error bars indicate that both engines could still be equal strength...
Not sure about those error bars, in any case, for skewed results, they should be asymmetrical. LOS with a uniform prior is easy to get, and it is 96.5%. Not very encouraging. LOS with a bit informed prior can be tamed somewhat, with a rough prior (4 * score * (1-score))^4, which basically assumes that the difference is hardly above 200 Elo points, I can get a LOS of 93.6%, still not very nice. I will maybe run another 10 games overnight, if something similar happens, it will be much clearer.
The second overnight match went +3 -0 =7 for Komodo 12.1.1, a bit different, but still bad. Totally in 20 games at 600'' + 10'':

TC: 600'' + 10''
Komodo 12.1.1 vs Lc0_11jul ID482
+9 -1 =10 [0.700] 20
Elo difference: 147.19
errors 95% confidence:
-96.65
+125.54

Performance: 3270 CCRL 40/4' Elo

LOS 99.4%
LOS p1 99.0%
LOS p2 98.6%

I calculated myself the 95% confidence error bars from the scratch, but assuming normal distribution (which is a bit awkward here). It is pretty clear then that Leela is at least 50 Elo points weaker in these condition than Komodo, and has a meager expected performance of 3270 CCRL Elo points. It seems to not scale significantly better than Komodo from 60'' + 1'' to 600'' + 10''. A bit disappointing. Uninformed LOS of Komodo is 99.4%, pretty decisive. With a prior p1, basically saying that engines are hardly separated by more than 200 Elo points, LOS doesn't decrease significantly, to 99.0%. WIth even a stronger prior, that engines are hardly separated by more than 100 Elo points, LOS is 98.6%. Such stability is due to the relative large number of draws.

There are reasons of worry for WCCC. George's results show a similar meager performance at 300'' + 5'' in many games. Although TCEC results on 2x GTX 1080 Ti are encouraging, I saw that Leela NPS in TCEC now are only some 2 times lower than what 8 x V100 reported, which is a surprise to me, I expected a factor of 4 maybe. If Leela doesn't scale well to LTC (as I assumed it would), then there is a little chance it to perform at Komodo 12+ level on 48 cores. Let's see.

Hai
Posts: 383
Joined: Sun Aug 04, 2013 11:19 am

Re: how will Leela fare at the WCCC?

Post by Hai » Sat Jul 14, 2018 5:33 pm

Laskos wrote:
Sat Jul 14, 2018 9:11 am
Laskos wrote:
Fri Jul 13, 2018 7:26 pm
Michel wrote:
Fri Jul 13, 2018 7:01 pm
Not sure if the result against K12.1 is really disappointing. The error bars indicate that both engines could still be equal strength...
Not sure about those error bars, in any case, for skewed results, they should be asymmetrical. LOS with a uniform prior is easy to get, and it is 96.5%. Not very encouraging. LOS with a bit informed prior can be tamed somewhat, with a rough prior (4 * score * (1-score))^4, which basically assumes that the difference is hardly above 200 Elo points, I can get a LOS of 93.6%, still not very nice. I will maybe run another 10 games overnight, if something similar happens, it will be much clearer.
The second overnight match went +3 -0 =7 for Komodo 12.1.1, a bit different, but still bad. Totally in 20 games at 600'' + 10'':

TC: 600'' + 10''
Komodo 12.1.1 vs Lc0_11jul ID482
+9 -1 =10 [0.700] 20
Elo difference: 147.19
errors 95% confidence:
-96.65
+125.54

Performance: 3270 CCRL 40/4' Elo

LOS 99.4%
LOS p1 99.0%
LOS p2 98.6%

I calculated myself the 95% confidence error bars from the scratch, but assuming normal distribution (which is a bit awkward here). It is pretty clear then that Leela is at least 50 Elo points weaker in these condition than Komodo, and has a meager expected performance of 3270 CCRL Elo points. It seems to not scale significantly better than Komodo from 60'' + 1'' to 600'' + 10''. A bit disappointing. Uninformed LOS of Komodo is 99.4%, pretty decisive. With a prior p1, basically saying that engines are hardly separated by more than 200 Elo points, LOS doesn't decrease significantly, to 99.0%. WIth even a stronger prior, that engines are hardly separated by more than 100 Elo points, LOS is 98.6%. Such stability is due to the relative large number of draws.

There are reasons of worry for WCCC. George's results show a similar meager performance at 300'' + 5'' in many games. Although TCEC results on 2x GTX 1080 Ti are encouraging, I saw that Leela NPS in TCEC now are only some 2 times lower than what 8 x V100 reported, which is a surprise to me, I expected a factor of 4 maybe. If Leela doesn't scale well to LTC (as I assumed it would), then there is a little chance it to perform at Komodo 12+ level on 48 cores. Let's see.
Leela doesn‘t know how to handle increment time.
Often she ends up with 3x-5x more base time, at the end of the games, due to bad increment management.
And this looks to be true. To be safe I‘ve checked it myself with 800 games / 15 seconds per move.
It‘s like: nice I will get increment time with every move...what should I do with it...I don’t know...I will add it to my base time.
This will be a super big problem at TCEC, different divisions with different increments, maybe something like: +10, +15, +30, +45, +60, +90, +120, +180 seconds per move!
Most increment improvements are done to improve play with only +1 to +5 seconds per move and that‘s a big mistake.
The bigger an increment is, the bigger is the base time at the and. I had seen lots of game which started with +5 minutes base time and ended with +95 minutes base time :roll: .

yanquis1972
Posts: 1682
Joined: Tue Jun 02, 2009 10:14 pm

Re: how will Leela fare at the WCCC?

Post by yanquis1972 » Sat Jul 14, 2018 5:36 pm

re lc0 & scaling, check out this chart: https://docs.google.com/spreadsheets/d/ ... =583567367

2 things to be aware of: 1) "nodes/move was from a middlegame position", which isn't given afaik. 2) june 4th lc0.exe used

(and, obviously, not finished yet)

potentially little to no gain after as little as 30s/move average hardware? which is somehow not dissimilar to A0 (by my interpretation) with a different net size on completely different hardware.

User avatar
Laskos
Posts: 8453
Joined: Wed Jul 26, 2006 8:21 pm

Re: how will Leela fare at the WCCC?

Post by Laskos » Sun Jul 15, 2018 8:28 pm

A taste of what could happen in the next days at WCCC. 4 LTC games, side and reversed from 2 starting positions. According to my estimates, at this long time control, 60 minutes + 60 seconds increment, lc0 ID491 on GTX 1060 GPU should be a bit stronger than the latest Komodo 12.1.1 on one i7 core from regular, quiet starting positions.

Games: 3600'' + 60''

From standard opening position, 2 draws:





=================================================

But if the opponents will prepare some very sharp openings, the things can turn different for Leela. From "Fried Liver attack", the first game was a quick win for Komodo:



In the second game, Komodo, now as black, again gained some advantage in the endgame, but it ended in a draw.


Werewolf
Posts: 1084
Joined: Thu Sep 18, 2008 8:24 pm

Re: how will Leela fare at the WCCC?

Post by Werewolf » Sun Jul 15, 2018 8:32 pm

Laskos wrote:
Sun Jul 15, 2018 8:28 pm
A taste of what could happen in the next days at WCCC. 4 LTC games, side and reversed from 2 starting positions. According to my estimates, at this long time control, 60 minutes + 60 seconds increment, lc0 ID491 on GTX 1060 GPU should be a bit stronger than the latest Komodo 12.1.1 on one i7 core from regular, quiet starting positions.
Then the fascinating question will be what happens when we increase the core count of Komodo (do we know what they have in the WCCC?) and the graphics cards of Leela (8x V100 ?)

User avatar
Laskos
Posts: 8453
Joined: Wed Jul 26, 2006 8:21 pm

Re: how will Leela fare at the WCCC?

Post by Laskos » Sun Jul 15, 2018 8:39 pm

Werewolf wrote:
Sun Jul 15, 2018 8:32 pm
Laskos wrote:
Sun Jul 15, 2018 8:28 pm
A taste of what could happen in the next days at WCCC. 4 LTC games, side and reversed from 2 starting positions. According to my estimates, at this long time control, 60 minutes + 60 seconds increment, lc0 ID491 on GTX 1060 GPU should be a bit stronger than the latest Komodo 12.1.1 on one i7 core from regular, quiet starting positions.
Then the fascinating question will be what happens when we increase the core count of Komodo (do we know what they have in the WCCC?) and the graphics cards of Leela (8x V100 ?)
I estimated that the improvement from one core to 48 for Komodo is a bit larger than the improvement from my GTX 1060 to 8 x V100. Also, time control in WCCC is larger, but the scaling of Leela at these LTC is hard to know. So, my little experiment is somehow representative, but it's easy to miss 50-100 Elo points up or down. It seems that if the opening is quiet, expect maybe a draw between Leela and Komodo, if sharp, good chances for Komodo to win. Let's see.

yanquis1972
Posts: 1682
Joined: Tue Jun 02, 2009 10:14 pm

Re: how will Leela fare at the WCCC?

Post by yanquis1972 » Mon Jul 16, 2018 12:20 am

A0 on a larger net was +80 elo over SF8 at ~800K nodes/move & +100 elo at ~4M+ npm (taking the 100 game set as absolute). but recall this was SF8 on a 64(?) core machine with 1GB hash. don't know if tests have been done on how that effects scaling, but i assume it's not good.

in terms of raw elo est. +300 elo gained 8K to 80K, and about the same amount from ~20K to 1M. basically by 500K nodes searched/move there were perhaps 50 elo left on the table, but this is all presumably benchmarked against the same flawed stockfish setup.

is +40 elo (perhaps) reasonable to expect for SF going from 70Mnps to 700M?

looking at the chart now i don't know how we're surprised by results like kai's & this test https://docs.google.com/spreadsheets/d/ ... =583567367

Post Reply