Something goes wrong with lc0 since yesterday?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: Something goes wrong with lc0 since yesterday?

Post by yanquis1972 »

crem wrote: Tue Jul 17, 2018 5:31 pm
George Tsavdaris wrote: Tue Jul 17, 2018 4:53 pm
Laskos wrote: Wed Jul 11, 2018 7:53 pm 40/2' against single core is fine, but the most strictly correct simulation of CCRL conditions with lc0 on GPU would probably be to leave the lc0 at the same 40/4', as lc0 speed on GPU is very weakly dependent on CPU (2 threads), and for AB engines use 40/2' on a reasonable i7 core. Anyway, we get a picture, I use the same 40/2' bench for both lc0 and AB engines as you use. My GPU is Nvidia 1060 6GB.
Despite what you said here, i used 40/2 for Leela also, in the following gaunlet since i find it unfair to offer Leela double time. :D
The following 2 gaunlets are for the latest test net, but the way it evolved last day is AMAZING! :shock:
The learning rate was changed at network 10077 (as planned), so progress should indeed be much faster now.
oh, i had no idea! blog said shortly before that...wait, are you guys still gonna reset?? damn
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: Something goes wrong with lc0 since yesterday?

Post by yanquis1972 »

Laskos wrote: Wed Jul 18, 2018 12:01 am These days I probed lc0 ID495 (main branch) for behavior on certain openings against Gaviota 1.0 regular engine (on two cores for randomization). I played overall several thousands ultra-fast games from Noomen topical opening suite.

Here I present over- and under- performance (over- with green, under- with red) in Elo compared to general performance for most common openings:

Ruy Lopez: -46
French: -43
Sicilian: -46
Queen's Pawn: +75
King's Indian: -31
Nimzo Indian: +64
Reti: +31


The 2SD error margins are about 25 Elo points.
the negative in the french is really surprising, & i would've guessed =ish in ruy lopez. but the overall theme is pretty clear. what line did you use for the KID?
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: Something goes wrong with lc0 since yesterday?

Post by jp »

Laskos wrote: Wed Jul 18, 2018 12:01 am Here I present over- and under- performance (over- with green, under- with red) in Elo compared to general performance for most common openings:

Ruy Lopez: -46
French: -43
Sicilian: -46
Queen's Pawn: +75
King's Indian: -31
Nimzo Indian: +64
Reti: +31

Is it roughly equally bad/good from both sides e.g. -46 as White and -46 as Black in the Ruy Lopez?
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

yanquis1972 wrote: Wed Jul 18, 2018 2:46 am
Laskos wrote: Wed Jul 18, 2018 12:01 am These days I probed lc0 ID495 (main branch) for behavior on certain openings against Gaviota 1.0 regular engine (on two cores for randomization). I played overall several thousands ultra-fast games from Noomen topical opening suite.

Here I present over- and under- performance (over- with green, under- with red) in Elo compared to general performance for most common openings:

Ruy Lopez: -46
French: -43
Sicilian: -46
Queen's Pawn: +75
King's Indian: -31
Nimzo Indian: +64
Reti: +31


The 2SD error margins are about 25 Elo points.
the negative in the french is really surprising, & i would've guessed =ish in ruy lopez. but the overall theme is pretty clear. what line did you use for the KID?
I used Noomen topical opening suite. The KID lines are:

[Event "Topical Testsuite 2012"]
[Site "Apeldoorn"]
[Date "2012.04.15"]
[Round "?"]
[White "King's Indian"]
[Black "Fianchetto Variation"]
[Result "*"]
[ECO "E63"]
[PlyCount "17"]
[EventDate "2012.??.??"]

1. d4 Nf6 2. c4 g6 3. Nc3 Bg7 4. Nf3 O-O 5. g3 d6 6. Bg2 Nc6 7. O-O a6 8. b3
Rb8 9. Nd5 *

[Event "Topical Testsuite 2012"]
[Site "Apeldoorn"]
[Date "2012.04.15"]
[Round "?"]
[White "King's Indian"]
[Black "Fianchetto Variation 8.Qd3!?"]
[Result "*"]
[ECO "E63"]
[PlyCount "15"]
[EventDate "2012.??.??"]

1. d4 Nf6 2. c4 g6 3. Nc3 Bg7 4. Nf3 O-O 5. g3 d6 6. Bg2 Nc6 7. O-O a6 8. Qd3 *

[Event "Topical Testsuite 2012"]
[Site "Apeldoorn"]
[Date "2012.04.15"]
[Round "?"]
[White "King's Indian"]
[Black "Samisch Byrne System"]
[Result "*"]
[ECO "E81"]
[PlyCount "14"]
[EventDate "2012.??.??"]

1. d4 Nf6 2. c4 g6 3. Nc3 Bg7 4. e4 d6 5. f3 O-O 6. Be3 c6 7. Qd2 a6 *

[Event "Topical Testsuite 2012"]
[Site "Apeldoorn"]
[Date "2012.04.15"]
[Round "?"]
[White "King's Indian"]
[Black "Makagonov Variation"]
[Result "*"]
[ECO "E90"]
[PlyCount "14"]
[EventDate "2012.??.??"]

1. d4 Nf6 2. c4 g6 3. Nc3 Bg7 4. e4 d6 5. Nf3 O-O 6. h3 e5 7. d5 a5 *

[Event "Topical Testsuite 2012"]
[Site "Apeldoorn"]
[Date "2012.04.15"]
[Round "?"]
[White "King's Indian"]
[Black "Main Line 9.b4"]
[Result "*"]
[ECO "E97"]
[PlyCount "26"]
[EventDate "2012.??.??"]

1. d4 Nf6 2. c4 g6 3. Nc3 Bg7 4. e4 d6 5. Nf3 O-O 6. Be2 e5 7. O-O Nc6 8. d5
Ne7 9. b4 Nh5 10. a4 f5 11. c5 Nf4 12. Bc4 fxe4 13. Nxe4 Bg4 *

[Event "Topical Testsuite 2012"]
[Site "Apeldoorn"]
[Date "2012.04.15"]
[Round "?"]
[White "King's Indian"]
[Black "Main Line 9.b4 Ne8!?"]
[Result "*"]
[ECO "E97"]
[PlyCount "20"]
[EventDate "2012.??.??"]

1. d4 Nf6 2. c4 g6 3. Nc3 Bg7 4. e4 d6 5. Nf3 O-O 6. Be2 e5 7. O-O Nc6 8. d5
Ne7 9. b4 Ne8 10. c5 f5 *
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

jp wrote: Wed Jul 18, 2018 6:05 am
Laskos wrote: Wed Jul 18, 2018 12:01 am Here I present over- and under- performance (over- with green, under- with red) in Elo compared to general performance for most common openings:

Ruy Lopez: -46
French: -43
Sicilian: -46
Queen's Pawn: +75
King's Indian: -31
Nimzo Indian: +64
Reti: +31

Is it roughly equally bad/good from both sides e.g. -46 as White and -46 as Black in the Ruy Lopez?
No, it seems very differently behaving White and Black in Ruy Lopez. I don't like separating in White and Black, as it halves the number of games for each and increases error margins, but here the difference in behavior seems huge:

In Ruy Lopez, lc0 ID495 behaves as:

White: under-performance of 108 Elo points
Black: over-performance 15 Elo points

The 2SD error margins here are about 35 Elo points.
crem
Posts: 177
Joined: Wed May 23, 2018 9:29 pm

Re: Something goes wrong with lc0 since yesterday?

Post by crem »

yanquis1972 wrote: Wed Jul 18, 2018 2:41 am
oh, i had no idea! blog said shortly before that...wait, are you guys still gonna reset?? damn
No, most likely we'll just promote test10 to main and continue.
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: Something goes wrong with lc0 since yesterday?

Post by jp »

Laskos wrote: Wed Jul 18, 2018 7:17 am
jp wrote: Wed Jul 18, 2018 6:05 am
Laskos wrote: Wed Jul 18, 2018 12:01 am Here I present over- and under- performance in Elo compared to general performance for most common openings:
Ruy Lopez: -46
...

Is it roughly equally bad/good from both sides e.g. -46 as White and -46 as Black in the Ruy Lopez?

No, it seems very differently behaving White and Black in Ruy Lopez. I don't like separating in White and Black, as it halves the number of games for each and increases error margins, but here the difference in behavior seems huge:

In Ruy Lopez, lc0 ID495 behaves as:

White: under-performance of 108 Elo points
Black: over-performance 15 Elo points

The 2SD error margins here are about 35 Elo points.

Thanks, Kai. Very interesting!!
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: Something goes wrong with lc0 since yesterday?

Post by duncan »

Laskos post wrote:
Here I present over- and under- performance (over- with green, under- with red) in Elo compared to general performance for most common openings:

Ruy Lopez: -46
French: -43
Sicilian: -46
Queen's Pawn: +75
King's Indian: -31
Nimzo Indian: +64
Reti: +31


The 2SD error margins are about 25 Elo points.
if compared to stockfish , do you know if you would get similar figures ?
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

duncan wrote: Wed Jul 18, 2018 12:07 pm
Laskos post wrote:
Here I present over- and under- performance (over- with green, under- with red) in Elo compared to general performance for most common openings:

Ruy Lopez: -46
French: -43
Sicilian: -46
Queen's Pawn: +75
King's Indian: -31
Nimzo Indian: +64
Reti: +31


The 2SD error margins are about 25 Elo points.
if compared to stockfish , do you know if you would get similar figures ?
It would have been better to have a gauntlet against several (many) regular engines, but I wanted to have a score close to 50%, so the performances are not distorted, and choosing many such opponents is hard. The variations Leela shows in openings is somewhat larger than the variations between 2 regular engines of similar strength, so the answer to your question would be: yes, compared to Stockfish, I would probably get similar results with some variations, say take or add 20 Elo points here and there.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

Something not very nice happens again. ID10086 came out as the best, about 40 Elo points below the best later IDs from the main branch. ID10093 came out as weaker, about 90 Elo points weaker than best IDs from the main branch, 50 Elo points weaker than ID10086.

On the other hand, the smallish 6x64 ID9155 came out 20 Elo points above 20x256 ID10086. Shouldn't the work then have progressed gradually to 10x128 nets from the very good 6x64 ID9155, which is about 3300 CCRL 40/4' Elo points? By now I guess we would have at least a 3500 CCRL 40/4' 10x128 net, probably more. It seems we are now stuck at some 3250 CCRL 40/4' level with this bignet 20x256 1009x nets.