Stockfish 11, Contempt=0 - Lc0 0.23.2 (Kiudee Settings) 20x256 S.Vieri T40-1541

fastgm · Post by **fastgm** » Wed Jan 29, 2020 11:31 pm

GeForce RTX 2060, 10 Core Intel E5-2680v2 @ 2.80 GHz
Openings: Hert_250_lowdraws.pgn
TC: 60 sec + 0.6 sec
250 games

Code: Select all

Lc0 0.23.2 256x20-T40-1541   +40  +51/=177/-22 55.80%  139.5/250
Stockfish 11                 -40  +22/=177/-51 44.20%  110.5/250

+40 Elo!

pohl4711 · Post by **pohl4711** » Fri Jan 31, 2020 9:05 am

Great! Thanx.

Nearly the same result, which I got with my longtime testrun (and SALC Armageddon openings):

1 Lc0 0.23.2k t40-1541 (20x256) : 3604 300 (+171,= 0,-129), 57.0 % (+49 Elo)

https://www.sp-cc.de/nn-longtime-testing.htm

I hope, you will do a complete testrun with that setup (Lc0 Kiudee, 20x256-t40-1541 Net) for your ratinglist!!! I would expect a clear new world's number one.

fastgm · Post by **fastgm** » Fri Jan 31, 2020 5:56 pm

Here the results with Stockfish 11 and default Contempt=24

Code: Select all

1   Lc0 0.23.2 256x20-T40-1541   +52  +62/=163/-25 57.40%  143.5/250
2   Stockfish 11                 -52  +25/=163/-62 42.60%  106.5/250

+52 Elo

I will soon test 256x20-T40-1541 with the Kiudee settings for my rating list.

pohl4711 · Post by **pohl4711** » Sat Feb 01, 2020 12:31 pm

fastgm wrote: ↑Fri Jan 31, 2020 5:56 pm Here the results with Stockfish 11 and default Contempt=24
Code: Select all
1   Lc0 0.23.2 256x20-T40-1541   +52  +62/=163/-25 57.40%  143.5/250
2   Stockfish 11                 -52  +25/=163/-62 42.60%  106.5/250
+52 Elo

I will soon test 256x20-T40-1541 with the Kiudee settings for my rating list.

Perhaps you should wait some more days:
The first 45 games of the KiudeeLaskos-setting (kl= kiudee with CPuct=1.900) are played and at this point, it looks very promising.
Lc0 0.23.2kl t40-1541 (20x256) (kl= Kiudee with Laskos change CPuct=1.900) is at 62% vs. Stockfish 191210 (final result of Kiudee setting without Laskos CPuct-change was 57%), which would mean around +35 Elo more and a real destruction of Stockfish.
But 45 games does not mean a really reliable result - all can still change. We have to wait some days more, but the result is very good so far, so I let the test go on...
The testrun with 300 games will end in 5 days...if all works correctly.

pohl4711 · Post by **pohl4711** » Mon Feb 03, 2020 10:04 am

pohl4711 wrote: ↑Sat Feb 01, 2020 12:31 pm
fastgm wrote: ↑Fri Jan 31, 2020 5:56 pm Here the results with Stockfish 11 and default Contempt=24
Code: Select all
1   Lc0 0.23.2 256x20-T40-1541   +52  +62/=163/-25 57.40%  143.5/250
2   Stockfish 11                 -52  +25/=163/-62 42.60%  106.5/250
+52 Elo

I will soon test 256x20-T40-1541 with the Kiudee settings for my rating list.
Perhaps you should wait some more days:
The first 45 games of the KiudeeLaskos-setting (kl= kiudee with CPuct=1.900) are played and at this point, it looks very promising.
Lc0 0.23.2kl t40-1541 (20x256) (kl= Kiudee with Laskos change CPuct=1.900) is at 62% vs. Stockfish 191210 (final result of Kiudee setting without Laskos CPuct-change was 57%), which would mean around +35 Elo more and a real destruction of Stockfish.
But 45 games does not mean a really reliable result - all can still change. We have to wait some days more, but the result is very good so far, so I let the test go on...
The testrun with 300 games will end in 5 days...if all works correctly.

I aborted that testrun. After 150 games, the KiudeeLaskos-setting was 2% weaker, than Kiudee-setting. So, you should try Kiudee...

Alayan · Post by **Alayan** » Mon Feb 03, 2020 2:43 pm

Small sample size means a 2% difference after 150 games isn't anywhere good enough to know which is better.

Aborting tests that don't look promising while finishing those that do also introduce some bias as the results you end up publishing will be more lucky than average.

fastgm · Post by **fastgm** » Mon Feb 03, 2020 8:17 pm

pohl4711 wrote: ↑Sat Feb 01, 2020 12:31 pm I aborted that testrun. After 150 games, the KiudeeLaskos-setting was 2% weaker, than Kiudee-setting. So, you should try Kiudee...

Thanks for the test!

Albert Silver · Post by **Albert Silver** » Tue Feb 04, 2020 7:46 pm

Alayan wrote: ↑Mon Feb 03, 2020 2:43 pm Small sample size means a 2% difference after 150 games isn't anywhere good enough to know which is better.

Aborting tests that don't look promising while finishing those that do also introduce some bias as the results you end up publishing will be more lucky than average.

Have to agree. 2% is 14 Elo and the error margins after 150 games would be +/- 40 Elo or so. In my tests, and I do a lot, I have seen runs where one side was -90 Elo after 30 games, -35 Elo after 100 games, and +44 by the end of a mere 300 games. Even 1000 games will see error margins of +/- 15 Elo roughly.

Stockfish 11, Contempt=0 - Lc0 0.23.2 (Kiudee Settings) 20x256 S.Vieri T40-1541

Stockfish 11, Contempt=0 - Lc0 0.23.2 (Kiudee Settings) 20x256 S.Vieri T40-1541

Re: Stockfish 11, Contempt=0 - Lc0 0.23.2 (Kiudee Settings) 20x256 S.Vieri T40-1541

Re: Stockfish 11, Contempt=0 - Lc0 0.23.2 (Kiudee Settings) 20x256 S.Vieri T40-1541

Re: Stockfish 11, Contempt=0 - Lc0 0.23.2 (Kiudee Settings) 20x256 S.Vieri T40-1541

Re: Stockfish 11, Contempt=0 - Lc0 0.23.2 (Kiudee Settings) 20x256 S.Vieri T40-1541

Re: Stockfish 11, Contempt=0 - Lc0 0.23.2 (Kiudee Settings) 20x256 S.Vieri T40-1541

Re: Stockfish 11, Contempt=0 - Lc0 0.23.2 (Kiudee Settings) 20x256 S.Vieri T40-1541

Re: Stockfish 11, Contempt=0 - Lc0 0.23.2 (Kiudee Settings) 20x256 S.Vieri T40-1541