leela is official(?) better than sf9

Discussion of anything and everything relating to chess playing software and machines.

Moderators: Harvey Williamson, Dann Corbit, hgm

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Laskos
Posts: 10936
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: leela is official(?) better than sf9

Post by Laskos » Wed Sep 12, 2018 7:36 pm

Javier Ros wrote:
Wed Sep 12, 2018 6:38 pm
Laskos wrote:
Wed Sep 12, 2018 6:02 pm

I don't understand the discrepancy. Are you sure your SF9 uses 6 threads and not 1? This result would indicate 1 thread is used, but maybe I am doing something very wrong. OTOH, CCCC result, which I expected to come close to my result, because the CPU effective speed-up is about 8 compared to my 4 i7 cores, and GPU speed-up is again about 8 compared to to my GTX 1060, shows, like in my case, that Lc0 is the level of Fire 7.1. I use adjudication only for very long games, above 100 moves with evals in [-20cp, 20cp] range (usually very drawish long endgames). I use no win adjudication.
http://talkchess.com/forum3/download/file.php?id=97

I am sure Stockfish uses 6 threads of the 8 threads (4 cores), you can see the 74% in the task manager for Stockfish 9 and 0% for lc0 due to Ponder Off and also the NPS counter is 8.7 million.

I use adjudication by Arena at -9.00

I have read in this link that the net 11120 is better than the rest of 11xxx and I am testing it right now. At this moment Stockfish 6.5-lc017 11120 6.5

https://groups.google.com/forum/#!topic ... IUjoNZAxVw

Version NN ELO Perf W D L Fire Komodo SF Score Games % Fire % Komodo % SF % Total
0.16 600 3428 56 186 118 61 48.5 39.5 149 360 50.83 40.42 32.92 41.39
0.16 643 3428 56 186 118 58.5 52 38.5 149 360 48.75 43.33 32.08 41.39
0.16 809 3428 56 188 116 56 51.5 42.5 150 360 46.67 42.92 35.42 41.67
0.16 695 3435 69 169 122 66 45.5 42 153.5 360 55 37.92 35 42.64
0.16 776 3435 61 187 112 59.5 54 41 154.5 360 49.58 45 34.17 42.92
0.16 928 3442 67 179 114 61.5 54.5 40.5 156.5 360 51.25 45.42 33.75 43.47
0.16 840 3456 64 198 98 67 53.5 42.5 163 360 55.83 44.58 35.42 45.28
0.17 1066 3463 77 184 99 70.5 51 47.5 169 360 58.75 42.5 39.58 46.94
0.17 1120 3470 77 189 94 68.5 54 49 171.5 360 57.08 45 40.83 47.64
0.17 1186 3449 68 183 109 59 57 43.5 159.5 360 49.17 47.5 36.25 44.31
In this Arena output, the third row is the nodes searched, and the fourth is the speed? If yes, then why after 1 second you have 36,000 nodes searched with 2,600 nodes per second? Second thing, my NPS both in Arena (3.51) and from command line at 1 second are about 2,000 and increasing slowly to 4,000 in similar position to 30s. My NPS are increasing pretty slowly, is it the case with your NPS too? Maybe I have some issues with drivers or CUDA/CUDNN software? I refreshed them one month ago or so.

Javier Ros
Posts: 184
Joined: Fri Oct 12, 2012 10:48 am
Location: Seville (SPAIN)
Full name: Javier Ros

Re: leela is official(?) better than sf9

Post by Javier Ros » Wed Sep 12, 2018 9:25 pm

Laskos wrote:
Wed Sep 12, 2018 7:36 pm

In this Arena output, the third row is the nodes searched, and the fourth is the speed? If yes, then why after 1 second you have 36,000 nodes searched with 2,600 nodes per second? Second thing, my NPS both in Arena (3.51) and from command line at 1 second are about 2,000 and increasing slowly to 4,000 in similar position to 30s. My NPS are increasing pretty slowly, is it the case with your NPS too? Maybe I have some issues with drivers or CUDA/CUDNN software? I refreshed them one month ago or so.
I think the third row is the total of nodes searched en the fourth the speed in NPS.
I don't know why the total doesn´t match with the time x NPS.
The love relationship between a chess engine tester and his computer can be summarized in one sentence:
Until heat do us part.

User avatar
Laskos
Posts: 10936
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: leela is official(?) better than sf9

Post by Laskos » Wed Sep 12, 2018 9:32 pm

Javier Ros wrote:
Wed Sep 12, 2018 9:25 pm
Laskos wrote:
Wed Sep 12, 2018 7:36 pm

In this Arena output, the third row is the nodes searched, and the fourth is the speed? If yes, then why after 1 second you have 36,000 nodes searched with 2,600 nodes per second? Second thing, my NPS both in Arena (3.51) and from command line at 1 second are about 2,000 and increasing slowly to 4,000 in similar position to 30s. My NPS are increasing pretty slowly, is it the case with your NPS too? Maybe I have some issues with drivers or CUDA/CUDNN software? I refreshed them one month ago or so.
I think the third row is the total of nodes searched en the fourth the speed in NPS.
I don't know why the total doesn´t match with the time x NPS.
Do you have some sort of buffer time? It seems Lc0 spends some 15 or 20 seconds per move more than the shown usage of time. At 5' + 3'' it would mean a triple or quadruple time control (on average say 16s surplus of unknown origin + 8s regular time used, or 3s regular time used in the endgames). That might explain your results.

Javier Ros
Posts: 184
Joined: Fri Oct 12, 2012 10:48 am
Location: Seville (SPAIN)
Full name: Javier Ros

Re: leela is official(?) better than sf9

Post by Javier Ros » Wed Sep 12, 2018 9:59 pm

Javier Ros wrote:
Wed Sep 12, 2018 9:25 pm

In this Arena output, the third row is the nodes searched, and the fourth is the speed? If yes, then why after 1 second you have 36,000 nodes searched with 2,600 nodes per second? Second thing, my NPS both in Arena (3.51) and from command line at 1 second are about 2,000 and increasing slowly to 4,000 in similar position to 30s. My NPS are increasing pretty slowly, is it the case with your NPS too? Maybe I have some issues with drivers or CUDA/CUDNN software? I refreshed them one month ago or so.
I think the third row is the total of nodes searched en the fourth the speed in NPS.
I don't know why the total doesn´t match with the time x NPS.

But at the beginning of the search it macthes well as you can see in this video I have uploaded to youtube:


https://youtu.be/nJG2X6NcxYM

In green temperature of GPU, in red temperature of one core.
The love relationship between a chess engine tester and his computer can be summarized in one sentence:
Until heat do us part.

Javier Ros
Posts: 184
Joined: Fri Oct 12, 2012 10:48 am
Location: Seville (SPAIN)
Full name: Javier Ros

Re: leela is official(?) better than sf9

Post by Javier Ros » Wed Sep 12, 2018 10:29 pm

Laskos wrote:
Wed Sep 12, 2018 9:32 pm
Javier Ros wrote:
Wed Sep 12, 2018 9:25 pm
Laskos wrote:
Wed Sep 12, 2018 7:36 pm

In this Arena output, the third row is the nodes searched, and the fourth is the speed? If yes, then why after 1 second you have 36,000 nodes searched with 2,600 nodes per second? Second thing, my NPS both in Arena (3.51) and from command line at 1 second are about 2,000 and increasing slowly to 4,000 in similar position to 30s. My NPS are increasing pretty slowly, is it the case with your NPS too? Maybe I have some issues with drivers or CUDA/CUDNN software? I refreshed them one month ago or so.
I think the third row is the total of nodes searched en the fourth the speed in NPS.
I don't know why the total doesn´t match with the time x NPS.
Do you have some sort of buffer time? It seems Lc0 spends some 15 or 20 seconds per move more than the shown usage of time. At 5' + 3'' it would mean a triple or quadruple time control (on average say 16s surplus of unknown origin + 8s regular time used, or 3s regular time used in the endgames). That might explain your results.
I don't have any buffer time. I don't know how to do it.
I simply installed the CUDA drivers for NVIDIA for the Windows 7 OS, I download them from NVIDIA site, before putting the engines to play.

Here another game from your Test

https://youtu.be/p4qTNir_VB0
The love relationship between a chess engine tester and his computer can be summarized in one sentence:
Until heat do us part.

Milos
Posts: 3968
Joined: Wed Nov 25, 2009 12:47 am

Re: leela is official(?) better than sf9

Post by Milos » Thu Sep 13, 2018 12:04 am

Laskos wrote:
Wed Sep 12, 2018 9:32 pm
Javier Ros wrote:
Wed Sep 12, 2018 9:25 pm
Laskos wrote:
Wed Sep 12, 2018 7:36 pm

In this Arena output, the third row is the nodes searched, and the fourth is the speed? If yes, then why after 1 second you have 36,000 nodes searched with 2,600 nodes per second? Second thing, my NPS both in Arena (3.51) and from command line at 1 second are about 2,000 and increasing slowly to 4,000 in similar position to 30s. My NPS are increasing pretty slowly, is it the case with your NPS too? Maybe I have some issues with drivers or CUDA/CUDNN software? I refreshed them one month ago or so.
I think the third row is the total of nodes searched en the fourth the speed in NPS.
I don't know why the total doesn´t match with the time x NPS.
Do you have some sort of buffer time? It seems Lc0 spends some 15 or 20 seconds per move more than the shown usage of time. At 5' + 3'' it would mean a triple or quadruple time control (on average say 16s surplus of unknown origin + 8s regular time used, or 3s regular time used in the endgames). That might explain your results.
It is actually quite simple since Lc0 has high cache value set - 1GB. All you need to do is play the same game few times without resetting engine and you'll have 20-30k nodes in cache at the start of almost each search in the given game. You can easily see that by also observing the depth it starts the moves, it is already depth (not selective depth) 7,8 even 10 indicating with high certainty that the move is already in cache. Easy way of cheating and boosting of performance.
What I don't get is why are ppl doing this, it's such a childish behaviour. It's the reason why I don't even bother to read any of those "spectacular" results from unknown testers that are essentially nothing but immature fanboys.

Javier Ros
Posts: 184
Joined: Fri Oct 12, 2012 10:48 am
Location: Seville (SPAIN)
Full name: Javier Ros

Re: leela is official(?) better than sf9

Post by Javier Ros » Thu Sep 13, 2018 7:21 am

Milos wrote:
Thu Sep 13, 2018 12:04 am
Laskos wrote:
Wed Sep 12, 2018 9:32 pm
Javier Ros wrote:
Wed Sep 12, 2018 9:25 pm
Laskos wrote:
Wed Sep 12, 2018 7:36 pm

In this Arena output, the third row is the nodes searched, and the fourth is the speed? If yes, then why after 1 second you have 36,000 nodes searched with 2,600 nodes per second? Second thing, my NPS both in Arena (3.51) and from command line at 1 second are about 2,000 and increasing slowly to 4,000 in similar position to 30s. My NPS are increasing pretty slowly, is it the case with your NPS too? Maybe I have some issues with drivers or CUDA/CUDNN software? I refreshed them one month ago or so.
I think the third row is the total of nodes searched en the fourth the speed in NPS.
I don't know why the total doesn´t match with the time x NPS.
Do you have some sort of buffer time? It seems Lc0 spends some 15 or 20 seconds per move more than the shown usage of time. At 5' + 3'' it would mean a triple or quadruple time control (on average say 16s surplus of unknown origin + 8s regular time used, or 3s regular time used in the endgames). That might explain your results.
It is actually quite simple since Lc0 has high cache value set - 1GB. All you need to do is play the same game few times without resetting engine and you'll have 20-30k nodes in cache at the start of almost each search in the given game. You can easily see that by also observing the depth it starts the moves, it is already depth (not selective depth) 7,8 even 10 indicating with high certainty that the move is already in cache. Easy way of cheating and boosting of performance.
What I don't get is why are ppl doing this, it's such a childish behaviour. It's the reason why I don't even bother to read any of those "spectacular" results from unknown testers that are essentially nothing but immature fanboys.
You are wrong.
Arena is resetting the program every 20 games so the engines are also resetting.
I don't think that 1 Gb lets 'learn' how to win a game against Stockfish and remember 20 or 24 games after.

I don't think any of the testers have made those traps that you say.

"You can easily see that by also observing the depth it starts the moves, it is already depth (not selective depth) 7,8 even 10 indicating with high certainty that the move is already in cache"
The search in the first 6 plies is very fast so it doesn't appear.

Here is the match played this night with positions of Laskos and the same players. I extract the position from the pgn of Laskos and I didn't realize that the positions were duplicated and the Arena option of "Repeat start position with switched colours" was on, so the games were repeated as you say, showing that you are not right, because in the repeated games lc0 randomly sometimes get better or worse results.

-----------------Lc01711261-----------------
Lc01711261 - Stockfish_8_x64_bmi2 : 5,5/20 0-9-11 (=00==0=0=0==00=0==0=) 28%
-----------------Stockfish_8_x64_bmi2-----------------
Stockfish_8_x64_bmi2 - Lc01711261 : 14,5/20 9-0-11 (=11==1=1=1==11=1==1=) 73%


What is the difference between Laskos test and mine?
Well, the initial positions of my test were the same chosen at the AlphaZero-Stockfish8 match and they include 2 or 3 natural moves that lc0 plays itself in the training games. The positions from Laskos test are more complicated and leads to tactical positions were lc0 shows its weakness, really serious tactical weakness. On the other side, in the 12 positions from AlphaZero-Stockfish8 match, the play leads to positional maneuvers where the positional hability of lc0 slowly gain some advantage over Stockfish and sometimes lc0 won, but due to its limited knowledge of endgame, the game often ended as a draw.

The love relationship between a chess engine tester and his computer can be summarized in one sentence:
Until heat do us part.

User avatar
Laskos
Posts: 10936
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: leela is official(?) better than sf9

Post by Laskos » Thu Sep 13, 2018 8:01 am

Javier Ros wrote:
Thu Sep 13, 2018 7:21 am
Milos wrote:
Thu Sep 13, 2018 12:04 am
Laskos wrote:
Wed Sep 12, 2018 9:32 pm
Javier Ros wrote:
Wed Sep 12, 2018 9:25 pm
Laskos wrote:
Wed Sep 12, 2018 7:36 pm

In this Arena output, the third row is the nodes searched, and the fourth is the speed? If yes, then why after 1 second you have 36,000 nodes searched with 2,600 nodes per second? Second thing, my NPS both in Arena (3.51) and from command line at 1 second are about 2,000 and increasing slowly to 4,000 in similar position to 30s. My NPS are increasing pretty slowly, is it the case with your NPS too? Maybe I have some issues with drivers or CUDA/CUDNN software? I refreshed them one month ago or so.
I think the third row is the total of nodes searched en the fourth the speed in NPS.
I don't know why the total doesn´t match with the time x NPS.
Do you have some sort of buffer time? It seems Lc0 spends some 15 or 20 seconds per move more than the shown usage of time. At 5' + 3'' it would mean a triple or quadruple time control (on average say 16s surplus of unknown origin + 8s regular time used, or 3s regular time used in the endgames). That might explain your results.
It is actually quite simple since Lc0 has high cache value set - 1GB. All you need to do is play the same game few times without resetting engine and you'll have 20-30k nodes in cache at the start of almost each search in the given game. You can easily see that by also observing the depth it starts the moves, it is already depth (not selective depth) 7,8 even 10 indicating with high certainty that the move is already in cache. Easy way of cheating and boosting of performance.
What I don't get is why are ppl doing this, it's such a childish behaviour. It's the reason why I don't even bother to read any of those "spectacular" results from unknown testers that are essentially nothing but immature fanboys.
You are wrong.
Arena is resetting the program every 20 games so the engines are also resetting.
I don't think that 1 Gb lets 'learn' how to win a game against Stockfish and remember 20 or 24 games after.

I don't think any of the testers have made those traps that you say.

"You can easily see that by also observing the depth it starts the moves, it is already depth (not selective depth) 7,8 even 10 indicating with high certainty that the move is already in cache"
The search in the first 6 plies is very fast so it doesn't appear.

Here is the match played this night with positions of Laskos and the same players. I extract the position from the pgn of Laskos and I didn't realize that the positions were duplicated and the Arena option of "Repeat start position with switched colours" was on, so the games were repeated as you say, showing that you are not right, because in the repeated games lc0 randomly sometimes get better or worse results.

-----------------Lc01711261-----------------
Lc01711261 - Stockfish_8_x64_bmi2 : 5,5/20 0-9-11 (=00==0=0=0==00=0==0=) 28%
-----------------Stockfish_8_x64_bmi2-----------------
Stockfish_8_x64_bmi2 - Lc01711261 : 14,5/20 9-0-11 (=11==1=1=1==11=1==1=) 73%


What is the difference between Laskos test and mine?
Well, the initial positions of my test were the same chosen at the AlphaZero-Stockfish8 match and they include 2 or 3 natural moves that lc0 plays itself in the training games. The positions from Laskos test are more complicated and leads to tactical positions were lc0 shows its weakness, really serious tactical weakness. On the other side, in the 12 positions from AlphaZero-Stockfish8 match, the play leads to positional maneuvers where the positional hability of lc0 slowly gain some advantage over Stockfish and sometimes lc0 won, but due to its limited knowledge of endgame, the game often ended as a draw.

Wow, that's VERY interesting! So, you get about equal results against SF8 on 4 cores from positions used in AlphaZero paper, and +0 -9 =11 result, or some -170 Elo points from my 3-mover balanced positions? My positions are balanced and collected from GM and IM human games, so they are not some nutty positions from random 2-mover openings of SF framework. I have about 900 3-mover positions in my suite, and I am quite sure it's a good set to test on. The conclusion would be: AlphaZero team choose on purpose openings suitable to AlphaZero to boost significantly the performance, by more than 100 Elo points. That's remarkable, as I didn't expect some regular 3-movers to change dramatically the result. One has to play diverse openings, because, first, other engines might not reply the way Lc0 plays, second, other engines might use a book. In fact CCCC results, which show, like me, Lc0 the level of Fire 7.1, from initial standard opening position, show that other engines do not play the openings as Lc0 would like them to play.

But anyway, I don't understand the total nodes issue, the depth issue (this is not only about initial depth, but the depths reached after say 10s), NPS issue. Your NPS seem very stable from the beginning, I even saw peaks in NPS at 1-2s in a Youtube video, larger than after 10s, while my NPS are increasing slowly, being almost twice higher after 10s search than after less than 1s search. Hash issue as observed by Milos seemed quite plausible to me.

Interesting, and my advice would be to not use AlphaZero openings, they seem to be chosen not quite fairly against SF8.

frankp
Posts: 228
Joined: Sun Mar 12, 2006 2:11 pm

Re: leela is official(?) better than sf9

Post by frankp » Thu Sep 13, 2018 8:45 am

"The conclusion would be: AlphaZero team choose on purpose openings suitable to AlphaZero to boost significantly the performance, by more than 100 Elo points"
Is a very strong accusation.
I thought there was no opening book at all.
Which meant, as with the other aspects of the match, SF8 was not playing optimally.

jp
Posts: 1411
Joined: Mon Apr 23, 2018 5:54 am

Re: leela is official(?) better than sf9

Post by jp » Thu Sep 13, 2018 10:18 am

Javier Ros wrote:
Mon Sep 10, 2018 5:42 pm
Using the following set of 12 positions of AlphaZero

[ECO "A10"][PlyCount "1"] 1. c4 *
[ECO "D06"][PlyCount "3"] 1. d4 d5 2. c4 *
[ECO "A46"][PlyCount "3"] 1. d4 Nf6 2. Nf3 *
[ECO "A50"][PlyCount "4"] 1. d4 Nf6 2. c4 e6 *
[ECO "E61"][PlyCount "5"] 1. d4 Nf6 2. c4 g6 3. Nc3 *
[ECO "C01"][PlyCount "4"] 1. e4 e6 2. d4 d5 *
[ECO "B50"][PlyCount "4"] 1. e4 c5 2. Nf3 d6 *
[ECO "B30"][PlyCount "4"] 1. e4 c5 2. Nf3 Nc6 *
[ECO "B40"][PlyCount "4"] 1. e4 c5 2. Nf3 e6 *
[ECO "C68"][PlyCount "6"] 1. e4 e5 2. Nf3 Nc6 3. Bb5 a6 *
[ECO "B10"][PlyCount "2"] 1. e4 c6 *
[ECO "A05"][PlyCount "2"] 1. Nf3 Nf6 *
So you mean the 12 positions in the diagrams in their Table 2.
Don't know about others, but in my web browser it didn't show these in your original post, so I couldn't see the positions you meant.

frankp wrote:
Thu Sep 13, 2018 8:45 am
I thought there was no opening book at all.
Which meant, as with the other aspects of the match, SF8 was not playing optimally.
There was never an opening book but in the Table 2 games they started from positions in Table 2, where they claimed win/draw/loss: w 242/353/5, b 48/533/19. What they called their match was 100 games from the initial position.

Post Reply