chrisw wrote: ↑Fri Nov 30, 2018 10:11 pm
sorry for being stupid, but these matches are what exactly? the current version versus its predecessor? test games from lc0.org?
so, the test procedure is to check if x+1 wins against x? if it doesn’t, what happens, it goes forward anyway, or it gets junked, and x stays, until there’s an x+1 that beats it?
Those are the matches, which are used for selfplay rating and also for propagating new nets
(they fail only though, if much weaker than x-1, I don't know the exact number now, may be it was 100...)
I started those stats only to show that there has crept in a huge 'French invasion', since a while
and all other openings were much less played than in previous nets, therefor selfplay rating had gone inflated
in comparison to real rating. I was also surprised to discover that high number of dupes (measured for 60 plies).
If I understand it correctly, this is a consequence of policy sharpening, which is now being reduced again.
chrisw wrote: ↑Fri Nov 30, 2018 10:11 pm
sorry for being stupid, but these matches are what exactly? the current version versus its predecessor? test games from lc0.org?
so, the test procedure is to check if x+1 wins against x? if it doesn’t, what happens, it goes forward anyway, or it gets junked, and x stays, until there’s an x+1 that beats it?
Those are the matches, which are used for selfplay rating and also for propagating new nets
(they fail only though, if much weaker than x-1, I don't know the exact number now, may be it was 100...)
I started those stats only to show that there has crept in a huge 'French invasion', since a while
and all other openings were much less played than in previous nets, therefor selfplay rating had gone inflated
in comparison to real rating. I was also surprised to discover that high number of dupes (measured for 60 plies).
If I understand it correctly, this is a consequence of policy sharpening, which is now being reduced again.
Ok, thanks. I thought that was what they were doing, but didn’t know the figure at which they junked a new version.
I think this test and development process results in running away with good-looking “elo” climbs that are not actually generalising, and leaving behind, unnoticed “actual real good versions”. The stars are not identified, and the effort goes into chasing things that don’t go anywhere.
To avoid needlessly chasing its own proverbial tail, I still think they should also test LC0 against known entities with a fixed, established rating - at least periodically. That should automatically be part of the testing protocol.
carldaman wrote: ↑Sat Dec 01, 2018 1:01 am
To avoid needlessly chasing its own proverbial tail, I still think they should also test LC0 against known entities with a fixed, established rating - at least periodically. That should automatically be part of the testing protocol.
which lc0 would you like to test? this is the production run since yesterday .....
I think LC0 is progressing rapidly. It is now close to overtaking Stockfish any version. And this is with LCO running on a slow graphics card vs I7 6700. The progress in recent weeks has been impressive in testing with real games.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
"As our previous 'run to completion' has not made much progress recently, main page and default contributions have been reset to our new actually testing run while we wait for lc0 0.20.0 to be prepared with a new network architecture for our next main run."
carldaman wrote: ↑Sat Dec 01, 2018 5:57 am
Then why is the Leela site stating this?
"As our previous 'run to completion' has not made much progress recently, main page and default contributions have been reset to our new actually testing run while we wait for lc0 0.20.0 to be prepared with a new network architecture for our next main run."
They are testing new ideas in 30xx series while waiting for version 0.20 engine with 40 block net. After full training, 30xx may be on par with 11xx or minimally better than 11xx, but it is unlikely to overtake SF 10 in tcec/cccc in this stage.
As 40 block " Go" has been undoubtedly proven better than A/B pruning, we are expecting similar results in chess.
If that comes into reality, A/B engines will be in history (like deep blue technology) .
Viral jokes about "End of an Era in 6 months" has been popular in cccc/tcec since 2018 Sept, so 2019 March should be the deadline whether those rumors come true or not.
mwyoung wrote: ↑Sat Dec 01, 2018 1:55 am
I think LC0 is progressing rapidly. It is now close to overtaking Stockfish any version. And this is with LCO running on a slow graphics card vs I7 6700. The progress in recent weeks has been impressive in testing with real games.
when you say slow graphic card what do you mean?
What is the price of the graphics card and what is the price of the I7 6700?
mwyoung wrote: ↑Sat Dec 01, 2018 1:55 am
I think LC0 is progressing rapidly. It is now close to overtaking Stockfish any version. And this is with LCO running on a slow graphics card vs I7 6700. The progress in recent weeks has been impressive in testing with real games.
You mean recent test30 nets? Even on my very powerful RTX 2070 GPU they are nowhere close to SF_10 on 4 threads. My CPU is OCed i7-4790, probably quite close to your i7-6700. Only the best test10 nets can be stronger than SF_10 on my RTX 2070. On "slow graphics card", even if you mean by that GTX 1060 (which is not that slow), at best only some test10 nets can be level with SF_8 on 4 threads, and test30 about SF_7 or SF_6 level. And there is almost no progress at all in real Elo points in test30 for already 400 nets or 3000+ self-Elo points "improvement", at best some 30-40 Elo poinjts.
mwyoung wrote: ↑Sat Dec 01, 2018 1:55 am
I think LC0 is progressing rapidly. It is now close to overtaking Stockfish any version. And this is with LCO running on a slow graphics card vs I7 6700. The progress in recent weeks has been impressive in testing with real games.
when you say slow graphic card what do you mean?
What is the price of the graphics card and what is the price of the I7 6700?
That's about the price of AMD Ryzen Threadripper 1950X @3.8Ghz ($313) which can do 48,391,000 NPS in the bench run by Ipman chess.
These GPUs, in turn:
Radeon RX Vega
GeForce GTX 1070
are at about that same price point.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.