Page 44 of 61

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Tue May 15, 2018 5:31 pm
by jp
mhull wrote: Tue May 15, 2018 4:08 pm
jp wrote: Tue May 15, 2018 11:39 am
mhull wrote: Mon May 14, 2018 3:39 pm The idea of computer chess is the asymptotic approach to best play. You can't measure that approach if much weaker humans are ALWAYS interjecting their moves into the test. Any resulting Elo measures are contaminated with human moves.
If we calculated human Elo using games composed partly of computer moves, we would call that cheating.
No current computer chess program tells us anything about the asymptotic approach to best play, because they are all so far below best play.
With all due respect, you wouldn't know. Human assessment of how close programs (which are hundreds of Elo better players than them) are approaching best play is likely of no value.
You don't know either. No one knows exactly how bad computers are. But obviously they aren't anywhere near perfect. No human is assessing the actual moves with their own chess skill to say that, so elo is irrelevant.

You want to excuse bad play just because it's from a position that isn't the starting position. Do you think that excuse works for human players? Do you think there's no relation between how well Capablanca would play some random reasonable middlegame or endgame and his strength in a whole game? Do you think he would make a worse annotator of some bozo's game than the bozo because it's not his moves?

Your complaints would be more reasonable if there were no games with the conditions you like.

Seems you'll be unhappy unless all tests you don't like were banned. Just be happy that tests you do like exist.

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Tue May 15, 2018 5:50 pm
by Albert Silver
Laskos wrote: Tue May 15, 2018 5:18 pm
JJJ wrote: Tue May 15, 2018 4:57 pm The good news is Leela is back on track. Only 25 elo below his max ! And progress are coming back very fast.
Don't look at their official self-play rating, it is only of some guidance. Look at matches against varied AB opposition. LC0 now is the strogest ever.
You mean as opposed to the normal builds, or are you referring to the normal builds? The LC0-cudnn builds are indeed the strongest, though they only run in machines equipped with Nvidia GPUs.

The self-attributed ratings for the NN are unreliable IMHO. I ran a 300-game match with v10 between NN223 and NN253, and they were about equal (facing each other). NN223 actually pulled fractionally ahead (+8 Elo) but well within the error margins obviously.

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Tue May 15, 2018 6:02 pm
by Laskos
Albert Silver wrote: Tue May 15, 2018 5:50 pm
Laskos wrote: Tue May 15, 2018 5:18 pm
JJJ wrote: Tue May 15, 2018 4:57 pm The good news is Leela is back on track. Only 25 elo below his max ! And progress are coming back very fast.
Don't look at their official self-play rating, it is only of some guidance. Look at matches against varied AB opposition. LC0 now is the strogest ever.
You mean as opposed to the normal builds, or are you referring to the normal builds? The LC0-cudnn builds are indeed the strongest, though they only run in machines equipped with Nvidia GPUs.

The self-attributed ratings for the NN are unreliable IMHO. I ran a 300-game match with v10 between NN223 and NN253, and they were about equal (facing each other). NN223 actually pulled fractionally ahead (+8 Elo) but well within the error margins obviously.
I just posted in this thread the result for ID292, it is the strongest ever (the standard v0.10 CPU and GPU build on master).

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Tue May 15, 2018 6:05 pm
by mhull
jp wrote: Tue May 15, 2018 5:31 pm
mhull wrote: Tue May 15, 2018 4:08 pm
jp wrote: Tue May 15, 2018 11:39 am
No current computer chess program tells us anything about the asymptotic approach to best play, because they are all so far below best play.
With all due respect, you wouldn't know. Human assessment of how close programs (which are hundreds of Elo better players than them) are approaching best play is likely of no value.
You don't know either. No one knows exactly how bad computers are. But obviously they aren't anywhere near perfect. No human is assessing the actual moves with their own chess skill to say that, so elo is irrelevant.
But that is using the unknown to assess the unknown which only supports the point being made.
jp wrote: Tue May 15, 2018 5:31 pm You want to excuse bad play just because it's from a position that isn't the starting position.
I have never excused bad play. You are imposing this view on me to support your view. Since I'm not doing what you assume, your point is empty and void.
jp wrote: Tue May 15, 2018 5:31 pmDo you think that excuse works for human players? Do you think there's no relation between how well Capablanca would play some random reasonable middlegame or endgame and his strength in a whole game? Do you think he would make a worse annotator of some bozo's game than the bozo because it's not his moves?
It's fine if your measure is for ability at analysis for arbitrary positions or playing partial games from arbitrary positions. Sure, give the engines a bunch of mate-in-x puzzles to start from and then measure the Elo. But to give an Elo for those abilities is to use Elo for that which it was not designed which is playing full games of chess. Just a fact.
jp wrote: Tue May 15, 2018 5:31 pm Your complaints would be more reasonable if there were no games with the conditions you like.Seems you'll be unhappy unless all tests you don't like were banned. Just be happy that tests you do like exist.
Completely wrong, otherwise I would not be making this case. There are no extensive, on-going gauntlets for L0 playing all moves itself across its evolution, tracking its Elo progress. Instead there are only CCRL-style, cripple-bot games in overflowing abundance.

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Tue May 15, 2018 6:19 pm
by jp
mhull wrote: Tue May 15, 2018 6:05 pm
jp wrote: Tue May 15, 2018 5:31 pm
mhull wrote: Tue May 15, 2018 4:08 pm With all due respect, you wouldn't know. Human assessment of how close programs (which are hundreds of Elo better players than them) are approaching best play is likely of no value.
You don't know either. No one knows exactly how bad computers are. But obviously they aren't anywhere near perfect. No human is assessing the actual moves with their own chess skill to say that, so elo is irrelevant.
But that is using the unknown to assess the unknown which only supports the point being made.
No, you're not. You're not using the unknown parts to assess the unknown. Your argument is like saying I can only measure the speed of a car by running alongside it and knowing my own speed.

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Tue May 15, 2018 6:46 pm
by yanquis1972
anyone have general guidelines for setting up LCZ for tournament play? finally became curious enough to try it, but i couldn't find any explanation of the parameters.

would like to test an 'optimal' vanilla setting vs same for cudnn and/or ensure they're identical (cudnn has several more parameters than the default)

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Tue May 15, 2018 7:14 pm
by Albert Silver
Laskos wrote: Tue May 15, 2018 6:02 pm
Albert Silver wrote: Tue May 15, 2018 5:50 pm
Laskos wrote: Tue May 15, 2018 5:18 pm
Don't look at their official self-play rating, it is only of some guidance. Look at matches against varied AB opposition. LC0 now is the strogest ever.
You mean as opposed to the normal builds, or are you referring to the normal builds? The LC0-cudnn builds are indeed the strongest, though they only run in machines equipped with Nvidia GPUs.

The self-attributed ratings for the NN are unreliable IMHO. I ran a 300-game match with v10 between NN223 and NN253, and they were about equal (facing each other). NN223 actually pulled fractionally ahead (+8 Elo) but well within the error margins obviously.
I just posted in this thread the result for ID292, it is the strongest ever (the standard v0.10 CPU and GPU build on master).
If confirmed, that will be very promising as it will clearly indicates that not only was the rut the neural network was in caused by the bug but that it is also past it and finally making genuine progress.

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Tue May 15, 2018 8:36 pm
by yanquis1972
i expect slow progress, myself. i've watched about a hundred games or fragments & while leela's play out of the opening is excellent, she loses the plot in the middlegame. her endgame evals are clueless but the play itself nevertheless manages to be interesting.

i think one of the reasons (& i hope, the only reason) for this is obvious; literally every game she's played has been from the opening. my guess is that will be resolved with volume (lots & lots of volume).

my worry is the tactics. she will what looks to be an overwhelming attack but doesn't execute on it. this is where i'm hoping my total ignorance of the process has me completely wrong in worrying that this is an inherent problem that google was able to overcome by testing on their hardware & conditions.

i assume the theory is that there are consistent patterns to attack but she just hasn't played enough of the correct moves to have learned them yet.

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Tue May 15, 2018 8:53 pm
by Guenther
Albert Silver wrote: Tue May 15, 2018 7:14 pm
Laskos wrote: Tue May 15, 2018 6:02 pm
Albert Silver wrote: Tue May 15, 2018 5:50 pm

You mean as opposed to the normal builds, or are you referring to the normal builds? The LC0-cudnn builds are indeed the strongest, though they only run in machines equipped with Nvidia GPUs.

The self-attributed ratings for the NN are unreliable IMHO. I ran a 300-game match with v10 between NN223 and NN253, and they were about equal (facing each other). NN223 actually pulled fractionally ahead (+8 Elo) but well within the error margins obviously.
I just posted in this thread the result for ID292, it is the strongest ever (the standard v0.10 CPU and GPU build on master).
If confirmed, that will be very promising as it will clearly indicates that not only was the rut the neural network was in caused by the bug but that it is also past it and finally making genuine progress.
You know that there will be a rollback soon?

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Posted: Tue May 15, 2018 9:03 pm
by yanquis1972
did want to add that just after i posted that, i watched a game against rybka 3 that quickly boiled down to an early endgame. rybka evaluated drawish & stayed there, leela +2 or so (ended drawn). i realized the answer there is pretty obvious too; not every game contains an endgame, but most would've, & the large majority probably ended decisively.

also forgot to mention the other hardware aspect to the tactical problem; while we're waiting for millions of games & hoping she stumbles upon the solution often enough to learn it, i'm guessing google used training h/w that could calculate several orders beyond what lc0 does. but i'm hopefully wrong & it was volume-focused.

stumbled on this graph (re strength vs stockfish based on time per move) which is interesting as well
Image