LCZero Accomplishments and Goals Thus Far

Albert Silver · Post by **Albert Silver** » Thu May 03, 2018 2:36 am

MonteCarlo wrote:
Albert Silver wrote:
Dariusz Orzechowski wrote:
Dann Corbit wrote:
Dariusz Orzechowski wrote:
Albert Silver wrote:
Dariusz Orzechowski wrote:For me it's exciting but it's not that incredible as I was following Leela Zero in go and something similar happened there. I like playing against pure policy networks (1 playout, no search), it's a lot of fun in both chess and go. In go pure network is already around 3-4 dan level which would correspond to FM/IM level in chess I guess. I'm pretty confident it's possible to get pure network to around 5-6 dan in go and GM level in chess.
I think it is already at least SuperGM level.
This is just highly innacurate. Pure network is perhaps around 2000 now what makes it a very decent player anyway.
I think this is apples and oranges.
Albert was talking about the current state of the LCZero project and you about using a pure network.
I have no idea how it's possible to spin it this way. He quoted what I said about pure network and answered with a statement that is demonstrably not true.
It is possible because you presented a match result, then quoted LCZero's strength at ~2900 CCRL, which I am pretty sure is not 2000 Elo. I also have no idea what the strength of a 'pure network' even means. If you are suggesting plain single node play with no search tree of any kind then I am sure it is MUCH weaker than 2000 Elo.
Yes, he's talking about a 1 visit "search", as he explicitly said (and I highlighted above).

In classical time controls, sure, at 1 visit LC0 is probably not 2000 on any human scale. However, at any time control where a human has to even remotely try to match the pace of a 1 visit "search", it's likely far stronger than 2000

Ok, so the idea is to replace the search with a single playout? How is that a 'pure network'?

MonteCarlo · Post by **MonteCarlo** » Thu May 03, 2018 2:44 am

Well, the search algorithm plays the move with the most visits.

The initial visit is made to the move with the highest probability from the policy head.

With a 1 visit "search", the search will evaluate the position after the move with the highest probability from the policy head, but no matter what the evaluation is, it will play that move.

It has 1 visit, and everything else has 0.

So you're literally just playing against the policy head, as he indicated he liked to do

Albert Silver · Post by **Albert Silver** » Thu May 03, 2018 3:49 am

MonteCarlo wrote:Well, the search algorithm plays the move with the most visits.

The initial visit is made to the move with the highest probability from the policy head.

With a 1 visit "search", the search will evaluate the position after the move with the highest probability from the policy head, but no matter what the evaluation is, it will play that move.

It has 1 visit, and everything else has 0.

So you're literally just playing against the policy head, as he indicated he liked to do

So it is essentially just evaluating the position with the highest value move according to its policy. Ok, well, needless to say, if you reduce the time to the human to impossible controls such as one minute or less with no increment, the human will lose in all likelihood at some juncture, but if that is qualifying the 'pure network' as 2000 or GM, we can say it is already a GM at g/10s. On the other hand, if you set the lower limit to g/5 (minutes) and use this 'pure network', it will never reach GM level, due to the tactics. Unless you handpick the GM I suppose...

MonteCarlo · Post by **MonteCarlo** » Thu May 03, 2018 4:21 am

Albert Silver wrote:
MonteCarlo wrote:Well, the search algorithm plays the move with the most visits.

The initial visit is made to the move with the highest probability from the policy head.

With a 1 visit "search", the search will evaluate the position after the move with the highest probability from the policy head, but no matter what the evaluation is, it will play that move.

It has 1 visit, and everything else has 0.

So you're literally just playing against the policy head, as he indicated he liked to do
So it is essentially just evaluating the position with the highest value move according to its policy. Ok, well, needless to say, if you reduce the time to the human to impossible controls such as one minute or less with no increment, the human will lose in all likelihood at some juncture, but if that is qualifying the 'pure network' as 2000 or GM, we can say it is already a GM at g/10s. On the other hand, if you set the lower limit to g/5 (minutes) and use this 'pure network', it will never reach GM level, due to the tactics. Unless you handpick the GM I suppose...

On the time control, even if it's 1+1, I'd wager 1 visit is around 2000 now.

On the claim that 1 visit will never be GM-level at G/5 or slower, that should age well

Let's talk again in a year

Albert Silver · Post by **Albert Silver** » Thu May 03, 2018 6:07 am

MonteCarlo wrote:
Albert Silver wrote:
MonteCarlo wrote:Well, the search algorithm plays the move with the most visits.

The initial visit is made to the move with the highest probability from the policy head.

With a 1 visit "search", the search will evaluate the position after the move with the highest probability from the policy head, but no matter what the evaluation is, it will play that move.

It has 1 visit, and everything else has 0.

So you're literally just playing against the policy head, as he indicated he liked to do
So it is essentially just evaluating the position with the highest value move according to its policy. Ok, well, needless to say, if you reduce the time to the human to impossible controls such as one minute or less with no increment, the human will lose in all likelihood at some juncture, but if that is qualifying the 'pure network' as 2000 or GM, we can say it is already a GM at g/10s. On the other hand, if you set the lower limit to g/5 (minutes) and use this 'pure network', it will never reach GM level, due to the tactics. Unless you handpick the GM I suppose...
On the time control, even if it's 1+1, I'd wager 1 visit is around 2000 now.

On the claim that 1 visit will never be GM-level at G/5 or slower, that should age well Let's talk again in a year

At 1+1 I would imagine it could do a lot better than 2000.

mhull · Post by **mhull** » Thu May 03, 2018 7:25 am

Daniel Shawul wrote:Sigh..wake me up when it is 2800 elo running on singe CPU core, which is what every other engine uses in rating lists. As far as I am concerned, it is still a 2100 elo engine there.

Your demand for uniform platform comparison is commutative. Why not demand all the other engines run on a GPU?

Then it would be "equal".

jp · Post by jp » Thu May 03, 2018 10:50 am

mhull wrote:
Daniel Shawul wrote:Sigh..wake me up when it is 2800 elo running on singe CPU core, which is what every other engine uses in rating lists. As far as I am concerned, it is still a 2100 elo engine there.
Your demand for uniform platform comparison is commutative. Why not demand all the other engines run on a GPU?

Then it would be "equal".

So if I teach my pet frog to play chess, I should demand you play it underwater rather than on land, because demands should be "commutative"?

JJJ · Post by **JJJ** » Thu May 03, 2018 11:01 am

I guess someone is frustrated to have a computer with 2 core and no graphic card and instead of saying "I d like so much to play with a strong LCzero" he just says "LCzero is bad with 1 core".

Okey, try meditation. It might help.

hgm · Post by **hgm** » Thu May 03, 2018 12:11 pm

jp wrote:So if I teach my pet frog to play chess, I should demand you play it underwater rather than on land, because demands should be "commutative"?

A better analogy: If I teach my pet shark to play chess, wouldn't it be a bit unfair to insist he will play it sitting in a chair in a room filled with air?

jp · Post by jp » Thu May 03, 2018 12:31 pm

hgm wrote:
jp wrote:So if I teach my pet frog to play chess, I should demand you play it underwater rather than on land, because demands should be "commutative"?
A better analogy: If I teach my pet shark to play chess, wouldn't it be a bit unfair to insist he will play it sitting in a chair in a room filled with air?

My frog can survive on land even if he prefers the pond. Your shark would just die.

You jumping into the pond with my frog is also fairer than me being forced into your shark's jaws in the ocean.

LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far