about an idea to learn better for LC0

Uri Blass · Post by **Uri Blass** » Wed Apr 18, 2018 9:40 pm

I read that LC0 even could miss mate in 1(not sure if it is still the case today) so the question is why not have a simple rule that say
play mate in 1 if there is a mate in 1.

It is not something specific for chess and can be done in every game and it could probably save a lot of training games when LC0 missed mate in 1 and got the wrong result.

I wonder if this simple idea cannot help LC0 to play chess better.

Another question is if using slower time control in the training games can help LC0 to improve faster.

Did people test these ideas or people are only interested to do the same as A0?

Robert Pope · Post by **Robert Pope** » Wed Apr 18, 2018 10:36 pm

Uri Blass wrote:I read that LC0 even could miss mate in 1(not sure if it is still the case today) so the question is why not have a simple rule that say
play mate in 1 if there is a mate in 1.

It is not something specific for chess and can be done in every game and it could probably save a lot of training games when LC0 missed mate in 1 and got the wrong result.

I wonder if this simple idea cannot help LC0 to play chess better.

Another question is if using slower time control in the training games can help LC0 to improve faster.

Did people test these ideas or people are only interested to do the same as A0?

Starting with the same framework as A0 is a starting point, and given A0's results, it makes sense to see how far we can take that concept.

There are dozens and dozens of things like this that might speed learning, but Leela is a single project and can obviously only travel down one path, and the path that the developer chose is A0. It's not really feasible to test a dozen different ways of doing things before continuing. They've spent a month and millions of games just validating a single approach (and they are still working out bugs in the process).

Anybody that wants to try something different is welcome to fork it, but as long as Leela is making steady progress, I personally don't see the point of dividing resources.

Matthias Gemuh · Post by **Matthias Gemuh** » Wed Apr 18, 2018 10:57 pm

Uri Blass wrote:... why not have a simple rule that say
play mate in 1 if there is a mate in 1.

...

That sounds like abandoning LC0 and moving to LC1

Robert Pope · Post by **Robert Pope** » Wed Apr 18, 2018 10:59 pm

Matthias Gemuh wrote:
Uri Blass wrote:... why not have a simple rule that say
play mate in 1 if there is a mate in 1.

...
That sounds like abandoning LC0 and moving to LC1

Not that there's anything wrong with that.
/seinfeld

Henk · Post by **Henk** » Wed Apr 18, 2018 11:04 pm

Maybe if generating training examples is most expensive then one can start with examples with only one proven best move. Mate in 1 would be easiest to start with. But generating them in another way and collecting them may be too much work. And storing them might take too much space. And how many do you need.

Don't like extra checks for this zero method is already tremendously slow and then it would take even more time to finish one training game. [So I'm only interested in changes that speed up playing a game]

hgm · Post by **hgm** » Thu Apr 19, 2018 6:33 am

Robert Pope wrote:It's not really feasible to test a dozen different ways of doing things before continuing.

Actually that is the only sensible thing to do, when you embark on a gigantic task. Just randomly picking a method, and then sticking to it, is unlikely to be within a factor 10 from the optimum. So spending 20% of the projected time to evaluate 5 different methods in a model problem (e.g. Los Alamos Chess) can easily save you a factor 3 in total work.

The AlphaZero method was selected because it was good for Go, when Chess was still completely out of the picture. Google did not have to care for efficiency, because they have the resources to be 100 times inefficient, and still get the job done.

Michel · Post by **Michel** » Thu Apr 19, 2018 7:55 am

I view LC0 as a scientific experiment to reproduce the results claimed by Google. By following the A0 approach one always gets an answer. The A0 approach works or it doesn't. So the ressources poured into the experiment are never wasted.

If one tries a different approach and it fails there is no conclusion. The failure might be due to the fact that the approach is the wrong one after all or it might be due to the fact that NN's+MCTS generally do not work and Google somehow screwed up.

There is another argument for following the A0 approach: comparing different alternatives in a statistically sound way is very hard and requires enormous ressources. People typically do not want to devote the required ressources, especially when one approach is performing already very well. Usually people do not even have the skill, or the time, to actually set up such a comparison (comparisons with many alternatives are much more tricky than comparisons with two alternatives).

noobpwnftw · Post by **noobpwnftw** » Thu Apr 19, 2018 8:22 am

Definition of "reproducing something that is performing very well" come with a pre-condition that it requires similar amount of hardware capacity(like rollouts per second) to get over with the horizon effect, as clearly manifested in LC0's current tactical oversight.

If one take LC0's network on Google's hardware, it may also result in "performing very well", mission accomplished? I think people are more willingly to accept that than to imply that it might not scale that well after all, no way to prove or disprove it now BTW.

Reality is the smaller nets usually have a lower ceiling on the same hardware that can efficiently run a bigger one. People tend to fall back on "just give it more time" than "just get better hardware", since the latter is something unachieveable.

Now it comes down to how to get better performance on regular hardware, and as a part of the project goal of LC0 to make a strong chess engine, then I think that part of sticking to "what they did" should go along with "what they had", or you are going nowhere even if somehow Google fed you with the net they trained.

Henk · Post by **Henk** » Thu Apr 19, 2018 8:46 am

hgm wrote:
Robert Pope wrote:It's not really feasible to test a dozen different ways of doing things before continuing.
Actually that is the only sensible thing to do, when you embark on a gigantic task. Just randomly picking a method, and then sticking to it, is unlikely to be within a factor 10 from the optimum. So spending 20% of the projected time to evaluate 5 different methods in a model problem (e.g. Los Alamos Chess) can easily save you a factor 3 in total work.

The AlphaZero method was selected because it was good for Go, when Chess was still completely out of the picture. Google did not have to care for efficiency, because they have the resources to be 100 times inefficient, and still get the job done.

And how do you evaluate a method. You can measure progress by testing against an older version. But how do you measure progress in the future? (Progress might get stuck in the future) For you don't know the state of a program in the future.

Maybe try out the method on very small problems first which reach the final state very quickly. But then it is solving different problems. And what might hold for these smaller problems might not hold for standard chess.

Henk · Post by **Henk** » Thu Apr 19, 2018 9:10 am

hgm wrote: ..
Google did not have to care for efficiency, because they have the resources to be 100 times inefficient, and still get the job done.

Looks like Google is wasting their resources for chess. Might that be the reason why they quit with alpha zero for chess. Chess is a luxury problem for it is not necessary to play chess. Although when people get bored worse may happen. (bread and games)

about an idea to learn better for LC0

about an idea to learn better for LC0

Re: about an idea to learn better for LC0

Re: about an idea to learn better for LC0

Re: about an idea to learn better for LC0

Re: about an idea to learn better for LC0

Re: about an idea to learn better for LC0

Re: about an idea to learn better for LC0

Re: about an idea to learn better for LC0

Re: about an idea to learn better for LC0

Re: about an idea to learn better for LC0