Something goes wrong with lc0 since yesterday?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

crem wrote: Wed Jun 13, 2018 6:18 pm
Milos wrote: Wed Jun 13, 2018 2:43 pm
crem wrote: Wed Jun 13, 2018 9:42 am There is indeed something wrong and we are investigating what exactly.

We've fixed some small bugs after previous runs and just restarted training to confirm everything is good.
But it turned out not to be that good.
By we, you mean you, like Alex, right? :D
No, they weren't really bugs, more like infrastructural things to tune (test/training data separation, how data moves from one server to another, training multiple network sizes in parallel, etc), as I personally did very little of that.
It was mostly Tilps, Error323 and nousian (those are nicks at discord) who did that.
Ok, seems to have redressed now. You seem to have a general offset of maybe 600 Elo points compared to earlier two nets, maybe you started from some very poor or unlucky "random" net. By now, ID256 seems to be close to 2900-3000 CCRL 40/4' Elo level, or some 300 Elo points behind best nets of the main branch (ID395 or so). Very fast progress in small amount of fast games.
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: Something goes wrong with lc0 since yesterday?

Post by duncan »

https://github.com/LeelaChessZero/lc0/w ... transition


The main server is still using the old lczero.exe client. The test server is using the new client. Most users should still use the main server, because the test server is still buggy, and we cannot support all users on it yet.
We recently updated the main server with a bootstrapped net to try to fix a bug with rule50.
About the usefulness of continuing your client on the main server:
We are getting test results about bootstrapping a net to recover from bugs. This could be useful if we need to do it again to fix more bugs, or change network architecture, etc.
For the short term (e.g. TCEC Season 13), main server may still produce the strongest net.
The current plan is to do a full restart from random net when the new server is ready.
Some users may feel the value of these tests and short term gains on the main server are not great enough. If so, please wait for the transition to be complete and join us again for the next run using the new lc0 client.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

Laskos wrote: Thu Jun 14, 2018 12:08 am
crem wrote: Wed Jun 13, 2018 6:18 pm
Milos wrote: Wed Jun 13, 2018 2:43 pm
By we, you mean you, like Alex, right? :D
No, they weren't really bugs, more like infrastructural things to tune (test/training data separation, how data moves from one server to another, training multiple network sizes in parallel, etc), as I personally did very little of that.
It was mostly Tilps, Error323 and nousian (those are nicks at discord) who did that.
Ok, seems to have redressed now. You seem to have a general offset of maybe 600 Elo points compared to earlier two nets, maybe you started from some very poor or unlucky "random" net. By now, ID256 seems to be close to 2900-3000 CCRL 40/4' Elo level, or some 300 Elo points behind best nets of the main branch (ID395 or so). Very fast progress in small amount of fast games.
I hope this run is the "real" one. It's the fifth one if I counted correctly, they seem to exaggerate with these tries. Also, ID48 had a sudden jump in performance, compared to earlier nets (say ID44), and it's by far the best 6x64 net of all runs, in fact its performance is pretty amazing, at least at short TC.

My gauntlet at short TC on GTX 1060 against AB engines:

earlier IDs of the 5th run: 26 to 36 out of 200
ID44: 36/200
ID48: 60/200 !!!!!

Outside error margins large jump.

Maybe some folks can communicate the devs this outstanding result. It's the best result by far of the 6x64 net, and comes even better than run 1 128x10 nets. In terms of CCRL 40/4' Elo points, this ID48 6x64 net with the latest lc0 engine is 3000+ CCRL Elo points. The best 15x192 nets of the main branch were only 200 Elo points better. I would expect that the best 15x192 nets of this branch can reach some 3600 CCRL 40/4' Elo level, or Stockfish dev level.
Henk
Posts: 7216
Joined: Mon May 27, 2013 10:31 am

Re: Something goes wrong with lc0 since yesterday?

Post by Henk »

They are just demonstrating that using neural networks is inefficient. What can be wrong with that. Maybe less human work but enormous waste of electricity. If there is no knowledge something has to pay. This only holds for training/tuning/configuring part of course.

For small problems it is do-able. But chess is not that small. Be glad with a local optimum.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

Laskos wrote: Sun Jun 17, 2018 8:49 am
Laskos wrote: Thu Jun 14, 2018 12:08 am
crem wrote: Wed Jun 13, 2018 6:18 pm

No, they weren't really bugs, more like infrastructural things to tune (test/training data separation, how data moves from one server to another, training multiple network sizes in parallel, etc), as I personally did very little of that.
It was mostly Tilps, Error323 and nousian (those are nicks at discord) who did that.
Ok, seems to have redressed now. You seem to have a general offset of maybe 600 Elo points compared to earlier two nets, maybe you started from some very poor or unlucky "random" net. By now, ID256 seems to be close to 2900-3000 CCRL 40/4' Elo level, or some 300 Elo points behind best nets of the main branch (ID395 or so). Very fast progress in small amount of fast games.
I hope this run is the "real" one. It's the fifth one if I counted correctly, they seem to exaggerate with these tries. Also, ID48 had a sudden jump in performance, compared to earlier nets (say ID44), and it's by far the best 6x64 net of all runs, in fact its performance is pretty amazing, at least at short TC.

My gauntlet at short TC on GTX 1060 against AB engines:

earlier IDs of the 5th run: 26 to 36 out of 200
ID44: 36/200
ID48: 60/200 !!!!!

Outside error margins large jump.

Maybe some folks can communicate the devs this outstanding result. It's the best result by far of the 6x64 net, and comes even better than run 1 128x10 nets. In terms of CCRL 40/4' Elo points, this ID48 6x64 net with the latest lc0 engine is 3000+ CCRL Elo points. The best 15x192 nets of the main branch were only 200 Elo points better. I would expect that the best 15x192 nets of this branch can reach some 3600 CCRL 40/4' Elo level, or Stockfish dev level.
ID54, although 30 Elo points better in their self-games than ID48, comes a bit lower in my gauntlet, 55.5/200 compared to 60.0/200 of ID48, but within error margins. Hope it doesn't start regressing in games against AB engines.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

Henk wrote: Sun Jun 17, 2018 10:47 am They are just demonstrating that using neural networks is inefficient. What can be wrong with that. Maybe less human work but enormous waste of electricity. If there is no knowledge something has to pay. This only holds for training/tuning/configuring part of course.

For small problems it is do-able. But chess is not that small. Be glad with a local optimum.
I don't know what you are babbling. They (the new testserver) showed by yesterday morning that in 500,000 fast self-games in half-a-day, one can train a network to beat 99% of humans, and by yesterday evening, after 1,500,000 fast self-games, in a-day-and-a-half, the network can be trained to beat ALL humans. Now one has to see the limits of this small 6x64 network, and to take care of the training parameters like CPUCT, temperature, expansion parameter, etc. The result seems remarkable to me, and I hope they won't start from scratch again, what's the point? Even with now available hardware resources, they can quickly overcome the main branch.
Henk
Posts: 7216
Joined: Mon May 27, 2013 10:31 am

Re: Something goes wrong with lc0 since yesterday?

Post by Henk »

Laskos wrote: Sun Jun 17, 2018 12:54 pm
Henk wrote: Sun Jun 17, 2018 10:47 am They are just demonstrating that using neural networks is inefficient. What can be wrong with that. Maybe less human work but enormous waste of electricity. If there is no knowledge something has to pay. This only holds for training/tuning/configuring part of course.

For small problems it is do-able. But chess is not that small. Be glad with a local optimum.
I don't know what you are babbling. They (the new testserver) showed by yesterday morning that in 500,000 fast self-games in half-a-day, one can train a network to beat 99% of humans, and by yesterday evening, after 1,500,000 fast self-games, in a-day-and-a-half, the network can be trained to beat ALL humans. Now one has to see the limits of this small 6x64 network, and to take care of the training parameters like CPUCT, temperature, expansion parameter, etc. The result seems remarkable to me, and I hope they won't start from scratch again, what's the point? Even with now available hardware resources, they can quickly overcome the main branch.
Beating all humans is not enough for a machine. Also already 18 million training games had been played.
Last edited by Henk on Tue Jun 19, 2018 5:46 pm, edited 2 times in total.
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: Something goes wrong with lc0 since yesterday?

Post by duncan »

Laskos wrote: I don't know what you are babbling. They (the new testserver) showed by yesterday morning that in 500,000 fast self-games in half-a-day, one can train a network to beat 99% of humans, and by yesterday evening, after 1,500,000 fast self-games, in a-day-and-a-half, the network can be trained to beat ALL humans. Now one has to see the limits of this small 6x64 network, and to take care of the training parameters like CPUCT, temperature, expansion parameter, etc. The result seems remarkable to me, and I hope they won't start from scratch again, what's the point? Even with now available hardware resources, they can quickly overcome the main branch.
Today beating All humans is trivial. Beating stockfish 20 0 in a tournament with unseen so far chess is a hard to achieve goal. A0 is going to have to play at its best to have even a chance to achieve this.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Something goes wrong with lc0 since yesterday?

Post by Laskos »

duncan wrote: Sun Jun 17, 2018 1:56 pm
Laskos wrote: I don't know what you are babbling. They (the new testserver) showed by yesterday morning that in 500,000 fast self-games in half-a-day, one can train a network to beat 99% of humans, and by yesterday evening, after 1,500,000 fast self-games, in a-day-and-a-half, the network can be trained to beat ALL humans. Now one has to see the limits of this small 6x64 network, and to take care of the training parameters like CPUCT, temperature, expansion parameter, etc. The result seems remarkable to me, and I hope they won't start from scratch again, what's the point? Even with now available hardware resources, they can quickly overcome the main branch.
Today beating All humans is trivial. Beating stockfish 20 0 in a tournament with unseen so far chess is a hard to achieve goal. A0 is going to have to play at its best to have even a chance to achieve this.
Beating all humans by giving the rules and the goal of Chess and letting it self-play to "understand" what is Chess about and how to play it well, is easy? Then give the rules and the goal of "human life", and let him live "human lives" for 1 million times, and you get a super-human, no? The (hard to surmount) problems would be shifted to describing the "rules and goal of life" and the environment to live those 1 million lives, right? But you get what I am talking about, aside that I know from our past discussions you had some hard time with understanding anything.
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: Something goes wrong with lc0 since yesterday?

Post by duncan »

Laskos wrote:
Beating all humans by giving the rules and the goal of Chess and letting it self-play to "understand" what is Chess about and how to play it well, is easy?
The above is of course not easy and has only been achieved in the last few months. beating humans with computers with conventional a/b software is easy in the sense, it has been achieved a long time ago.