Wouldn’t it be nice if there was a ChessNet50

chrisw · Post by **chrisw** » Sat Jul 13, 2019 1:07 pm

... something along the lines of say, ResNet50, a massively trained chess tower that programmers could remove the head of, add some layers, and take advantage of the knowledge patterns, hopefully generalized.

Public domain of course.

Daniel Shawul · Post by **Daniel Shawul** » Sun Jul 14, 2019 3:33 pm

Maybe leela nets will be public domain someday..after all they are generated by the gpu power of the people.
I think for Go, facebook and others have provided strong networks for free but no one has done the same for chess other than lc0.

If i start a training project (after i get a home gpu machine), maybe i will try to sell this idea. If you contribute to the project,
you will get the nets for free, i mean really for free like public domain.

chrisw · Post by **chrisw** » Sun Jul 14, 2019 7:32 pm

Daniel Shawul wrote: ↑Sun Jul 14, 2019 3:33 pm Maybe leela nets will be public domain someday..after all they are generated by the gpu power of the people.
I think for Go, facebook and others have provided strong networks for free but no one has done the same for chess other than lc0.

If i start a training project (after i get a home gpu machine), maybe i will try to sell this idea. If you contribute to the project,
you will get the nets for free, i mean really for free like public domain.

We had a long discussion on the main page, quite a while ago (it was after the DeusX affair, to date it), me, Ronald de Man, and I think hgm, on copyright and IP of nets. The conclusion was that nets are not copyrightable. Neither LC0 nets nor anybody else's nets. Obviously if people keep their 'numbers' secret then they are not available in PD, but not-secret nets are no persons property. They other guys might correct me if I interpret it wrong. I was arguing that the net belonged to the creator of the training program, hgm argued the net was no different to a program and therefore copyrightable, and RdeM argued that something created by algorithm was not copyrightable and he 'won' the argument.

Theoretically, we could cut the top four or five layers off a well trained LC0 deep net, maybe at the point it splits into policy head and value head, imagine up the idea that the cut away and revealed layer contained chess truths (possibly a fantasy) and then train for some other aspects (or similar aspects) with a few layers on top of that. Conventionally, with something like ResNet50, you warm up the added layers with the ResNet50 frozen, then train on, having unfrozen the ResNet.

Naturally, it would be a good idea to ask Crem first, but if correct, it is free of IP restrictions anyway.

I rather like the idea of trying to train something deep, say on policy, Python only, for ease of general use, maybe integrated with Python Chess, and then releasing it as a ChessNet tower. Chess play without any search at all appeals, but I guess some cooperative input on training positions, technique would be better than one person. I got a bit of GPU power here btw.

Daniel Shawul · Post by **Daniel Shawul** » Mon Jul 15, 2019 1:34 pm

I think chess NN is easier than general image classification that has the goal of identifying unlimited number of image classes.
Maybe some of them can classify to a 1000 classes but then it won't be enough. For us, though we always have
a fixed number of classes in value/policy head. I do not know enough about retraining to comment on that, but note that policy/value head
is almost the last layer. The value/policy head share about 19 resnet blocks (38 convolutions) before diverging.

I guess some cooperative input on training positions, technique would be better than one person. I got a bit of GPU power here btw.

RL at 800 playouts per move was so resource intensive for me even when i had access to a couple of voltas. But i did learn that
RL could be reasonable choice at a really lower playout count. Even 1 playout is not so bad as one may think. I also like the idea from lc0
of using different number of playouts per move (KLD gain) but never got around to implementing it. I suspect it maybe possible to
bring down the number of GPUs needed to train a full 20b network with RL using tricks like that.

I am about to own a modern cpu+gpu desktop this week and plan to setup a continuous training framework with it and will let you know if successful.

Daniel

chrisw · Post by **chrisw** » Mon Jul 15, 2019 2:05 pm

Daniel Shawul wrote: ↑Mon Jul 15, 2019 1:34 pm I think chess NN is easier than general image classification that has the goal of identifying unlimited number of image classes.
Maybe some of them can classify to a 1000 classes but then it won't be enough. For us, though we always have
a fixed number of classes in value/policy head. I do not know enough about retraining to comment on that, but note that policy/value head
is almost the last layer. The value/policy head share about 19 resnet blocks (38 convolutions) before diverging.

I guess some cooperative input on training positions, technique would be better than one person. I got a bit of GPU power here btw.
RL at 800 playouts per move was so resource intensive for me even when i had access to a couple of voltas. But i did learn that
RL could be reasonable choice at a really lower playout count. Even 1 playout is not so bad as one may think. I also like the idea from lc0
of using different number of playouts per move (KLD gain) but never got around to implementing it. I suspect it maybe possible to
bring down the number of GPUs needed to train a full 20b network with RL using tricks like that.

I am about to own a modern cpu+gpu desktop this week and plan to setup a continuous training framework with it and will let you know if successful.

Daniel

I found one play out (well, policy head only, no search) found it really tough to train with. Was getting about a thousand moves per second using Python. Blunder after blunder makes so much noise in the result. That was playing hundreds of games in parallel. What might work is policy versus SF depth limited, with parallel game limit = number of CPU cores. Haven’t tried that, but say with 12 threads at 50 ms each, could probably get 500 moves a second for training on, with at least half the moves making some kind of sense, helpful for the initial kickstart.

syzygy · Post by **syzygy** » Mon Jul 22, 2019 8:57 am

chrisw wrote: ↑Sun Jul 14, 2019 7:32 pm
Daniel Shawul wrote: ↑Sun Jul 14, 2019 3:33 pm Maybe leela nets will be public domain someday..after all they are generated by the gpu power of the people.
I think for Go, facebook and others have provided strong networks for free but no one has done the same for chess other than lc0.

If i start a training project (after i get a home gpu machine), maybe i will try to sell this idea. If you contribute to the project,
you will get the nets for free, i mean really for free like public domain.
We had a long discussion on the main page, quite a while ago (it was after the DeusX affair, to date it), me, Ronald de Man, and I think hgm, on copyright and IP of nets. The conclusion was that nets are not copyrightable. Neither LC0 nets nor anybody else's nets. Obviously if people keep their 'numbers' secret then they are not available in PD, but not-secret nets are no persons property. They other guys might correct me if I interpret it wrong. I was arguing that the net belonged to the creator of the training program, hgm argued the net was no different to a program and therefore copyrightable, and RdeM argued that something created by algorithm was not copyrightable and he 'won' the argument.

I don't remember if I won it, but I still agree with myself

The full argument would be that a copyright on an algorithm's output would cover only those creative aspects that were already present in the human-created input data. With LC0 nets the input data would basically be the rules of chess, but those are not copyrightable (or if they were, the copyright has long expired).

At least this should be the situation in the US and the EU.

chrisw · Post by **chrisw** » Mon Jul 22, 2019 10:00 am

syzygy wrote: ↑Mon Jul 22, 2019 8:57 am
chrisw wrote: ↑Sun Jul 14, 2019 7:32 pm
Daniel Shawul wrote: ↑Sun Jul 14, 2019 3:33 pm Maybe leela nets will be public domain someday..after all they are generated by the gpu power of the people.
I think for Go, facebook and others have provided strong networks for free but no one has done the same for chess other than lc0.

If i start a training project (after i get a home gpu machine), maybe i will try to sell this idea. If you contribute to the project,
you will get the nets for free, i mean really for free like public domain.
We had a long discussion on the main page, quite a while ago (it was after the DeusX affair, to date it), me, Ronald de Man, and I think hgm, on copyright and IP of nets. The conclusion was that nets are not copyrightable. Neither LC0 nets nor anybody else's nets. Obviously if people keep their 'numbers' secret then they are not available in PD, but not-secret nets are no persons property. They other guys might correct me if I interpret it wrong. I was arguing that the net belonged to the creator of the training program, hgm argued the net was no different to a program and therefore copyrightable, and RdeM argued that something created by algorithm was not copyrightable and he 'won' the argument.
I don't remember if I won it, but I still agree with myself

The full argument would be that a copyright on an algorithm's output would cover only those creative aspects that were already present in the human-created input data. With LC0 nets the input data would basically be the rules of chess, but those are not copyrightable (or if they were, the copyright has long expired).

At least this should be the situation in the US and the EU.

You did “win” it. I forget exactly what hgm was arguing, I think he was focusing on programs and data being interconvertible, and since programs could be copyrighted so could their data equivalents. I argued that human creativity of the algorithm designers gave them copyright on the algorithm, and hence on the data output.
Unusually for me, my logic got gradually and slowly ground down, and eventually conceded. Possibly a first in computer chess, well done!
Open forums can be good places to determine the truth of things, especially if nobody gets personal about it, and our discussion remained good natured over quite a long time. Possibly another first in computer chess.

Wouldn’t it be nice if there was a ChessNet50

Wouldn’t it be nice if there was a ChessNet50

Re: Wouldn’t it be nice if there was a ChessNet50

Re: Wouldn’t it be nice if there was a ChessNet50

Re: Wouldn’t it be nice if there was a ChessNet50

Re: Wouldn’t it be nice if there was a ChessNet50

Re: Wouldn’t it be nice if there was a ChessNet50

Re: Wouldn’t it be nice if there was a ChessNet50ing