jkiliani wrote:Milos wrote:jkiliani wrote:Milos wrote:CMCanavessi wrote:So from Gen 6 to Gen 8, LCZero got +190 elo, in 2 days. Imagine if the project catches up and more people help train it. It would blow our minds. I believe that Gen 10-12 will already be around 1000 elo, by the weekend.
Like most of the ppl you don't quite understand how DCNN training works.
Sooner than later you'll hit a plateau and the training will saturate. The higher you go, the harder it becomes to improve DCNN.
While you are correct in principle, you're overlooking that there are ways to deal with that, as Leela Zero already demonstrated. Once you reach a plateau, you can simply use the self-play games you already have to bootstrap a larger neural net, which will usually achieve a significant initial jump and be able to train to a higher level. Once you stall again, rinse and repeat.
In some cases larger net would help, in others not, which is most probably the case with A0.
Training larger net requires much more resources for self-play games. In case of LCZero, they are already struggling with relatively small net as it is now to get a decent number of games.
So enthusiasm regarding LCZero is pretty much in vain.
A larger net always helps, since more weights can get better representation of the search output, and a ResNet also doesn't suffer from vanishing gradients anymore like earlier network architectures. Leela Zero got a big boost with every network expansion, and you can't tell me that somehow chess is so fundamentally different that the same wouldn't apply here.
Sure, self-play speed will go down with every network expansion, but it will also go up a lot as the project gains traction and more people contribute. The stronger it gets, the more publicity will there be.
Enthusiasm for LCZero is very well founded, and shared by everyone who read the Deepmind papers and understood them at least at some level. No-one is forcing you to contribute, feel free to watch the downfall of the Alpha-Beta engines from the sidelines.
I do not believe in the downfall of the alpha-beta.
I believe that it is possible to achieve better level by improving alpha-beta instead of trying to replace it.
I believe that basically if you have stockfish play against itself as part of the evaluation function of an engine you may get stronger engine than stockfish at long time control with correct search rules without more knowledge in the evaluation.
The idea is that the new engine can simply call stockfish play against itself at fixed depth d1(n,i) for ply i of the game (1<=i<d2(n,i))
and use the sequence of d2 evaluations to calculate the evaluation function when it searches iteration n.
d1 and d2 are going to be bigger when n is bigger.
I believe that with right tuning of d1 and d2 you should get a stronger engine than stockfish(at small depthes of course d2(n,i)=0 so the engine is the same as stockfish because it does not get play against itself and use the static evaluation function.
I believe that this idea can help to detect fortress at huge depths because when the depth is big enough playing against yourself leads to a draw.
Note that first step should be to try to replace the evaluation by searching to depth 1 for 1 ply(at iteration bigger than some n) to see if it helps.
I believe it should help when n is big enough and you may lose at most constant speed difference between evaluation and searching to depth 1 when you earn smarter evaluation(for example there may be cases when the program does not see a stalemate and think one side has a big advantage because the stalemate is evaluated wrong and the search find ways to delay the stalemate and if you do one ply search it simply does not happen.