My post was meant to explain how a deep NN could in principle see deep tactics. I didn't make any claims regarding the A0 network, only citing the AG0 network.Milos wrote:This is one of the best summaries from the AGZ paper assuming the same DCNN is used for chess. However, there are no indications that DCNN for chess is organized in the same way as for Go, since there is no mentioning about this in the paper. I guess they left it for the next Nature publication.Rein Halbersma wrote:The deep neural network connects the pieces on different squares to each other. They use 3x3 convolutions. This means that the next 8x8 layer's cells are connected to a 3x3 region (called "receptive field") in the previous region, and to a 5x5 region in the layer before etc. After only 4 layers, each cell is connected to every other cell in the original input layer. For AlphaGoZero they used no less than 80 layers. Then they also have many "feature maps" in parallel, so that they can learn different concepts related to piece-square combinations. Finally, they use the last 8 positions as input as well, so they also have a sense of ongoing maneuvers. All this is then being trained on the game result and the best move from the MC tree search.
Although the amount of resources required to train the millions of weights related to these neural networks is enormous, conceptually it is not surprising that pawn structure, king safety, mobility and even deep tactics can be detected from the last 8 positions.
We know how are input features organized, we know the policies, but that really doesn't tell much about actual network implementation especially since both inputs and policies are totally different and much more complex in case of chess compared to Go.
The only thing we can guess from the paper is the total number/size of weights of NN.
However, awaiting the full paper with all the details, my guess would be that A0-chess = AG0 - rotations - test games for selecting best network + game-dependent NN with more input planes, smaller spatial dimensions and larger policy vector output. Since they didn't mention any other big changes for A0-Go compared to AG0, I think it's safe to assume they still use 3x3 convolutions and ResNet blocks as their main infrastructure.
Anyway, hopefully more details soon!