1. Using the final result of the game is fine. This is what we do in Go. You'll need really many games, but it is easy to generate many.
2. I took a quick look at your NN architecture. My advice is:
- Use a separate channel for black and white pieces
- Instead of black/white, encode input as "my color"/"opponent's color" from the point of view of the player to move.
- You'll also have to encode castling availability. This can be done with a channel that indicates which rook can castle.
I'd like to try to implement AlphaGo's algorithm for chess. I wonder if anyone is trying. That sounds like a very interesting project, and I would not be surprised if it could produce a very strong chess program.
Re: the number of input channels, why not have a channel for every bitboard in your position? So 8x8 2D inputs for black pawns through king, same for white, plus one empty (so 13 input channels in all). If you then do 3x3x13 convolutions on the input layer, and go up 4+ layers (so that effective receptive field in the final layer is larger than 8x8), the model should be able to learn piece interactions across the entire board.
Another question that I have: can heavy multi-layer neural nets overcome the much shorter computation time of a linear eval with pure alpha-beta? Maybe if enough GPU power is crammed into a pc, but the Giraffe program still had plenty of room to improve in order to catch up to the state-of-the-art.
I am convinced that a convolutional neural network should be able to learn an evaluation function that is considerably better than any hand-made evaluation. But it will be much slower.
Whether it can be competitive with traditional engines is difficult to predict.
It is fun and exciting to try, anyway.
I believe the best approach is to learn value + policy, like in alphago, and use the policy to grow a very selective MCTS tree. It may be better than alpha-beta.
Also, good efficiency might require chess-specific layers. I am thinking about knight-jump convolution, diagonal convolution, rook-move convolution, etc.
Modern deep-learning hardware will help to close the gap, too. A 8xV100 machine is really tremendous in terms of computing power.
coulom wrote:I believe the best approach is to learn value + policy, like in alphago, and use the policy to grow a very selective MCTS tree. It may be better than alpha-beta.
One can argue that current state of the art chess engines are already using a form of Monte Carlo search. Trees are very narrow and the "policy" is generated dynamically using an assortment of history mechanisms.
Where alpha-beta search really shines is in its ability to resolve tactics quickly. In the past I think tactics were the achilles heel of traditional MCTS attempts in chess. It would be interesting to see if things have changed.
Giraffe uses a NN for evaluation but alpha beta for search (albeit node based instead of depth based).
Last edited by Michel on Tue Nov 14, 2017 2:38 pm, edited 2 times in total.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Michel wrote:
One can argue that current state of the art chess engines are already using a form of Monte Carlo search. Trees are very narrow and the "policy" is generated dynamically using an assortment of history mechanisms.
Where alpha-beta search really shines in its ability to resolve tactics quickly. In the past I think tactics were the achilles heel of traditional MCTS attempts in chess. It would be interesting to see if things have changed.
Giraffe uses a NN for evaluation but alpha beta for search (albeit node based instead of depth based).
It would be quite something if a NN would be able to spot mate-in-N (for small N) without look-ahead and with high accuracy.
By MCTS, I mean the AlphaGo zero approach, where there is no random playouts. Playouts are completely replaced by neural-network evaluation.
I believe a large network can read a lot of tactics. The raw neural-network of AlphaGo is (weak) pro strength by itself, with no search at all.
Of course, convolutional neural networks fit the game of Go much more than the game of Chess. That's why I am thinking about chess-specific NN architectures.
Maybe working in the 8x8x8x8 space of piece movements might be better than the 8x8 board. I guess the policy would have to be of size 8x8x8x8, anyways. 8x8 and 8x8x8x8 units may have to be combined, somehow.
It's interesting to think about it. I'll certainly give it a try some day.
By MCTS, I mean the AlphaGo zero approach, where there is no random playouts. Playouts are completely replaced by neural-network evaluation.
I had understood that. So roughly speaking the point is to replace alpha beta search with UCT search (both can be combined with NN leaf node evaluation).
I think in the past UCT combined with a heuristic evaluation has not worked well in chess (tactics...). But perhaps it would be different with a high quality NN evaluation. It is certainly an exciting thought.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
I'd like to try to implement AlphaGo's algorithm for chess. I wonder if anyone is trying. That sounds like a very interesting project, and I would not be surprised if it could produce a very strong chess program.
That would be quite exciting!
I wish you the best of luck but I don't think you will get a good chess engine. GPU training is just quite slow compared to google's tpu's