Google's AlphaGo team has been working on chess

hgm · Post by **hgm** » Thu Dec 14, 2017 9:12 am

CheckersGuy wrote:Yeah. Seems like a way to improve the search. However, DeepMind wasn't going for "the strongest possible chess engine" but for a system that can learn any learn. Therefore they didnt use any domain knowledge.

The ban on domain knowledge did not extend to the rules. And what is a checkmates belongs to the rules. This is about the completely general issue of how the MCTS reacts to stumbling on a node that it judges to be a certain win (and cannot expand to chage that conclusion).

I would really like to know how one could speed up the training process using domain knowledge.

I would really love to give that a try. Write a MCTS program, and equip it with a comparatively simple NN, which you offer lots of domain-specific information you know to be useful in Chess. Like all attackers and protectors of every square, slider attacks that are blocked by a single piece, which moves would attack a given square, etc. And the try to train that NN.

Pio · Post by **Pio** » Thu Dec 14, 2017 10:47 am

I would love to see that engine. I think you would make a great one

Henk · Post by **Henk** » Tue Dec 19, 2017 9:49 pm

Article says that they use a neural network that takes the board position
s as an input and outputs a vector of move probabilities p.

Does that mean that output vector contains 4,672 elements ?
For how does it know which move probablity belongs to which move ?

Robert Pope · Post by **Robert Pope** » Tue Dec 19, 2017 11:03 pm

Henk wrote:Article says that they use a neural network that takes the board position
s as an input and outputs a vector of move probabilities p.

Does that mean that output vector contains 4,672 elements ?
For how does it know which move probablity belongs to which move ?

Right - the vector of probabilties has every combination of possible moves in it, even ones that aren't even pseudolegal in the position.

trulses · Post by **trulses** » Wed Dec 20, 2017 12:38 pm

Henk wrote:Does that mean that output vector contains 4,672 elements?

Yeah, but they filter out the illegal moves and re-normalize so at the end you have a probability distribution over only the legal moves.

Henk wrote:For how does it know which move probablity belongs to which move ?

It's just an encoding you choose when you set up the net. For instance in a dog/cat classifier you could have two outputs where you decide that element 0 means probability of cat and element 1 means probability of dog.

The only thing that really matters is consistency in how you label things and how you interpret the output.

Henk · Post by **Henk** » Wed Dec 20, 2017 2:38 pm

4,672 output elements. Looks a bit terrible expensive.

I remember implementing a perceptron with only one bit as output.

In my simple network a forward pass is already a computational bottleneck.
Image how costly a pass in their network must be.

Also input contains the 8 chess positions of last eight plies. So input is enormous too.

AlvaroBegue · Post by **AlvaroBegue** » Wed Dec 20, 2017 2:57 pm

Henk wrote:4,672 output elements. Looks a bit terrible expensive.

[...]

Also input contains the 8 chess positions of last eight plies. So input is enormous too.

The outputs are 73 planes of size 8x8. The 8 previous chess positions are something like 96 planes of size 8x8. If the architecture they used is similar to the one in AlphaGo Zero, the hidden layers consist of 256 planes of size 8x8, and there are lots of them. So you wouldn't save much computation by trimming down the inputs or the outputs.

Henk · Post by **Henk** » Wed Dec 20, 2017 4:17 pm

I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?

AlvaroBegue · Post by **AlvaroBegue** » Wed Dec 20, 2017 4:28 pm

Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?

First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can under-promote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.

The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to re-scale.

probability(m) = exp(value[m]) / sum_i(exp(value))

This operation is called SoftMax in the machine-learning world.

Henk · Post by **Henk** » Wed Dec 20, 2017 5:00 pm

Ok. Clear.

Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess

Re: Google's AlphaGo team has been working on chess