Google's AlphaGo team has been working on chess
Moderators: bob, hgm, Harvey Williamson
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Re: Google's AlphaGo team has been working on chess
I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
a probability for each possible move ?

 Posts: 922
 Joined: Tue Mar 09, 2010 2:46 pm
 Location: New York
 Full name: Álvaro Begué (RuyDos)
Re: Google's AlphaGo team has been working on chess
First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can underpromote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to rescale.
probability(m) = exp(value[m]) / sum_i(exp(value))
This operation is called SoftMax in the machinelearning world.

 Posts: 402
 Joined: Sat Jul 02, 2011 8:49 pm
Re: Google's AlphaGo team has been working on chess
AlvaroBegue wrote:First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can underpromote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to rescale.
probability(m) = exp(value[m]) / sum_i(exp(value))
This operation is called SoftMax in the machinelearning world.
it sounds like to me ... that this takes a lot of memory to do...
how much memory is needed to do all of this stuff...
Re: Google's AlphaGo team has been working on chess
PUCT:syzygy wrote:But I am looking at the (socalled MC) tree it builds, starting with a 1node tree and growing it by successively expanding leaf nodes (whether by NN or some other means). That is how automatic book building works.CheckersGuy wrote:Additionally, I am against the term "bookbuildingapproach". It's not like A0 zero stores the best move in a given position.
See e.g. Thomas Lincke's dissertation. Section 3.4 considers expansion strategies. The rule he came up with gives a high expansion priority to a move that is good or has a shallow subbook. It seems he was not aware of the multiarmed bandit theory.
http://www.ecmlpkdd2013.org/wpcontent/ ... 07/456.pdf
Very easy.

 Posts: 922
 Joined: Tue Mar 09, 2010 2:46 pm
 Location: New York
 Full name: Álvaro Begué (RuyDos)
Re: Google's AlphaGo team has been working on chess
pilgrimdan wrote:AlvaroBegue wrote:First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can underpromote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to rescale.
probability(m) = exp(value[m]) / sum_i(exp(value))
This operation is called SoftMax in the machinelearning world.
it sounds like to me ... that this takes a lot of memory to do...
how much memory is needed to do all of this stuff...
Do you mean memory or storage for the training database? I can make guesses about both, but I am not sure what your concern is.

 Posts: 402
 Joined: Sat Jul 02, 2011 8:49 pm
Re: Google's AlphaGo team has been working on chess
AlvaroBegue wrote:pilgrimdan wrote:AlvaroBegue wrote:First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can underpromote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to rescale.
probability(m) = exp(value[m]) / sum_i(exp(value))
This operation is called SoftMax in the machinelearning world.
it sounds like to me ... that this takes a lot of memory to do...
how much memory is needed to do all of this stuff...
Do you mean memory or storage for the training database? I can make guesses about both, but I am not sure what your concern is.
okay ... yes ... storage is what I had in mine ...

 Posts: 922
 Joined: Tue Mar 09, 2010 2:46 pm
 Location: New York
 Full name: Álvaro Begué (RuyDos)
Re: Google's AlphaGo team has been working on chess
pilgrimdan wrote:AlvaroBegue wrote:pilgrimdan wrote:AlvaroBegue wrote:First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can underpromote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to rescale.
probability(m) = exp(value[m]) / sum_i(exp(value))
This operation is called SoftMax in the machinelearning world.
it sounds like to me ... that this takes a lot of memory to do...
how much memory is needed to do all of this stuff...
Do you mean memory or storage for the training database? I can make guesses about both, but I am not sure what your concern is.
okay ... yes ... storage is what I had in mine ...
You don't need to store all that. If you have the result of the game and the list of moves, you can build the input and the desired outputs on the fly.
In the paper they indicate that they also store the distribution of playouts amongst the moves, since they use that as labels for the policy output. That would require an additional ~100 bytes per move.

 Posts: 402
 Joined: Sat Jul 02, 2011 8:49 pm
Re: Google's AlphaGo team has been working on chess
AlvaroBegue wrote:pilgrimdan wrote:AlvaroBegue wrote:pilgrimdan wrote:AlvaroBegue wrote:First let's explain where the 73 comes from, in case anyone in this conversation doesn't know. Queens can move in 56 different ways (8 directions and up to 7 steps in each direction). This also contains the ways kings, bishops, rooks and pawns move. Then there are 8 ways knight move. Finally, we can underpromote in 9 ways ({capture left, capture right, move straight} x {promote to R, promote to B, promote to N}). So now we encode a move as the `from' square (64 possibilities) and a number from 1 to 73, indicating the way the piece is moving.Henk wrote:I still don't understand. How do you get from 73 planes of size 8x8
a probability for each possible move ?
The neural network comes up with 73 planes of size 8x8, where each entry is a single real number. Now we generate the legal moves (by conventional means) for each move we look up the value assigned to it by the neural network, and we say the probability of the move is proportional to exp(value). Now these probabilities don't add up to 1, so you need to rescale.
probability(m) = exp(value[m]) / sum_i(exp(value))
This operation is called SoftMax in the machinelearning world.
it sounds like to me ... that this takes a lot of memory to do...
how much memory is needed to do all of this stuff...
Do you mean memory or storage for the training database? I can make guesses about both, but I am not sure what your concern is.
okay ... yes ... storage is what I had in mine ...
You don't need to store all that. If you have the result of the game and the list of moves, you can build the input and the desired outputs on the fly.
In the paper they indicate that they also store the distribution of playouts amongst the moves, since they use that as labels for the policy output. That would require an additional ~100 bytes per move.
okay ... thanks ...

 Posts: 3046
 Joined: Fri May 26, 2006 1:00 am
 Location: WY, USA
 Full name: Michael Sherwin
Re: Google's AlphaGo team has been working on chess
Marco, A0 did not win a match against SF. A0 with RL won a match against SF. Or said another way, A0 won a match against SF because SF does not have RL. Or thought of a different way, a group of programmers identified a deficiency that exist in a competitive field and took advantage of that deficiency by eliminating that deficiency in their entity. Or one can change that thought around and say RL does not belong in competitive chess because it covers up the underlying strength and correctness of the algorithm. In that case the A0 vs SF match is non sequitur and meaningless. Then there is the thought of the fan that wants RL but are ignored because they are not important and what the fan thinks or wants is not meaningful.mcostalba wrote:I have read the paper: result is impressive!
Honestly I didn't think it was possible because my understanding was that chess is more "computer friendly" than Go....I was wrong.
It is true, SF is not meant to play at its best without a book and especially 1 fixed minute per move cuts out the whole time management, it would be more natural to play with tournament conditions, but nevertheless I think these are secondary aspects, what has been accomplished is huge.
But, what you can't say is, "what has been accomplished is huge" in terms of a chess playing algorithm. You might say that what A0 has demonstrated in go, chess and shogi has accomplished a huge demonstration that a NN with RL may conquer hamanity some day. I won't argue against that. Concerning chess though the AB algorithm is not inferior to NN+MC. It is inferior to NN+MC+RL. AB+RL is far superior to NN+MC+RL.
And I said all that without mentioning RomiChess not even one time!
I hate if statements. Pawns demand if statements. Therefore I hate pawns.