Then don't. You are free to choose any excuse you prefer.AlvaroBegue wrote:Dude, take your pills. I am not interested in participating in a conversation with such poor tone.
Regarding mentioning pills, it speaks volumes about your tone.
Moderators: bob, hgm, Harvey Williamson
Then don't. You are free to choose any excuse you prefer.AlvaroBegue wrote:Dude, take your pills. I am not interested in participating in a conversation with such poor tone.
I was obviously talking about the trained network and at the first few iterations the move probabilites provide a stronger bias. 0.99 may be a little high. As for a few moves having 0.5-07 probability I dont think so. Because the some of all the probabilites should be 1. If you had more than two moves with a probability of 0.5-0.7 this would be wrong alreadyMilos wrote:So many wrong things.CheckersGuy wrote:I do get how it works but apparently you dont. The move predictions of the neural networks initially provide a strong bias. This bias gets less as the search traverses the tree more often to get to the even distribution you are talking about. This is similiar to what the uct policy does. Just look at the equations and not at the text and this is not really hard to understand
The even distribution only happens after many many traverses through the tree. If you only do a few searches the bias of the nn is very high. This concept is very similiar to RAVE and/or uct which is commonly used in mcts....
In the alphaZero paper the lower the probabilites the following way. P(s,a)/1+N(s,a) where N is the number of traverses through the node and P the move probability.
Now give the candiate move 99 % probabilty and draw the picture yourself. Takes some iterations to lower the probability to encourage searching other moves
In the beginning of the training NN weights are random so output probabilities are random and uniform which gives exactly the shallowest possible tree.
Even for highly selective NN (later in training), in each node there are at least few best moves that have similar probability 0.5-0.7 (and A0 uses UCT policy that selects other candidates much more often than original UCB1 - which you wrote up even though you omitted sqrt of sum of total visit count). 0.99 never happens unless NN actually sees a mate in few moves, so your example is totally irrelevant.
First few iterations of what?CheckersGuy wrote:I was obviously talking about the trained network and at the first few iterations the move probabilites provide a stronger bias. 0.99 may be a little high. As for a few moves having 0.5-07 probability I dont think so. Because the some of all the probabilites should be 1. If you had more than two moves with a probability of 0.5-0.7 this would be wrong already
First few iterations of the algorithm obviously.Milos wrote:First few iterations of what?CheckersGuy wrote:I was obviously talking about the trained network and at the first few iterations the move probabilites provide a stronger bias. 0.99 may be a little high. As for a few moves having 0.5-07 probability I dont think so. Because the some of all the probabilites should be 1. If you had more than two moves with a probability of 0.5-0.7 this would be wrong already
Previously you were talking about training games (coz only there you have 800MCTS per move in games against SF you have 80k).
If network is already fully trained NN is selective, but not nearly as much as you are suggesting.
When I said few moves 0.5-0.7, I meant best move has 0.5-0.7 others (other lets say 1-2 moves) would have 0.3-0.5 combined.
So realistic example for selective NN would be something like P = [0.6 0.25 0.1 0.02 0.01 0.01 0.01].
So already after 3 times selecting first move, fourth time second move would be selected, and 8th time third one, etc.
This is what you said. If you call the probabilites you gave above "similiar" to 0.5-0.7 than thats just bogusin each node there are at least few best moves that have similar probability 0.5-0.7
Which algorithm? Training algorithm iteration (since iteration is what is used in training in the paper) or MCTS iteration?CheckersGuy wrote:First few iterations of the algorithm obviously.
See the edit in my previous post. Point is extending depth is not so easy as you originally suggested and even in the best branch tree was not very deep even in the games against SF.As for your example. Sure you will visits other moves but those moves are more likely to be bad and therefore have a bad action value and won't be visited that often in subsequent iterations. Don't forget that the search isnt entirely based on the move probabilites but the value that gets backed up from leafNodes goes into the calculation as well
You seem a bit confused, when people say there are 800 nodes per search they refer to the number of simulations per move, this should be fairly obvious if you read the paper.Milos wrote:It seems you got it wrong.trulses wrote:You're not playing random moves, you're playing the moves given by a random search tree. At 800 nodes per search you typically will find mate in ones and play them. The quality of this play is much higher than just picking a random move.
This is under the assumption that your prior probabilities aren't extremely biased, which they won't be with proper initialization.
There are no 800 nodes per search, there is 1 evaluated and few tens traversed nodes per search (because number of different paths explored is large). Reached depth of single MCT search would than be typically smaller than what SF achieves, and only in very late endgame you'd reach mates.
Yeah. Seems like a way to improve the search. However, DeepMind wasn't going for "the strongest possible chess engine" but for a system that can learn any learn. Therefore they didnt use any domain knowledge.hgm wrote:I am not very familiar with MCTS or simulations of it. But surely a search that expands 800 nodes should be aware if one of the root moves leads to a mate, and then only consider that move? If not, it seems to me that much could be improved. It is hard to believe that a mate close to the root would not affect the move choice at the root at all.
If totally random play already results in 15% checkmates, any preferece for mate-in-1 moves could only drive up that number, as the purely random games that eded in a draw would surely contain positions where mate-in-1 was possible, but not played, and these would now all turn into wins.
What are "MCTS hyper-parameters"? Never heard of it, can you point me to a reference where that is mentioned?trulses wrote:The point still stands, with the MCTS hyper-parameters from A0
The only thing that I get from your writing is that you have quite some difficulty to express yourself clearly in English. That really doesn't help in the discussion.To find a mate in one from the root node, you would have to be in a mate-in-one position. It's fairly obvious to me that most of those positions would not be in the early game, so your comment that these positions would be in the end game seems to be an indication that you're missing the point or you lack some understanding.
MCTS hyper-parameters refer to the hyper-parameters of the search like the value of virtual loss, the dirichlet noise alpha value, the c_puct exploration value etc. Even the number of search threads. These things help shape the resulting search tree.Milos wrote:What are "MCTS hyper-parameters"? Never heard of it, can you point me to a reference where that is mentioned?trulses wrote:The point still stands, with the MCTS hyper-parameters from A0
The only thing that I get from your writing is that you have quite some difficulty to express yourself clearly in English. That really doesn't help in the discussion.To find a mate in one from the root node, you would have to be in a mate-in-one position. It's fairly obvious to me that most of those positions would not be in the early game, so your comment that these positions would be in the end game seems to be an indication that you're missing the point or you lack some understanding.
That's cute.Milos wrote:The only thing that I get from your writing is that you have quite some difficulty to express yourself clearly in English.