"Instead of a handcrafted evaluation function and move ordering heuristics, AlphaZero utilises a deep neural network (p,v) = fθ(s) with parameters θ.pkappler wrote:Today is a big day in computer chess:
https://arxiv.org/abs/1712.01815
https://arxiv.org/pdf/1712.01815.pdf
This neural network takes the board position s as an input and outputs a vector of move probabilities p with components pa = Pr(a|s) for each action a, and a scalar value v estimating the expected outcome z from position s"
This seems normal to me.
"Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general-purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root to leaf. Each simulation proceeds by selecting in each state a move with low visit count, high move probability and high value" [emphasis mine]
This is interesting. If I understand it correctly, it basically goes deeper only after reaching a high level of hash table hits.
"AlphaZero vs Stockfish: 25 win for AlphaZero, 25 draw, 0 loss (each program was given 1 minute of thinking time per move, strongest skill level using 64 threads and a hash size of 1GB)"
This is sci-fi. I do not have a 64 core machine but on my pc Stockfish do not sacrifice a Knight for 2 pawns:
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.d3 Bc5 5.Bxc6 dxc6 6.O-O Nd7 7.Nbd2 O-O 8.Qe1 f6 9.Nc4 Rf7 10.a4 Bf8 11.Kh1 Nc5 12.a5 Ne6 13.Ncxe5?