Google's AlphaGo team has been working on chess

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

jdart
Posts: 4366
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Google's AlphaGo team has been working on chess

Post by jdart »

Stockfish does such heavy pruning that it is throwing away most of the nodes in its search trees. But the ones it does search, it searches very deeply. I see a lot of high-level computer games won by tactics or by endgame play that requires deep search. Shannon Type II (selective search) has never worked well in any of the past 5-6 decades. But maybe this effort is showing that eval is more important than has been thought, and search less important.

--Jon
Henk
Posts: 7216
Joined: Mon May 27, 2013 10:31 am

Re: Google's AlphaGo team has been working on chess

Post by Henk »

What I understand is that the neural network predicts the winning probabilities for each valid move in a position.

Don't understand that these predictions will be good if it doesn't do a search but only simulation.

So how is it possible that monte carlo simulation is better than an alpha beta search.
clumma
Posts: 186
Joined: Fri Oct 10, 2014 10:05 pm
Location: Berkeley, CA

Re: Google's AlphaGo team has been working on chess

Post by clumma »

Henk wrote:What I understand is that the neural network predicts the winning probabilities for each valid move in a position.

Don't understand that these predictions will be good if it doesn't do a search but only simulation.

So how is it possible that monte carlo simulation is better than an alpha beta search.
The trick is to stop thinking in terms of tactics and search, and start thinking in terms of learning a really complex evaluation function. As the paper explains, alpha-beta can amplify any error in the evaluation function, whereas MCTS (plus a little noise) averages it out. So tuning alpha-beta is, in a sense, harder.

-Carl
clumma
Posts: 186
Joined: Fri Oct 10, 2014 10:05 pm
Location: Berkeley, CA

Re: Google's AlphaGo team has been working on chess

Post by clumma »

Xann wrote:Time is misleading in DeepMind's papers, as they use thousands of "computers" (not even commercially available). Money would be a better measure.
Neural nets scale well on cheap (per FLOP) SIMD hardware, so you'll definitely pay more for the traditional engine at the same level of performance.

-Carl
Uri Blass
Posts: 10268
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Google's AlphaGo team has been working on chess

Post by Uri Blass »

Daniel Shawul wrote:Most of us here suspected that this could happen once Giraffe showed it can beat Stockfish's eval.

Just the fact that the new approch to chess programming worked incredibly well is fantastic even if it didn't beat the best.

Daniel

How do you decide that giraffe's evaluation is better than stockfish?
If the definition is by using fixed number of nodes with the same search function then having a better evaluation than stockfish is easy if you do not care about time.

Evaluation is basically a function that take a position and return a number.

In this case I can define the evaluation of the position to be the result of the search of stockfish when it searches 20 plies forward.

I am sure this evaluation is better than stockfish's evaluation and can beat stockfish's evaluation when you search the same number of nodes(of course you do not count the nodes that you search to calculate the evaluation because they are defined to be part of the evaluation).

Uri
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Google's AlphaGo team has been working on chess

Post by Dann Corbit »

EvgeniyZh wrote:
mar wrote:While this is indeed incredible, show me how it beats SF dev with good book and syzygy on equal hardware in a 1000 game match.

Alternatively winning next TCEC should do :wink:
You suppose to run Stockfish on GPU?)
mar wrote:They are scientists so it would be nice to compare apples to apples.
AlphaZero din't used neither book nor syzygy, neither did stockfish. That sounds like apples to apples.
They used TPU not GPU (I suppose, did not read the paper yet, but TPU were used for Go).

They are special vector processors like GPU in having massive parallel operations.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
TommyTC
Posts: 38
Joined: Thu Mar 30, 2017 8:52 am

Re: Google's AlphaGo team has been working on chess

Post by TommyTC »

Fulvio wrote:This is sci-fi. I do not have a 64 core machine but on my pc Stockfish do not sacrifice a Knight for 2 pawns:
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.d3 Bc5 5.Bxc6 dxc6 6.O-O Nd7 7.Nbd2 O-O 8.Qe1 f6 9.Nc4 Rf7 10.a4 Bf8 11.Kh1 Nc5 12.a5 Ne6 13.Ncxe5?
Using Stockfish 181117, at move 30 it finds 13. Ncxe5 as best move, and changes at depth 34 to 13. Qc3 with eval = -0.37.
I'd say it is normal.
Leo
Posts: 1080
Joined: Fri Sep 16, 2016 6:55 pm
Location: USA/Minnesota
Full name: Leo Anger

Re: Google's AlphaGo team has been working on chess

Post by Leo »

Fulvio wrote:
pkappler wrote:Today is a big day in computer chess:

https://arxiv.org/abs/1712.01815
https://arxiv.org/pdf/1712.01815.pdf
"Instead of a handcrafted evaluation function and move ordering heuristics, AlphaZero utilises a deep neural network (p,v) = fθ(s) with parameters θ.
This neural network takes the board position s as an input and outputs a vector of move probabilities p with components pa = Pr(a|s) for each action a, and a scalar value v estimating the expected outcome z from position s"

This seems normal to me.

"Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general-purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root to leaf. Each simulation proceeds by selecting in each state a move with low visit count, high move probability and high value" [emphasis mine]

This is interesting. If I understand it correctly, it basically goes deeper only after reaching a high level of hash table hits.


"AlphaZero vs Stockfish: 25 win for AlphaZero, 25 draw, 0 loss (each program was given 1 minute of thinking time per move, strongest skill level using 64 threads and a hash size of 1GB)"

This is sci-fi. I do not have a 64 core machine but on my pc Stockfish do not sacrifice a Knight for 2 pawns:
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.d3 Bc5 5.Bxc6 dxc6 6.O-O Nd7 7.Nbd2 O-O 8.Qe1 f6 9.Nc4 Rf7 10.a4 Bf8 11.Kh1 Nc5 12.a5 Ne6 13.Ncxe5?
I also wondered why SF took that pawn with its Knight.
Advanced Micro Devices fan.
Leo
Posts: 1080
Joined: Fri Sep 16, 2016 6:55 pm
Location: USA/Minnesota
Full name: Leo Anger

Re: Google's AlphaGo team has been working on chess

Post by Leo »

mar wrote:While this is indeed incredible, show me how it beats SF dev with good book and syzygy on equal hardware in a 1000 game match.

Alternatively winning next TCEC should do
:wink:
Great point.
Advanced Micro Devices fan.
Uri Blass
Posts: 10268
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Google's AlphaGo team has been working on chess

Post by Uri Blass »

jdart wrote:Stockfish does such heavy pruning that it is throwing away most of the nodes in its search trees. But the ones it does search, it searches very deeply. I see a lot of high-level computer games won by tactics or by endgame play that requires deep search. Shannon Type II (selective search) has never worked well in any of the past 5-6 decades. But maybe this effort is showing that eval is more important than has been thought, and search less important.

--Jon
I am not sure how you define eval.

It is easy to make a search inside the evaluation and if the evaluation is slow then I suspect that it does a lot of search inside what you call evaluation.

I do not know what alphazero is doing in the evaluation and some book that describe alphazero's evaluation and explain how to calculate the evaluation in different positions may be good so humans can follow the steps and get the result without a special computer program even if it take some days to calculate the evaluation of a single position may be interesting.