AlphaZero Chess is not that strong ...

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Werewolf
Posts: 1796
Joined: Thu Sep 18, 2008 10:24 pm

Re: AlphaZero Chess is not that strong ...

Post by Werewolf »

abulmo2 wrote:
My guess is more something like:
1) +5 elo
2) -10 elo (with AZ tuned for the same time management)
3) 0 elo (with AZ using an opening book too)
4) +20 elo
So a total of +15 elo, invisible on a 100 game tournament.
This looks very wrong to me.

1) Are you saying hash basically does nothing, that's the implication here? Of course SF would be better with 32 - 64 GB. And it doesn't address the question: "why did they limit it?". Even if you were correct, they should have put it at 32 GB just to be squeaky clean and above reproach. Vas always reckoned a doubling of hash was worth 5 elo, so I'd estimate +25 elo on this.

2) Given the complexity of the opening / early middlegame I'm sure SF would have gained here. Team SF specifically say they've done work on time management and picking critical positions, so why take that away from them? Unless....of course....Alpha Zero doesn't do it very well. +20 IMO.

3) Well it's called Alpha ZERO. SF makes no such claims. The book and tablesbases are part of the package if it's used for competitive use. The tablebase issue is especially clear: programmers don't bother coding for B+N v K endings anymore because we assume the TB will handle them. I'd say +30 elo

4) Yeah seems right. +20 elo

5) They should use a machine running without HT rather than a 32 core machine on 64 threads. A fast 36 core machine Dual Xeon (no HT) is probably faster than the one they used. Say +10 elo.

So I make +105 elo. Much closer match.
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: AlphaZero Chess is not that strong ...

Post by Ovyron »

shrapnel wrote: As the Graph in the DeepMind Paper shows, strength of AlphaZero increases rapidly the more time it is given, much more so than Stockfish.
So, if it had been given 2 min/move, Stockfish would probably be looking at a 72-0 Score instead of 28-0 !
That's very unlikely, as Stockfish is also doubling its time and this approach gains about 50 ELO per doubling.

A more realistic approximation could be:

A0 - Stockfish (1min/move)
+28 =72 -0

A0 - Stockfish (2min/move)
+32 =68 -0

A0 - Stockfish (4min/move)
+36 =64 -0

A0 - Stockfish (8min/move)
+40 =60 -0

A0 - Stockfish (16min/move)
+43 =57 -0

A0 - Stockfish (32min/move)
+46 =54 -0

A0 - Stockfish (64min/move)
+50 =50 -0

A0 - Stockfish (2 hour, 8 min/move)
+53 =47 -0

A0 - Stockfish (4 hour, 16 min/move)
+56 =44 -0

A0 - Stockfish (8 hour, 32 min/move)
+59 =41 -0

A0 - Stockfish (17 hour, 4 min/move)
+62 =38 -0

A0 - Stockfish (1 Day, 10 hour, 8 min/move)
+64 =36 -0

A0 - Stockfish (2 Day, 20 hour, 16 min/move)
+67 =33 -0

A0 - Stockfish (5 Day, 16 hour, 32 min/move)
+69 =31 -0

A0 - Stockfish (11 Day, 9 hour, 4 min/move)
+72 =28 -0

Note this 11 Day/move Stockfish would have by 1 min/move standards the equivalent of 4150 ELO (that is, it'd defeat the 3400 elo Stockfish by a 98.7% performance) and A0 would have a 315 ELO advantage over it.

This is overly optimistic already (i.e. it assumes A0 can't lose a game against Stockfish, if it ever loses this analysis goes out the window) considering move choices may not budge after some point, so it's very unlikely A0 can perform better than this at an earlier time control.
Werewolf
Posts: 1796
Joined: Thu Sep 18, 2008 10:24 pm

Re: AlphaZero Chess is not that strong ...

Post by Werewolf »

Laskos wrote:

1/ Elo terminology is a bit misleading here.

Take these 3 results:

+2 -0 =8
+20 -0 =80
+200 -0 =800

...

And all in all, all your objections of handicapping SF8 are almost irrelevant.
I think your argument is excellent, I'd not thought of that, but I'm not persuaded by your conclusion.

If SF was raised by 100 elo (and this seems possible taking a variety of steps: hash, opening book, TB, HT OFF on a faster machine, TC and so on)
then the result would be much closer.

If it was (say) +10, =90

and this followed through to

+100
=900

then I would agree with you, but we don't know that.

Secondly, a really good book can sometimes catch the opponent out with a deep line. It could be that without its book SF just didn't get positions it could pressure A0 with.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: AlphaZero Chess is not that strong ...

Post by Milos »

Laskos wrote:1/ Elo terminology is a bit misleading here.

Take these 3 results:

+2 -0 =8
+20 -0 =80
+200 -0 =800

Elo-wise the difference is the same. What can be said about the last result is that the advantage is HUGE, and only the drawishness of the Chess is distorting Elo perceptions. Elo becomes a bit irrelevant, and something like Normalized Elo or WILO is more useful to describe the results more straightforwardly.

I don't know why people are forgetful about Checkers paradigm, where +2 -0 =98 result was considered decisive. Elo-wise it's +7 Elo points, more precisely 7 +/- 10 Elo (2SD) points, not even decisive.

And all in all, all your objections of handicapping SF8 are almost irrelevant.
What you write is true but irrelevant. 100 games from chess starting position were clearly cherry picked. In best case (for Google ethicality) it was just pure luck. That is clear because from 11 out of 12 openings that were just 2 moves deep SF scored at least couple of wins. So that result with 0 wins for SF is statistically improbable, i.e. hypothesis that test was not fair is more probable than that it was fair.
It's like tossing a coin for 10 times and getting all 10 heads. Ofc it can happen with a fair coin too, but if you only had that single experiment, hypothesis that the coin is not fair is far more probable.
User avatar
mclane
Posts: 18748
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: AlphaZero Chess is not that strong ...

Post by mclane »

Why do you believe they hide something?
By looking into the 10 games, did you ever see stockfish losing this way ?!
What seems like a fairy tale today may be reality tomorrow.
Here we have a fairy tale of the day after tomorrow....
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: AlphaZero Chess is not that strong ...

Post by hgm »

abulmo2 wrote:1) Are you saying hash basically does nothing, that's the implication here? Of course SF would be better with 32 - 64 GB. And it doesn't address the question: "why did they limit it?". Even if you were correct, they should have put it at 32 GB just to be squeaky clean and above reproach. Vas always reckoned a doubling of hash was worth 5 elo, so I'd estimate +25 elo on this.
It only gives 5 Elo per doubling as long as the hash is still not large enough. Which is typically 10% of the size of the tree.) After that, there is no further advantage in enlarging the hash table (or it is even detrimental).
Michael Neish
Posts: 70
Joined: Wed Apr 05, 2006 9:22 am

Re: AlphaZero Chess is not that strong ...

Post by Michael Neish »

For traditional engines, how many extra Elo points does an extra ply of search add to its strength? I thought 50 points, but that's from memory. It could have been 30 or something entirely different.

If AlphaGo had a x16 hardware advantage (as someone mentioned sometime ago), then assuming a branching factor of 2, a x16 speed improvement for Stockfish would amount to an extra 4 plies or 200 Elo points (if the above figure is correct).

Any comments?
CheckersGuy
Posts: 273
Joined: Wed Aug 24, 2016 9:49 pm

Re: AlphaZero Chess is not that strong ...

Post by CheckersGuy »

Isn't the "doubling time control" gives +50 elo"-statement from ages ago ? Does it still hold today ? I don't belive it :D Is there any statistics from Komodo,Houdini, SF on how well they do with double the time ?
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: AlphaZero Chess is not that strong ...

Post by Lyudmil Tsvetkov »

Leo wrote:
JJJ wrote:I m not sure Alphazero hit a wall in strenght at all. I just think it does now need a lot of time to improve because he is now in the area of high draw rate and it's gonna need a lot of training to find some hidden winning line.

But in the end, I think he could still improve and win 500 elo at least with many more time he used to train at first.
Is there any evidence that Deep Mind will continue the project? If I had to guess, they will not use it for chess again. That's where the brainy people on our side come in.
If they don't use it for chess, certainly there is no better application.
Chess is much more complex than solving some imaginary problems regarding the human condition.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: AlphaZero Chess is not that strong ...

Post by Adam Hair »

Michael Neish wrote:For traditional engines, how many extra Elo points does an extra ply of search add to its strength? I thought 50 points, but that's from memory. It could have been 30 or something entirely different.

If AlphaGo had a x16 hardware advantage (as someone mentioned sometime ago), then assuming a branching factor of 2, a x16 speed improvement for Stockfish would amount to an extra 4 plies or 200 Elo points (if the above figure is correct).

Any comments?
The Elo gain per extra ply decreases as depth increases.