Alphazero news

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Uri Blass
Posts: 8611
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: Alphazero news

Post by Uri Blass » Sat Dec 15, 2018 4:30 pm

jp wrote:
Sat Dec 15, 2018 2:22 pm
Laskos wrote:
Sat Dec 15, 2018 12:28 pm
Uri Blass wrote:
Sat Dec 15, 2018 11:06 am

We have no data for A0 and the data that I read in the paper suggest that A0 scales better than Stockfish8
No, you missed some posts here or a whole thread. In all appearance, A0 scales worse at LTC than SF8 in their conditions. It's apparent from the data in the paper.
Uri, look at
http://talkchess.com/forum3/viewtopic.php?f=2&t=69206
I missed that thread and I thought that Alpha0 scales better than Stockfish8 based on previous paper.

clumma
Posts: 177
Joined: Fri Oct 10, 2014 8:05 pm
Location: Berkeley, CA

Re: Alphazero news

Post by clumma » Sat Dec 15, 2018 5:57 pm

matthewlai wrote:
Fri Dec 14, 2018 5:32 pm
For example, if SF plays 1. e4 openings very poorly as white, but never plays it, should any 1. e4 openings be used to estimate SF's strength as white?
Of course not. However, "strength as white" is a strange concept. Forcing this hypothetical SF to play both sides of 1.e4 would be completely fair. A player should only be credited for avoiding 1.e4 if he can win against it as black -- i.e. if 1.e4 is objectively bad.

-Carl

jp
Posts: 841
Joined: Mon Apr 23, 2018 5:54 am

Re: Alphazero news

Post by jp » Sat Dec 15, 2018 6:20 pm

clumma wrote:
Sat Dec 15, 2018 5:57 pm
matthewlai wrote:
Fri Dec 14, 2018 5:32 pm
For example, if SF plays 1. e4 openings very poorly as white, but never plays it, should any 1. e4 openings be used to estimate SF's strength as white?
Of course not. However, "strength as white" is a strange concept. Forcing this hypothetical SF to play both sides of 1.e4 would be completely fair. A player should only be credited for avoiding 1.e4 if he can win against it as black -- i.e. if 1.e4 is objectively bad.
It's also a bad example because White can avoid 1.e4 if he wants, but Black cannot avoid facing 1.e4 or any other first move unless he refuses to play any White opponent who plays an opening he doesn't like.

User avatar
Laskos
Posts: 9535
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Alphazero news

Post by Laskos » Sat Dec 15, 2018 6:45 pm

jp wrote:
Sat Dec 15, 2018 6:20 pm
clumma wrote:
Sat Dec 15, 2018 5:57 pm
matthewlai wrote:
Fri Dec 14, 2018 5:32 pm
For example, if SF plays 1. e4 openings very poorly as white, but never plays it, should any 1. e4 openings be used to estimate SF's strength as white?
Of course not. However, "strength as white" is a strange concept. Forcing this hypothetical SF to play both sides of 1.e4 would be completely fair. A player should only be credited for avoiding 1.e4 if he can win against it as black -- i.e. if 1.e4 is objectively bad.
It's also a bad example because White can avoid 1.e4 if he wants, but Black cannot avoid facing 1.e4 or any other first move unless he refuses to play any White opponent who plays an opening he doesn't like.
I anyway left Lc0 play whatever it wants against SF + diversified book, this time at 4 times longer time control, and the result is again conclusive. I will post it in half an hour. A0(Lc0) simply cannot avoid various openings they don't like, an very plausibly at any time control.

User avatar
Laskos
Posts: 9535
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Alphazero news

Post by Laskos » Sat Dec 15, 2018 7:16 pm

matthewlai wrote:
Sat Dec 15, 2018 1:15 pm
Laskos wrote:
Sat Dec 15, 2018 12:47 pm
And 3) Book saves time.
That is true. This can be compensated for by giving Lc0 more time so that when SF comes out of book, they have roughly equal time left.
the main effect againt Lc0 is not the quality of the book moves, but their diversity which allows to put Lc0 in less familiar to it positions
How do you know this?
Many games exited with close to 0.00 evals out of the book, evals shown by both engines, but often Lc0 mishandled them to lose, much more often than viceversa.

Old results ar 1min + 1s TC (Lc0 on RTX 2070, SF10 on 4 i7 threads):

Lc0 No Book vs SF10 BookX.bin:
Score of lc0_v191_11261 vs SF10: 5 - 16 - 19 [0.362] 40
Elo difference: -98.07 +/- 79.65


Lc0 No Book vs SF10 No Book (Initial Board position):
Score of lc0_v19_11261 vs SF10: 12 - 6 - 22 [0.575] 40
Elo difference: 52.51 +/- 73.05

==================================


New result at 4min + 4s TC:

Lc0 No Book vs SF10 BookX.bin:
Score of lc0_v19_11261 vs SF10: 2 - 14 - 24 [0.350] 40
Elo difference: -107.54 +/- 66.82

Even worse than before, but the number of games is small (40).

Lc0 No Book vs SF10 No Book (Initial Board position):
Score of lc0_v19_11261 vs SF10: 8 - 5 - 27 (0.537) 40
Elo difference: 26.11 +/- 61.76

So, at 4x time control, the difference is pretty stable.
I think the burden is on you to show that in your conditions, diversified BookX.bin book or some other even small good diversified books don't make much difference. In fact, this is the last argument left, that at long TC and big hardware, this difference vanishes, a thing which I highly doubt. Anyway, playing 1000 games from 1 Initial Board position at very long time control seems a bit hilarious to me. Cannot you try some 200 with a diversified book for SF?
Also, in test suites at LTC, LcO rate of improvement with TC is lower than that rate for strong regular engines. Probably inherent to MCTS. Both on positional and tactical test-suites.
We have observed the same. Testsuite results just don't correlate very well to playing strength with this kind of engines. My guess (pure speculation) is that most conventional engines have been tuned against this kind of test suites at some point, so there's a degree of overfitting. Also, these testsuites (especially STS for example) test common human chess knowledge, and conventional chess engines implement the same.
I cannot play even 40 LTC games, it would take for the whole comparison more than a week. But I increased TC to 4 times longer, and in 5-6 hours I will have some results. Even that takes a total of 12-16 hours of testing, for only 40 games each match (high error margins). I am not a real tester and I have a single desktop PC, maybe you could use your vast resources to check my results at LTC on powerful hardware.
We do still need to pay for the resources and I'm afraid this one will be a bit hard to justify.

One thing you can do is start games at LTC, and when SF gets out of book, reset the clocks to whatever you want.

yanquis1972
Posts: 1762
Joined: Tue Jun 02, 2009 10:14 pm

Re: Alphazero news

Post by yanquis1972 » Sat Dec 15, 2018 7:53 pm

Laskos wrote:
Sat Dec 15, 2018 6:45 pm
jp wrote:
Sat Dec 15, 2018 6:20 pm
clumma wrote:
Sat Dec 15, 2018 5:57 pm
matthewlai wrote:
Fri Dec 14, 2018 5:32 pm
For example, if SF plays 1. e4 openings very poorly as white, but never plays it, should any 1. e4 openings be used to estimate SF's strength as white?
Of course not. However, "strength as white" is a strange concept. Forcing this hypothetical SF to play both sides of 1.e4 would be completely fair. A player should only be credited for avoiding 1.e4 if he can win against it as black -- i.e. if 1.e4 is objectively bad.
It's also a bad example because White can avoid 1.e4 if he wants, but Black cannot avoid facing 1.e4 or any other first move unless he refuses to play any White opponent who plays an opening he doesn't like.
I anyway left Lc0 play whatever it wants against SF + diversified book, this time at 4 times longer time control, and the result is again conclusive. I will post it in half an hour. A0(Lc0) simply cannot avoid various openings they don't like, an very plausibly at any time control.
i missed the details about the book you're using, but if it's extremely strong, why isn't the conclusion that SF emerges from the opening with a superior position in the majority of games? again, i haven't seen anyone argue NNs surpassed known opening theory.

jp
Posts: 841
Joined: Mon Apr 23, 2018 5:54 am

Re: Alphazero news

Post by jp » Sat Dec 15, 2018 8:41 pm

yanquis1972 wrote:
Sat Dec 15, 2018 7:53 pm
Laskos wrote:
Sat Dec 15, 2018 6:45 pm
I anyway left Lc0 play whatever it wants against SF + diversified book, this time at 4 times longer time control, and the result is again conclusive. I will post it in half an hour. A0(Lc0) simply cannot avoid various openings they don't like, an very plausibly at any time control.
i missed the details about the book you're using, but if it's extremely strong, why isn't the conclusion that SF emerges from the opening with a superior position in the majority of games? again, i haven't seen anyone argue NNs surpassed known opening theory.
According to an earlier post, BookX.bin is small, but good. It's not meant to give huge opening advantages. It's meant to give diversified openings.

If we want best-theory opening books (we don't), we have to steal them from the top-10 players' laptops.

Javier Ros
Posts: 183
Joined: Fri Oct 12, 2012 10:48 am
Location: Seville (SPAIN)
Full name: Javier Ros

Re: Alphazero news

Post by Javier Ros » Sat Dec 15, 2018 9:24 pm

Laskos wrote:
Sat Dec 15, 2018 7:16 pm
matthewlai wrote:
Sat Dec 15, 2018 1:15 pm
Laskos wrote:
Sat Dec 15, 2018 12:47 pm
And 3) Book saves time.
That is true. This can be compensated for by giving Lc0 more time so that when SF comes out of book, they have roughly equal time left.
the main effect againt Lc0 is not the quality of the book moves, but their diversity which allows to put Lc0 in less familiar to it positions
How do you know this?
Many games exited with close to 0.00 evals out of the book, evals shown by both engines, but often Lc0 mishandled them to lose, much more often than viceversa.

Old results ar 1min + 1s TC (Lc0 on RTX 2070, SF10 on 4 i7 threads):

Lc0 No Book vs SF10 BookX.bin:
Score of lc0_v191_11261 vs SF10: 5 - 16 - 19 [0.362] 40
Elo difference: -98.07 +/- 79.65


Lc0 No Book vs SF10 No Book (Initial Board position):
Score of lc0_v19_11261 vs SF10: 12 - 6 - 22 [0.575] 40
Elo difference: 52.51 +/- 73.05

==================================


New result at 4min + 4s TC:

Lc0 No Book vs SF10 BookX.bin:
Score of lc0_v19_11261 vs SF10: 2 - 14 - 24 [0.350] 40
Elo difference: -107.54 +/- 66.82

Even worse than before, but the number of games is small (40).

Lc0 No Book vs SF10 No Book (Initial Board position):
Score of lc0_v19_11261 vs SF10: 8 - 5 - 27 (0.537) 40
Elo difference: 26.11 +/- 61.76

So, at 4x time control, the difference is pretty stable.
I think the burden is on you to show that in your conditions, diversified BookX.bin book or some other even small good diversified books don't make much difference. In fact, this is the last argument left, that at long TC and big hardware, this difference vanishes, a thing which I highly doubt. Anyway, playing 1000 games from 1 Initial Board position at very long time control seems a bit hilarious to me. Cannot you try some 200 with a diversified book for SF?
In this interesting experiment I think that the confrontation SF10 No Book vs Lc0 BookX.bin is missing and it would be very interesting to have more data in the comparison.
The love relationship between a chess engine tester and his computer can be summarized in one sentence:
Until heat do us part.

Chessqueen
Posts: 634
Joined: Wed Sep 05, 2018 12:16 am
Full name: Nancy M Pichardo

Re: Alphazero news

Post by Chessqueen » Sat Dec 15, 2018 9:30 pm

I do not believe that SF fell for this, or Alpha Zero looked so deep that SF missed it altogether ==>
https://www.youtube.com/watch?v=7wYMJ4dc-I0

User avatar
Laskos
Posts: 9535
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Alphazero news

Post by Laskos » Sat Dec 15, 2018 9:59 pm

jp wrote:
Sat Dec 15, 2018 8:41 pm
yanquis1972 wrote:
Sat Dec 15, 2018 7:53 pm
Laskos wrote:
Sat Dec 15, 2018 6:45 pm
I anyway left Lc0 play whatever it wants against SF + diversified book, this time at 4 times longer time control, and the result is again conclusive. I will post it in half an hour. A0(Lc0) simply cannot avoid various openings they don't like, an very plausibly at any time control.
i missed the details about the book you're using, but if it's extremely strong, why isn't the conclusion that SF emerges from the opening with a superior position in the majority of games? again, i haven't seen anyone argue NNs surpassed known opening theory.
According to an earlier post, BookX.bin is small, but good. It's not meant to give huge opening advantages. It's meant to give diversified openings.

If we want best-theory opening books (we don't), we have to steal them from the top-10 players' laptops.
That's right, it's a polyglot small, good book, built by Adam Hair, IIRC.
It's not a professional CTG book from Playchess Engine Room, which are 10-30 times larger and of much higher quality. And have much more options in Fritz interface to specialize them for strength or diversity. BookX.bin mostly provides for variety, without having many wrong lines.
Aside that, isn't the main argument that we shouldn't feed A0(Lc0) with openings they will never play? Now Lc0 IS playing openings all by itself and loses heavily.

It seems that for some people the only way to test A0(Lc0) is to feed it with cherry picked openings to heavily favor it. 1+12 of them in total.

Post Reply