Alphazero news

Discussion of anything and everything relating to chess playing software and machines.

Moderators: Harvey Williamson, bob, hgm

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
jp
Posts: 1406
Joined: Mon Apr 23, 2018 5:54 am

Re: Alphazero news

Post by jp » Sat Dec 15, 2018 12:51 am

Laskos wrote:
Fri Dec 14, 2018 11:30 pm
But the main role of a diversified book against Lc0 is destabilizing it with unfamiliar or tactical positions, where Lc0 (A0) buckles. And this destabilizing of Lc0 (A0) will not vanish even at infinite time control.
Yes. You cannot expect a player's opponent to give the player the opening positions it likes the most. That's not what opponents do.

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

Re: Alphazero news

Post by matthewlai » Sat Dec 15, 2018 2:05 am

Laskos wrote:
Fri Dec 14, 2018 11:30 pm
Uri Blass wrote:
Fri Dec 14, 2018 9:55 pm
Laskos wrote:
Fri Dec 14, 2018 3:15 pm
Thomas A. Anderson wrote:
Fri Dec 14, 2018 2:59 pm
Laskos wrote:
Fri Dec 14, 2018 1:31 pm
Michel wrote:
Fri Dec 14, 2018 12:20 pm
If AZ can always play into closed openings from start position no matter what the opponent does, why should its performance on open openings be reflected in its Elo rating?
It is a question of philosophy. As 100% of the practical use of chess engines consists of analysis one can argue that a chess engine should be able to play good chess in any (reasonable) position...
That statement of Matthew is not true. A0 cannot play always as it likes the openings (closed, less tactics). With "Leela Ratio" of about 2-3, Lc0 is comparable to A0 playing against SF (10 or 8).
rom the beginning by itself). If these things happen (I will post the next results in say 2 hours), A0 CANNOT steer every game into convenient opening. In fact, I knew that from my experience with Lc0, so for me the only reliable result was that from TCEC "outrageous" openings. But let's see the next results.
I think you misunderstood the sentence, it starts with an "If" and was certainly meant as conjunctive construction. Common consensus seems to be that the "fairest" match condition is starting from the initial position and let SF use whatever book and with whatever diversification it wants to use. Same for AZ. I start considering different starting positions as different disciplines. Because we have no clue if AZ would be able to avoid running into some/many of that positions, it's would be unfair testing AZ on a possibly very untrained battlefield.
See the results:

Lc0 No Book vs SF10 BookX.bin:
Score of lc0_v191_11261 vs SF10: 5 - 16 - 19 [0.362] 40
Elo difference: -98.07 +/- 79.65


Lc0 No Book vs SF10 No Book (Initial Board position):
Score of lc0_v19_11261 vs SF10: 12 - 6 - 22 [0.575] 40
Elo difference: 52.51 +/- 73.05


Lc0 vs SF10 from "12 human openings":
Score of lc0_v19_11261 vs SF10: 8 - 5 - 27 [0.537] 40
Elo difference: 26.11 +/- 61.76

In the paper only the latter 2 are presented, but the first one, which is close to 120-150 Elo points lower for Lc0, without any hindering to its playing (it plays all by itself form the start position), is absent. And the almost deterministic "best Cerebellum book moves" with A0 diversification is again a practice playing into A0 strength. In fact the first result is the most relevant, as Lc0 "shouldn't be hindered by openings", and if the diversification comes from SF10, SF will be punished by Lc0 all by itself, right?

Aside from lack of diversification or a bit of diversification coming mostly from A0, all this relates too to the fact that A0 (Lc0) is a more specialized task solver than a regular engine, a thread I posted two or three months ago. So, take A0 away from what it specializes in, say with an opening book, and it will buckle its strength.

I think that for comparing with the paper you need to use the same time control as the paper if you have the same hardware(or equivalent time control that means adjusting the time control based on your hardware).
It is logical that BookX.bin can give advantage for Stockfish at short time control that is going to disappear at long time control.

Simple reason is that I expect the opening moves of LC0 to be improved at long time control when
I do not expect moves of stockfish in the opening to be improved at long time control because stockfish is going to read the same moves directly from the book.
I explained this earlier. One role of the book is indeed replying instantly to Lc0 with reasonable moves (even possibly not the best) at which Lc0 has to spend some time. But the main role of a diversified book against Lc0 is destabilizing it with unfamiliar or tactical positions, where Lc0 (A0) buckles. And this destabilizing of Lc0 (A0) will not vanish even at infinite time control. Again, A0 (Lc0) are more specialized on the specific task they are trained to do, that is given initial chess board, to win the game. Regular engines are more general chess positions solvers, I am stating it for the third or fifth time on this forum.
And generally, most of your comments are "we cannot know this, we cannot know that". No, you, and only you cannot know many, many things, and I feel sorry for you.
There are two possible reasons for why Lc0 would lose more against SF if SF has a book.
1) Like you said, Lc0 is more specialized and doesn't play a diverse set of openings as well.
2) The quality of book moves are just higher in general than what Lc0 can come up with at short time control.

So why not play Lc0 at long time control and see which one is true?
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

Milos
Posts: 3923
Joined: Wed Nov 25, 2009 12:47 am

Re: Alphazero news

Post by Milos » Sat Dec 15, 2018 2:33 am

matthewlai wrote:
Sat Dec 15, 2018 2:05 am
There are two possible reasons for why Lc0 would lose more against SF if SF has a book.
1) Like you said, Lc0 is more specialized and doesn't play a diverse set of openings as well.
2) The quality of book moves are just higher in general than what Lc0 can come up with at short time control.

So why not play Lc0 at long time control and see which one is true?
Regarding 2), if TC is long enough (like 2-5 million nodes pre move) Lc0 will not change any more its mind about root move, i.e. probability of root move change decreases drastically if we are looking at early opening moves (moves 1-5 from the start position for example).
This is really easy to test. You run Lc0 from 100 common early opening positions for 2 million nodes and 20 million nodes and check how many times best move would be different. I'll give you a hint, much less than 1%.
That is one more reason why Lc0 and A0 scale much worse than top A/B engines once we have LTC and ultra long TC. With more thinking time in opening strong A/B engines will improve best moves more than Lc0/A0. That is inherent to MCTS. Ofc you can tweak it with search parameters but you will not make engine stronger by forcing it to change its mind more often.

chrisw
Posts: 3841
Joined: Tue Apr 03, 2012 2:28 pm

Re: Alphazero news

Post by chrisw » Sat Dec 15, 2018 10:08 am

Milos wrote:
Sat Dec 15, 2018 2:33 am
matthewlai wrote:
Sat Dec 15, 2018 2:05 am
There are two possible reasons for why Lc0 would lose more against SF if SF has a book.
1) Like you said, Lc0 is more specialized and doesn't play a diverse set of openings as well.
2) The quality of book moves are just higher in general than what Lc0 can come up with at short time control.

So why not play Lc0 at long time control and see which one is true?
Regarding 2), if TC is long enough (like 2-5 million nodes pre move) Lc0 will not change any more its mind about root move, i.e. probability of root move change decreases drastically if we are looking at early opening moves (moves 1-5 from the start position for example).
This is really easy to test. You run Lc0 from 100 common early opening positions for 2 million nodes and 20 million nodes and check how many times best move would be different. I'll give you a hint, much less than 1%.
That is one more reason why Lc0 and A0 scale much worse than top A/B engines once we have LTC and ultra long TC. With more thinking time in opening strong A/B engines will improve best moves more than Lc0/A0. That is inherent to MCTS. Ofc you can tweak it with search parameters but you will not make engine stronger by forcing it to change its mind more often.
You are thinking too tactical about the problem. A/B finds lines, MCTS finds regions. The former is inherently tactical and the latter positional. Scaling is not an alternative word for stronger or better, btw. And games at high level are decided on gradual accumulation of positional, with the tactics just dropping out at the end because. It is rare for any search to come up with situations where everything is okay for gazillions of nodes and iterations, and then suddenly there’s unexpected tactics (spare me the deep test positions which are only eeached in play because one side is positionally stronger). For A/B state of the art, it’s not even useful because the thirty or forty ply below the tactics is so shot with pruning, that the tactics are unlikely to be inevitable anyway. For MCTS, it doesn’t really care about individual single lines, it want to go to regions where there’s lots of good lines (for it). So both algorithms ‘change their minds’, but for different reasons. Meanwhile MCTS inherent positional is game winning, whilst not necessarily being problem solving.
Chess is not tactics, and if your positional region finding is already good, then you don’t need to change your mind much.
Your contention will become correct when A/B can see so far ahead without the dangerous pruning that it finds resolution to everything (eg a good enough exhaustive searcher), but that’s quite a way off yet.

Uri Blass
Posts: 8798
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: Alphazero news

Post by Uri Blass » Sat Dec 15, 2018 11:06 am

Milos wrote:
Sat Dec 15, 2018 2:33 am
matthewlai wrote:
Sat Dec 15, 2018 2:05 am
There are two possible reasons for why Lc0 would lose more against SF if SF has a book.
1) Like you said, Lc0 is more specialized and doesn't play a diverse set of openings as well.
2) The quality of book moves are just higher in general than what Lc0 can come up with at short time control.

So why not play Lc0 at long time control and see which one is true?
Regarding 2), if TC is long enough (like 2-5 million nodes pre move) Lc0 will not change any more its mind about root move, i.e. probability of root move change decreases drastically if we are looking at early opening moves (moves 1-5 from the start position for example).
This is really easy to test. You run Lc0 from 100 common early opening positions for 2 million nodes and 20 million nodes and check how many times best move would be different. I'll give you a hint, much less than 1%.
That is one more reason why Lc0 and A0 scale much worse than top A/B engines once we have LTC and ultra long TC. With more thinking time in opening strong A/B engines will improve best moves more than Lc0/A0. That is inherent to MCTS. Ofc you can tweak it with search parameters but you will not make engine stronger by forcing it to change its mind more often.
I never used lc0(and I am sure that I have no hardware that is good for lc0 because my computer is relatively old) but I know that with alpha-beta chess engines it is clearly more than 1%
I agree that testing Lc0 with 2 million nodes and 20 million nodes from positions that it got in the games to see how often it changes its mind may be an interesting experiment.

I think that you can say only about lc0 that it scales worse than A/B engines.

We have no data for A0 and the data that I read in the paper suggest that A0 scales better than Stockfish8

User avatar
Laskos
Posts: 10843
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Alphazero news

Post by Laskos » Sat Dec 15, 2018 12:28 pm

Uri Blass wrote:
Sat Dec 15, 2018 11:06 am


I think that you can say only about lc0 that it scales worse than A/B engines.

We have no data for A0 and the data that I read in the paper suggest that A0 scales better than Stockfish8
No, you missed some posts here or a whole thread. In all appearance, A0 scales worse at LTC than SF8 in their conditions. It's apparent from the data in the paper.

User avatar
Laskos
Posts: 10843
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Alphazero news

Post by Laskos » Sat Dec 15, 2018 12:47 pm

matthewlai wrote:
Sat Dec 15, 2018 2:05 am

There are two possible reasons for why Lc0 would lose more against SF if SF has a book.
1) Like you said, Lc0 is more specialized and doesn't play a diverse set of openings as well.
2) The quality of book moves are just higher in general than what Lc0 can come up with at short time control.

So why not play Lc0 at long time control and see which one is true?
And 3) Book saves time.

2) Is only partly relevant. First, the main effect againt Lc0 is not the quality of the book moves, but their diversity which allows to put Lc0 in less familiar to it positions, often opening the game or involving some tactics. With time control, yes I guess that huge difference will diminish, but not vanish, as per 1). Also, in test suites at LTC, LcO rate of improvement with TC is lower than that rate for strong regular engines. Probably inherent to MCTS. Both on positional and tactical test-suites.

I cannot play even 40 LTC games, it would take for the whole comparison more than a week. But I increased TC to 4 times longer, and in 5-6 hours I will have some results. Even that takes a total of 12-16 hours of testing, for only 40 games each match (high error margins). I am not a real tester and I have a single desktop PC, maybe you could use your vast resources to check my results at LTC on powerful hardware.

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

Re: Alphazero news

Post by matthewlai » Sat Dec 15, 2018 1:15 pm

Laskos wrote:
Sat Dec 15, 2018 12:47 pm
And 3) Book saves time.
That is true. This can be compensated for by giving Lc0 more time so that when SF comes out of book, they have roughly equal time left.
the main effect againt Lc0 is not the quality of the book moves, but their diversity which allows to put Lc0 in less familiar to it positions
How do you know this?
Also, in test suites at LTC, LcO rate of improvement with TC is lower than that rate for strong regular engines. Probably inherent to MCTS. Both on positional and tactical test-suites.
We have observed the same. Testsuite results just don't correlate very well to playing strength with this kind of engines. My guess (pure speculation) is that most conventional engines have been tuned against this kind of test suites at some point, so there's a degree of overfitting. Also, these testsuites (especially STS for example) test common human chess knowledge, and conventional chess engines implement the same.
I cannot play even 40 LTC games, it would take for the whole comparison more than a week. But I increased TC to 4 times longer, and in 5-6 hours I will have some results. Even that takes a total of 12-16 hours of testing, for only 40 games each match (high error margins). I am not a real tester and I have a single desktop PC, maybe you could use your vast resources to check my results at LTC on powerful hardware.
We do still need to pay for the resources and I'm afraid this one will be a bit hard to justify.

One thing you can do is start games at LTC, and when SF gets out of book, reset the clocks to whatever you want.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

jp
Posts: 1406
Joined: Mon Apr 23, 2018 5:54 am

Re: Alphazero news

Post by jp » Sat Dec 15, 2018 2:22 pm

Laskos wrote:
Sat Dec 15, 2018 12:28 pm
Uri Blass wrote:
Sat Dec 15, 2018 11:06 am

We have no data for A0 and the data that I read in the paper suggest that A0 scales better than Stockfish8
No, you missed some posts here or a whole thread. In all appearance, A0 scales worse at LTC than SF8 in their conditions. It's apparent from the data in the paper.
Uri, look at
http://talkchess.com/forum3/viewtopic.php?f=2&t=69206

noobpwnftw
Posts: 435
Joined: Sun Nov 08, 2015 10:10 pm

Re: Alphazero news

Post by noobpwnftw » Sat Dec 15, 2018 2:25 pm

matthewlai wrote:
Sat Dec 15, 2018 1:15 pm
We do still need to pay for the resources and I'm afraid this one will be a bit hard to justify.
Despite my prejudice and all that, I sincerely hope that next time you guys are going to pay for a test like a thousand games at 3 hours TC, think twice or better ask on forums.

Post Reply