Alphazero news

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
duncan
Posts: 10202
Joined: Mon Jul 07, 2008 8:50 pm

Re: Alphazero news

Post by duncan » Sun Dec 09, 2018 4:40 pm

Albert Silver wrote:
Sun Dec 09, 2018 4:34 pm


Yet, that is what AlphaZero has done, and we are even able to bring this to the home user's PC thanks to Deep Mind's generosity with their knowledge, as well as the fantatsic Leela Chess community efforts. In other words that is not limited to some absurdly exotic hardware no one could ever hope to obtain. I am not even commenting on the whole self-learning process, which is what has been the focus.

What is more, to achieve this, you are looking at an incredibly evolved eval function (not precisely, but it helps illustrate the point) that has roughly 28 million values compared to a few thousand at most for even the most sophisticated predecessors. In all the years I have seen discussions on the fight between smart searchers and fast searchers, I have never seen anyone come close to imagining that is how enormous a difference it would take, much less realize and prove it.

It is pure genius.
Playing devil's advocate. Yes it is genius, but the chess community is not looking for genius we are looking to revolutionise chess playing and alpha zero seems to have maxed out at a level not significantly higher than an old version of stockfish without a book.

A dazzling 'toy', but a 'toy' nevertheless.

The future may be different, but that remains to be seen.

Albert Silver
Posts: 2839
Joined: Wed Mar 08, 2006 8:57 pm
Location: Rio de Janeiro, Brazil

Re: Alphazero news

Post by Albert Silver » Sun Dec 09, 2018 4:51 pm

duncan wrote:
Sun Dec 09, 2018 4:40 pm
Albert Silver wrote:
Sun Dec 09, 2018 4:34 pm


Yet, that is what AlphaZero has done, and we are even able to bring this to the home user's PC thanks to Deep Mind's generosity with their knowledge, as well as the fantatsic Leela Chess community efforts. In other words that is not limited to some absurdly exotic hardware no one could ever hope to obtain. I am not even commenting on the whole self-learning process, which is what has been the focus.

What is more, to achieve this, you are looking at an incredibly evolved eval function (not precisely, but it helps illustrate the point) that has roughly 28 million values compared to a few thousand at most for even the most sophisticated predecessors. In all the years I have seen discussions on the fight between smart searchers and fast searchers, I have never seen anyone come close to imagining that is how enormous a difference it would take, much less realize and prove it.

It is pure genius.
Playing devil's advocate. Yes it is genius, but the chess community is not looking for genius we are looking to revolutionise chess playing and alpha zero seems to have maxed out at a level not significantly higher than an old version of stockfish without a book.

A dazzling 'toy', but a 'toy' nevertheless.

The future may be different, but that remains to be seen.
Needless to say, I disagree with pretty much every point you made. We will have to agree to disagree.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."

noobpwnftw
Posts: 350
Joined: Sun Nov 08, 2015 10:10 pm

Re: Alphazero news

Post by noobpwnftw » Sun Dec 09, 2018 5:31 pm

Brute-forcing of those 28 million parameters of a evaluation function in a black-box style is neither efficient nor intelligent.

For generic algorithm, yes, a lazy man's solution to everything, but in chess domain, I do not see how it differs from a giant SPSA run in principle, and how would one call it "the future" when the comparable performance can be achieved in a way that is not a black-box.

It is important that when you are engineering something, you know the reasons behind its outcome, where in the A0 papers I see none, parameters and formulas are chosen arbitrarily because of "other attempts failed" or something, that is very scientific.

Also it is not "zero", why would one introduce temperature up to 30 plies instead of 40 or 50? Those are just another set of domain knowledge, and when you look at possibility of improvements upon such approaches, I see more domain knowledge. So this zero thing itself is a fake science, it is like putting a jet engine on a bicycle and say hey it runs so fast, nice work on building the jet engine, and how to improve? Get rid of the jet engine! :D

Milos
Posts: 3387
Joined: Wed Nov 25, 2009 12:47 am

Re: Alphazero news

Post by Milos » Sun Dec 09, 2018 5:37 pm

Albert Silver wrote:
Sun Dec 09, 2018 4:34 pm
I think it goes deeper than that. If someone, even someone whose knowledge and technical savvy you deeply respected, had told you a few years ago that they could get a program that was nearly one thousand times slower in NPS to compete and even beat the best of the day on a PC, I am guessing you would have rolled your eyes a them. I know I would have.

Yet, that is what AlphaZero has done, and we are even able to bring this to the home user's PC thanks to Deep Mind's generosity with their knowledge, as well as the fantatsic Leela Chess community efforts. In other words that is not limited to some absurdly exotic hardware no one could ever hope to obtain. I am not even commenting on the whole self-learning process, which is what has been the focus.

What is more, to achieve this, you are looking at an incredibly evolved eval function (not precisely, but it helps illustrate the point) that has roughly 28 million values compared to a few thousand at most for even the most sophisticated predecessors. In all the years I have seen discussions on the fight between smart searchers and fast searchers, I have never seen anyone come close to imagining that is how enormous a difference it would take, much less realize and prove it.

It is pure genius.
A0 and LC0 success is 90% due to hardware, and 10% due to "smart" software. All things Google used to create A0 existed since decades or at least a 5-10 years ago and are not invented by Google by any mean. Even A0 itself and the way it was presented and put was a marketing effort to advertise CloudTPU service.
Without incredibly fast hardware developed mainly in last 3-4 years A0 wouldn't exist. So if there is one thing ppl should be thankful to Google is that by creating their TPU hardware they created a competition to NVIDIA that pushed NVIDIA to create tremendous ML oriented cards. Those also rendered CloudTPU service obsolete ;).

Albert Silver
Posts: 2839
Joined: Wed Mar 08, 2006 8:57 pm
Location: Rio de Janeiro, Brazil

Re: Alphazero news

Post by Albert Silver » Sun Dec 09, 2018 6:07 pm

noobpwnftw wrote:
Sun Dec 09, 2018 5:31 pm
Brute-forcing of those 28 million parameters of a evaluation function in a black-box style is neither efficient nor intelligent.
Brute forcing? I think you are confusing it with your Stockfish box running on 384 threads.
For generic algorithm, yes, a lazy man's solution to everything, but in chess domain, I do not see how it differs from a giant SPSA run in principle, and how would one call it "the future" when the comparable performance can be achieved in a way that is not a black-box.

It is important that when you are engineering something, you know the reasons behind its outcome, where in the A0 papers I see none, parameters and formulas are chosen arbitrarily because of "other attempts failed" or something, that is very scientific.
I have a friend who works with NN development who would agree, and openly states that in his opinion NNs are not science, but engineering projects. However, I complete fail to see why this is important here. I don't care at all if you call NNs the product of science, engineering, or fairy dust sprinkled over the CPU. The fact is that this is a radical departure from previous paradigms, and a working one. It is not important if it is significantly stronger or not. The fact that it is at least as good is revolutionary.
Also it is not "zero", why would one introduce temperature up to 30 plies instead of 40 or 50? Those are just another set of domain knowledge, and when you look at possibility of improvements upon such approaches, I see more domain knowledge. So this zero thing itself is a fake science, it is like putting a jet engine on a bicycle and say hey it runs so fast, nice work on building the jet engine, and how to improve? Get rid of the jet engine! :D
Again, I could not care less about this either. As many know, using the core Leela Chess training code, I have been working on training NNs that are completely non-zero, using human games, engine games, tablebases, and other changes, so if I am supposed to be defending zero as the end-all of end-alls, you are talking to the wrong person.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."

Albert Silver
Posts: 2839
Joined: Wed Mar 08, 2006 8:57 pm
Location: Rio de Janeiro, Brazil

Re: Alphazero news

Post by Albert Silver » Sun Dec 09, 2018 6:12 pm

Milos wrote:
Sun Dec 09, 2018 5:37 pm
Albert Silver wrote:
Sun Dec 09, 2018 4:34 pm
I think it goes deeper than that. If someone, even someone whose knowledge and technical savvy you deeply respected, had told you a few years ago that they could get a program that was nearly one thousand times slower in NPS to compete and even beat the best of the day on a PC, I am guessing you would have rolled your eyes a them. I know I would have.

Yet, that is what AlphaZero has done, and we are even able to bring this to the home user's PC thanks to Deep Mind's generosity with their knowledge, as well as the fantatsic Leela Chess community efforts. In other words that is not limited to some absurdly exotic hardware no one could ever hope to obtain. I am not even commenting on the whole self-learning process, which is what has been the focus.

What is more, to achieve this, you are looking at an incredibly evolved eval function (not precisely, but it helps illustrate the point) that has roughly 28 million values compared to a few thousand at most for even the most sophisticated predecessors. In all the years I have seen discussions on the fight between smart searchers and fast searchers, I have never seen anyone come close to imagining that is how enormous a difference it would take, much less realize and prove it.

It is pure genius.
A0 and LC0 success is 90% due to hardware, and 10% due to "smart" software. All things Google used to create A0 existed since decades or at least a 5-10 years ago and are not invented by Google by any mean.
That is not much of an argument. Many many important advances used parts and elements that existed in one form or other before them. This in no way diminished their accomplishments.
Even A0 itself and the way it was presented and put was a marketing effort to advertise CloudTPU service.
Without incredibly fast hardware developed mainly in last 3-4 years A0 wouldn't exist. So if there is one thing ppl should be thankful to Google is that by creating their TPU hardware they created a competition to NVIDIA that pushed NVIDIA to create tremendous ML oriented cards. Those also rendered CloudTPU service obsolete ;).
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."

jhellis3
Posts: 390
Joined: Fri Aug 16, 2013 10:36 pm

Re: Alphazero news

Post by jhellis3 » Sun Dec 09, 2018 6:25 pm

Brute-forcing of those 28 million parameters of a evaluation function in a black-box style is neither efficient nor intelligent.
Actually, I would have to disagree with this. Yes, it is "brute forcing" an eval. TBs are brute forced evals and people find plenty of value in them. Ever increasing thread count in conventional engines is brute forcing via search and people find value in that. But as to the efficiency question, by what measure is it inefficient. Using fairly simple, straightforward methods, one can train a NN significantly stronger than the best efforts of humanity, in considerably less dev time than has been invested in SF or K or H for example. So if you look at objective strength vs effort and time, NNs are quite efficient. And I think realizing this and making use of them is a fairly intelligent thing to do, but that is just me....

*This is not to say I agree with/like all of DMs methods. And I am pretty sure there was some really shady stuff that went on regarding LC0, which makes me incredibly sad, but the simple truth is NN evals (properly trained) are simply much better than what humans have been able to accomplish to this point. Period.

Milos
Posts: 3387
Joined: Wed Nov 25, 2009 12:47 am

Re: Alphazero news

Post by Milos » Sun Dec 09, 2018 6:27 pm

Albert Silver wrote:
Sun Dec 09, 2018 6:12 pm
Milos wrote:
Sun Dec 09, 2018 5:37 pm
Albert Silver wrote:
Sun Dec 09, 2018 4:34 pm
I think it goes deeper than that. If someone, even someone whose knowledge and technical savvy you deeply respected, had told you a few years ago that they could get a program that was nearly one thousand times slower in NPS to compete and even beat the best of the day on a PC, I am guessing you would have rolled your eyes a them. I know I would have.

Yet, that is what AlphaZero has done, and we are even able to bring this to the home user's PC thanks to Deep Mind's generosity with their knowledge, as well as the fantatsic Leela Chess community efforts. In other words that is not limited to some absurdly exotic hardware no one could ever hope to obtain. I am not even commenting on the whole self-learning process, which is what has been the focus.

What is more, to achieve this, you are looking at an incredibly evolved eval function (not precisely, but it helps illustrate the point) that has roughly 28 million values compared to a few thousand at most for even the most sophisticated predecessors. In all the years I have seen discussions on the fight between smart searchers and fast searchers, I have never seen anyone come close to imagining that is how enormous a difference it would take, much less realize and prove it.

It is pure genius.
A0 and LC0 success is 90% due to hardware, and 10% due to "smart" software. All things Google used to create A0 existed since decades or at least a 5-10 years ago and are not invented by Google by any mean.
That is not much of an argument. Many many important advances used parts and elements that existed in one form or other before them. This in no way diminished their accomplishments.
Even A0 itself and the way it was presented and put was a marketing effort to advertise CloudTPU service.
Without incredibly fast hardware developed mainly in last 3-4 years A0 wouldn't exist. So if there is one thing ppl should be thankful to Google is that by creating their TPU hardware they created a competition to NVIDIA that pushed NVIDIA to create tremendous ML oriented cards. Those also rendered CloudTPU service obsolete ;).
You didn't get me. What I am saying is that Google is the one that did it because they had hardware resources and 3-4 years ago no one else had resources. So if Google didn't do it, someone else would most probably do it by today.

jp
Posts: 756
Joined: Mon Apr 23, 2018 5:54 am

Re: Alphazero news

Post by jp » Sun Dec 09, 2018 6:31 pm

jhellis3 wrote:
Sun Dec 09, 2018 6:25 pm
Brute-forcing of those 28 million parameters of a evaluation function in a black-box style is neither efficient nor intelligent.
Actually, I would have to disagree with this. Yes, it is "brute forcing" an eval. TBs are brute forced evals and people find plenty of value in them. Ever increasing thread count in conventional engines is brute forcing via search and people find value in that. But as to the efficiency question, by what measure is it inefficient.
Yes, it's brute force. And that's what it means to be inefficient. It requires brute force computational power to work well. The efficiency question is how much computation it needs.

jhellis3
Posts: 390
Joined: Fri Aug 16, 2013 10:36 pm

Re: Alphazero news

Post by jhellis3 » Sun Dec 09, 2018 6:36 pm

Yes, it's brute force. And that's what it means to be inefficient.
No.

You either did not read what I wrote, misunderstood, or are being purposefully obtuse. TBs are brute force but they are hardly inefficient when compared to a human trying to write a 100% perfect endgame eval function for 7 pieces or less.


EDIT: Or here is a challenge for you if you still don't get it: code a better static eval in less time. I won't hold my breath.

Post Reply