Alphazero news

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Sesse
Posts: 300
Joined: Mon Apr 30, 2018 11:51 pm

Re: Alphazero news

Post by Sesse »

Milos wrote: Sun Dec 09, 2018 3:47 am
Sesse wrote: Sun Dec 09, 2018 1:41 am I didn't say Google was devoted to open source. I said they open sourced a lot of stuff.
They open source only for PR purposes. They are far less sincere about open source than Microsoft which is in itself really ironic.
I'm not sure how you arrive at that conclusion. Google in general open-sources software for a lot of reasons, probably fairly similar to how other companies that deal with OSS works. Look at the amount of stuff flowing into the Linux kernel—is that for “only for PR purposes”? If so, that's really expensive PR.
Why would something so basic as neural network architecture model not be compatible between open source and internal TF????
Have you ever tried to maintain a closed and open version of anything in parallel? I can tell you it's a pain in the neck.
If that was really the case, that would be one more hell of an argument that anything that Google open sourced was to gain PR or increase their revenue and that it has nothing to do with true spirit of open source.
…what? This makes no sense.
No other private engine does run PR campaign for selling of their online services (like cloud TPUs).
You mean like the Rybka cluster?
they finally decided to really publish something and give us a little bit more insight and ppl instantly feel like everyone should be enormously grateful to them.
You're aware that this paper was submitted a year ago, right, long before the RTX series was announced? Google had no say the lead time here; that was up to the journal.
You are not working in Google PR department, why so much need to defend them?
If I'm defending them even if I don't work in the Google PR department, perhaps it's on what I perceive as a factual basis as opposed to just fanboyism?

To flip it around, you're not working in FairSearch.org, why so much need to attack Google?
And no I don't hold the same stance towards other private engines, coz none of them is created by a giant mean corporation
Ah, so you're having different standards for evidence for AlphaZero because you don't like the company behind it (a “giant mean corporation”, in your own words). Well, that's good to know, but it's hardly objective.
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: Alphazero news

Post by Albert Silver »

noobpwnftw wrote: Sun Dec 09, 2018 10:17 pm
Albert Silver wrote: Sun Dec 09, 2018 10:03 pm
noobpwnftw wrote: Sun Dec 09, 2018 9:33 pm
Albert Silver wrote: Sun Dec 09, 2018 9:29 pm You are missing the point. The development on a basis of winning chances has led to a very different basis for making decisions with far-reaching consequences.
If I understand your concept of "very different basis for making decisions with far-reaching consequences" correctly, are you referring to the most widely used method by opening books and chess databases?
I am referring to the engine, not the opening book or chess databases.
But the fact that using winning chance representation is not new and have been found functionally equivalent to centipawn representation. It is a compromise because the evaluation is not perfect. Switch from centipawn to "centiking" does not remove that compromise in any way.

People happen to use it because usually NNs have an output range of [-1, 1], and converting it linearly gets you the "winning chance".
You are still missing the point. The winning chance representation is not what is important.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Alphazero news

Post by Milos »

Sesse wrote: Sun Dec 09, 2018 11:07 pm I'm not sure how you arrive at that conclusion. Google in general open-sources software for a lot of reasons, probably fairly similar to how other companies that deal with OSS works. Look at the amount of stuff flowing into the Linux kernel—is that for “only for PR purposes”? If so, that's really expensive PR.
No, that's for revenue, like all these new ways they are trying to generate revenue from Android.
Why would something so basic as neural network architecture model not be compatible between open source and internal TF????
Have you ever tried to maintain a closed and open version of anything in parallel? I can tell you it's a pain in the neck.
One of the most advanced software companies in the world can't maintain compatibility between the core model of their core open-source framework (TF) and its in-house version because that is extremely complicated???
And not because they don't care, or not because most probably they intentionally want to differentiate and offer "advanced" version of TF as a paid service in the future?
You really think ppl are sheep, don't you?
No other private engine does run PR campaign for selling of their online services (like cloud TPUs).
You mean like the Rybka cluster?
Rybka making PR campaign by publishing papers and holding private matches against what Robbolito? Are you out of your mind?
Even if that was true, that was like almost 10 years ago. What is next, you are gonna bring Deep Blue?
Straw-man is an understatement for this.
You're aware that this paper was submitted a year ago, right, long before the RTX series was announced? Google had no say the lead time here; that was up to the journal.
Not true, the paper was submitted in March, RTX was announced in August, but unofficially it was very well known that 20xx generation would have tensor cores support and be similar in performance to Titan V (that already existed) well before March this year.
Plus for such a ground-braking paper as it has been presented, taking 9 (or more than 12 if one counts their claim that already preprint was submitted (and probably rejected)) months to publish in Science is a hell of a lot of time, indicating that it might have passed through on "political" intervention of the chief editor, not because reviewers were satisfied. One more indication is the discrepancy between that PR stunt preprint from arXive that was allegedly already submitted to Science and actual submission was 4 months later.
If I'm defending them even if I don't work in the Google PR department, perhaps it's on what I perceive as a factual basis as opposed to just fanboyism?

To flip it around, you're not working in FairSearch.org, why so much need to attack Google?
C'mon man, you claim you worked for them, you clearly wish to work for them again, you have vested interest as large as a house and you are talking about objectivity, seriously?
And no I don't hold the same stance towards other private engines, coz none of them is created by a giant mean corporation
Ah, so you're having different standards for evidence for AlphaZero because you don't like the company behind it (a “giant mean corporation”, in your own words). Well, that's good to know, but it's hardly objective.
I don't work for any competitor of Google nor I have any financial or professional interest to "create negativity" towards them as you put it. I am having different standard because Google as a company is known to lie, cheat, break laws, bribe, be a total monopolist, etc, etc.
I can't really help you coz you see them exclusively true your rose-colored glasses...
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Alphazero news

Post by corres »

Sesse wrote: Sun Dec 09, 2018 10:52 pm
corres wrote: Sun Dec 09, 2018 8:07 am For reverse engineering it need to make working the studied object.
No, this is incorrect.
What kind of reverse engineering do you understand?
From an obfuscated program one can get its source (or at least what works similarly ) if the program is running.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Alphazero news

Post by Daniel Shawul »

@Mathew Lai

The following pseudo code for Dirichlet noise seems to be wrong

Code: Select all

# At the start of each search, we add dirichlet noise to the prior of the root
# to encourage the search to explore new actions.
def add_exploration_noise(config: AlphaZeroConfig, node: Node):
  actions = node.children.keys()
  noise = numpy.random.gamma(config.root_dirichlet_alpha, 1, len(actions))
  frac = config.root_exploration_fraction
  for a, n in zip(actions, noise):
    node.children[a].prior = node.children[a].prior * (1 - frac) + n * frac
Python has np.random.dirichlet() so one can use that directly. But in c++11 only gamma distribution is available so one can use
that to generate dirichlet noise but it needs normalization as shown here

https://en.wikipedia.org/wiki/Dirichlet ... tributions

The pseudo-code lacks the normalization. Btw is this applied even after 30 moves are made ?
User avatar
M ANSARI
Posts: 3707
Joined: Thu Mar 16, 2006 7:10 pm

Re: Alphazero news

Post by M ANSARI »

I don't get some of the negative talk about A0 and the constant bickering about test conditions ... have you guys played through some of those games ??? Obviously there is something "game changing" happening. If you go through the games of some of the human chess geniuses of our time, on a really good day they will come up with some amazing plans that seem so incredible and without any tangible tactical short term advantage based purely on intuition. But on many of these games the initial intuition is correct but then some slight tactical inaccuracy totally screws up the game. I feel A0 gives you that but with a much more developed tactical awareness. I can't help that a perfect engine would be something like A0, but with a SF type engine running passively and covering some deep tactical blind spots that A0 might miss. This engine would only flag a move if there is a large fail due to a tactic. Actually now that I think of it the standard SF is probably not the best for this and a tactical version would probably do better as a helper. I always dreamed of an engine that could use a strong CPU based engine and have a daughter card to do a Monte Carlo search passively. Maybe a better idea is to use something like A0 as the main engine on the daughter card and then have the CPU based engine do the sanity check.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Alphazero news

Post by Daniel Shawul »

M ANSARI wrote: Mon Dec 10, 2018 6:09 am I don't get some of the negative talk about A0 and the constant bickering about test conditions ... have you guys played through some of those games ??? Obviously there is something "game changing" happening. If you go through the games of some of the human chess geniuses of our time, on a really good day they will come up with some amazing plans that seem so incredible and without any tangible tactical short term advantage based purely on intuition. But on many of these games the initial intuition is correct but then some slight tactical inaccuracy totally screws up the game. I feel A0 gives you that but with a much more developed tactical awareness. I can't help that a perfect engine would be something like A0, but with a SF type engine running passively and covering some deep tactical blind spots that A0 might miss. This engine would only flag a move if there is a large fail due to a tactic. Actually now that I think of it the standard SF is probably not the best for this and a tactical version would probably do better as a helper. I always dreamed of an engine that could use a strong CPU based engine and have a daughter card to do a Monte Carlo search passively. Maybe a better idea is to use something like A0 as the main engine on the daughter card and then have the CPU based engine do the sanity check.
This is very easy to do and will avoid tactical blunders of mcts engines.
Say the first 30% of the time, you do a multi-pv search using Stockfish and get actual scores for all root moves.
Then for the rest of the time, you do mcts-nn search and get scores for all root moves.
Then you pick the move with highest (ab + mcts)/2 score. If the move is a tactical blunder it won't be picked -- the move
with highest tactical+strategic score will be picked. You can do this root move picking during the mcts simulations to guide
the mcts search towards something that optimizes both tactics and strategy.
I already do these things in scorpio-mcts-nn and it doesn't blunder like lc0 does.

I think with this approach it is best to use standard stockfish than something adapted to perform well on tactics only.
Sesse
Posts: 300
Joined: Mon Apr 30, 2018 11:51 pm

Re: Alphazero news

Post by Sesse »

corres wrote: Mon Dec 10, 2018 12:52 am What kind of reverse engineering do you understand?
I've done reverse engineering through static analysis, both professionally and as a hobby. A lot of that was done without ever running a line of the program (e.g. decompiling Mac software without ever having owned a Mac).
From an obfuscated program one can get its source (or at least what works similarly ) if the program is running.
You don't need the program to be running for that, just as much as you don't need a C compiler to understand what a C program listing does.
Sesse
Posts: 300
Joined: Mon Apr 30, 2018 11:51 pm

Re: Alphazero news

Post by Sesse »

Milos wrote: Mon Dec 10, 2018 12:24 am
Sesse wrote: Sun Dec 09, 2018 11:07 pm I'm not sure how you arrive at that conclusion. Google in general open-sources software for a lot of reasons, probably fairly similar to how other companies that deal with OSS works. Look at the amount of stuff flowing into the Linux kernel—is that for “only for PR purposes”? If so, that's really expensive PR.
No, that's for revenue, like all these new ways they are trying to generate revenue from Android.
OK, so you're backtracking, and it's not all for PR after all.

Of course Google does open-sourcing because it's good for business. You're setting up a false dichotomy—basically that they're either flawless altruists or they're evil and always lying through their teeth. I'd claim they are neither.
C'mon man, you claim you worked for them, you clearly wish to work for them again, you have vested interest as large as a house and you are talking about objectivity, seriously?
Uhm, do you think I'm posting on a chess forum to suck up, so that I can get back the job I left? If I for whatever reason would wish to work for Google again, I need to make one phone call, thank you very much. Maybe you should back down on the ad hominems.
I don't work for any competitor of Google nor I have any financial or professional interest to "create negativity" towards them as you put it.
OK, so now we're at using quotes around words I didn't actually say.
I am having different standard because Google as a company is known to lie, cheat, break laws, bribe, be a total monopolist, etc, etc.
I can't really help you coz you see them exclusively true your rose-colored glasses...
I'll just let this stand here.
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: Alphazero news

Post by matthewlai »

Daniel Shawul wrote: Mon Dec 10, 2018 5:41 am @Mathew Lai

The following pseudo code for Dirichlet noise seems to be wrong

Code: Select all

# At the start of each search, we add dirichlet noise to the prior of the root
# to encourage the search to explore new actions.
def add_exploration_noise(config: AlphaZeroConfig, node: Node):
  actions = node.children.keys()
  noise = numpy.random.gamma(config.root_dirichlet_alpha, 1, len(actions))
  frac = config.root_exploration_fraction
  for a, n in zip(actions, noise):
    node.children[a].prior = node.children[a].prior * (1 - frac) + n * frac
Python has np.random.dirichlet() so one can use that directly. But in c++11 only gamma distribution is available so one can use
that to generate dirichlet noise but it needs normalization as shown here

https://en.wikipedia.org/wiki/Dirichlet ... tributions

The pseudo-code lacks the normalization. Btw is this applied even after 30 moves are made ?
Thanks Daniel. That's not a part of the code I am very familiar with, but having checked our C++ code, I think you are right and this is an error transcribing to pseudo-code. We do have normalization in our C++ code.

It is used for the entire game (during training). Unlike visit count sampling, Dirichlet noise wouldn't cause us to blunder (since there's still a search on top of it), so it's safe to use for the whole game.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.