I can't believe that so many people don't get it!

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 21, 2017 7:09 pm

Rebel wrote:
pilgrimdan wrote:
Rebel wrote:
hgm wrote:
Rebel wrote:All of the document can be true, except that a paragraph of how AZ learned SF8 first was left out.
That would make them die-hard liars. Lying by omission is still lying. It would be considered gross scientific fraud.
Yep.

If I remember correctly you are doing this stuff even longer than me and I would say this AZ thing (provided the conditions of the match meet the scientific standard) by far is the biggest breakthrough in computer chess ever. Would you not agree with me? And the paper doesn't meet the scientific standard. Hence I prefer (as announced in CTF) to stick to my DA role for the moment, discuss every detail, until everything is said, people might see that as strange but I feel it as an obligation.

The paper then. Reading it I would say the author(s) have a good understanding of computer chess in general, excellent understanding of the inner works of a chess program, some members of the deepmind team are (maybe even long time) members and lurk here because it is likely they know this is the place where the programmers meet and where their document will be scrutinized and yet I have to believe they don't know how to properly play a fair match? Is that stupidity? If not stupidity then what is it?

There are indeed reasons to believe (we discussed it) all 100 games were played from the start position, how stupid is that? And if not stupidity then what is it? Did they not know you either play from predefined opening lines or from an opening book? If only it were to avoid doubles. They did not know?

Did they not know by doing so they favored AZ?

From the paper we read AZ learned the most common openings and left SF in the dark, not allowing an opening book. They did not know that is unfair?

Of course they knew.

And yet they decided as they decided.

Why?

I consider the "why" question as one of the most important question in life. Everything happens for a reason.

~~~~~

I proposed a working model, learning an opponent from the start position, we even have a proven case (Mchess 5) from the past during the RGCC 96/97 period.

Not showing us all 100 games, the fixed 1 minute TC all fit well in this picture.

Adding up all things I am a sceptic for good reasons.

I was told that at the Free University (or was it UvA) only two thesis defenses in all of the history of the university had not resulted in granting the Ph.D. degree. In one of them the student appeared stone drunk. The other was for a thesis that discussed an experimental treatment of a certain kind of cancer, which by the 10 case studies treated in the thesis looked very good. And then during questioning, it turned out that the fact that 90 other patients submitted to this same treatment had died had been omitted...
Terrible indeed.
hi Ed...

it may well be like you said they perfectly knew...

one thought...

this may have been the 'optimal conditions' so that alphazero would not lose one single game...

and still make it look plausible...

and as far as the chess programming community...

their 'not having one single loss' no matter how 'odd' the conditions...

is what their only conern was...

if it isolated the computer chess community...

then so be it...

pretty ruthless and cut-throat...

but, obviously they didn't care...

only that alphzero did not have any losses...

and the 'setup' they used did the trick...
Well, maybe the paper is meant as a pilot balloon to measure the criticism before they make it official on the deepmind page. AlphaGo is up. The wait is for AlphaZero.

Something else I forgot, the hardware, from the paper:
Code: Select all
AlphaZero NPS 80,000 
Stockfish NPS 70,000,000
For the casual reader they make it appear SF had a giant advantage, but they don't mention the hardware advantage of the TPU's over the SF hardware. It's hard to really know the advantage factor, I read numbers varying from 50-100 to 1000. It's another unscientific way to present facts.

I bet their eyes are right on this very forum.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 21, 2017 7:16 pm

CheckersGuy wrote:They dont make SF appear stronger at all. It's just you and some other ppl thinking that they did. It`s just a fact that AlphaGo looked at much fewer nodes than SF. If one actually read the paper it should be clear that AlphaZero had better hardware (at least for the training part).

Additionally, this paper is just a preprint. It does not need to contain all the information as long as the actually paper does contain that information which is missing.
Seeing that almost every 2nd post is about DeepMind not providing enough infromation is really really stupid and the only reason why we have such posts is that ppl are too lazy to read...

I guess most people here are just re-printing and re-counting.
And actually, this is the very case.

Looking at 1000 times less nodes, what does that mean?
Where are they spending their time and hardware?

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 21, 2017 7:22 pm

hgm wrote:
Rebel wrote:If I remember correctly you are doing this stuff even longer than me and I would say this AZ thing (provided the conditions of the match meet the scientific standard) by far is the biggest breakthrough in computer chess ever. Would you not agree with me?
I agree it is pretty big. But the fact that before they did the same thing for Go somewhat eclipses it. It is well kown that Chess is just a trivial game compared to Go. So the large majority of mankind that do not happen to be Chess players will just shrug about it, as much as you would when someone came to you with a Tic Tac Toe program, being very excited that it it leared all by itself to become unbeatable. "Computer that could master a difficult game can now also masters a simple game"... So what?

And the paper doesn't meet the scientific standard.
Well, I used to be a scientist in my professional carreer, ad I do't think there is much wrong with it. If I had been asked to referee this paper, I would just have required one change: that they cannot say "TCEC world champion", and should drop the "world".

Hence I prefer (as announced in CTF) to stick to my DA role for the moment, discuss every detail, until everything is said, people might see that as strange but I feel it as an obligation.
Nothing wrong with being DA. (Reminds me of a movie with a brilliant Al Pacino, BTW.)

There are indeed reasons to believe (we discussed it) all 100 games were played from the start position, how stupid is that?
Aren't Chess games by definition not always starting from that position? If Carlsen plays Anand for the world title, don't all the games start from that position? I don't think it would officially qualify as 'Chess' when you start from another position. That would be one strong reason to do it.

Starting from know opening lines would weaken their claim that the computer played only from knowledge it learned itself. People could say: "but you started from selected good positions, which the program might not have been able to find by itself. You made it develop his pieces, and left alone it migh just have been shuttling its Rook between a1 and a2". That would also be a good reason to start all games from the official opening position.

BTW, they also started games from 12 other positions, but attach less significance to the fact that they also beat Stockfish there.

And if not stupidity then what is it? Did they not know you either play from predefined opening lines or from an opening book? If only it were to avoid doubles. They did not know?
Well, for one, I do often play matches starting from the same position. If the engines randomize enough, there is nothing against that. Problems with duplicats and games that only diverge after being decided do not occur in that case.

So it seems indeed a legitimate questions whether there was enough game diversity in 100 games. This is a reason to request publication of all the 100 games, (which most scientific journals would not like at all; they are not Chess magazins...), or at least adding some text to address the point, like revealing the longest line they had in common. Something like: "All games had diverged after 15 moves, and on average the number of initial moves shared between games was 6". That would satisfy me. If they would say instead 80 and 45, there obviously is a problem, close to fraud.

Did they not know by doing so they favored AZ?
Of course they did not know that, because it is not true, and they are not stupid... Both were playing without book, so there was no bias.

From the paper we read AZ learned the most common openings and left SF in the dark, not allowing an opening book. They did not know that is unfair?
AZ played the openings from its general Chess knowledge. Common openings are common because they consist of good moves. If Stockfish was any good, it would also prefer common openings. If not... well, perhaps Stockfish isn't so strong as people want to make it out to be. Perhaps winning TCEC doesn't mean as much as people think, so that beatig the TCEC winner fair and square doesn't mean a thing, because TCEC is typically won by programs that utterly suck at Chess...

The chess evaluation is around 1000 times more complex than Go's, so how could you say chess is trivial next to Go?

There are couple of reasons for their success in go:
- much simpler evaluation
- go is much less researched than chess what concerns software
- the much larger board, that makes calculations much more significant than knowledge

So that, the Go story is just that: a story of an easy success.
They will not solve chess even in 1000 years from now.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 21, 2017 7:26 pm

Michael Sherwin wrote:
corres wrote:
Michael Sherwin wrote: AlphaZ beat SF by the use of a 'simple trick' called a learn file with reinforcement learning. RomiChess demonstrated the same 'simple trick' 11 years ago against the world's strongest chess engine at the time beating Rybka.
It has been established that A0 has a learn file that it saves all its trained games in and stores wins, losses, draws and a percentage chance to win. RomiChess does the exact same thing. Here is a record from Romi's learn file.
Record 1 sib 487 chd 2 fs 12 ts 28 t 0 f 0 d 15 s 0 score 17 w 283 L 264 d 191
Record Number
First Sibling Record
First Child Record
From Square
To Square
Type of Move
Flags
Depth
Status
Score, reinforcement learning rewards/penalties
White Wins
Black Wins
Draws
Store a million complete games that have been guided by the stats in the learn file and tactics unlimited ply deep can be found and stored and played back or the search can be guided to find them. It is just a 'simple trick'.
I put 'simple trick' in single quotes because it is a valid trick and not some swindle. If an engine is programmed to do this then more power to it! The wins are legit and if an engine like SF, K or H etc. lose because they don't have this type of learning then tough cookies!
You are right basically.
But. Can you estimate the measure of that learning file what makes from Romi an engine with 3400 Elo?
It is pity but the team of DeepMind did not give me any information about the measure of (programmable) memory used by AlphaZero for neural network. I am afraid a Romi type engine with 3400 Elo needs much more bigger memory to use for learning file as the AlphaZero have.
Moreover a system based on neural network is more flexible and effective than using a learning file only.
I'm not sure what you are asking but I will give as much information as I can.

Romi's learn file is stored on the hard drive. It is modified on the hard drive. The only part of it that is brought into memory is the subtree of the current position if there is one. And that is stored in the hash table so no extra memory footprint is created.

Romi only being a 2425 ccrl elo engine needs to learn a lot of good moves to win games against way stronger engines. A top engine can take advantage of much less learning just simply because only one move is all it may need. A top engine will show a positive learning curve much sooner.

"Moreover a system based on neural network is more flexible and effective than using a learning file only."

Romi does not use a learn file only. Technically there is no learning in a learn file. It is just data recording results. The real learning happens when the nodes are moved from the data tree to the hash file. The data moved into the hash is what allows the search to learn and hopefully play better moves. Those nodes moved into the hash are each a little nugget of accumulated knowledge that goes beyond the understanding of the eval and results in super human looking play. If an engine that achieves a 3800 elo can play near perfect chess then RL may not help much. If instead the elo ceiling is at 5000 or higher then RL can produce giant gains in elo with enough games. Romi's elo gain is linear in the range of 1 to 1000 elo in only 400 games using only 10 starting positions against one opponent. That is 2.5 elo per game. Against a humongous book and iirc 6 top engines Romi's elo gain was 50 elo per 5000 games.

If you are able to draw SF at 2425, why would not Alpha be able to beat SF at a bit higher level?
Actually, are not they just 2500?

What is your score against SF after 100 games?

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 21, 2017 7:30 pm

Michael Sherwin wrote:
mjlef wrote:Although AlphaZero's neural network could indeed learn specific opening moves, it does a lot more. The neural network is used to do thee two things:

a. predict the probability that a move will be played (actually, the position after the move is made)
b. predict the probability of a specific position is a win.loss/draw

These outputs are used to guide the search.

What you describe is a scheme to learn better openings. But you have no mentioned if your scheme helps for the entire game. Based on your description, it does not seem to effect either the evaluation or the search once it reaches positions not in the previous games. AlphaZero is a generalized neural network which outputs the above information even for positions it has never seen.
1. Experiential data with a reinforcement scheme is brought into the hash file.
2. The search produces statistical information that modifies the pst's of the evaluator
3. The modified evaluator then affects the search which affects the evaluator.
4. This last long after the experiential data has run out.
5. There is a lasting generalized effect that is not static but can morph as the needs of the position changes.

It is very crude but it works.

I guess the first thing they should say is if they use evaluation or not?
In case they do, how many terms do they have and what those are?

Is not that all about the NN, why are they not disclosing the most important part?

Any guess why an approach that has produced at most engines at 2400 elos on normal hardware would suddenly start working on their machines?

No sense to this, simply no sense, sorry.

Evert · Post by **Evert** » Thu Dec 21, 2017 9:21 pm

Lyudmil Tsvetkov wrote: The chess evaluation is around 1000 times more complex than Go's, so how could you say chess is trivial next to Go?

There are couple of reasons for their success in go:
- much simpler evaluation
- go is much less researched than chess what concerns software
- the much larger board, that makes calculations much more significant than knowledge

Oh yeah, Go is much easier to handle than Chess. That's why computers have completely dominated at Go for decades while they still struggle against humans in Chess.

Sorry, what parallel universe are you from, exactly?

hgm · Post by **hgm** » Thu Dec 21, 2017 9:28 pm

Lyudmil Tsvetkov wrote:Looking at 1000 times less nodes, what does that mean?
Where are they spending their time and hardware?

They spend the hardware on running the NN. To let it thik about which move to search in every tree node.

Michael Sherwin · Post by **Michael Sherwin** » Fri Dec 22, 2017 5:09 am

Lyudmil Tsvetkov wrote:
Michael Sherwin wrote:
corres wrote:
Michael Sherwin wrote: AlphaZ beat SF by the use of a 'simple trick' called a learn file with reinforcement learning. RomiChess demonstrated the same 'simple trick' 11 years ago against the world's strongest chess engine at the time beating Rybka.
It has been established that A0 has a learn file that it saves all its trained games in and stores wins, losses, draws and a percentage chance to win. RomiChess does the exact same thing. Here is a record from Romi's learn file.
Record 1 sib 487 chd 2 fs 12 ts 28 t 0 f 0 d 15 s 0 score 17 w 283 L 264 d 191
Record Number
First Sibling Record
First Child Record
From Square
To Square
Type of Move
Flags
Depth
Status
Score, reinforcement learning rewards/penalties
White Wins
Black Wins
Draws
Store a million complete games that have been guided by the stats in the learn file and tactics unlimited ply deep can be found and stored and played back or the search can be guided to find them. It is just a 'simple trick'.
I put 'simple trick' in single quotes because it is a valid trick and not some swindle. If an engine is programmed to do this then more power to it! The wins are legit and if an engine like SF, K or H etc. lose because they don't have this type of learning then tough cookies!
You are right basically.
But. Can you estimate the measure of that learning file what makes from Romi an engine with 3400 Elo?
It is pity but the team of DeepMind did not give me any information about the measure of (programmable) memory used by AlphaZero for neural network. I am afraid a Romi type engine with 3400 Elo needs much more bigger memory to use for learning file as the AlphaZero have.
Moreover a system based on neural network is more flexible and effective than using a learning file only.
I'm not sure what you are asking but I will give as much information as I can.

Romi's learn file is stored on the hard drive. It is modified on the hard drive. The only part of it that is brought into memory is the subtree of the current position if there is one. And that is stored in the hash table so no extra memory footprint is created.

Romi only being a 2425 ccrl elo engine needs to learn a lot of good moves to win games against way stronger engines. A top engine can take advantage of much less learning just simply because only one move is all it may need. A top engine will show a positive learning curve much sooner.

"Moreover a system based on neural network is more flexible and effective than using a learning file only."

Romi does not use a learn file only. Technically there is no learning in a learn file. It is just data recording results. The real learning happens when the nodes are moved from the data tree to the hash file. The data moved into the hash is what allows the search to learn and hopefully play better moves. Those nodes moved into the hash are each a little nugget of accumulated knowledge that goes beyond the understanding of the eval and results in super human looking play. If an engine that achieves a 3800 elo can play near perfect chess then RL may not help much. If instead the elo ceiling is at 5000 or higher then RL can produce giant gains in elo with enough games. Romi's elo gain is linear in the range of 1 to 1000 elo in only 400 games using only 10 starting positions against one opponent. That is 2.5 elo per game. Against a humongous book and iirc 6 top engines Romi's elo gain was 50 elo per 5000 games.
If you are able to draw SF at 2425, why would not Alpha be able to beat SF at a bit higher level?
Actually, are not they just 2500?

What is your score against SF after 100 games?

Without learning Romi is a 2400 level engine. With learning Romi's elo would rise. The problem is that the only person who cared, died. And since then no organized tournament or rating agency allows Romi to learn if it plays. Romi would climb in the rating list if it was allowed to use its learning.

Against multithreaded SF because it constantly changes its play Romi can not show any gain in 100 games except if the starting position is already highly favorable to Romi. Against single threaded SF if that means SF will always play the same then in 100 games I suspect Romi would win. I have not tested that since Glaurung days. I'll test it now.

CheckersGuy · Post by **CheckersGuy** » Fri Dec 22, 2017 7:11 am

Lyudmil Tsvetkov wrote:
hgm wrote:
Rebel wrote:If I remember correctly you are doing this stuff even longer than me and I would say this AZ thing (provided the conditions of the match meet the scientific standard) by far is the biggest breakthrough in computer chess ever. Would you not agree with me?
I agree it is pretty big. But the fact that before they did the same thing for Go somewhat eclipses it. It is well kown that Chess is just a trivial game compared to Go. So the large majority of mankind that do not happen to be Chess players will just shrug about it, as much as you would when someone came to you with a Tic Tac Toe program, being very excited that it it leared all by itself to become unbeatable. "Computer that could master a difficult game can now also masters a simple game"... So what?

And the paper doesn't meet the scientific standard.
Well, I used to be a scientist in my professional carreer, ad I do't think there is much wrong with it. If I had been asked to referee this paper, I would just have required one change: that they cannot say "TCEC world champion", and should drop the "world".

Hence I prefer (as announced in CTF) to stick to my DA role for the moment, discuss every detail, until everything is said, people might see that as strange but I feel it as an obligation.
Nothing wrong with being DA. (Reminds me of a movie with a brilliant Al Pacino, BTW.)

There are indeed reasons to believe (we discussed it) all 100 games were played from the start position, how stupid is that?
Aren't Chess games by definition not always starting from that position? If Carlsen plays Anand for the world title, don't all the games start from that position? I don't think it would officially qualify as 'Chess' when you start from another position. That would be one strong reason to do it.

Starting from know opening lines would weaken their claim that the computer played only from knowledge it learned itself. People could say: "but you started from selected good positions, which the program might not have been able to find by itself. You made it develop his pieces, and left alone it migh just have been shuttling its Rook between a1 and a2". That would also be a good reason to start all games from the official opening position.

BTW, they also started games from 12 other positions, but attach less significance to the fact that they also beat Stockfish there.

And if not stupidity then what is it? Did they not know you either play from predefined opening lines or from an opening book? If only it were to avoid doubles. They did not know?
Well, for one, I do often play matches starting from the same position. If the engines randomize enough, there is nothing against that. Problems with duplicats and games that only diverge after being decided do not occur in that case.

So it seems indeed a legitimate questions whether there was enough game diversity in 100 games. This is a reason to request publication of all the 100 games, (which most scientific journals would not like at all; they are not Chess magazins...), or at least adding some text to address the point, like revealing the longest line they had in common. Something like: "All games had diverged after 15 moves, and on average the number of initial moves shared between games was 6". That would satisfy me. If they would say instead 80 and 45, there obviously is a problem, close to fraud.

Did they not know by doing so they favored AZ?
Of course they did not know that, because it is not true, and they are not stupid... Both were playing without book, so there was no bias.

From the paper we read AZ learned the most common openings and left SF in the dark, not allowing an opening book. They did not know that is unfair?
AZ played the openings from its general Chess knowledge. Common openings are common because they consist of good moves. If Stockfish was any good, it would also prefer common openings. If not... well, perhaps Stockfish isn't so strong as people want to make it out to be. Perhaps winning TCEC doesn't mean as much as people think, so that beatig the TCEC winner fair and square doesn't mean a thing, because TCEC is typically won by programs that utterly suck at Chess...
The chess evaluation is around 1000 times more complex than Go's, so how could you say chess is trivial next to Go?

There are couple of reasons for their success in go:
- much simpler evaluation
- go is much less researched than chess what concerns software
- the much larger board, that makes calculations much more significant than knowledge

So that, the Go story is just that: a story of an easy success.
They will not solve chess even in 1000 years from now.

No one said that they will solve chess that's a complety different topic. No matter how strong alphaZero is (even if it was 6000 elo) that wouldn't solve chess at all.

Rebel · Post by **Rebel** » Fri Dec 22, 2017 7:22 am

CheckersGuy wrote:Additionally, this paper is just a preprint

And a very biased one.

Google exactly knows the hardware advantage they had during the match and it isn't mentioned in the paper.

I can't believe that so many people don't get it!

Re: I can't believe that so many people don't get it!

Re: I can't believe that so many people don't get it!

Re: I can't believe that so many people don't get it!

Re: I can't believe that so many people don't get it!

Re: I can't believe that so many people don't get it!

Re: I can't believe that so many people don't get it!

Re: I can't believe that so many people don't get it!

Re: I can't believe that so many people don't get it!

Re: I can't believe that so many people don't get it!

Re: I can't believe that so many people don't get it!