LC0 on 43 cores had a ~2700 CCRL ELO performance.

George Tsavdaris · Post by **George Tsavdaris** » Wed Apr 18, 2018 9:40 am

Milos wrote: A0 was not better than SF8 even with 4TPUs. Rigged tests and and cherry picked results mean nothing.

Deepmind paper states that out from 1300 games A0 played with Stockfish the result was +318 =958 -24 in favor of A0 so how A0 is not much much better than SF??

Do you accuse Deepmind, a research company(that brought revolution in GO world and computer GO world) of fabricating the data on its papers?
If they didn't give us the 3 papers and didn't change the GO world with their astonishing results, we would all believe their claims to be nonsensical, but now there is little doubt that they are not 100% genuine.

Werewolf · Post by **Werewolf** » Wed Apr 18, 2018 9:49 am

George Tsavdaris wrote:
Milos wrote: A0 was not better than SF8 even with 4TPUs. Rigged tests and and cherry picked results mean nothing.
Deepmind paper states that out from 1300 games A0 played with Stockfish the result was +318 =958 -24 in favor of A0 so how A0 is not much much better than SF??

Do you accuse Deepmind, a research company(that brought revolution in GO world and computer GO world) of fabricating the data on its papers?
If they didn't give us the 3 papers and didn't change the GO world with their astonishing results, we would all believe their claims to be nonsensical, but now there is little doubt that they are not 100% genuine.

It was a crippled Stockfish though

hgm · Post by **hgm** » Wed Apr 18, 2018 9:53 am

Crippled by having to play under the same conditions as AlphaZero, no doubt.

Werewolf · Post by **Werewolf** » Wed Apr 18, 2018 10:02 am

hgm wrote:Crippled by having to play under the same conditions as AlphaZero, no doubt.

Not so.

There are loads of settings they could have improved if they wanted to for Stockfish.

And there's no defence to using such tiny hash and having HT on.

duncan · Post by **duncan** » Wed Apr 18, 2018 10:40 am

Werewolf wrote:
George Tsavdaris wrote:
Milos wrote: A0 was not better than SF8 even with 4TPUs. Rigged tests and and cherry picked results mean nothing.
Deepmind paper states that out from 1300 games A0 played with Stockfish the result was +318 =958 -24 in favor of A0 so how A0 is not much much better than SF??

Do you accuse Deepmind, a research company(that brought revolution in GO world and computer GO world) of fabricating the data on its papers?
If they didn't give us the 3 papers and didn't change the GO world with their astonishing results, we would all believe their claims to be nonsensical, but now there is little doubt that they are not 100% genuine.
It was a crippled Stockfish though

crippled by how many elo. 50-70 ?

George Tsavdaris · Post by **George Tsavdaris** » Wed Apr 18, 2018 10:45 am

Werewolf wrote:
George Tsavdaris wrote:
Milos wrote: A0 was not better than SF8 even with 4TPUs. Rigged tests and and cherry picked results mean nothing.
Deepmind paper states that out from 1300 games A0 played with Stockfish the result was +318 =958 -24 in favor of A0 so how A0 is not much much better than SF??

Do you accuse Deepmind, a research company(that brought revolution in GO world and computer GO world) of fabricating the data on its papers?
If they didn't give us the 3 papers and didn't change the GO world with their astonishing results, we would all believe their claims to be nonsensical, but now there is little doubt that they are not 100% genuine.
It was a crippled Stockfish though

Crippled in what sense?
It had 64 cores(i can't say that this is crippled) and equal time control with its opponent.
Only the hash size was stupidly low but how much could affect it?

duncan · Post by **duncan** » Wed Apr 18, 2018 10:46 am

Milos wrote: What Daniel is constantly pointing is that alpha-beta as general search might not have the future, but there is absolutely zero evidence that NN evaluation is the key. A0 got its performance from massive amount of rollouts thanks to massive hardware. If you used better MCTS and AB for leaves evaluation instead of NN you could get better performance on comparably massive hardware (1k CPU cores).

what about elo per dollar. which gives you a better deal. NN or MCTS and AB for leaves evaluation ?

if the latter is not affordable, then it is not the way ahead.

George Tsavdaris · Post by **George Tsavdaris** » Wed Apr 18, 2018 10:48 am

Werewolf wrote:
hgm wrote:Crippled by having to play under the same conditions as AlphaZero, no doubt.
Not so.

There are loads of settings they could have improved if they wanted to for Stockfish.

These are just excuses.
I only know that Stockfish was running on 64 CPUS which is HUGE.
People complain about the time control to be dubious but both engines had the same time so i don't see where is anything bad on using that. And find me an engine that will win, actually humiliate Stockfish on 64 CPUs with that time control, like A0 did.

Only the small hash was a stupid decision to use but how much it could affect its strength? 5 ELO?

mirek · Post by **mirek** » Wed Apr 18, 2018 11:00 am

Milos wrote: What Daniel is constantly pointing is that alpha-beta as general search might not have the future, but there is absolutely zero evidence that NN evaluation is the key. A0 got its performance from massive amount of rollouts thanks to massive hardware. If you used better MCTS and AB for leaves evaluation instead of NN you could get better performance on comparably massive hardware (1k CPU cores).

Once again: for AB engines to compete you would need some breakthrough in how searched variations get pruned. For last 20 years AB engines became much better at this, yet were bested by A0 just by few days of training (on powerful hardware)

What additional years of hand written development would you expect it will take for non-NN engines to catch up with A0 in this department? Because they are clearly lagging at the moment. It's not obvious to me at all how you could do the needed huge jump in the pruning effectiveness without some serious knowledge of tactical and strategical chess patterns and how you could then effectively represent them without NN. If that was possible in near future then yes, non-NN engines could become much better than NN ones. But in reality that's not likely to happen - not in a time-span where NN based engines will also become much better (better NN architecture, better HW etc.)

Uri Blass · Post by **Uri Blass** » Wed Apr 18, 2018 12:39 pm

mirek wrote:
Milos wrote: What Daniel is constantly pointing is that alpha-beta as general search might not have the future, but there is absolutely zero evidence that NN evaluation is the key. A0 got its performance from massive amount of rollouts thanks to massive hardware. If you used better MCTS and AB for leaves evaluation instead of NN you could get better performance on comparably massive hardware (1k CPU cores).
Once again: for AB engines to compete you would need some breakthrough in how searched variations get pruned. For last 20 years AB engines became much better at this, yet were bested by A0 just by few days of training (on powerful hardware)

What additional years of hand written development would you expect it will take for non-NN engines to catch up with A0 in this department? Because they are clearly lagging at the moment. It's not obvious to me at all how you could do the needed huge jump in the pruning effectiveness without some serious knowledge of tactical and strategical chess patterns and how you could then effectively represent them without NN. If that was possible in near future then yes, non-NN engines could become much better than NN ones. But in reality that's not likely to happen - not in a time-span where NN based engines will also become much better (better NN architecture, better HW etc.)

I think that it should be simple to make a big progress only at long time control.

You only need in the eval stage to do play against yourself to make the engine stronger at long time control.

For example it may easily detect fortress positions in the leaves because the result of play against itself is going to be a draw.

It is not only about fortress positions but also about positions when one side has an advantage that it takes long time to convert to something engines of today understand and by using normal alphabeta the engines will not find it in a reasonable time.

Note that Stockfish is optimized only for bullet and of course this idea does not work at bullet.

LC0 on 43 cores had a ~2700 CCRL ELO performance.

Re: LC0 on 43 cores had a ~2700 CCRL ELO performance.

Re: LC0 on 43 cores had a ~2700 CCRL ELO performance.

Re: LC0 on 43 cores had a ~2700 CCRL ELO performance.

Re: LC0 on 43 cores had a ~2700 CCRL ELO performance.

Re: LC0 on 43 cores had a ~2700 CCRL ELO performance.

Re: LC0 on 43 cores had a ~2700 CCRL ELO performance.

Re: LC0 on 43 cores had a ~2700 CCRL ELO performance.

Re: LC0 on 43 cores had a ~2700 CCRL ELO performance.

Re: LC0 on 43 cores had a ~2700 CCRL ELO performance.

Re: LC0 on 43 cores had a ~2700 CCRL ELO performance.