LCzero sacs a knight for nothing

jdart · Post by **jdart** » Fri Apr 20, 2018 4:01 am

Therefore my conclusions are that at 1 min / move A0@1080Ti is playing at level comparable to SF8@64 cores.

That may be so. But the TCEC version of LC0 is doing badly in Division 4. It is not even going to get to play Stockfish.

--Jon

Daniel Shawul · Post by **Daniel Shawul** » Fri Apr 20, 2018 4:57 am

Why this argument again? How many times have I already mentioned that A0 on 1080TI + 1min / move should be of comparable strength to SF8@64 cores 1min / move?

A GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage. Why not use the same 64 CPU cores for it and see if it will beat Stockfish ?
You have elevated the minimum hardware requirement for A0 to be competitive with Stockfish to 11 x 64 = 700 CPU cores each thinking for 1 min per move. So if you had used one core, which is the standard in chess rating lists, you have to use a time control of 11 hours per move

What I know is their result over stockfish is obtained with 4-TPUs (180 TFlops) that is a 180x advantage. Do you really think the result will be the same if the A0 used the 64 cores?

In any case you are backing off from your "bet" that the net is going to improve the tactics. Now you insist on some form of hardware advantage to cover for tactical weakness. Which is it?

Michel · Post by **Michel** » Fri Apr 20, 2018 8:19 am

GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage.

I don't think you can see it like that. GPU's are optimized for floating point operations but they are poor for other types of operations that CPU's excel in.

I am not an expert on this but I am sure that by focusing on some other type of operation you would be equally able to argue that SF had a hardware advantage.

smatovic · Post by **smatovic** » Fri Apr 20, 2018 9:14 am

"A GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage."

Maybe, but Stockfish uses about 10 esoteric tree pruning techniques over the simple MCTS
of LC0...what kind of advantage is this?

LC0 exists since January 2018, do you think there will be no progress in improving
the Monte-Carlo methods?

I am sure the LC0 team has plenty of improvements in pipe,
but they have to reproduce the A0 results first...

--
Srdja

mirek · Post by **mirek** » Fri Apr 20, 2018 10:27 am

Daniel Shawul wrote: A GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage. Why not use the same 64 CPU cores for it and see if it will beat Stockfish ?
You have elevated the minimum hardware requirement for A0 to be competitive with Stockfish to 11 x 64 = 700 CPU cores each thinking for 1 min per move. So if you had used one core, which is the standard in chess rating lists, you have to use a time control of 11 hours per move

Ah this again? So comparing performance per $ or performance per Watt obviously doesn't concern you, right? I can see team of scientists deciding if they will run their simulation on 10x1080Ti or 7000 cpu cores and in the end they will go for the cpus, because while the GPUs would be cheaper it would also provide an unfair advantage over cpu

We will see what HW and engines e.g. correspondence players will use once LC0 gets to A0 level. If you want to insist that 1 CPU core is the only correct metric on which to measure engine strength than have it your way. But don't be surprised if in correspondence game you get completely smashed by engine which according to your "correct 1 core rating" will be like 500 elo weaker to you alpha-beta searcher.

Daniel Shawul wrote: In any case you are backing off from your "bet" that the net is going to improve the tactics. Now you insist on some form of hardware advantage to cover for tactical weakness. Which is it?

I am not backing off, the test is independent of hardware, we can just let it calculate roughly as many nodes as were used in game before the blunder was played and see how it goes with new weights e.g. in weekly intervals. Until the net gets resized again.

jkiliani · Post by **jkiliani** » Fri Apr 20, 2018 10:34 am

mirek wrote:
Daniel Shawul wrote: A GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage. Why not use the same 64 CPU cores for it and see if it will beat Stockfish ?
You have elevated the minimum hardware requirement for A0 to be competitive with Stockfish to 11 x 64 = 700 CPU cores each thinking for 1 min per move. So if you had used one core, which is the standard in chess rating lists, you have to use a time control of 11 hours per move
Ah this again? So comparing performance per $ or performance per Watt obviously doesn't concern you, right? I can see team of scientists deciding if they will run their simulation on 10x1080Ti or 7000 cpu cores and in the end they will go for the cpus, because while the GPUs would be cheaper it would also provide an unfair advantage over cpu

We will see what HW and engines e.g. correspondence players will use once LC0 gets to A0 level. If you want to insist that 1 CPU core is the only correct metric on which to measure engine strength than have it your way. But don't be surprised if in correspondence game you get completely smashed by engine which according to your "correct 1 core rating" will be like 500 elo weaker to you alpha-beta searcher.

Daniel Shawul wrote: In any case you are backing off from your "bet" that the net is going to improve the tactics. Now you insist on some form of hardware advantage to cover for tactical weakness. Which is it?
I am not backing off, the test is independent of hardware, we can just let it calculate roughly as many nodes as were used in game before the blunder was played and see how it goes with new weights e.g. in weekly intervals. Until the net gets resized again.

To accommodate Daniel's valid concerns, we should run both Alpha-Beta engines and neural net engines on the abacus

Seriously, equivalent power use is a fair metric. Otherwise what would be the point of improving hardware at all? If Alpha-Beta crunchers find a good way to use modern GPUs they should implement these by all means...

noobpwnftw · Post by **noobpwnftw** » Fri Apr 20, 2018 10:41 am

I just couldn't resist to launch a test of SF vs SF at many cores.

It is pretty obvious that a 384-core SF with some SMP tuning will crash the 32-core SF by a far margin, and those tunings are not "designed" for less cores, so what now?
I think every claim between A0 and SF still hold true between mine. Try me.

OneTrickPony · Post by **OneTrickPony** » Fri Apr 20, 2018 11:14 am

Not at all - I think you've raised some very interesting points! MCTS averaging does seem fundamentally mismatched to Chess. That's why I was so amazed A0 actually worked.

It seems it's not that good in go either as shown by recent Leela games it blunders tactics in go on regular basis. Go is a bit different as it's possible to kill humans there without much tactical awareness but winning against humans in a board game isn't exactly a high bar these days. In engine vs engine matches it's clear that MCTS in pure form isn't working.
There is some kind of component missing, for example now you can get to 100k playouts, discover that the line is a total disaster (losing by force) and it will take a longer while for it to prefer a different move.

I personally believe policy guided search and policy being trained on many games will work but the move selection itself will be more in line with alpha/beta in the end. Overall I am excited (and wish I had time away from programming engines for card games to participate). If anything Leela plays like a naive optimistic human 2200-2300ELO and that's really cool to have

Milos · Post by **Milos** » Fri Apr 20, 2018 11:25 am

Daniel Shawul wrote:
Why this argument again? How many times have I already mentioned that A0 on 1080TI + 1min / move should be of comparable strength to SF8@64 cores 1min / move?
A GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage. Why not use the same 64 CPU cores for it and see if it will beat Stockfish ?

Daniel usually I agree with most of the things you write but in this particular case you are quite wrong.
LC0 on GTX 1080Ti gets around 2.5knps on average.
LC0 on my (old Sandybridge) 16 cores machine running on 32 threads gets 2knps on average.
LC0 on 64 cores would get at least 8knps. With Intel MKL library instead of crappy openBLAS that could easily go to 20knps. That is already much better than what LC0 would have on V100 and almost comparable to LC0 performance running on a single first-gen TPU.
CPU is still the king no matter what a lot of ppl here would try to convince you.

jkiliani · Post by **jkiliani** » Fri Apr 20, 2018 11:27 am

OneTrickPony wrote:
Not at all - I think you've raised some very interesting points! MCTS averaging does seem fundamentally mismatched to Chess. That's why I was so amazed A0 actually worked.
It seems it's not that good in go either as shown by recent Leela games it blunders tactics in go on regular basis. Go is a bit different as it's possible to kill humans there without much tactical awareness but winning against humans in a board game isn't exactly a high bar these days. In engine vs engine matches it's clear that MCTS in pure form isn't working.
There is some kind of component missing, for example now you can get to 100k playouts, discover that the line is a total disaster (losing by force) and it will take a longer while for it to prefer a different move.

I personally believe policy guided search and policy being trained on many games will work but the move selection itself will be more in line with alpha/beta in the end. Overall I am excited (and wish I had time away from programming engines for card games to participate). If anything Leela plays like a naive optimistic human 2200-2300ELO and that's really cool to have

Not sure what you're talking about here. Leela (as in Leela 0.11) certainly has tactical weaknesses, but that's an MCTS engine with a neural net, not a pure NN engine like Leela Zero.

And while Leela Zero may still have some tactical vulnerabilities, they're getting really hard to exploit, certainly for humans.

Agreed that policy guided search has some similarity to Alpha-Beta on a mature, larger neural net.

LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing

Re: LCzero sacs a knight for nothing