That may be so. But the TCEC version of LC0 is doing badly in Division 4. It is not even going to get to play Stockfish.Therefore my conclusions are that at 1 min / move A0@1080Ti is playing at level comparable to SF8@64 cores.
--Jon
Moderators: hgm, Rebel, chrisw
That may be so. But the TCEC version of LC0 is doing badly in Division 4. It is not even going to get to play Stockfish.Therefore my conclusions are that at 1 min / move A0@1080Ti is playing at level comparable to SF8@64 cores.
A GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage. Why not use the same 64 CPU cores for it and see if it will beat Stockfish ?Why this argument again? How many times have I already mentioned that A0 on 1080TI + 1min / move should be of comparable strength to SF8@64 cores 1min / move?
I don't think you can see it like that. GPU's are optimized for floating point operations but they are poor for other types of operations that CPU's excel in.GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage.
Maybe, but Stockfish uses about 10 esoteric tree pruning techniques over the simple MCTS"A GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage."
Ah this again? So comparing performance per $ or performance per Watt obviously doesn't concern you, right? I can see team of scientists deciding if they will run their simulation on 10x1080Ti or 7000 cpu cores and in the end they will go for the cpus, because while the GPUs would be cheaper it would also provide an unfair advantage over cpuDaniel Shawul wrote: A GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage. Why not use the same 64 CPU cores for it and see if it will beat Stockfish ?
You have elevated the minimum hardware requirement for A0 to be competitive with Stockfish to 11 x 64 = 700 CPU cores each thinking for 1 min per move. So if you had used one core, which is the standard in chess rating lists, you have to use a time control of 11 hours per move
I am not backing off, the test is independent of hardware, we can just let it calculate roughly as many nodes as were used in game before the blunder was played and see how it goes with new weights e.g. in weekly intervals. Until the net gets resized again.Daniel Shawul wrote: In any case you are backing off from your "bet" that the net is going to improve the tactics. Now you insist on some form of hardware advantage to cover for tactical weakness. Which is it?
To accommodate Daniel's valid concerns, we should run both Alpha-Beta engines and neural net engines on the abacusmirek wrote:Ah this again? So comparing performance per $ or performance per Watt obviously doesn't concern you, right? I can see team of scientists deciding if they will run their simulation on 10x1080Ti or 7000 cpu cores and in the end they will go for the cpus, because while the GPUs would be cheaper it would also provide an unfair advantage over cpuDaniel Shawul wrote: A GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage. Why not use the same 64 CPU cores for it and see if it will beat Stockfish ?
You have elevated the minimum hardware requirement for A0 to be competitive with Stockfish to 11 x 64 = 700 CPU cores each thinking for 1 min per move. So if you had used one core, which is the standard in chess rating lists, you have to use a time control of 11 hours per move
We will see what HW and engines e.g. correspondence players will use once LC0 gets to A0 level. If you want to insist that 1 CPU core is the only correct metric on which to measure engine strength than have it your way. But don't be surprised if in correspondence game you get completely smashed by engine which according to your "correct 1 core rating" will be like 500 elo weaker to you alpha-beta searcher.
I am not backing off, the test is independent of hardware, we can just let it calculate roughly as many nodes as were used in game before the blunder was played and see how it goes with new weights e.g. in weekly intervals. Until the net gets resized again.Daniel Shawul wrote: In any case you are backing off from your "bet" that the net is going to improve the tactics. Now you insist on some form of hardware advantage to cover for tactical weakness. Which is it?
It seems it's not that good in go either as shown by recent Leela games it blunders tactics in go on regular basis. Go is a bit different as it's possible to kill humans there without much tactical awareness but winning against humans in a board game isn't exactly a high bar these days. In engine vs engine matches it's clear that MCTS in pure form isn't working.Not at all - I think you've raised some very interesting points! MCTS averaging does seem fundamentally mismatched to Chess. That's why I was so amazed A0 actually worked.
Daniel usually I agree with most of the things you write but in this particular case you are quite wrong.Daniel Shawul wrote:A GTX 1080 Ti is 11 TFlops, and 64 cores is 1 TFlops so that is an 11X hardware advantage. Why not use the same 64 CPU cores for it and see if it will beat Stockfish ?Why this argument again? How many times have I already mentioned that A0 on 1080TI + 1min / move should be of comparable strength to SF8@64 cores 1min / move?
Not sure what you're talking about here. Leela (as in Leela 0.11) certainly has tactical weaknesses, but that's an MCTS engine with a neural net, not a pure NN engine like Leela Zero.OneTrickPony wrote:It seems it's not that good in go either as shown by recent Leela games it blunders tactics in go on regular basis. Go is a bit different as it's possible to kill humans there without much tactical awareness but winning against humans in a board game isn't exactly a high bar these days. In engine vs engine matches it's clear that MCTS in pure form isn't working.Not at all - I think you've raised some very interesting points! MCTS averaging does seem fundamentally mismatched to Chess. That's why I was so amazed A0 actually worked.
There is some kind of component missing, for example now you can get to 100k playouts, discover that the line is a total disaster (losing by force) and it will take a longer while for it to prefer a different move.
I personally believe policy guided search and policy being trained on many games will work but the move selection itself will be more in line with alpha/beta in the end. Overall I am excited (and wish I had time away from programming engines for card games to participate). If anything Leela plays like a naive optimistic human 2200-2300ELO and that's really cool to have