Btw. you don't even know how to test only NN eval of LC0 do you?jkiliani wrote:Just tested exactly that, with Id 150, against Stockfish with fixed depth 1:Milos wrote:Cut the crap. Return here when LC0 network alone (single playout) is able to beat SF depth 1 search.jkiliani wrote:Not sure what you're talking about here. Leela (as in Leela 0.11) certainly has tactical weaknesses, but that's an MCTS engine with a neural net, not a pure NN engine like Leela Zero.
And while Leela Zero may still have some tactical vulnerabilities, they're getting really hard to exploit, certainly for humans.
Agreed that policy guided search has some similarity to Alpha-Beta on a mature, larger neural net.
I'm pretty confident that will not happen any time soon, especially if you don't increase the size of NN.
./cutechess-cli -rounds 400 -tournament round-robin -concurrency 2 -pgnout results_tuning.pgn \
-engine name=Id_152 cmd=lczero_tunenew2 arg="--threads=1" arg="--weights=$WDR/weights_152.txt" arg="--noponder" nodes=1 tc=inf\
-engine name=sf_d1 cmd=stockfish_x86-64 option.Threads=1 depth=1 tc=inf \
Result: 1-1-0. Obviously I ran more games than two, but it turns out that both Stockfish and Lc0 are deterministic at these settings, so the end result was 200-200-0.
Unless you can come up with a way to make Stockfish non-deterministic at fixed depth 1, I consider this point now proven.
Edit: Id 150 actually wins with 1-0-1, Id 129 also scores equal with 1-1-0, only Id 125 loses with 0-1-1.
So in total, we have Lc0 performing comparably to Stockfish Depth 1. Any further questions?
It is not enough to try to force through cutecheescli nodes=1 because LC0 would ignore it. You specifically have to use "-p 1" argument and even then, on a single thread it would still give you 2 playouts not one. But that's the best you can get without changing actual LC0 code.