SPCC: Longtime-testrun of Stockfish nnue vs Lc0 finished

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
pohl4711
Posts: 2444
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

SPCC: Longtime-testrun of Stockfish nnue vs Lc0 finished

Post by pohl4711 »

NN-testrun: Longtime 300 games testrun of Lc0 0.26.1 t60-4619 (30x384) vs Stockfish 200823 82215d0fd0df finished. Another clear win for Stockfish NNUE. See the result and download the games in the "NN vs SF testing"- section (scroll down to the bottom of the page!).

https://www.sp-cc.de

(Perhaps you have to clear your browsercache or reload the website)
mehmet123
Posts: 671
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: SPCC: Longtime-testrun of Stockfish nnue vs Lc0 finished

Post by mehmet123 »

Thanks Stefan for the test. But for me to use Leela Ratio is meaningless. Why do we use Deep Mind team match conditions. AlphaZero and Lc0 aren't same engines. All Lc0 weights has different speeds. 384x30 nets are more slower, 128x 10 nets are more faster.
Stockfish NNUE is more stronger than Lc0 at same conditions and it uses %50 more time in your test. Are we see any sport match like this.
jdart
Posts: 4367
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: SPCC: Longtime-testrun of Stockfish nnue vs Lc0 finished

Post by jdart »

Lc0 really performed badly here, lot of losses due to tactical blindspots. For example:

[pgn][Event "SF vs Lc0 longtime"]
[Site "?"]
[Date "2020.08.14"]
[Round "116"]
[White "SF 200810 112bb1c8cdb5"]
[Black "Lc0 0.26.1 LS 15 (20x256)"]
[Result "1-0"]
[ECO "B99"]
[PlyCount "103"]
[EventDate "2020.??.??"]

1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 a6 6. Bg5 e6 7. f4 Be7 8. Qf3
Qc7 9. O-O-O Nbd7 10. g4 b5 11. Bxf6 Nxf6 12. g5 Nd7 13. f5 Nc5 (13... O-O $142
) 14. f6 gxf6 15. gxf6 Bf8 16. Rg1 h5 17. a3 Rb8 $2 (17... Qb6 $5) 18. Re1 $1
Nd7 19. Nxe6 fxe6 20. f7+ Kd8 21. Rg8 Rh6 22. Qg3 Ke7 23. Bh3 Bb7 24. Nd5+ Bxd5
25. exd5 Qc4 26. Bxe6 Ne5 27. Bh3 Qxd5 28. Rg5 Kxf7 29. Rf1+ Rf6 30. Rxf6+ Kxf6
31. Rf5+ Ke7 32. Qg5+ Ke8 33. Qxh5+ Kd8 34. Rxf8+ Kc7 35. Qh7+ Kc6 36. Rf1 Qa2
37. Bg2+ Kb6 38. Rf8 Qa1+ 39. Kd2 Nc4+ 40. Ke2 Ka5 41. Rxb8 Qg1 42. Qh3 Ne3 43.
Qxe3 Qxg2+ 44. Kd1 Qd5+ 45. Qd2+ Qxd2+ 46. Kxd2 Ka4 47. Kd3 d5 48. Kc3 d4+ 49.
Kxd4 a5 50. Kc5 b4 51. axb4 axb4 52. Ra8# 1-0[/pgn]
User avatar
pohl4711
Posts: 2444
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Longtime-testrun of Stockfish nnue vs Lc0 finished

Post by pohl4711 »

Yes, the tactics are the weak point of lc0. No surprise, when the opponent calculates 500x (?) more nodes than lc0...
But 5'+3'' is a not so short thinking-time and lc0 run on a (mobile) RTX 2060, which is a not too bad hardware. So, no excuses here for tactical blindness in the testing-conditions.
OliverBr
Posts: 725
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Re: SPCC: Longtime-testrun of Stockfish nnue vs Lc0 finished

Post by OliverBr »

On my system (MacBook Pro, GTX 1080 Ti with CUDA), Lc0 ist clearly stronger than Stockfish.

The argument that her tactics are bad because Stockfish calculates more nodes it not valid because it's a complete different algorithm.

Hardware is vital für Leela: On a AMD 32-core CPU with OpenBLAS she loses against OliThink on fast games. She needs a GPU, a good one.
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink
Dann Corbit
Posts: 12542
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: SPCC: Longtime-testrun of Stockfish nnue vs Lc0 finished

Post by Dann Corbit »

Every chess engine is a victim and a victor with the right hardware.
You can put 20 GPUs in some machines, with 32 GB HBM in each, which is pretty much a supercomputer that is awesome.

You can also put two AMD 7H12 cpus in a machine and have absurd CPU power at your command, along with multiple terabytes of really fast RAM.

Hardware and software are two independent strength entities that make a chess system what it is.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: SPCC: Longtime-testrun of Stockfish nnue vs Lc0 finished

Post by mwyoung »

pohl4711 wrote: Sat Aug 29, 2020 12:01 pm Yes, the tactics are the weak point of lc0. No surprise, when the opponent calculates 500x (?) more nodes than lc0...
But 5'+3'' is a not so short thinking-time and lc0 run on a (mobile) RTX 2060, which is a not too bad hardware. So, no excuses here for tactical blindness in the testing-conditions.
I think your testing is 100% accurate. Meaning they are consistent with other testing, including my own.
But details matter, hardware matters, time controls matter, and your definitions matter.

Lets be clear, 5'+3" is a fast time control. It is called blitz. And is pretty close to my testing at 3'+2" on a RTX 2080 Ti, and TR 2950x.

So I guess the real question in today's world with NNUE, Lc0, and A/B engines. Does testing only at blitz, and faster time controls on good hardware, give a accurate representation of the 3 classes of engines at all time controls?

We no longer live in the age of the A/B only engines.

Evidence is showing the answer to the above question is no.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.