Stockfish NNUE SV Tests

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

mehmet123
Posts: 670
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: Stockfish NNUE SV Tests

Post by mehmet123 »

Fire vs Stockfish:

Program Elo + - Games Score Av.Op. Draws

1 Fire 8.NN.MC.2 x64 bmi2 : 2438 18 18 700 60.8 % 2362 52.4 %
2 Stockfish 310720 x64 bmi2 : 2362 18 18 700 39.2 % 2438 52.4 %

Individual statistics:

1 Fire 8.NN.MC.2 x64 bmi2 : 2438 700 (+242,=367,- 91), 60.8 %

Stockfish 310720 x64 bmi2 : 700 (+242,=367,- 91), 60.8 %

2 Stockfish 310720 x64 bmi2 : 2362 700 (+ 91,=367,-242), 39.2 %

Fire 8.NN.MC.2 x64 bmi2 : 700 (+ 91,=367,-242), 39.2 %


Game Conditions: Cutechess Gui, 1 Core (Core-i7 9750h), 10 sec + 0.2 sec TC, Balsa 5 Moves Opening Book, 64 Mb Hash, Ponder Off
https://www.mediafire.com/file/smljoc5e ... 8.pgn/file

Stockfish 310720 is the last Stockfish Dev version without NNUE evaluation.


Fire vs Stockfish:

Program Elo + - Games Score Av.Op. Draws

1 Fire 8.NN.MC.2 x64 bmi2 : 2413 15 14 600 53.7 % 2387 72.3 %
2 Stockfish 310720 x64 bmi2 : 2387 14 15 600 46.3 % 2413 72.3 %

Individual statistics:

1 Fire 8.NN.MC.2 x64 bmi2 : 2413 600 (+105,=434,- 61), 53.7 %

Stockfish 310720 x64 bmi2 : 600 (+105,=434,- 61), 53.7 %

2 Stockfish 310720 x64 bmi2 : 2387 600 (+ 61,=434,-105), 46.3 %

Fire 8.NN.MC.2 x64 bmi2 : 600 (+ 61,=434,-105), 46.3 %


Game Conditions: Cutechess Gui, 1 Core (Core-i7 9750h), 2 min + 0.5 sec TC, Balsa 5 Moves Opening Book, 64 Mb Hash, Ponder Off
https://www.mediafire.com/file/ypedq3by ... 9.pgn/file

With increasing time control, the elo difference decreased (+76 elo to +26 elo) drastically.
I wrote this Fri Jul 31, 2020 " Stockfish NNUE has 2 main problems. It doesn't scale with increasing time control and it doesn't have an aggressive play style against weak engines."
Although more than a year has passed, unfortunately with increasing time the scaling problem of NNUE nets are still going on.
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Stockfish NNUE SV Tests

Post by Vinvin »

Fire 8.NN.MC.2 seems very strong. Even stronger than Stockfish !
Sopel
Posts: 389
Joined: Tue Oct 08, 2019 11:39 pm
Full name: Tomasz Sobczyk

Re: Stockfish NNUE SV Tests

Post by Sopel »

mehmet123 wrote: Tue Aug 03, 2021 6:19 pm Fire vs Stockfish:

Program Elo + - Games Score Av.Op. Draws

1 Fire 8.NN.MC.2 x64 bmi2 : 2438 18 18 700 60.8 % 2362 52.4 %
2 Stockfish 310720 x64 bmi2 : 2362 18 18 700 39.2 % 2438 52.4 %

Individual statistics:

1 Fire 8.NN.MC.2 x64 bmi2 : 2438 700 (+242,=367,- 91), 60.8 %

Stockfish 310720 x64 bmi2 : 700 (+242,=367,- 91), 60.8 %

2 Stockfish 310720 x64 bmi2 : 2362 700 (+ 91,=367,-242), 39.2 %

Fire 8.NN.MC.2 x64 bmi2 : 700 (+ 91,=367,-242), 39.2 %


Game Conditions: Cutechess Gui, 1 Core (Core-i7 9750h), 10 sec + 0.2 sec TC, Balsa 5 Moves Opening Book, 64 Mb Hash, Ponder Off
https://www.mediafire.com/file/smljoc5e ... 8.pgn/file

Stockfish 310720 is the last Stockfish Dev version without NNUE evaluation.


Fire vs Stockfish:

Program Elo + - Games Score Av.Op. Draws

1 Fire 8.NN.MC.2 x64 bmi2 : 2413 15 14 600 53.7 % 2387 72.3 %
2 Stockfish 310720 x64 bmi2 : 2387 14 15 600 46.3 % 2413 72.3 %

Individual statistics:

1 Fire 8.NN.MC.2 x64 bmi2 : 2413 600 (+105,=434,- 61), 53.7 %

Stockfish 310720 x64 bmi2 : 600 (+105,=434,- 61), 53.7 %

2 Stockfish 310720 x64 bmi2 : 2387 600 (+ 61,=434,-105), 46.3 %

Fire 8.NN.MC.2 x64 bmi2 : 600 (+ 61,=434,-105), 46.3 %


Game Conditions: Cutechess Gui, 1 Core (Core-i7 9750h), 2 min + 0.5 sec TC, Balsa 5 Moves Opening Book, 64 Mb Hash, Ponder Off
https://www.mediafire.com/file/ypedq3by ... 9.pgn/file

With increasing time control, the elo difference decreased (+76 elo to +26 elo) drastically.
I wrote this Fri Jul 31, 2020 " Stockfish NNUE has 2 main problems. It doesn't scale with increasing time control and it doesn't have an aggressive play style against weak engines."
Although more than a year has passed, unfortunately with increasing time the scaling problem of NNUE nets are still going on.
you mention "Stockfish NNUE" in your conclusions but your results refer to something completely different
dangi12012 wrote:No one wants to touch anything you have posted. That proves you now have negative reputations since everyone knows already you are a forum troll.

Maybe you copied your stockfish commits from someone else too?
I will look into that.
connor_mcmonigle
Posts: 530
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: Stockfish NNUE SV Tests

Post by connor_mcmonigle »

Sopel wrote: Tue Aug 03, 2021 8:19 pm
mehmet123 wrote: Tue Aug 03, 2021 6:19 pm ...
you mention "Stockfish NNUE" in your conclusions but your results refer to something completely different
Well, his results refer to Fire so I wouldn't say "completely different", haha. In any case, Fire is relying on old Stockfish networks so his conclusion is nonsense anyways...
mehmet123
Posts: 670
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: Stockfish NNUE SV Tests

Post by mehmet123 »

Sopel wrote: Tue Aug 03, 2021 8:19 pm you mention "Stockfish NNUE" in your conclusions but your results refer to something completely different
Fire 8.NN doesn't use Stockfish NNUE nets but I don't think that Stockfish and Fire nets are very different from each other in structure.

From Source Code Readme.Md File " The NNUE implementation utilizes a modified version of Daniel Shaw's/Cfish excellent nnue probe code:
- [nnue-probe](https://github.com/dshawul/nnue-probe/)
Fire includes 'Raptor', a top reinforcement learning network trained by Sergio Vieri
- https://www.comp.nus.edu.sg/~sergio-v/nnue/"

Sergio Vieri has made very powerful nets trained on Stockfish games. But scaling with time problem still have not resolved in SV nets or NNUE nets produced from Stockfish games, Lc0 games or Stockfish/Lc0 mix games.
According to my tests, I can say that NNUE nets made by Dietrich Kappe are much better at scaling with time.
Last edited by mehmet123 on Tue Aug 03, 2021 9:41 pm, edited 2 times in total.
connor_mcmonigle
Posts: 530
Joined: Sun Sep 06, 2020 4:40 am
Full name: Connor McMonigle

Re: Stockfish NNUE SV Tests

Post by connor_mcmonigle »

mehmet123 wrote: Tue Aug 03, 2021 9:26 pm
Sopel wrote: Tue Aug 03, 2021 8:19 pm you mention "Stockfish NNUE" in your conclusions but your results refer to something completely different
Fire 8.NN doesn't use Stockfish NNUE nets but I don't think that Stockfish and Fire nets are very different from each other in structure.
...
I guess Norman's description has confused you. This "Raptor" network Fire uses is just an old Stockfish network from last year. It most definitely does use a Stockfish NNUE network. The network architecture is identical to that used Stockfish 12/13 and the probing code is taken from Cfish.

Why would you expect different behavior from a year old network which you've already tested?
mehmet123
Posts: 670
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: Stockfish NNUE SV Tests

Post by mehmet123 »

Stockfish Dev vs Stockfish 14 (Fischer Random Chess):

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 150821 x64 bmi2 : 2414 12 12 800 54.1 % 2386 74.0 %
2 Stockfish 14 x64 bmi2 : 2386 12 12 800 45.9 % 2414 74.0 %


Individual statistics:

1 Stockfish 150821 x64 bmi2 : 2414 800 (+137,=592,- 71), 54.1 %

Stockfish 14 x64 bmi2 : 800 (+137,=592,- 71), 54.1 %

2 Stockfish 14 x64 bmi2 : 2386 800 (+ 71,=592,-137), 45.9 %

Stockfish 150821 x64 bmi2 : 800 (+ 71,=592,-137), 45.9 %


Game Conditions: Cutechess Gui, 1 Core (Core-i7 9750h), 1 min + 0.5 sec TC, Chess960 3 Moves Opening Book, 128 Mb Hash, Ponder Off
https://www.mediafire.com/file/e0kazuq9 ... 9.pgn/file

There is a serious improvement in Stockfish's FRC performance in a short time.
In last month Stockfish Dev suffered a serious defeat against Stockfish miniNNUE at my FRC test. ( 700 (+ 54,=555,- 91), 47.4 %)
mehmet123
Posts: 670
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: Stockfish NNUE SV Tests

Post by mehmet123 »

Carlsen.bin vs DONBFupdate1.bin:

Program Elo + - Games Score Av.Op. Draws

1 Cfish 130721 D : 2419 18 14 200 55.5 % 2381 88.0 %
2 Cfish 130721 C : 2381 14 18 200 44.5 % 2419 88.0 %


Individual statistics:

1 Cfish 130721 D : 2419 200 (+ 23,=176,- 1), 55.5 %

Cfish 130721 C : 200 (+ 23,=176,- 1), 55.5 %

2 Cfish 130721 C : 2381 200 (+ 1,=176,- 23), 44.5 %

Cfish 130721 D : 200 (+ 1,=176,- 23), 44.5 %


Game Conditions: Cutechess Gui, 1 Core (Core-i7 9750h), 3 min TC, 512 Mb Hash, Ponder Off, BestMove=Off / BookDepth=Max
Cfish 130721 C (Carlsen/Opening Book prepared from Magnus Carlsen's games)// Cfish 130721 D ( DONBFupdate1:Vasid Chouhan)
https://www.mediafire.com/file/8kyghrei ... 0.pgn/file

We can say that even the top chess players's knowledge of the opening book is not entirely perfect.
mehmet123
Posts: 670
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: Stockfish NNUE SV Tests

Post by mehmet123 »

Carlsen.bin vs Hubble 1.3 .bin:

Program Elo + - Games Score Av.Op. Draws

1 Cfish 130721 H : 2420 18 15 200 55.8 % 2380 87.5 %
2 Cfish 130721 C : 2380 15 18 200 44.2 % 2420 87.5 %

Individual statistics:

1 Cfish 130721 H : 2420 200 (+ 24,=175,- 1), 55.8 %

Cfish 130721 C : 200 (+ 24,=175,- 1), 55.8 %

2 Cfish 130721 C : 2380 200 (+ 1,=175,- 24), 44.2 %

Cfish 130721 H : 200 (+ 1,=175,- 24), 44.2 %

Game Conditions: Cutechess Gui, 1 Core (Core-i7 9750h), 3 min TC, 512 Mb Hash, Ponder Off, BestMove=Off / BookDepth=Max
Cfish 130721 C (Carlsen/Opening Book prepared from Magnus Carlsen's games)// Cfish 130721 H ( Hubble 1.3:Mehmet Karaman)
https://www.mediafire.com/file/2ppxiopn ... 1.pgn/file

DONBFupdate1 book opens always with c4 move because of this I ran a new test to see if Carlsen opening book would do well at in a wider selection of openings. But the result is very close to the previous test.
mehmet123
Posts: 670
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: Stockfish NNUE SV Tests

Post by mehmet123 »

Koivisto 6.16 vs Stockfish 9:

Program Elo + - Games Score Av.Op. Draws

1 Koivisto 6.16 x64 avx2 : 2420 28 28 300 55.8 % 2380 50.3 %
2 Stockfish 9 x64 : 2380 28 28 300 44.2 % 2420 50.3 %

Individual statistics:

1 Koivisto 6.16 x64 avx2 : 2420 300 (+ 92,=151,- 57), 55.8 %

Stockfish 9 x64 : 300 (+ 92,=151,- 57), 55.8 %

2 Stockfish 9 x64 : 2380 300 (+ 57,=151,- 92), 44.2 %

Koivisto 6.16 x64 avx2 : 300 (+ 57,=151,- 92), 44.2 %


Game Conditions: Cutechess Gui, 1 Core (Core-i7 9750h), 30 sec + 0.5 sec TC, Balsa 5 Moves Opening Book, 64 Mb Hash, Ponder Off
Koivisto 6.16 x64 avx2 ( "Ipman" compile)
https://www.mediafire.com/file/j3v939ps ... 5.pgn/file

The rise of Koivisto continues rapidly.

Previous result:
1 Koivisto 6.0 x64 avx2 : 2415 21 21 600 54.4 % 2385 45.2 %
2 Stockfish 8 x64 bmi2 : 2385 21 21 600 45.6 % 2415 45.2 %
http://talkchess.com/forum3/viewtopic.p ... &start=220


Koivisto 6.16 vs Stockfish 10:

Program Elo + - Games Score Av.Op. Draws

1 Stockfish 10 x64 bmi2 : 2413 26 26 300 53.8 % 2387 55.7 %
2 Koivisto 6.16 x64 avx2 : 2387 26 26 300 46.2 % 2413 55.7 %

Individual statistics:

1 Stockfish 10 x64 bmi2 : 2413 300 (+ 78,=167,- 55), 53.8 %

Koivisto 6.16 x64 avx2 : 300 (+ 78,=167,- 55), 53.8 %

2 Koivisto 6.16 x64 avx2 : 2387 300 (+ 55,=167,- 78), 46.2 %

Stockfish 10 x64 bmi2 : 300 (+ 55,=167,- 78), 46.2 %


Game Conditions: Cutechess Gui, 1 Core (Core-i7 9750h), 30 sec + 0.5 sec TC, Balsa 5 Moves Opening Book, 64 Mb Hash, Ponder Off
Koivisto 6.16 x64 avx2 ( "Ipman" compile)
https://www.mediafire.com/file/4wbhexu0 ... 6.pgn/file