So what should be better with the network nn-3c0054ea9860.nnue?
nn-3c0054ea9860
Moderators: hgm, Dann Corbit, Harvey Williamson
-
Eduard
- Posts: 1305
- Joined: Sat Oct 27, 2018 12:58 am
- Location: Germany
- Full name: Eduard Nemeth
nn-3c0054ea9860
I can't explain it, but I'm not belighted with the newest green Stockfish NNUE net "nn-3c0054ea9860.nnue". What should be better than before? In my private tests, it is significantly worse than the previous green network. Tactically it is clearly inferior to the previous network. It also doesn't surprise me that the author of the engine Blue Marlin (a tactically tuned stockfish) hasn't implemented the latest NNUE. But the latest new Stockfish pure engine (090722) is great. But I combine the engine with the net "nn-3c0aa92af1da.nnue". It's the best in my tests, and I ended up at the top of the table in yesterday's tournament on PlayChess (with a slightly modified version)! 
So what should be better with the network nn-3c0054ea9860.nnue?
So what should be better with the network nn-3c0054ea9860.nnue?
-
Ovyron
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: nn-3c0054ea9860
Did you remember to change the scale in the code? With the old scale nn-3c0aa92af1da is better but with the new scale nn-3c0054ea9860 is better.Eduard wrote: ↑Sun Jul 10, 2022 10:35 am I can't explain it, but I'm not belighted with the newest green Stockfish NNUE net "nn-3c0054ea9860.nnue". What should be better than before? In my private tests, it is significantly worse than the previous green network. Tactically it is clearly inferior to the previous network. It also doesn't surprise me that the author of the engine Blue Marlin (a tactically tuned stockfish) hasn't implemented the latest NNUE. But the latest new Stockfish pure engine (090722) is great. But I combine the engine with the net "nn-3c0aa92af1da.nnue". It's the best in my tests, and I ended up at the top of the table in yesterday's tournament on PlayChess (with a slightly modified version)!
So what should be better with the network nn-3c0054ea9860.nnue?
We've reached the point at which the best net depends on the search, you can't apply a new net to old code and expect it to work, newest code and newest net is the strongest combination.
Your beliefs create your reality, so be careful what you wish for.
-
Eduard
- Posts: 1305
- Joined: Sat Oct 27, 2018 12:58 am
- Location: Germany
- Full name: Eduard Nemeth
Re: nn-3c0054ea9860
How should I know that the newest net doesn't work well with older code? But I found that out with tests. There are still some Stockfish derivatives that don't have a new Stockfish code yet. And here I noticed that the new network is not working well.
-
Ovyron
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: nn-3c0054ea9860
I figure lazy programmers are now delivering engines that are weaker in the new version than in the old one, by either keeping old code and only updating the net, or updating the code without updating the net, doing a disservice to the people that use their derivatives.
But it's the people's fault for just looking at a bigger version number and getting the latest thing without knowing if it's worse or not.
And no, you weren't supposed to know, I hadn't seen anything like this before, I think nn-3c0aa92af1da or even an older net could become default again if people find better code that works for them and scale needs to be adjusted back! Wouldn't that be a thing?
Your beliefs create your reality, so be careful what you wish for.
-
Eduard
- Posts: 1305
- Joined: Sat Oct 27, 2018 12:58 am
- Location: Germany
- Full name: Eduard Nemeth
Re: nn-3c0054ea9860
I tested the latest green net (nn-3c0054ea9860) with the engine Blue Marlin. It didn't work well. But the engine is great because it's tactically good (better than Stockfish standard). I also need engines like Blue Marlin for special analyses. Unfortunately it doesn't work with the latest network. Fortunately, the old network is doing very well. 
-
Eelco de Groot
- Posts: 4556
- Joined: Sun Mar 12, 2006 2:40 am
- Full name:
Re: nn-3c0054ea9860
Rubinus made a nice post in the Tournaments and Matches forum about the ideas behind the new Stockfish net:
Also see the notes in the commit here https://github.com/official-stockfish/S ... 85813e637a
Fascinating post I thought!Rubinus wrote: ↑Wed Jul 06, 2022 2:43 pmI don't want an outright book. It's just (almost) all draws. I'm making a private book for a king gambit. And I'm using these games to look for weaknesses.AndrewGrant wrote: ↑Wed Jul 06, 2022 11:28 am Taking a look at some of these games, and I'm fairly confused. Arena suggests to me that the engines are reporting scores well over +40.00 at the very first position in some cases. Is this supposed to be an extremely unbalanced book, or am I viewing these scores wrong?
It's supposed to be for people with somehow impaired motors to play, so it's intentionally pretty broad. And it seems that most old books, apart from some historical value, are just a collection of paper anymore. You can find routines with a modern engine that just aren't in those older books.
Dragon - Ethereal match is running now, the score looks similar, 6:0=32. I think there are gaps in the knowledge network.
Off topic. Apparently the last Stockfish was trained in anti-computer positions. The next position resolves in a freakish 1s. Version 13 couldn't even make it after two days, LC0 in like 30 mins. The thing is that after 1.Qxe5 fe5 2.Rf1 Black is knocked out, but there are a lot of those tempos, so a classically working program probably doesn't stand a chance. Dragon sees nothing, neither does Ethereal. We've also tried mirroring, or maybe with a tower on b1 and form, Stockfish sees it, has it in the net, not just some trick like it used to be with Rebel and the LCTII test. There are some solutions wrong by the way when checked today.
[fen]4q1kr/p6p/1prQPppB/4n3/4P3/2P5/PP2B2P/R5K1 w - - 0 0[/fen]
Also see the notes in the commit here https://github.com/official-stockfish/S ... 85813e637a
That is all I know, I'm not sure what the scaling what Ovyron alludes to does with the old net in the new code, the change in using complexity precedes the new net, so the new Net does not have anything to do with that I think?The real strength of the net is in FRC and DFRC chess where it gains significantly.
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
-
Eduard
- Posts: 1305
- Joined: Sat Oct 27, 2018 12:58 am
- Location: Germany
- Full name: Eduard Nemeth
Re: nn-3c0054ea9860
I don't care about FRC!
But I am interested in important positions that occur in normal practice.
Here are two analyzes of the chart position above. PegasusSF is a slightly modified SF version
from 090722 (Vondele) with the network nn-3c0aa92af1da.nnue. Below the analysis of Stockfish 090722 (Vondele) with the current green net. Before I started the second analysis, I rebooted the PC (because of the hash).
[fen]4q1kr/p6p/1prQPppB/4n3/4P3/2P5/PP2B2P/R5K1 w - - 0 1[/fen]
Analysis by PegasusSF 090722:
1.Qxe5 fxe5 2.Rf1 Qe7 3.Bd1 Rc4 4.Bb3 b5 5.Kg2 a6 6.a4 g5 7.Rf5 Qg7 8.Bxg7 Kxg7 9.Bxc4 bxc4 10.Kf3 Re8 11.Rxg5+ Kf6 12.Rh5 a5 13.Rxh7 Kxe6 14.Rh6+ Kd7 15.h4 Rf8+ 16.Ke3 Rb8 17.Ra6 Rxb2 18.Rxa5 Ke6 19.Rc5 Rh2 20.Rxc4 Rxh4 21.Kd3 Kd6 22.a5 Rh1 23.Ra4 Rh3+ 24.Kc2
White is clearly winning: +- (7.46 ++) Depth: 39/59 00:00:10 226MN, tb=33651
(Nemeth, MyTown 10.07.2022)
Analysis by Stockfish 090722:
1.Qxe5 fxe5 2.Rf1 Rc8 3.Bd1 Qxe6 4.Bb3 Qxb3 5.axb3 a5 6.h4 Rd8 7.Rf6 Rb8 8.Kh2 Rd8 9.Kg3 Rb8 10.Kg4 Rd8 11.Rf1 Rb8 12.c4 Ra8 13.Kg5 Rb8 14.Kf6 Rf8+ 15.Bxf8 Kxf8 16.Rd1 Ke8 17.Rd6 Rf8+ 18.Kxe5 Rf3 19.Rxb6 Rh3 20.Kd6 Rxh4 21.c5 Rh2 22.Rb8+ Kf7 23.c6 Rd2+ 24.Ke5 Rc2 25.Rb7+ Ke8
White is clearly winning: +- (8.08) Depth: 37/57 00:00:09 196MN, tb=53301
(Nemeth, MyTown 10.07.2022)
Even the old network finds the move very quickly!
But that's not important to me. I like to play in engine tournaments. A few weeks ago, the tournament "WhiteBlackChallengeEngineTournament" was held on InfinityChess. Only 12 variants were allowed (you could add more moves) where White had an advantage of about +0.80. One of the variants was 1. e4 d5 --> and here you could add more book moves.
I played the following position in one of the games in one of the test tournaments. I had black and my engine played without further moves from the second move. I wanted to see what the engine can do. Timecontrol was 12m+2s.
[fen]2kr4/pp1n1pBp/3bp1p1/3p2qP/P7/3B1K2/1PP2P2/R2Q1R2 b - - 0 1[/fen]
My engine thought about this for about two minutes, and then played 24...f6! Because I knew the line, and that after 24...e5? pretty much all the games Black has lost so far, I was happy about the new move. And really, my engine made a draw.
Here is the analysis of Original-Stockfish 090722:
Analysis by Stockfish 090722:
24...e5 25.Ke2 e4 26.Bb5 Ne5 27.Bxe5 Qxe5 28.Qd2 a6 29.Qc3+ Qxc3 30.bxc3 axb5 31.axb5 Kc7 32.Rad1 Bc5 33.Rg1 Rd6 34.Rg5 f5 35.hxg6 hxg6 36.Rdg1 Kb6 37.Rxg6 Rxg6 38.Rxg6+ Kxb5 39.Rf6 Kc4 40.Rxf5 b6 41.Kd2 Bd6 42.Kc1 Bc5 43.Kb2 Bd6 44.Rf6 Bc5 45.Rf4 Bd6 46.Rf5 Be7 47.f4 exf3 48.Rxf3 b5 49.Rf5 Bd6 50.Rf6 Kc5 51.Kb3 Kc6 52.Rg6 Kc5 53.Rg8 Kc6
White is better: +/- (1.17 ++) Depth: 46/64 00:02:25 2388MN, tb=868436
24...f6
White is better: +/- (1.08 ++) Depth: 46/75 00:02:39 2610MN, tb=935503
(Nemeth, MyTown 10.07.2022)
2 min 39s on 18 threads of my Ryzen 3900x to find the right move.
Here I rebooted the PC and analyzed the modified version (4 GB hash, 3456men EGTB):
Analysis by PegasusSF 090722:
24...e5 25.Ke2 e4 26.Bb5 Ne5 27.Bxe5 Qxe5 28.Qd2 a6 29.Qc3+ Qxc3 30.bxc3 axb5 31.axb5 Kd7 32.Rfd1 Ke6 33.c4 dxc4 34.h6 f5 35.Ra7 Rd7 36.Ra8 Re7 37.b6 Bf4 38.Rc8 Bxh6 39.Rxc4 Bf4 40.Rc8 h5 41.c4 h4 42.Rh8 g5 43.c5 Kf6 44.Rb1 e3 45.c6 exf2+ 46.Kxf2 bxc6 47.b7 Bg3+ 48.Kg2 Re2+ 49.Kg1 Rc2 50.b8Q Bxb8 51.Rbxb8 Rc1+ 52.Kf2 Ke5 53.Rh5 Rc2+ 54.Kf3
White is better: +/- (1.25 --) Depth: 42/54 00:00:56 933MN, tb=275118
24...f6
White is better: +/- (1.13 ++) Depth: 42/66 00:01:20 1305MN, tb=437432
24...f6 25.Rh1 gxh5 26.Qg1 Qd2 27.Rxh5 Ne5+ 28.Kg2 Qf4 29.Rh3 Nxd3 30.Rxd3 Qg4+ 31.Kf1 Qxg1+ 32.Kxg1 Rg8 33.Kh1 Rxg7 34.c4 Be5 35.cxd5 exd5 36.Rf1 Bxb2 37.Rd2 Ba3 38.Rxd5 Kc7 39.Rf5 Rg6 40.Kh2 b6 41.Kh3 Kd7 42.Rh5 Rg7 43.Re1 Bd6 44.Re4 Bc5 45.Rd5+ Kc6 46.Rf5 Rg6 47.Re1 Kd7
White is better: +/- (1.08) Depth: 42/66 00:01:21 1325MN, tb=443211
24...f6 25.Rh1 gxh5 26.Qg1 Qd2 27.Rxh5 Ne5+ 28.Kg2 Qf4 29.Rh3 Nxd3 30.Rxd3 Qg4+ 31.Kf1 Qxg1+ 32.Kxg1 Rg8 33.Kh1 Rxg7 34.c4 Be5 35.cxd5 exd5 36.Rf1 Bxb2 37.Rd2 Ba3 38.Rxd5 Kc7 39.Rf5 Rg6 40.Kh2 b6 41.Kh3 Kd7 42.Rh5 Rg7 43.Re1 Bd6 44.Re4 Bc5 45.Rd5+ Kc6 46.Rf5 Rg6 47.Re1 Kd7
White is better: +/- (1.19 --) Depth: 43/66 00:01:30 1471MN, tb=465794
24...f6 25.Rh1 gxh5 26.Qg1 Qd2 27.Rxh5 Ne5+ 28.Kg2 Qf4 29.Rh3 Nxd3 30.Rxd3 Qg4+ 31.Kf1 Qxg1+ 32.Kxg1 Rg8 33.Kh1 Rxg7 34.c4 Be5 35.cxd5 exd5 36.Rf1 Bxb2 37.Rd2 Ba3 38.Rxd5 Kc7 39.Rf5 Rg6 40.Kh2 b6 41.Kh3 Kd7 42.Rh5 Rg7 43.Re1 Bd6 44.Re4 Bc5 45.Rd5+ Kc6 46.Rf5 Rg6 47.Re1 Kd7
White is better: +/- (1.11 ++) Depth: 43/66 00:01:31 1488MN, tb=468715
(Nemeth, MyTown 10.07.2022)
1min 20s! The old network is significantly faster here. There are also other similar positions that I would be happy to show if desired. The new net + engine is still not enough for my current tournament practice. I will prefer to use the older network.
But I am interested in important positions that occur in normal practice.
Here are two analyzes of the chart position above. PegasusSF is a slightly modified SF version
from 090722 (Vondele) with the network nn-3c0aa92af1da.nnue. Below the analysis of Stockfish 090722 (Vondele) with the current green net. Before I started the second analysis, I rebooted the PC (because of the hash).
[fen]4q1kr/p6p/1prQPppB/4n3/4P3/2P5/PP2B2P/R5K1 w - - 0 1[/fen]
Analysis by PegasusSF 090722:
1.Qxe5 fxe5 2.Rf1 Qe7 3.Bd1 Rc4 4.Bb3 b5 5.Kg2 a6 6.a4 g5 7.Rf5 Qg7 8.Bxg7 Kxg7 9.Bxc4 bxc4 10.Kf3 Re8 11.Rxg5+ Kf6 12.Rh5 a5 13.Rxh7 Kxe6 14.Rh6+ Kd7 15.h4 Rf8+ 16.Ke3 Rb8 17.Ra6 Rxb2 18.Rxa5 Ke6 19.Rc5 Rh2 20.Rxc4 Rxh4 21.Kd3 Kd6 22.a5 Rh1 23.Ra4 Rh3+ 24.Kc2
White is clearly winning: +- (7.46 ++) Depth: 39/59 00:00:10 226MN, tb=33651
(Nemeth, MyTown 10.07.2022)
Analysis by Stockfish 090722:
1.Qxe5 fxe5 2.Rf1 Rc8 3.Bd1 Qxe6 4.Bb3 Qxb3 5.axb3 a5 6.h4 Rd8 7.Rf6 Rb8 8.Kh2 Rd8 9.Kg3 Rb8 10.Kg4 Rd8 11.Rf1 Rb8 12.c4 Ra8 13.Kg5 Rb8 14.Kf6 Rf8+ 15.Bxf8 Kxf8 16.Rd1 Ke8 17.Rd6 Rf8+ 18.Kxe5 Rf3 19.Rxb6 Rh3 20.Kd6 Rxh4 21.c5 Rh2 22.Rb8+ Kf7 23.c6 Rd2+ 24.Ke5 Rc2 25.Rb7+ Ke8
White is clearly winning: +- (8.08) Depth: 37/57 00:00:09 196MN, tb=53301
(Nemeth, MyTown 10.07.2022)
Even the old network finds the move very quickly!
But that's not important to me. I like to play in engine tournaments. A few weeks ago, the tournament "WhiteBlackChallengeEngineTournament" was held on InfinityChess. Only 12 variants were allowed (you could add more moves) where White had an advantage of about +0.80. One of the variants was 1. e4 d5 --> and here you could add more book moves.
I played the following position in one of the games in one of the test tournaments. I had black and my engine played without further moves from the second move. I wanted to see what the engine can do. Timecontrol was 12m+2s.
[fen]2kr4/pp1n1pBp/3bp1p1/3p2qP/P7/3B1K2/1PP2P2/R2Q1R2 b - - 0 1[/fen]
My engine thought about this for about two minutes, and then played 24...f6! Because I knew the line, and that after 24...e5? pretty much all the games Black has lost so far, I was happy about the new move. And really, my engine made a draw.
Here is the analysis of Original-Stockfish 090722:
Analysis by Stockfish 090722:
24...e5 25.Ke2 e4 26.Bb5 Ne5 27.Bxe5 Qxe5 28.Qd2 a6 29.Qc3+ Qxc3 30.bxc3 axb5 31.axb5 Kc7 32.Rad1 Bc5 33.Rg1 Rd6 34.Rg5 f5 35.hxg6 hxg6 36.Rdg1 Kb6 37.Rxg6 Rxg6 38.Rxg6+ Kxb5 39.Rf6 Kc4 40.Rxf5 b6 41.Kd2 Bd6 42.Kc1 Bc5 43.Kb2 Bd6 44.Rf6 Bc5 45.Rf4 Bd6 46.Rf5 Be7 47.f4 exf3 48.Rxf3 b5 49.Rf5 Bd6 50.Rf6 Kc5 51.Kb3 Kc6 52.Rg6 Kc5 53.Rg8 Kc6
White is better: +/- (1.17 ++) Depth: 46/64 00:02:25 2388MN, tb=868436
24...f6
White is better: +/- (1.08 ++) Depth: 46/75 00:02:39 2610MN, tb=935503
(Nemeth, MyTown 10.07.2022)
2 min 39s on 18 threads of my Ryzen 3900x to find the right move.
Here I rebooted the PC and analyzed the modified version (4 GB hash, 3456men EGTB):
Analysis by PegasusSF 090722:
24...e5 25.Ke2 e4 26.Bb5 Ne5 27.Bxe5 Qxe5 28.Qd2 a6 29.Qc3+ Qxc3 30.bxc3 axb5 31.axb5 Kd7 32.Rfd1 Ke6 33.c4 dxc4 34.h6 f5 35.Ra7 Rd7 36.Ra8 Re7 37.b6 Bf4 38.Rc8 Bxh6 39.Rxc4 Bf4 40.Rc8 h5 41.c4 h4 42.Rh8 g5 43.c5 Kf6 44.Rb1 e3 45.c6 exf2+ 46.Kxf2 bxc6 47.b7 Bg3+ 48.Kg2 Re2+ 49.Kg1 Rc2 50.b8Q Bxb8 51.Rbxb8 Rc1+ 52.Kf2 Ke5 53.Rh5 Rc2+ 54.Kf3
White is better: +/- (1.25 --) Depth: 42/54 00:00:56 933MN, tb=275118
24...f6
White is better: +/- (1.13 ++) Depth: 42/66 00:01:20 1305MN, tb=437432
24...f6 25.Rh1 gxh5 26.Qg1 Qd2 27.Rxh5 Ne5+ 28.Kg2 Qf4 29.Rh3 Nxd3 30.Rxd3 Qg4+ 31.Kf1 Qxg1+ 32.Kxg1 Rg8 33.Kh1 Rxg7 34.c4 Be5 35.cxd5 exd5 36.Rf1 Bxb2 37.Rd2 Ba3 38.Rxd5 Kc7 39.Rf5 Rg6 40.Kh2 b6 41.Kh3 Kd7 42.Rh5 Rg7 43.Re1 Bd6 44.Re4 Bc5 45.Rd5+ Kc6 46.Rf5 Rg6 47.Re1 Kd7
White is better: +/- (1.08) Depth: 42/66 00:01:21 1325MN, tb=443211
24...f6 25.Rh1 gxh5 26.Qg1 Qd2 27.Rxh5 Ne5+ 28.Kg2 Qf4 29.Rh3 Nxd3 30.Rxd3 Qg4+ 31.Kf1 Qxg1+ 32.Kxg1 Rg8 33.Kh1 Rxg7 34.c4 Be5 35.cxd5 exd5 36.Rf1 Bxb2 37.Rd2 Ba3 38.Rxd5 Kc7 39.Rf5 Rg6 40.Kh2 b6 41.Kh3 Kd7 42.Rh5 Rg7 43.Re1 Bd6 44.Re4 Bc5 45.Rd5+ Kc6 46.Rf5 Rg6 47.Re1 Kd7
White is better: +/- (1.19 --) Depth: 43/66 00:01:30 1471MN, tb=465794
24...f6 25.Rh1 gxh5 26.Qg1 Qd2 27.Rxh5 Ne5+ 28.Kg2 Qf4 29.Rh3 Nxd3 30.Rxd3 Qg4+ 31.Kf1 Qxg1+ 32.Kxg1 Rg8 33.Kh1 Rxg7 34.c4 Be5 35.cxd5 exd5 36.Rf1 Bxb2 37.Rd2 Ba3 38.Rxd5 Kc7 39.Rf5 Rg6 40.Kh2 b6 41.Kh3 Kd7 42.Rh5 Rg7 43.Re1 Bd6 44.Re4 Bc5 45.Rd5+ Kc6 46.Rf5 Rg6 47.Re1 Kd7
White is better: +/- (1.11 ++) Depth: 43/66 00:01:31 1488MN, tb=468715
(Nemeth, MyTown 10.07.2022)
1min 20s! The old network is significantly faster here. There are also other similar positions that I would be happy to show if desired. The new net + engine is still not enough for my current tournament practice. I will prefer to use the older network.
-
sarona
- Posts: 107
- Joined: Tue Oct 29, 2019 4:14 pm
- Location: Canada
- Full name: Ron Doughie
Re: nn-3c0054ea9860
I always scroll down to the bottom of the commits page to check and see what files were revised. Often with updated nets, it is only evaluate.h that is changed. But with nn-3c0054ea9860, evaluate.cpp had the scaling value changed from 1092 to 1064.
That should be a warning sign of possible issues occurring when using older SF binaries with this net.
-
Ovyron
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: nn-3c0054ea9860
All you need to do is, as sarona mentions, searching for this line on evaluate.cpp on Blue Marlin's source*:
Code: Select all
int scale = 1092 + 106 * pos.non_pawn_material() / 5120;Code: Select all
int scale = 1064 + 106 * pos.non_pawn_material() / 5120;This makes the old net obsolete. This doesn't work for people using illegal closed source derivatives that are weakening their engines by only switching to the new net without changing scale, which is funny.
* EDIT - and if you can't find this piece of code that means Blue Marlin is very outdated!
Your beliefs create your reality, so be careful what you wish for.
-
Eduard
- Posts: 1305
- Joined: Sat Oct 27, 2018 12:58 am
- Location: Germany
- Full name: Eduard Nemeth