Nolot position 9, the one that modern programs "get wrong":
Code: Select all
9 Ng5; id "Position 9"
Searching move: Nf3-g5
Best move (Lc0-v0.31.2:MPV:10): Nxh6
---------------------------------------------------------------------------
20/60 59:58 9,666k 3k -0.38 Ra1 Qb3 Ne1 Rac8 Bxc6 Rxc6 Rxa5 Rfc8 Ra7 R6c7 Rxc7 Rxc7 Nf6 Ra7 Rc1 Bf8 Nxh7 Kxh7 Rc4 Ra1 Rxd4 Rb1 Rd8
20/60 59:58 9,666k 3k -0.24 Bg5 hxg5 Nxg5 Bg6 Bxc6 Rac8 Ra1 Qb3 Bb5 Qd5 Bc4 Qa8 Nh2 a4 Qg4 Bf5 Qf4 a3
20/60 59:58 9,666k 3k -0.64 Nd2 Bxd3 Qf3 Qd5 Qxd3 Qxg2+ Kxg2 Nxd3 Rxc6 Nxb2 Rb1 Na4 Nb3 Nc3 Ra1 a4 Nxd4 Rfd8 Rc4 Rac8 Rxc8 Rxc8
20/60 59:58 9,666k 3k +0.22 Nfh2 Rac8 Nf6 Bg6 Rxc5 Bxc5 Bxh6 gxh6 Nhg4 Kg7 h5 Ne7 hxg6 fxg6 Bb7 Rc7 Be4
20/60 59:58 9,666k 3k +0.19 Ne1 Rac8 Nf6 Bg6 h5 Bf5 g4 Nxe5 Bxe5 gxf6 gxf5 fxe5 Qxe5+ f6 Qxd4 Nb3 Qe3 Nxc1 Qxh6+ Kg8 Qg6+ Kh8 Rxc1 Rxc1 Qh6+ Kg8 Qg6+ Kh8
20/60 59:58 9,666k 3k +1.06 Rxc5 Bxc5 Nxh6 Be7 Ng5 Bxg5 Bxg5 Nxe5 Bxa8 f6 Bg2 a4 Bf4 a3 Bxe5 fxe5 Qxe5 axb2 Qd6 Re8 Qd7 Bg6 Nf7+ Bxf7 Qxf7 Rc8 Be4 b1Q Rxb1 Qxb1+ Kg2 Qc1 Qxe6 Qc5 h5 Qf8 Qg6 Qg8 h6 Rf8
20/60 59:58 9,666k 3k +1.57 Ng5 hxg5 hxg5 Qb3 Nf6 gxf6 exf6 Nxd3 Qh5 Nxc1 Rd2 Rg8 Bxc6 Ne2+ Rxe2 Qd1+ Kg2 d3 Be4 Rg6 Bxg6 fxg6 Be5 Qxe2 Qxe2 Bxf6 Bxf6+ Kg8 Qxe6+
20/60 59:58 9,666k 3k +4.11 Bxh6 gxh6 Qd2 Rg8 Nxh6 Rg7 Rxc5 Bxc5 Ng5 Bg6 Bxc6 Rc8 Be4 Be7 Bxg6 Bxg5 Qxg5 Rxg6 Nxf7+ Kg7 Qe7 Kg8 Ng5 Rg7 Qf6 Rf8 Qxe6+ Qxe6 Nxe6 Re8 Nxg7 Rxe5 f4 Re2 Nf5 Rxb2 Ra1 b3 Nxd4 a4 Rxa4
20/60 59:58 9,666k 3k +4.74 Nf6 Rac8 Bxh6 Bf5 g4 Bg6 h5 gxh6 hxg6 fxg6 g5 Kg7 Nh4 hxg5 Nxg6 Nxe5 Nxe7 Kxf6 Nxc8 Rxc8 Bb7 Nxb7 Rxc8 Nd6 Rc5 Ng6 Re1 Nf7 Qe4 Nf4 Qxd4+ Kg6 Kh2 Qb3
20/60 59:58 9,666k 3k +5.79 Nxh6 gxh6 Bxh6 Qd5 Bg5 Ra7 Bxe7 Rxe7 Ng5 Qxe5 Qh5 f5 Bxc6 Qd6 Be8 Nd7 Bxd7 Rxd7 Ra1 e5 Rxa5 e4 Rda1 Qg6 Qe2 exd3 Qe5+ Qf6 Qb5 Qe7 Qxd3 f4 Nxh7 Qxh7 Qf3 Qf7 Rh5+ Kg8 Raa5 Qe6 Rae5
3/15/2025 5:40:01 PM, Time for this analysis: 01:00:00, Rated time: 9:00:00
Here. I must say is the one clear error in the test suite. But with hardware/software combinations that are a trillion times stronger than when this test is written, no wonder we did not find it and no wonder Pierre could not physically validate the perfection of the entire test.
The suggested best move is Ng5. Ng5 does win a pawn and a half.
But Bxh6 is far better being four pawns ahead.
But Nf6 is even better, being almost a rook ahead.
And Nxh6 is the best, a pawn better than the next best move.
The discovery of these things relied on hard work developing better and better computer systems, today including incredible parallelism and even GPU compute power, which was used to make this particular analysis.
As you know, advanced computer programs are said to have GM level knowledge in their evaluation function, so that just with eval they can play at GM level. And look at what LC0 did when it started out:
Code: Select all
9 Ng5; id "Position 9"
Searching move: Nf3-g5
Best move (Lc0-v0.31.2:MPV:10): Rc1-a1
Not found in: 1:00:00
1/2 00:00 4 114 +3.58 Ra1
1/2 00:00 4 114 +3.58 h5
1/2 00:00 4 114 +3.58 Nd2
1/2 00:00 4 114 +3.58 Nfh2
1/2 00:00 4 114 +3.58 Rxc5
1/2 00:00 4 114 +3.58 Ne1
1/2 00:00 4 114 +3.58 Ng5
1/2 00:00 4 114 +3.32 Nf6 Rac8
1/2 00:00 4 114 +3.47 Nxh6 gxh6
1/2 00:00 4 114 +4.33 Bxh6 gxh6
------------------------------------------------------------------------
It scored Ng5 equally among the best possible move choices (edit: oops, I just noticed that even here Bhx6 was a little ahead of the other choices).
So, in my opinion, this one is busted, but it has a new best move Nxh6.
So, the defect rate for the test is 1/11=9%, which is better than STS, which I helped to verify using three different computer programs (the three strongest) at one hour per position for every position in the test (along with at least as many that were rejected because the computer(s) disagreed with the proposed solution).
This is the great peril of a positional test because they are called positional for a reason. The goodness or badness of a position depends mostly upon strategic factors of positional strength and not merely on collection of wood. At the time they are written, it is very hard for a computer to find the best move via tactics. For that reason, I think it is far harder to write a positional test than a tactical test. But even tactical tests can have cooks, like the famous WAC.230 which Alex Szabo showed was a draw even with the rook sacrifice by creating the opponent's own passed pawn in response.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.