Nolot position 9, the one that modern programs "get wrong":
Code: Select all
  9 Ng5; id "Position 9" 
    Searching move: Nf3-g5
    Best move (Lc0-v0.31.2:MPV:10): Nxh6
   ---------------------------------------------------------------------------
    20/60	59:58	      9,666k	3k	-0.38	Ra1  Qb3  Ne1  Rac8  Bxc6  Rxc6  Rxa5  Rfc8  Ra7  R6c7  Rxc7  Rxc7  Nf6  Ra7  Rc1  Bf8  Nxh7  Kxh7  Rc4  Ra1  Rxd4  Rb1  Rd8
    20/60	59:58	      9,666k	3k	-0.24	Bg5  hxg5  Nxg5  Bg6  Bxc6  Rac8  Ra1  Qb3  Bb5  Qd5  Bc4  Qa8  Nh2  a4  Qg4  Bf5  Qf4  a3
    20/60	59:58	      9,666k	3k	-0.64	Nd2  Bxd3  Qf3  Qd5  Qxd3  Qxg2+  Kxg2  Nxd3  Rxc6  Nxb2  Rb1  Na4  Nb3  Nc3  Ra1  a4  Nxd4  Rfd8  Rc4  Rac8  Rxc8  Rxc8
    20/60	59:58	      9,666k	3k	+0.22	Nfh2  Rac8  Nf6  Bg6  Rxc5  Bxc5  Bxh6  gxh6  Nhg4  Kg7  h5  Ne7  hxg6  fxg6  Bb7  Rc7  Be4
    20/60	59:58	      9,666k	3k	+0.19	Ne1  Rac8  Nf6  Bg6  h5  Bf5  g4  Nxe5  Bxe5  gxf6  gxf5  fxe5  Qxe5+  f6  Qxd4  Nb3  Qe3  Nxc1  Qxh6+  Kg8  Qg6+  Kh8  Rxc1  Rxc1  Qh6+  Kg8  Qg6+  Kh8
    20/60	59:58	      9,666k	3k	+1.06	Rxc5  Bxc5  Nxh6  Be7  Ng5  Bxg5  Bxg5  Nxe5  Bxa8  f6  Bg2  a4  Bf4  a3  Bxe5  fxe5  Qxe5  axb2  Qd6  Re8  Qd7  Bg6  Nf7+  Bxf7  Qxf7  Rc8  Be4  b1Q  Rxb1  Qxb1+  Kg2  Qc1  Qxe6  Qc5  h5  Qf8  Qg6  Qg8  h6  Rf8
    20/60	59:58	      9,666k	3k	+1.57	Ng5  hxg5  hxg5  Qb3  Nf6  gxf6  exf6  Nxd3  Qh5  Nxc1  Rd2  Rg8  Bxc6  Ne2+  Rxe2  Qd1+  Kg2  d3  Be4  Rg6  Bxg6  fxg6  Be5  Qxe2  Qxe2  Bxf6  Bxf6+  Kg8  Qxe6+
    20/60	59:58	      9,666k	3k	+4.11	Bxh6  gxh6  Qd2  Rg8  Nxh6  Rg7  Rxc5  Bxc5  Ng5  Bg6  Bxc6  Rc8  Be4  Be7  Bxg6  Bxg5  Qxg5  Rxg6  Nxf7+  Kg7  Qe7  Kg8  Ng5  Rg7  Qf6  Rf8  Qxe6+  Qxe6  Nxe6  Re8  Nxg7  Rxe5  f4  Re2  Nf5  Rxb2  Ra1  b3  Nxd4  a4  Rxa4
    20/60	59:58	      9,666k	3k	+4.74	Nf6  Rac8  Bxh6  Bf5  g4  Bg6  h5  gxh6  hxg6  fxg6  g5  Kg7  Nh4  hxg5  Nxg6  Nxe5  Nxe7  Kxf6  Nxc8  Rxc8  Bb7  Nxb7  Rxc8  Nd6  Rc5  Ng6  Re1  Nf7  Qe4  Nf4  Qxd4+  Kg6  Kh2  Qb3
    20/60	59:58	      9,666k	3k	+5.79	Nxh6  gxh6  Bxh6  Qd5  Bg5  Ra7  Bxe7  Rxe7  Ng5  Qxe5  Qh5  f5  Bxc6  Qd6  Be8  Nd7  Bxd7  Rxd7  Ra1  e5  Rxa5  e4  Rda1  Qg6  Qe2  exd3  Qe5+  Qf6  Qb5  Qe7  Qxd3  f4  Nxh7  Qxh7  Qf3  Qf7  Rh5+  Kg8  Raa5  Qe6  Rae5
   3/15/2025 5:40:01 PM, Time for this analysis: 01:00:00, Rated time: 9:00:00
Here. I must say is the one clear error in the test suite.  But with hardware/software combinations that are a trillion times stronger than when this test is written, no wonder we did not find it and no wonder Pierre could not physically validate the perfection of the entire test.
The suggested best move is Ng5.  Ng5 does win a pawn and a half.
But Bxh6 is far better being four pawns ahead.
But Nf6  is even better, being almost a rook ahead.
And Nxh6  is the best, a pawn better than the next best move.
The discovery of these things relied on hard work developing better and better computer systems, today including incredible parallelism and even GPU compute power, which was used to make this particular analysis.
As you know, advanced computer programs are said to have GM level knowledge in their evaluation function, so that just with eval they can play at GM level.  And look at what LC0 did when it started out:
Code: Select all
 9 Ng5; id "Position 9" 
    Searching move: Nf3-g5
    Best move (Lc0-v0.31.2:MPV:10): Rc1-a1
    Not found in: 1:00:00
     1/2	00:00	           4	114	+3.58	Ra1
     1/2	00:00	           4	114	+3.58	h5
     1/2	00:00	           4	114	+3.58	Nd2
     1/2	00:00	           4	114	+3.58	Nfh2
     1/2	00:00	           4	114	+3.58	Rxc5
     1/2	00:00	           4	114	+3.58	Ne1
     1/2	00:00	           4	114	+3.58	Ng5
     1/2	00:00	           4	114	+3.32	Nf6  Rac8
     1/2	00:00	           4	114	+3.47	Nxh6  gxh6
     1/2	00:00	           4	114	+4.33	Bxh6  gxh6
   ------------------------------------------------------------------------
It scored Ng5 equally among the best possible move choices (edit: oops, I just noticed that even here Bhx6 was a little ahead of the other choices).
So, in my opinion, this one is busted, but it has a new best move Nxh6.
So, the defect rate for the test is 1/11=9%, which is better than STS, which I helped to verify using three different computer programs (the three strongest) at one hour per position for every position in the test (along with at least as many that were rejected because the computer(s) disagreed with the proposed solution).
This is the great peril of a positional test because they are called positional for a reason.  The goodness or badness of a position depends mostly upon strategic factors of positional strength and not merely on collection of wood. At the time they are written, it is very hard for a computer to find the best move via tactics. For that reason, I think it is far harder to write a positional test than a tactical test.  But even tactical tests can have cooks, like the famous WAC.230 which Alex Szabo showed was a draw even with the rook sacrifice by creating the opponent's own passed pawn in response.
 
			
			
									
						
							Taking ideas is not a vice, it is a virtue.  We have another word for this.  It is called learning.
But sharing ideas is an even greater virtue.   We have another word for this.  It is called teaching.