Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
mwyoung
Posts: 2725
Joined: Wed May 12, 2010 8:00 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mwyoung » Tue Nov 10, 2020 7:01 am

Here we will show how even in a very common 5 man endgame. Stockfish is clueless to the real evaluation of the position.
Again showing a type B search is no match for perfect play. And Stockfish is unable to convert the position.



New game Line
3Q4/q7/8/8/3K4/3p4/3P3k/8 w - - 0 1

Analysis by Stockfish 081120:

1.Kxd3 Qa6+ 2.Ke3 Qh6+ 3.Ke4 Qg6+ 4.Ke5 Qg3+ 5.Kd4 Qe1 6.d3 Qd2 7.Kc4 Qc2+ 8.Kb5 Qb2+ 9.Kc6 Qc2+ 10.Kb7 Qg2+ 11.Kb8 Qg3+ 12.Ka7 Qg7+ 13.Ka6 Qg6+ 14.Kb5 Kg3 15.Kc5 Qf5+ 16.Kc6 Kf2 17.Qd4+ Kf1 18.Kc7 Qf7+ 19.Qd7 Qf4+ 20.Qd6 Qf7+ 21.Kc6 Qf3+ 22.Qd5 Qf8 23.Kd7 Qg7+ 24.Ke8 Qh8+ 25.Ke7 Qh4+ 26.Kd7 Qa4+ 27.Kd6 Qf4+ 28.Qe5 Qf8+ 29.Kd7 Qf7+ 30.Kc6 Qf3+ 31.Qe4 Qf6+ 32.Kb5 Qb2+ 33.Kc5 Qa3+ 34.Kd5 Qa8+ 35.Ke5 Qh8+ 36.Kf4 Kf2 37.Qe3+ Kf1
White is slightly better: +/= (0.44) Depth: 54/75 00:02:52 18111MN
(, 10.11.2020)


Results with perfect play!

"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.

Uri Blass
Posts: 8921
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by Uri Blass » Tue Nov 10, 2020 11:37 am

mwyoung wrote:
Tue Nov 10, 2020 5:54 am
Uri Blass wrote:
Tue Nov 10, 2020 2:07 am
mwyoung wrote:
Mon Nov 09, 2020 10:28 pm
mwyoung wrote:
Mon Nov 09, 2020 9:46 pm
Alayan wrote:
Mon Nov 09, 2020 8:35 pm
"Very simple endgame positions" where it fails. :lol:

Meanwhile, SF from 1.5 years ago + 6-men got 99.999% of 7-men positions right with a few Gnodes.

Some sample of the very natural and common positions where it failed in Aloril's testing :


Fails to see that 2 same color bishops can't force a win, very common issue in normal games...



Q+P vs 3M, very common sort of position... The suggested move is still a draw under the 50 moves rule.



More nonsense with both sides having bishops of the same color.



Very common imbalance, 3N vs Q...

And so on.

I agree! Stockfish is a idiot chess player. Here is another big fail by Stockfish that a 32 man engine would find instantly. Stockfish play has huge holes, and would be crushed with perfect play.



Stockfish 081120 - C Line, Blitz 5min+1sec
n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Analysis by Stockfish 081120:

1.Qd6+ Kg8 2.gxh7+ Kh8 3.Qd5 Qf8 4.Ke5 Kxh7 5.Qe4+ Kg8 6.Qxa8 Qc5+ 7.Kf4 Qd6+ 8.Kg4 Kg7 9.Bf6+ Qxf6
The position is equal: = (0.00) Depth: 100/19 00:06:25 11232MN, tb=404542636
(, 09.11.2020)

Even with a 100 ply type B search. Stockfish is totally clueless about this position.

I will give you a hint 1.Qc8 wins.
Stockfish will never find the win here. So I will give the solution now.

1.Qc8 Kg8 2.Bc7 Qxc8 3.gxf7+ Kh8 4.Be5 Qc5 5.Bb2 Nc7 6.Ba1 a4 7.Bb2 a3 8.Ba1 a2 9.Bb2 a1R 10.Bxa1 Qe5+ 11.Bxe5 Nd5+ 12.Ke6+ Nf6 13.Bxf6#
Maybe you are right for latest stockfish but people showed that some older version found the win.
Stockfish and all top programs can find easily the loss after 2.Bc7

The main problem is finding 2.Bc7 because stockfish insist to use the null move pruning.
I wonder how much time we need to wait for stockfish to replace it by a better pruning rules.

There are certainly positions when stockfish is blind but it does not prove that you can force stockfish to go to one of them from the opening position and if the 32 piece tablebases only try to play for the longest draw in order to draw position and does not try to take advantage of stockfish's weaknesses then it is not clear that it can win.

It may be interesting if somebody already built tablebases with the longest draw for 5-7 piece tablebases in drawn position with the distance in moves for a draw to find if this tablebase really score better than stockfish without tablebases against weak engines in random drawn tablebases endgames.
The point that is being showed is you will always have blind spots or a error rate with a type B search. And this will happen in every game. And this type B search approximation is fine, but will result in crushing losses against perfect play.

These positions are no trick, but the result of a type B search. And this error rate does not go away, but becomes worse with with more game complexity. As the type B search will need to prune more lines!
It is clear that there are many positions that stockfish does not find the right move but the question is not if there are positions that stockfish does not find the right move but if it is possible to get them from the initial position.

The fact that you show a lot of positions when stockfish does not find the right move does not prove that it is possible to achieve one of them from the opening position.

I guess that it is possible to achieve one of them but I doubt if the strategy of playing the move that is the longest path to draw in drawn positions is enough to do it.

Uri Blass
Posts: 8921
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by Uri Blass » Tue Nov 10, 2020 11:57 am

Alayan wrote:
Tue Nov 10, 2020 2:23 am
Uri Blass wrote:
Tue Nov 10, 2020 2:07 am
Maybe you are right for latest stockfish but people showed that some older version found the win.
Stockfish and all top programs can find easily the loss after 2.Bc7

The main problem is finding 2.Bc7 because stockfish insist to use the null move pruning.
I wonder how much time we need to wait for stockfish to replace it by a better pruning rules.

There are certainly positions when stockfish is blind but it does not prove that you can force stockfish to go to one of them from the opening position and if the 32 piece tablebases only try to play for the longest draw in order to draw position and does not try to take advantage of stockfish's weaknesses then it is not clear that it can win.
Null move pruning is good in almost all relevant chess positions. It's done with a low-depth verification search. There are some general patterns where NMP is more likely to fail, and it's not used.

Positions where NMP fails are drastically overrepresented among "problem positions" because these positions focus on anti-patterns and on what engines struggle with. Sure, these positions highlight the gap there is between current top engines and what a true TB-32 would be able to achieve, but how much do they matter when it comes to being able to play well from "reasonable positions" ? Rather little, I think.

Of course, we don't have a top engine without massive pruning because massive pruning is a key contributor of strength, so it's not easy to know how much could be exploited. But you'd then need trillions of nodes to keep depth up.
Uri Blass wrote:
Tue Nov 10, 2020 2:07 am
It may be interesting if somebody already built tablebases with the longest draw for 5-7 piece tablebases in drawn position with the distance in moves for a draw to find if this tablebase really score better than stockfish without tablebases against weak engines in random drawn tablebases endgames.
That would be an interesting experiment. 6 pieces is the highest feasible, I think, generating distance to shortest forced draw 7-pieces would require too much hardware and storage space.
Thinking about it again I doubt if 5 pieces is feasible.
The problem is that the longest path to draw is dependent on the history so you cannot have a simple table that give you move for every position.

You may find drawn positions when the opponent can force a draw in 1 ply (only by stalemate) and later by 2 plies and 3 plies....but at some point the history is relevant.

Here is a simple example:



It is a draw.
longest line to draw that black can force is 8 plies
1...Kg8 2.Qe8+ Kh7 3.Qh5+ Kg8 4.Qe8+ Kh7 5.Qh5+ draw

The problem is that the longest line before 1...Kg8 if you ignore the history is 8 plies but after 1...Kg8 2.Qe8+ Kh7 3.Qh5+ the longest line is 4 plies when the position is the same except history so you cannot give a position a single value and the only way that I can see how to search distance to draw is brute force with no pruning and you cannot build tablebases for all the possible histories.

mwyoung
Posts: 2725
Joined: Wed May 12, 2010 8:00 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mwyoung » Tue Nov 10, 2020 3:27 pm

Uri Blass wrote:
Tue Nov 10, 2020 11:37 am
mwyoung wrote:
Tue Nov 10, 2020 5:54 am
Uri Blass wrote:
Tue Nov 10, 2020 2:07 am
mwyoung wrote:
Mon Nov 09, 2020 10:28 pm
mwyoung wrote:
Mon Nov 09, 2020 9:46 pm
Alayan wrote:
Mon Nov 09, 2020 8:35 pm
"Very simple endgame positions" where it fails. :lol:

Meanwhile, SF from 1.5 years ago + 6-men got 99.999% of 7-men positions right with a few Gnodes.

Some sample of the very natural and common positions where it failed in Aloril's testing :


Fails to see that 2 same color bishops can't force a win, very common issue in normal games...



Q+P vs 3M, very common sort of position... The suggested move is still a draw under the 50 moves rule.



More nonsense with both sides having bishops of the same color.



Very common imbalance, 3N vs Q...

And so on.

I agree! Stockfish is a idiot chess player. Here is another big fail by Stockfish that a 32 man engine would find instantly. Stockfish play has huge holes, and would be crushed with perfect play.



Stockfish 081120 - C Line, Blitz 5min+1sec
n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Analysis by Stockfish 081120:

1.Qd6+ Kg8 2.gxh7+ Kh8 3.Qd5 Qf8 4.Ke5 Kxh7 5.Qe4+ Kg8 6.Qxa8 Qc5+ 7.Kf4 Qd6+ 8.Kg4 Kg7 9.Bf6+ Qxf6
The position is equal: = (0.00) Depth: 100/19 00:06:25 11232MN, tb=404542636
(, 09.11.2020)

Even with a 100 ply type B search. Stockfish is totally clueless about this position.

I will give you a hint 1.Qc8 wins.
Stockfish will never find the win here. So I will give the solution now.

1.Qc8 Kg8 2.Bc7 Qxc8 3.gxf7+ Kh8 4.Be5 Qc5 5.Bb2 Nc7 6.Ba1 a4 7.Bb2 a3 8.Ba1 a2 9.Bb2 a1R 10.Bxa1 Qe5+ 11.Bxe5 Nd5+ 12.Ke6+ Nf6 13.Bxf6#
Maybe you are right for latest stockfish but people showed that some older version found the win.
Stockfish and all top programs can find easily the loss after 2.Bc7

The main problem is finding 2.Bc7 because stockfish insist to use the null move pruning.
I wonder how much time we need to wait for stockfish to replace it by a better pruning rules.

There are certainly positions when stockfish is blind but it does not prove that you can force stockfish to go to one of them from the opening position and if the 32 piece tablebases only try to play for the longest draw in order to draw position and does not try to take advantage of stockfish's weaknesses then it is not clear that it can win.

It may be interesting if somebody already built tablebases with the longest draw for 5-7 piece tablebases in drawn position with the distance in moves for a draw to find if this tablebase really score better than stockfish without tablebases against weak engines in random drawn tablebases endgames.
The point that is being showed is you will always have blind spots or a error rate with a type B search. And this will happen in every game. And this type B search approximation is fine, but will result in crushing losses against perfect play.

These positions are no trick, but the result of a type B search. And this error rate does not go away, but becomes worse with with more game complexity. As the type B search will need to prune more lines!
It is clear that there are many positions that stockfish does not find the right move but the question is not if there are positions that stockfish does not find the right move but if it is possible to get them from the initial position.

The fact that you show a lot of positions when stockfish does not find the right move does not prove that it is possible to achieve one of them from the opening position.

I guess that it is possible to achieve one of them but I doubt if the strategy of playing the move that is the longest path to draw in drawn positions is enough to do it.
I guess I will not be seeing you upgrade your engine. Since Stockfish 12 plays near perfect chess. :lol:

I guess you are under the delusion that these type B search errors only happens in the endgame. And that is not correct.

We only use these endgame examples because we can prove the errors with perfect play.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.

JohnWoe
Posts: 280
Joined: Sat Mar 02, 2013 10:31 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by JohnWoe » Tue Nov 10, 2020 9:09 pm


Mayhem can't see the best move. Null move propably?

Code: Select all

exclude: none best +tail                                          
dep	score	nodes	time	(not shown:  tbhits	knps	seldep)
 15	+0,25 	149,1M	1:07.00	Qd6+ 
 14	+0,30 	60,6M  	0:26.09	Qd6+ 
 13	+0,26 	21,3M  	0:09.36	Qd6+ 
 12	+0,32 	9,79M  	0:04.48	Qd6+ 
 11	+0,31 	4,54M  	0:02.22	Qd6+ 
 10	+0,50 	1,89M  	0:01.00	Qd6+ 
  9	+0,52 	670535	0:00.42	Qd6+ 
  8	+0,67 	357122	0:00.26	Qd6+ 
  7	+0,26 	155040	0:00.14	Qc8 
  6	+0,21 	66985  	0:00.09	Qc8 
  5	+0,27 	29203  	0:00.06	Qc8 
  4	+0,60 	14625  	0:00.05	Qc8 
  3	+1,18 	2057    	0:00.01	g7+ 
  2	+0,51 	571      	0:00.00	Qxa8 
  1	+0,92 	90        	0:00.00	Qd6+ 
  0	# 
SF12 too...

Code: Select all

42	  0.00 	116,1M	1:45.27	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Qb3 Qg7+ Ke7 Qe5+ Kxf7 Qf5+ Ke7 Qxh7+ Qf7 Qe4+ Qe6 Qxe6+ Kxe6 Kg8 Ke5 Kf8 Bb4+ Ke8 Ke6 Kd8 Kd6 Kc8 Kc6 Kb8 Kb6 Kc8 
 41	  0.00 	105,3M	1:36.63	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Qb3 Qg7+ Ke7 Qe5+ Kxf7 Qf5+ Ke7 Qxh7+ Qf7 Qe4+ Qe6 Qxe6+ Kxe6 Kg8 Ke5 Kf8 Bb4+ Ke8 Ke6 Kd8 Kd6 Kc8 Kc6 Kb8 Kb6 Kc8 Kc6 
 40	  0.00 	68,0M  	1:05.72	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 39	  0.00 	56,4M  	0:55.76	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 38	  0.00 	47,1M  	0:47.09	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 37	  0.00 	40,7M  	0:39.78	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 36	  0.00 	32,4M  	0:31.73	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 35	  0.00 	28,9M  	0:28.65	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 34	  0.00 	25,9M  	0:26.14	Qd6+ Kg8 gx

Code: Select all

exclude: none best +tail                                          
dep	score	nodes	time	(not shown:  tbhits	knps	seldep)
 12	+0,22 	177,7M	1:18.68	Rh4+ 
 11	+0,30 	40,9M  	0:15.82	Rh4+ 
 10	+0,23 	24,5M  	0:10.15	Rh4+ 
  9	+0,30 	2,32M  	0:01.20	Rh4+ 
  8	+0,33 	802763	0:00.47	Rh4+ 
  7	+0,27 	435471	0:00.29	Rh4+ 
  6	+0,17 	291404	0:00.23	Rh4+ 
  5	+0,27 	133790	0:00.13	Rh4+ 
  4	+0,14 	39837  	0:00.07	Kd1 
  3	+0,18 	5574    	0:00.02	Rg5 
  2	+0,26 	1180    	0:00.00	Rg5 
  1	+0,33 	89        	0:00.00	Qxg7+ 
  0	# 
Mayhem NNUE sees Rh4.

The point is these engines are optimized for game play. Stockfish would simply avoid these kind of endgames

mwyoung
Posts: 2725
Joined: Wed May 12, 2010 8:00 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mwyoung » Wed Nov 11, 2020 12:39 am

JohnWoe wrote:
Tue Nov 10, 2020 9:09 pm

Mayhem can't see the best move. Null move propably?

Code: Select all

exclude: none best +tail                                          
dep	score	nodes	time	(not shown:  tbhits	knps	seldep)
 15	+0,25 	149,1M	1:07.00	Qd6+ 
 14	+0,30 	60,6M  	0:26.09	Qd6+ 
 13	+0,26 	21,3M  	0:09.36	Qd6+ 
 12	+0,32 	9,79M  	0:04.48	Qd6+ 
 11	+0,31 	4,54M  	0:02.22	Qd6+ 
 10	+0,50 	1,89M  	0:01.00	Qd6+ 
  9	+0,52 	670535	0:00.42	Qd6+ 
  8	+0,67 	357122	0:00.26	Qd6+ 
  7	+0,26 	155040	0:00.14	Qc8 
  6	+0,21 	66985  	0:00.09	Qc8 
  5	+0,27 	29203  	0:00.06	Qc8 
  4	+0,60 	14625  	0:00.05	Qc8 
  3	+1,18 	2057    	0:00.01	g7+ 
  2	+0,51 	571      	0:00.00	Qxa8 
  1	+0,92 	90        	0:00.00	Qd6+ 
  0	# 
SF12 too...

Code: Select all

42	  0.00 	116,1M	1:45.27	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Qb3 Qg7+ Ke7 Qe5+ Kxf7 Qf5+ Ke7 Qxh7+ Qf7 Qe4+ Qe6 Qxe6+ Kxe6 Kg8 Ke5 Kf8 Bb4+ Ke8 Ke6 Kd8 Kd6 Kc8 Kc6 Kb8 Kb6 Kc8 
 41	  0.00 	105,3M	1:36.63	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Qb3 Qg7+ Ke7 Qe5+ Kxf7 Qf5+ Ke7 Qxh7+ Qf7 Qe4+ Qe6 Qxe6+ Kxe6 Kg8 Ke5 Kf8 Bb4+ Ke8 Ke6 Kd8 Kd6 Kc8 Kc6 Kb8 Kb6 Kc8 Kc6 
 40	  0.00 	68,0M  	1:05.72	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 39	  0.00 	56,4M  	0:55.76	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 38	  0.00 	47,1M  	0:47.09	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 37	  0.00 	40,7M  	0:39.78	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 36	  0.00 	32,4M  	0:31.73	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 35	  0.00 	28,9M  	0:28.65	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 34	  0.00 	25,9M  	0:26.14	Qd6+ Kg8 gx

Code: Select all

exclude: none best +tail                                          
dep	score	nodes	time	(not shown:  tbhits	knps	seldep)
 12	+0,22 	177,7M	1:18.68	Rh4+ 
 11	+0,30 	40,9M  	0:15.82	Rh4+ 
 10	+0,23 	24,5M  	0:10.15	Rh4+ 
  9	+0,30 	2,32M  	0:01.20	Rh4+ 
  8	+0,33 	802763	0:00.47	Rh4+ 
  7	+0,27 	435471	0:00.29	Rh4+ 
  6	+0,17 	291404	0:00.23	Rh4+ 
  5	+0,27 	133790	0:00.13	Rh4+ 
  4	+0,14 	39837  	0:00.07	Kd1 
  3	+0,18 	5574    	0:00.02	Rg5 
  2	+0,26 	1180    	0:00.00	Rg5 
  1	+0,33 	89        	0:00.00	Qxg7+ 
  0	# 
Mayhem NNUE sees Rh4.

The point is these engines are optimized for game play. Stockfish would simply avoid these kind of endgames
Just avoid the endgames. Just like that! :lol:

Here is the problem with your genius plan. The engine can not avoid the issue. This is not a endgame issue. This type B search issue happens in all positions. And is worse with even more game complexity.

Endgames examples were just a convinance. As we can show these type B search errors against perfect play. Because of table bases.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.

Uri Blass
Posts: 8921
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by Uri Blass » Wed Nov 11, 2020 1:39 am

mwyoung wrote:
Tue Nov 10, 2020 3:27 pm
Uri Blass wrote:
Tue Nov 10, 2020 11:37 am
mwyoung wrote:
Tue Nov 10, 2020 5:54 am
Uri Blass wrote:
Tue Nov 10, 2020 2:07 am
mwyoung wrote:
Mon Nov 09, 2020 10:28 pm
mwyoung wrote:
Mon Nov 09, 2020 9:46 pm
Alayan wrote:
Mon Nov 09, 2020 8:35 pm
"Very simple endgame positions" where it fails. :lol:

Meanwhile, SF from 1.5 years ago + 6-men got 99.999% of 7-men positions right with a few Gnodes.

Some sample of the very natural and common positions where it failed in Aloril's testing :


Fails to see that 2 same color bishops can't force a win, very common issue in normal games...



Q+P vs 3M, very common sort of position... The suggested move is still a draw under the 50 moves rule.



More nonsense with both sides having bishops of the same color.



Very common imbalance, 3N vs Q...

And so on.

I agree! Stockfish is a idiot chess player. Here is another big fail by Stockfish that a 32 man engine would find instantly. Stockfish play has huge holes, and would be crushed with perfect play.



Stockfish 081120 - C Line, Blitz 5min+1sec
n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Analysis by Stockfish 081120:

1.Qd6+ Kg8 2.gxh7+ Kh8 3.Qd5 Qf8 4.Ke5 Kxh7 5.Qe4+ Kg8 6.Qxa8 Qc5+ 7.Kf4 Qd6+ 8.Kg4 Kg7 9.Bf6+ Qxf6
The position is equal: = (0.00) Depth: 100/19 00:06:25 11232MN, tb=404542636
(, 09.11.2020)

Even with a 100 ply type B search. Stockfish is totally clueless about this position.

I will give you a hint 1.Qc8 wins.
Stockfish will never find the win here. So I will give the solution now.

1.Qc8 Kg8 2.Bc7 Qxc8 3.gxf7+ Kh8 4.Be5 Qc5 5.Bb2 Nc7 6.Ba1 a4 7.Bb2 a3 8.Ba1 a2 9.Bb2 a1R 10.Bxa1 Qe5+ 11.Bxe5 Nd5+ 12.Ke6+ Nf6 13.Bxf6#
Maybe you are right for latest stockfish but people showed that some older version found the win.
Stockfish and all top programs can find easily the loss after 2.Bc7

The main problem is finding 2.Bc7 because stockfish insist to use the null move pruning.
I wonder how much time we need to wait for stockfish to replace it by a better pruning rules.

There are certainly positions when stockfish is blind but it does not prove that you can force stockfish to go to one of them from the opening position and if the 32 piece tablebases only try to play for the longest draw in order to draw position and does not try to take advantage of stockfish's weaknesses then it is not clear that it can win.

It may be interesting if somebody already built tablebases with the longest draw for 5-7 piece tablebases in drawn position with the distance in moves for a draw to find if this tablebase really score better than stockfish without tablebases against weak engines in random drawn tablebases endgames.
The point that is being showed is you will always have blind spots or a error rate with a type B search. And this will happen in every game. And this type B search approximation is fine, but will result in crushing losses against perfect play.

These positions are no trick, but the result of a type B search. And this error rate does not go away, but becomes worse with with more game complexity. As the type B search will need to prune more lines!
It is clear that there are many positions that stockfish does not find the right move but the question is not if there are positions that stockfish does not find the right move but if it is possible to get them from the initial position.

The fact that you show a lot of positions when stockfish does not find the right move does not prove that it is possible to achieve one of them from the opening position.

I guess that it is possible to achieve one of them but I doubt if the strategy of playing the move that is the longest path to draw in drawn positions is enough to do it.
I guess I will not be seeing you upgrade your engine. Since Stockfish 12 plays near perfect chess. :lol:

I guess you are under the delusion that these type B search errors only happens in the endgame. And that is not correct.

We only use these endgame examples because we can prove the errors with perfect play.

No.
I do not claim that the errors only happens in the endgame.
Errors can happen in many positions but the question is if the perfect player is going to always force one of them so stockfish lose the match 100-0.

BrendanJNorman
Posts: 2357
Joined: Sun Feb 07, 2016 11:43 pm
Full name: Brendan J Norman

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by BrendanJNorman » Wed Nov 11, 2020 1:42 am

mwyoung wrote:
Wed Nov 11, 2020 12:39 am
JohnWoe wrote:
Tue Nov 10, 2020 9:09 pm

Mayhem can't see the best move. Null move propably?

Code: Select all

exclude: none best +tail                                          
dep	score	nodes	time	(not shown:  tbhits	knps	seldep)
 15	+0,25 	149,1M	1:07.00	Qd6+ 
 14	+0,30 	60,6M  	0:26.09	Qd6+ 
 13	+0,26 	21,3M  	0:09.36	Qd6+ 
 12	+0,32 	9,79M  	0:04.48	Qd6+ 
 11	+0,31 	4,54M  	0:02.22	Qd6+ 
 10	+0,50 	1,89M  	0:01.00	Qd6+ 
  9	+0,52 	670535	0:00.42	Qd6+ 
  8	+0,67 	357122	0:00.26	Qd6+ 
  7	+0,26 	155040	0:00.14	Qc8 
  6	+0,21 	66985  	0:00.09	Qc8 
  5	+0,27 	29203  	0:00.06	Qc8 
  4	+0,60 	14625  	0:00.05	Qc8 
  3	+1,18 	2057    	0:00.01	g7+ 
  2	+0,51 	571      	0:00.00	Qxa8 
  1	+0,92 	90        	0:00.00	Qd6+ 
  0	# 
SF12 too...

Code: Select all

42	  0.00 	116,1M	1:45.27	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Qb3 Qg7+ Ke7 Qe5+ Kxf7 Qf5+ Ke7 Qxh7+ Qf7 Qe4+ Qe6 Qxe6+ Kxe6 Kg8 Ke5 Kf8 Bb4+ Ke8 Ke6 Kd8 Kd6 Kc8 Kc6 Kb8 Kb6 Kc8 
 41	  0.00 	105,3M	1:36.63	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Qb3 Qg7+ Ke7 Qe5+ Kxf7 Qf5+ Ke7 Qxh7+ Qf7 Qe4+ Qe6 Qxe6+ Kxe6 Kg8 Ke5 Kf8 Bb4+ Ke8 Ke6 Kd8 Kd6 Kc8 Kc6 Kb8 Kb6 Kc8 Kc6 
 40	  0.00 	68,0M  	1:05.72	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 39	  0.00 	56,4M  	0:55.76	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 38	  0.00 	47,1M  	0:47.09	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 37	  0.00 	40,7M  	0:39.78	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 36	  0.00 	32,4M  	0:31.73	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 35	  0.00 	28,9M  	0:28.65	Qd6+ Kg8 gxh7+ Kh8 Bxa5 Qc8 Bd2 Nb6 Qxb6 Qf8 Bf4 Qg7+ Ke7 f6+ Ke6 Qg4+ Kf7 Qd7+ Kxf6 Qg7+ Kf5 Qg6+ Qxg6 
 34	  0.00 	25,9M  	0:26.14	Qd6+ Kg8 gx

Code: Select all

exclude: none best +tail                                          
dep	score	nodes	time	(not shown:  tbhits	knps	seldep)
 12	+0,22 	177,7M	1:18.68	Rh4+ 
 11	+0,30 	40,9M  	0:15.82	Rh4+ 
 10	+0,23 	24,5M  	0:10.15	Rh4+ 
  9	+0,30 	2,32M  	0:01.20	Rh4+ 
  8	+0,33 	802763	0:00.47	Rh4+ 
  7	+0,27 	435471	0:00.29	Rh4+ 
  6	+0,17 	291404	0:00.23	Rh4+ 
  5	+0,27 	133790	0:00.13	Rh4+ 
  4	+0,14 	39837  	0:00.07	Kd1 
  3	+0,18 	5574    	0:00.02	Rg5 
  2	+0,26 	1180    	0:00.00	Rg5 
  1	+0,33 	89        	0:00.00	Qxg7+ 
  0	# 
Mayhem NNUE sees Rh4.

The point is these engines are optimized for game play. Stockfish would simply avoid these kind of endgames
Just avoid the endgames. Just like that! :lol:

Here is the problem with your genius plan. The engine can not avoid the issue. This is not a endgame issue. This type B search issue happens in all positions. And is worse with even more game complexity.

Endgames examples were just a convinance. As we can show these type B search errors against perfect play. Because of table bases.
I feel like some people are desperately trying to keep clutch of their biases here.

This "avoid the endgames" is nonsensical for 3 reasons:

1. Endgames are the natural result of surviving the middlegame (can we assume Stockfish woukd survive the middlegame against a 32 man tablebase? :lol: )

2. I suppose this "avoid the endgames" logic is based also on the fact that tablebases are sometimes called "endgame tablebases" - but the thing is, this is *only* because we have so few pieces. With 32 man, it'd just be called "the oracle" or something - the all-seeing God - in ALL positions.

3. As you mentioned, we only argue from the basis of endgames because it is the only field where we can prove with verifiable data (7 man TBS) that Stockfish has no idea. Any person with a decent grasp of a) chess and b) logic can understand that as complexity is introduced and Stockfish makes the inevitable errrors, the TB will remain PERFECT which means as we seen, INSTANT announcement of mate in x against Stockfish.

Surprised to see people I consider pretty smart arguing against this obvious point.

BrendanJNorman
Posts: 2357
Joined: Sun Feb 07, 2016 11:43 pm
Full name: Brendan J Norman

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by BrendanJNorman » Wed Nov 11, 2020 1:49 am

Uri Blass wrote:
Wed Nov 11, 2020 1:39 am
mwyoung wrote:
Tue Nov 10, 2020 3:27 pm
Uri Blass wrote:
Tue Nov 10, 2020 11:37 am
mwyoung wrote:
Tue Nov 10, 2020 5:54 am
Uri Blass wrote:
Tue Nov 10, 2020 2:07 am
mwyoung wrote:
Mon Nov 09, 2020 10:28 pm
mwyoung wrote:
Mon Nov 09, 2020 9:46 pm
Alayan wrote:
Mon Nov 09, 2020 8:35 pm
"Very simple endgame positions" where it fails. :lol:

Meanwhile, SF from 1.5 years ago + 6-men got 99.999% of 7-men positions right with a few Gnodes.

Some sample of the very natural and common positions where it failed in Aloril's testing :


Fails to see that 2 same color bishops can't force a win, very common issue in normal games...



Q+P vs 3M, very common sort of position... The suggested move is still a draw under the 50 moves rule.



More nonsense with both sides having bishops of the same color.



Very common imbalance, 3N vs Q...

And so on.

I agree! Stockfish is a idiot chess player. Here is another big fail by Stockfish that a 32 man engine would find instantly. Stockfish play has huge holes, and would be crushed with perfect play.



Stockfish 081120 - C Line, Blitz 5min+1sec
n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Analysis by Stockfish 081120:

1.Qd6+ Kg8 2.gxh7+ Kh8 3.Qd5 Qf8 4.Ke5 Kxh7 5.Qe4+ Kg8 6.Qxa8 Qc5+ 7.Kf4 Qd6+ 8.Kg4 Kg7 9.Bf6+ Qxf6
The position is equal: = (0.00) Depth: 100/19 00:06:25 11232MN, tb=404542636
(, 09.11.2020)

Even with a 100 ply type B search. Stockfish is totally clueless about this position.

I will give you a hint 1.Qc8 wins.
Stockfish will never find the win here. So I will give the solution now.

1.Qc8 Kg8 2.Bc7 Qxc8 3.gxf7+ Kh8 4.Be5 Qc5 5.Bb2 Nc7 6.Ba1 a4 7.Bb2 a3 8.Ba1 a2 9.Bb2 a1R 10.Bxa1 Qe5+ 11.Bxe5 Nd5+ 12.Ke6+ Nf6 13.Bxf6#
Maybe you are right for latest stockfish but people showed that some older version found the win.
Stockfish and all top programs can find easily the loss after 2.Bc7

The main problem is finding 2.Bc7 because stockfish insist to use the null move pruning.
I wonder how much time we need to wait for stockfish to replace it by a better pruning rules.

There are certainly positions when stockfish is blind but it does not prove that you can force stockfish to go to one of them from the opening position and if the 32 piece tablebases only try to play for the longest draw in order to draw position and does not try to take advantage of stockfish's weaknesses then it is not clear that it can win.

It may be interesting if somebody already built tablebases with the longest draw for 5-7 piece tablebases in drawn position with the distance in moves for a draw to find if this tablebase really score better than stockfish without tablebases against weak engines in random drawn tablebases endgames.
The point that is being showed is you will always have blind spots or a error rate with a type B search. And this will happen in every game. And this type B search approximation is fine, but will result in crushing losses against perfect play.

These positions are no trick, but the result of a type B search. And this error rate does not go away, but becomes worse with with more game complexity. As the type B search will need to prune more lines!
It is clear that there are many positions that stockfish does not find the right move but the question is not if there are positions that stockfish does not find the right move but if it is possible to get them from the initial position.

The fact that you show a lot of positions when stockfish does not find the right move does not prove that it is possible to achieve one of them from the opening position.

I guess that it is possible to achieve one of them but I doubt if the strategy of playing the move that is the longest path to draw in drawn positions is enough to do it.
I guess I will not be seeing you upgrade your engine. Since Stockfish 12 plays near perfect chess. :lol:

I guess you are under the delusion that these type B search errors only happens in the endgame. And that is not correct.

We only use these endgame examples because we can prove the errors with perfect play.

No.
I do not claim that the errors only happens in the endgame.
Errors can happen in many positions but the question is if the perfect player is going to always force one of them so stockfish lose the match 100-0.
We are talking about a 32-man tablebase, not necessarily a hypothetical "perfect player" (even though, in essence, this is the same thing).

A tablebase doesn't need to "force" anything, it just reads from the database.

If Stockfish makes ONE inaccuracy, he is theoretically getting mated by force.

Stockfish, even now, isn't strong enough to avoid this outcome because Stockfish himself isn't close to perfect chess.

Put it this way, if we threw Magnus Carlsen into a time machine and sent him back to 1857, guys like Paul Morphy would spout about him playing "perfect chess" - but we KNOW Calrsen would be hammered 100-0 even against old versions of Stockfish.

Now let's go 150 years INTO the future where we have 32 man tablebases (at this point even 9-10 man TBs seems inconceivable)...

...do you think people will be talking about Stockfish 12?

BrendanJNorman
Posts: 2357
Joined: Sun Feb 07, 2016 11:43 pm
Full name: Brendan J Norman

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by BrendanJNorman » Wed Nov 11, 2020 1:57 am

The premise of this thread is silly, guys, come on.

Stockfish is within 200 Elo of a 32 man tablebase (or "perfect chess")?

So the premise really is that given SF is around 3700 CCRL...

...a 32 man TB (or flawless, solved chess) is just 3900 CCRL.

The future will prove this to us (very soon I imagine since the NNUE revolution has just begun) and strongly imply that "perfect chess" may well be 5000 Elo or beyond.

The fact that they are still adding Elo to Stockfish via patches proves this point.

If Stockfish were ANYWHERE NEAR perfect chess, adding Elo would be like squeezing blood from a stone.

Post Reply