Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Alayan · Post by **Alayan** » Mon Nov 09, 2020 9:35 pm

"Very simple endgame positions" where it fails.

Meanwhile, SF from 1.5 years ago + 6-men got 99.999% of 7-men positions right with a few Gnodes.

Some sample of the very natural and common positions where it failed in Aloril's testing :
[d]1K6/5b2/8/3k4/b4B2/2B5/3r4/8 b - - 0 1

Fails to see that 2 same color bishops can't force a win, very common issue in normal games...

[d]4N1q1/6B1/p1k5/8/8/4K2B/8/8 w - - 0 1

Q+P vs 3M, very common sort of position... The suggested move is still a draw under the 50 moves rule.

[d]3B4/B5R1/b7/2k5/7K/8/4b3/8 b - - 0 1

More nonsense with both sides having bishops of the same color.

[d]3N4/6k1/8/8/K3N2p/8/8/3N2q1 w - - 0 1

Very common imbalance, 3N vs Q...

And so on.

duncan · Post by **duncan** » Mon Nov 09, 2020 10:17 pm

BrendanJNorman wrote: ↑Mon Nov 09, 2020 5:42 pm
duncan wrote: ↑Mon Nov 09, 2020 5:03 pm
mwyoung wrote: ↑Mon Nov 09, 2020 4:33 pm
duncan wrote: ↑Mon Nov 09, 2020 4:26 pm
mwyoung wrote: ↑Mon Nov 09, 2020 4:12 pm
duncan wrote: ↑Mon Nov 09, 2020 3:56 pm
mwyoung wrote: ↑Mon Nov 09, 2020 2:32 pm
duncan wrote: ↑Mon Nov 09, 2020 12:33 pm
BrendanJNorman wrote: ↑Mon Nov 09, 2020 12:03 pm

Right. I don't know why others aren't seeing this (obvious) logic.

A real 32 man tablebase already knows the outcome of the game in the beginning, so what the hell is a "complication trick" (which someone mentioned) going to do to save Stockfish?

What will happen, in *EVERY* game, is this:

1. Players shuffle out opening moves.
2. Stockfish gets an edge (if white, maybe).
3. Stockfish makes ONE...MINOR inaccuracy.
4. 32 man TB announces mate in 56.
5. Stockfish is mated in 31 because without tablebases, he chooses an imprecise path to mate (or getting mated in his case).

100-0 match result.

As you said, SF is clueless in 6 man positions, he'd be clueless and defenseless against a 32 man TB engine.
What happens if Stockfish 's evaluation in regular chess is so good, it does not make significant inaccuracies which will lead to a loss against 32 man TB.

In that case, you will have to target's stockfish's weak point that it cannot see beyond 15 moves. The job of 32 man Tb will be to move the game into positions where it is essential to calculate beyond 15 moves.
Why would you think this is possible! And this is regular chess. When stockfish can not show this kind of accuracy in endgames.
I thought 99% of positions in 7 or less tablebase, stockfish will draw all drawable and win all winnable positions when playing against a tablebase.
Where did you find this B.S. stat. And how does this help stockfish from losing all games against a 32 man TB. Stockfish would need to show 100% in 7 man to even have a chance.

But by your stat. Stockfish would be total wrong in 4,238,368,356,673 positions of 7 man TB. This is not a small hole, but a rip you could drive a truck through.

I did not find this stat anywhere. Just an impression which could easily be wrong. Anybody can give accurate stats ?

But if stockfish In 99% of positions in 32 or less tablebase, stockfish will draw all drawable and win all winnable positions when playing against a tablebase, would it not mean it would score a lot of draws in matches against a 32 man TB?
Why on earth would you think that your fictional 99% stat will hold up as the game gets more complicated. If Stockfish can only be right 99% of the time in simplified endgames. This will clearly not be true at 8 man, 16 man, or 32 man positions.

But do the math. If Stockfish has a 1% chance of making a losing move in every move of a game. This does not help your argument, but mine!
Your main point is right but If Stockfish has a 1% chance of making a losing move in every move of a game and an average game is 100 moves, then would there be no losing moves in every fourth game / fifth game or so ? if correct it will mean 20/25% draws against a 32 man tablebase.
Let me make this idea simple for you.

Kai is a statistician, he is very good at compiling data and making predictions about the future.

Probably one of the brightest minds here when it comes to such things.

In his field, and with the right data, it may appear like he can PREDICT THE FUTURE.

So here's my question:

Say we placed Kai head-to-head with God himself in a contest to predict the future...

Who do you think would come out on top when both were given very complicated (chess-level) data to make predictions from?

Here's the answer: God wins EVERY time, because God knows the answers without even thinking. His knowledge is omnipotent.

This is Stockfish vs 32 man tablebase as well. Stockfish is very good, but TBs are perfect and omnipotent.

Who would win a fight? 1985 Mike Tyson or an Abrams tank?

Who swims faster? Mike Phelps or a sailfish?

There is no contest and comparisons seem ridiculous.

I believe Tinsley would have scored a number of draws against checkers database, even if you describe it as God.
https://www.theatlantic.com/technology/ ... rs/534111/

How Checkers Was Solved

The story of a duel between two men, one who dies, and the nature of the quest to build artificial intelligence

mmt · Post by **mmt** » Mon Nov 09, 2020 10:37 pm

Alayan wrote: ↑Mon Nov 09, 2020 8:57 pm There are many many positions where Stockfish with a certain minimum amount of nodes will NEVER play a losing move, and a much smaller subset from which it would draw or win against arbitrary play (no matter the move played by the other side, no matter the randomness from many threads, it would never lose).

To maximize winning chances, the perfect TB player must do its best to avoid such positions, and find positions where the SF error rate is >0%.

I mostly agree with Alayan. Maybe there is an algorithm that will allow a 32-piece TB to win playing black against SF but if you just choose random non-losing moves from the TB, you won't win 100% of games. I don't think the longest distance to a draw would do it, especially if SF can learn from previous games and generate an opening book based on them.

mwyoung · Post by **mwyoung** » Mon Nov 09, 2020 10:46 pm

Alayan wrote: ↑Mon Nov 09, 2020 9:35 pm "Very simple endgame positions" where it fails.

Meanwhile, SF from 1.5 years ago + 6-men got 99.999% of 7-men positions right with a few Gnodes.

Some sample of the very natural and common positions where it failed in Aloril's testing :
[d]1K6/5b2/8/3k4/b4B2/2B5/3r4/8 b - - 0 1

Fails to see that 2 same color bishops can't force a win, very common issue in normal games...

[d]4N1q1/6B1/p1k5/8/8/4K2B/8/8 w - - 0 1

Q+P vs 3M, very common sort of position... The suggested move is still a draw under the 50 moves rule.

[d]3B4/B5R1/b7/2k5/7K/8/4b3/8 b - - 0 1

More nonsense with both sides having bishops of the same color.

[d]3N4/6k1/8/8/K3N2p/8/8/3N2q1 w - - 0 1

Very common imbalance, 3N vs Q...

And so on.

I agree! Stockfish is a idiot chess player. Here is another big fail by Stockfish that a 32 man engine would find instantly. Stockfish play has huge holes, and would be crushed with perfect play.

[d]n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Stockfish 081120 - C Line, Blitz 5min+1sec
n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Analysis by Stockfish 081120:

1.Qd6+ Kg8 2.gxh7+ Kh8 3.Qd5 Qf8 4.Ke5 Kxh7 5.Qe4+ Kg8 6.Qxa8 Qc5+ 7.Kf4 Qd6+ 8.Kg4 Kg7 9.Bf6+ Qxf6
The position is equal: = (0.00) Depth: 100/19 00:06:25 11232MN, tb=404542636
(, 09.11.2020)

Even with a 100 ply type B search. Stockfish is totally clueless about this position.

I will give you a hint 1.Qc8 wins.

Ajedrecista · Post by **Ajedrecista** » Mon Nov 09, 2020 11:06 pm

Hello:

duncan wrote: ↑Mon Nov 09, 2020 10:17 pmI believe Tinsley would have scored a number of draws against checkers database, even if you describe it as God.
https://www.theatlantic.com/technology/ ... rs/534111/

How Checkers Was Solved

The story of a duel between two men, one who dies, and the nature of the quest to build artificial intelligence

Tinsley's moment of 'You are going to regret that!' was massive. The game was played on Thursday, December 13, 1990 in an exhibition game and to be fair, Chinook at that time was far from the perfect status awarded at 2007. Today top programs KingsRow and Cake can see the losing move 10.- ..., 32-28? in very few time with 8-piece EGDB in my old hardware, not even using the latest versions.

A report of the game can be read in the great book 'One Jump Ahead. Computer Perfection at Checkers' by Jonathan Schaeffer (pages 188 to 192 of the Revised edition). This book uses a kind of algebraic notation (b6-a5 a3-b4) which is common in Rusian draughts but not in English draughts, where the squares of the board are numbered from 1 to 32, so this is why I wrote 10.- ..., 28-32? instead of 10.- ..., g1-h2?

Regards from Spain.

Ajedrecista.

Ajedrecista · Post by **Ajedrecista** » Mon Nov 09, 2020 11:25 pm

Hello:

mwyoung wrote: ↑Mon Nov 09, 2020 10:46 pmI agree! Stockfish is a idiot chess player. Here is another big fail by Stockfish that a 32 man engine would find instantly. Stockfish play has huge holes, and would be crushed with perfect play.

[d]n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Stockfish 081120 - C Line, Blitz 5min+1sec
n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Analysis by Stockfish 081120:

1.Qd6+ Kg8 2.gxh7+ Kh8 3.Qd5 Qf8 4.Ke5 Kxh7 5.Qe4+ Kg8 6.Qxa8 Qc5+ 7.Kf4 Qd6+ 8.Kg4 Kg7 9.Bf6+ Qxf6
The position is equal: = (0.00) Depth: 100/19 00:06:25 11232MN, tb=404542636
(, 09.11.2020)

Even with a 100 ply type B search. Stockfish is totally clueless about this position.

I will give you a hint 1.Qc8 wins.

I remember that I saw this problem for the first time at Rybka Forum some years ago. It is a true gem! I still remembered 2.- Bc7 but I had to look for the complete solution: YACPDB #276295.

There is a thread at TalkChess of two years ago where this study is claimed to be solved by SF, although 6-man Syzygy were used, which should make a difference:

Mario Matous study from 1975 solved by Stockfish

Regards from Spain.

Ajedrecista.

mwyoung · Post by **mwyoung** » Mon Nov 09, 2020 11:28 pm

mwyoung wrote: ↑Mon Nov 09, 2020 10:46 pm
Alayan wrote: ↑Mon Nov 09, 2020 9:35 pm "Very simple endgame positions" where it fails.

Meanwhile, SF from 1.5 years ago + 6-men got 99.999% of 7-men positions right with a few Gnodes.

Some sample of the very natural and common positions where it failed in Aloril's testing :
[d]1K6/5b2/8/3k4/b4B2/2B5/3r4/8 b - - 0 1

Fails to see that 2 same color bishops can't force a win, very common issue in normal games...

[d]4N1q1/6B1/p1k5/8/8/4K2B/8/8 w - - 0 1

Q+P vs 3M, very common sort of position... The suggested move is still a draw under the 50 moves rule.

[d]3B4/B5R1/b7/2k5/7K/8/4b3/8 b - - 0 1

More nonsense with both sides having bishops of the same color.

[d]3N4/6k1/8/8/K3N2p/8/8/3N2q1 w - - 0 1

Very common imbalance, 3N vs Q...

And so on.

I agree! Stockfish is a idiot chess player. Here is another big fail by Stockfish that a 32 man engine would find instantly. Stockfish play has huge holes, and would be crushed with perfect play.

[d]n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Stockfish 081120 - C Line, Blitz 5min+1sec
n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Analysis by Stockfish 081120:

1.Qd6+ Kg8 2.gxh7+ Kh8 3.Qd5 Qf8 4.Ke5 Kxh7 5.Qe4+ Kg8 6.Qxa8 Qc5+ 7.Kf4 Qd6+ 8.Kg4 Kg7 9.Bf6+ Qxf6
The position is equal: = (0.00) Depth: 100/19 00:06:25 11232MN, tb=404542636
(, 09.11.2020)

Even with a 100 ply type B search. Stockfish is totally clueless about this position.

I will give you a hint 1.Qc8 wins.

Stockfish will never find the win here. So I will give the solution now.
[d]n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1
1.Qc8 Kg8 2.Bc7 Qxc8 3.gxf7+ Kh8 4.Be5 Qc5 5.Bb2 Nc7 6.Ba1 a4 7.Bb2 a3 8.Ba1 a2 9.Bb2 a1R 10.Bxa1 Qe5+ 11.Bxe5 Nd5+ 12.Ke6+ Nf6 13.Bxf6#

Uri Blass · Post by **Uri Blass** » Tue Nov 10, 2020 3:07 am

mwyoung wrote: ↑Mon Nov 09, 2020 11:28 pm
mwyoung wrote: ↑Mon Nov 09, 2020 10:46 pm
Alayan wrote: ↑Mon Nov 09, 2020 9:35 pm "Very simple endgame positions" where it fails.

Meanwhile, SF from 1.5 years ago + 6-men got 99.999% of 7-men positions right with a few Gnodes.

Some sample of the very natural and common positions where it failed in Aloril's testing :
[d]1K6/5b2/8/3k4/b4B2/2B5/3r4/8 b - - 0 1

Fails to see that 2 same color bishops can't force a win, very common issue in normal games...

[d]4N1q1/6B1/p1k5/8/8/4K2B/8/8 w - - 0 1

Q+P vs 3M, very common sort of position... The suggested move is still a draw under the 50 moves rule.

[d]3B4/B5R1/b7/2k5/7K/8/4b3/8 b - - 0 1

More nonsense with both sides having bishops of the same color.

[d]3N4/6k1/8/8/K3N2p/8/8/3N2q1 w - - 0 1

Very common imbalance, 3N vs Q...

And so on.

I agree! Stockfish is a idiot chess player. Here is another big fail by Stockfish that a 32 man engine would find instantly. Stockfish play has huge holes, and would be crushed with perfect play.

[d]n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Stockfish 081120 - C Line, Blitz 5min+1sec
n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Analysis by Stockfish 081120:

1.Qd6+ Kg8 2.gxh7+ Kh8 3.Qd5 Qf8 4.Ke5 Kxh7 5.Qe4+ Kg8 6.Qxa8 Qc5+ 7.Kf4 Qd6+ 8.Kg4 Kg7 9.Bf6+ Qxf6
The position is equal: = (0.00) Depth: 100/19 00:06:25 11232MN, tb=404542636
(, 09.11.2020)

Even with a 100 ply type B search. Stockfish is totally clueless about this position.

I will give you a hint 1.Qc8 wins.
Stockfish will never find the win here. So I will give the solution now.
[d]n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1
1.Qc8 Kg8 2.Bc7 Qxc8 3.gxf7+ Kh8 4.Be5 Qc5 5.Bb2 Nc7 6.Ba1 a4 7.Bb2 a3 8.Ba1 a2 9.Bb2 a1R 10.Bxa1 Qe5+ 11.Bxe5 Nd5+ 12.Ke6+ Nf6 13.Bxf6#

Maybe you are right for latest stockfish but people showed that some older version found the win.
Stockfish and all top programs can find easily the loss after 2.Bc7

The main problem is finding 2.Bc7 because stockfish insist to use the null move pruning.
I wonder how much time we need to wait for stockfish to replace it by a better pruning rules.

There are certainly positions when stockfish is blind but it does not prove that you can force stockfish to go to one of them from the opening position and if the 32 piece tablebases only try to play for the longest draw in order to draw position and does not try to take advantage of stockfish's weaknesses then it is not clear that it can win.

It may be interesting if somebody already built tablebases with the longest draw for 5-7 piece tablebases in drawn position with the distance in moves for a draw to find if this tablebase really score better than stockfish without tablebases against weak engines in random drawn tablebases endgames.

Alayan · Post by **Alayan** » Tue Nov 10, 2020 3:23 am

Uri Blass wrote: ↑Tue Nov 10, 2020 3:07 am Maybe you are right for latest stockfish but people showed that some older version found the win.
Stockfish and all top programs can find easily the loss after 2.Bc7

The main problem is finding 2.Bc7 because stockfish insist to use the null move pruning.
I wonder how much time we need to wait for stockfish to replace it by a better pruning rules.

There are certainly positions when stockfish is blind but it does not prove that you can force stockfish to go to one of them from the opening position and if the 32 piece tablebases only try to play for the longest draw in order to draw position and does not try to take advantage of stockfish's weaknesses then it is not clear that it can win.

Null move pruning is good in almost all relevant chess positions. It's done with a low-depth verification search. There are some general patterns where NMP is more likely to fail, and it's not used.

Positions where NMP fails are drastically overrepresented among "problem positions" because these positions focus on anti-patterns and on what engines struggle with. Sure, these positions highlight the gap there is between current top engines and what a true TB-32 would be able to achieve, but how much do they matter when it comes to being able to play well from "reasonable positions" ? Rather little, I think.

Of course, we don't have a top engine without massive pruning because massive pruning is a key contributor of strength, so it's not easy to know how much could be exploited. But you'd then need trillions of nodes to keep depth up.

Uri Blass wrote: ↑Tue Nov 10, 2020 3:07 am It may be interesting if somebody already built tablebases with the longest draw for 5-7 piece tablebases in drawn position with the distance in moves for a draw to find if this tablebase really score better than stockfish without tablebases against weak engines in random drawn tablebases endgames.

That would be an interesting experiment. 6 pieces is the highest feasible, I think, generating distance to shortest forced draw 7-pieces would require too much hardware and storage space.

mwyoung · Post by **mwyoung** » Tue Nov 10, 2020 6:54 am

Uri Blass wrote: ↑Tue Nov 10, 2020 3:07 am
mwyoung wrote: ↑Mon Nov 09, 2020 11:28 pm
mwyoung wrote: ↑Mon Nov 09, 2020 10:46 pm
Alayan wrote: ↑Mon Nov 09, 2020 9:35 pm "Very simple endgame positions" where it fails.

Meanwhile, SF from 1.5 years ago + 6-men got 99.999% of 7-men positions right with a few Gnodes.

Some sample of the very natural and common positions where it failed in Aloril's testing :
[d]1K6/5b2/8/3k4/b4B2/2B5/3r4/8 b - - 0 1

Fails to see that 2 same color bishops can't force a win, very common issue in normal games...

[d]4N1q1/6B1/p1k5/8/8/4K2B/8/8 w - - 0 1

Q+P vs 3M, very common sort of position... The suggested move is still a draw under the 50 moves rule.

[d]3B4/B5R1/b7/2k5/7K/8/4b3/8 b - - 0 1

More nonsense with both sides having bishops of the same color.

[d]3N4/6k1/8/8/K3N2p/8/8/3N2q1 w - - 0 1

Very common imbalance, 3N vs Q...

And so on.

I agree! Stockfish is a idiot chess player. Here is another big fail by Stockfish that a 32 man engine would find instantly. Stockfish play has huge holes, and would be crushed with perfect play.

[d]n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Stockfish 081120 - C Line, Blitz 5min+1sec
n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1

Analysis by Stockfish 081120:

1.Qd6+ Kg8 2.gxh7+ Kh8 3.Qd5 Qf8 4.Ke5 Kxh7 5.Qe4+ Kg8 6.Qxa8 Qc5+ 7.Kf4 Qd6+ 8.Kg4 Kg7 9.Bf6+ Qxf6
The position is equal: = (0.00) Depth: 100/19 00:06:25 11232MN, tb=404542636
(, 09.11.2020)

Even with a 100 ply type B search. Stockfish is totally clueless about this position.

I will give you a hint 1.Qc8 wins.
Stockfish will never find the win here. So I will give the solution now.
[d]n2Bqk2/5p1p/Q4KP1/p7/8/8/8/8 w - - 0 1
1.Qc8 Kg8 2.Bc7 Qxc8 3.gxf7+ Kh8 4.Be5 Qc5 5.Bb2 Nc7 6.Ba1 a4 7.Bb2 a3 8.Ba1 a2 9.Bb2 a1R 10.Bxa1 Qe5+ 11.Bxe5 Nd5+ 12.Ke6+ Nf6 13.Bxf6#
Maybe you are right for latest stockfish but people showed that some older version found the win.
Stockfish and all top programs can find easily the loss after 2.Bc7

The main problem is finding 2.Bc7 because stockfish insist to use the null move pruning.
I wonder how much time we need to wait for stockfish to replace it by a better pruning rules.

There are certainly positions when stockfish is blind but it does not prove that you can force stockfish to go to one of them from the opening position and if the 32 piece tablebases only try to play for the longest draw in order to draw position and does not try to take advantage of stockfish's weaknesses then it is not clear that it can win.

It may be interesting if somebody already built tablebases with the longest draw for 5-7 piece tablebases in drawn position with the distance in moves for a draw to find if this tablebase really score better than stockfish without tablebases against weak engines in random drawn tablebases endgames.

The point that is being showed is you will always have blind spots or a error rate with a type B search. And this will happen in every game. And this type B search approximation is fine, but will result in crushing losses against perfect play.

These positions are no trick, but the result of a type B search. And this error rate does not go away, but becomes worse with with more game complexity. As the type B search will need to prune more lines!

Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Re: Perfect chess engine Elo (32-man TB) can be within 200 of Stockfish in TCEC LTC conditions.

Re: Perfect chess engine Elo (32-man TB) can be within 200 of Stockfish in TCEC LTC conditions.

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions