Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

BrendanJNorman
Posts: 2580
Joined: Mon Feb 08, 2016 12:43 am
Full name: Brendan J Norman

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by BrendanJNorman »

Alayan wrote: Mon Nov 09, 2020 5:41 pm TB32 using adversarial search..."
Why does TB32 need search? It is just a massive database.
Alayan wrote: Mon Nov 09, 2020 5:41 pm Instead of arguing further, I suggest we play a game. I'll run the latest SF as of now (no update during the game), 3 threads, 2GB hash. I'll play the move from the first depth completing after 1B nodes. Let's see how easily you can beat it.
Huh? Let's say you win 100-0...what does this prove about Stockfish's hypothetical results in trying to resist against a 32 man TB?

It would just be an engine vs engine match. Proves nothing. What is the logic in your idea?
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mwyoung »

BrendanJNorman wrote: Mon Nov 09, 2020 5:50 pm
Alayan wrote: Mon Nov 09, 2020 5:41 pm TB32 using adversarial search..."
Why does TB32 need search? It is just a massive database.
Alayan wrote: Mon Nov 09, 2020 5:41 pm Instead of arguing further, I suggest we play a game. I'll run the latest SF as of now (no update during the game), 3 threads, 2GB hash. I'll play the move from the first depth completing after 1B nodes. Let's see how easily you can beat it.
Huh? Let's say you win 100-0...what does this prove about Stockfish's hypothetical results in trying to resist against a 32 man TB?

It would just be an engine vs engine match. Proves nothing. What is the logic in your idea?
They have nothing else. No logic or reason behind anything they are saying.

And remember the result would be much worse then the 1% error rate given earlier. Remember this is under TCEC conditions with "time controls". The 32 man TB will move instantly, as Stockfish loses time, and search depth with each move. This clearly means more mistakes and losses from Stockfish under TCEC LTC match conditions. In this mind experiment. :lol:

Again as I pointed out earlier. The only thing you would need to do with the 32 man TB to assure the highest win rate. Is to tell the 32 man TB to play the move that extends the game the longest with a draw score. This assures Stockfish will lose in a timed match under TCEC LTC conditions.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by Alayan »

BrendanJNorman wrote: Mon Nov 09, 2020 5:50 pm
Alayan wrote: Mon Nov 09, 2020 5:41 pm TB32 using adversarial search..."
Why does TB32 need search? It is just a massive database.
You NEED a way to evaluate which positions are more difficult to play than others. This is fundamental.

The base TB32 with just WDL stats and random selection of "perfect move" (if you have no search, you must use some stupid algorithm like random perfect move, first alphabetical move... to pick among the perfect moves) would be incredibly weak. It would never lose, but it would almost never win.

And not just against Stockfish. You and me would have great chances at drawing against such a TB32 from the start position. Why ? It would play horribly poor moves, because it wouldn't care at all about having the advantage or not, everything is drawn all the same. Once into desperate almost lost positions, it would then start to play unfathomable moves that are required to avoid losing.
BrendanJNorman wrote: Mon Nov 09, 2020 5:50 pm
Alayan wrote: Mon Nov 09, 2020 5:41 pm Instead of arguing further, I suggest we play a game. I'll run the latest SF as of now (no update during the game), 3 threads, 2GB hash. I'll play the move from the first depth completing after 1B nodes. Let's see how easily you can beat it.
Huh? Let's say you win 100-0...what does this prove about Stockfish's hypothetical results in trying to resist against a 32 man TB?

It would just be an engine vs engine match. Proves nothing. What is the logic in your idea?
The proposed challenge offers the possibility to outsearch my Stockfish 1000 to 1, to use the version I'm using to try and target it directly for blindspots with other engines/versions/deeper searches, to use correspondence search techniques to improve on unassisted SF, to use opening books for help... The odds are massively stacked against my Stockfish. It should lose, I would expect it to.

The logic is obvious. If current Stockfish is very far from perfect play, then it makes plenty of exploitable mistakes, right ? So being allowed to use any available engine and search method, to outsearch it 1000 to 1, and so on, should allow to find plenty of improvements and find ways to punish those mistakes.

This doesn't mean that the "everything allowed" method would play perfect chess. In your hypothesis, it could still be very far off. Carlsen would crush us OTB, and Stockfish would crush Carlsen. But if "everything allowed" failed to win the challenge, then that would mean that even with all the advantages it couldn't find and punish enough mistakes. This would hint that my Stockfish isn't making that many/that big mistakes.

This empirical data would have more value than talking about how unfathomably big TB32 is.
mwyoung wrote: Mon Nov 09, 2020 6:08 pm They have nothing else. No logic or reason behind anything they are saying.

And remember the result would be much worse then the 1% error rate given earlier. Remember this is under TCEC conditions with "time controls". The 32 man TB will move instantly, as Stockfish loses time, and search depth with each move. This clearly means more mistakes and losses from Stockfish under TCEC LTC match conditions. In this mind experiment. :lol:
You have been consistently and gratuitously obnoxious. Your failure to see logic or reason stems from your refusal to try and understand what is being said.

Shame on you for making this a worse place.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mwyoung »

Alayan wrote: Mon Nov 09, 2020 6:41 pm
BrendanJNorman wrote: Mon Nov 09, 2020 5:50 pm
Alayan wrote: Mon Nov 09, 2020 5:41 pm TB32 using adversarial search..."
Why does TB32 need search? It is just a massive database.
You NEED a way to evaluate which positions are more difficult to play than others. This is fundamental.

The base TB32 with just WDL stats and random selection of "perfect move" (if you have no search, you must use some stupid algorithm like random perfect move, first alphabetical move... to pick among the perfect moves) would be incredibly weak. It would never lose, but it would almost never win.

And not just against Stockfish. You and me would have great chances at drawing against such a TB32 from the start position. Why ? It would play horribly poor moves, because it wouldn't care at all about having the advantage or not, everything is drawn all the same. Once into desperate almost lost positions, it would then start to play unfathomable moves that are required to avoid losing.
BrendanJNorman wrote: Mon Nov 09, 2020 5:50 pm
Alayan wrote: Mon Nov 09, 2020 5:41 pm Instead of arguing further, I suggest we play a game. I'll run the latest SF as of now (no update during the game), 3 threads, 2GB hash. I'll play the move from the first depth completing after 1B nodes. Let's see how easily you can beat it.
Huh? Let's say you win 100-0...what does this prove about Stockfish's hypothetical results in trying to resist against a 32 man TB?

It would just be an engine vs engine match. Proves nothing. What is the logic in your idea?
The proposed challenge offers the possibility to outsearch my Stockfish 1000 to 1, to use the version I'm using to try and target it directly for blindspots with other engines/versions/deeper searches, to use correspondence search techniques to improve on unassisted SF, to use opening books for help... The odds are massively stacked against my Stockfish. It should lose, I would expect it to.

The logic is obvious. If current Stockfish is very far from perfect play, then it makes plenty of exploitable mistakes, right ? So being allowed to use any available engine and search method, to outsearch it 1000 to 1, and so on, should allow to find plenty of improvements and find ways to punish those mistakes.

This doesn't mean that the "everything allowed" method would play perfect chess. In your hypothesis, it could still be very far off. Carlsen would crush us OTB, and Stockfish would crush Carlsen. But if "everything allowed" failed to win the challenge, then that would mean that even with all the advantages it couldn't find and punish enough mistakes. This would hint that my Stockfish isn't making that many/that big mistakes.

This empirical data would have more value than talking about how unfathomably big TB32 is.
mwyoung wrote: Mon Nov 09, 2020 6:08 pm They have nothing else. No logic or reason behind anything they are saying.

And remember the result would be much worse then the 1% error rate given earlier. Remember this is under TCEC conditions with "time controls". The 32 man TB will move instantly, as Stockfish loses time, and search depth with each move. This clearly means more mistakes and losses from Stockfish under TCEC LTC match conditions. In this mind experiment. :lol:
You have been consistently and gratuitously obnoxious. Your failure to see logic or reason stems from your refusal to try and understand what is being said.

Shame on you for making this a worse place.
My logic still stands. As you have brought nothing to this discussion. And I understand what is being said. And I am correct.
And all your fantasies have fallen away one by one.

Clearly Stockfish is not even close in Elo to a 32 man TB. Stockfish can not even play 7 and 6 man positions correctly. Let alone fight against a 32 man TB with time controls. :lol:
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by Alayan »

It's not because you're loud that you're right.

Stockfish plays almost all TB positions perfectly.

This is with 1.5 years old Stockfish over 10M+ 7-men positions :

Code: Select all

Stockfish 190501 errors with 6-piece Syzygy
options = {"Hash": min(NODES*10//10**6*4, 16384), "SyzygyPath": "/usr/games/syzygy", "Syzygy50MoveRule": False, "Clear Hash": None}
Threads=1 for Nodes<=1G
Threads=16 for Nodes>1G
 
All positions: 10099554
Nodes  Wrong  Total%   Previous%
random        ~22%     ~22%
depht1        ~1.4%    ~6%
100k    6828  0.0676%  ~5%
1M      1284  0.0127%  19%
10M      403  0.0040%  31%
100M     214  0.0021%  53%
1G       127  0.0013%  59%
10G       77  0.0008%  61%
Source :

1% is the error rate of Stockfish at depth 2 or so...
BrendanJNorman
Posts: 2580
Joined: Mon Feb 08, 2016 12:43 am
Full name: Brendan J Norman

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by BrendanJNorman »

Alayan wrote: Mon Nov 09, 2020 6:41 pm

You NEED a way to evaluate which positions are more difficult to play than others. This is fundamental.
What could possibly be 'difficult' for an endgame tablebase?

Consider the following position and hypothetical scenario.

Stockfish as black has just allowed white to enter this line where he can promote a pawn.

White has just played: a8=Q

[d]QN4n1/6r1/3k4/8/b2K4/8/8/8 b - -

Stockfish thinks he is completely okay.

Image

But Stockfish is deluded. He simply hasn't seen the forced mate in 545 - which the TB sees instantly.

Image

And this is only a SEVEN man position. Stockfish's failures will increase exponentially as more pieces are added, while the TB remains perfect.

Isn't the result obvious?
Alayan wrote: Mon Nov 09, 2020 5:41 pm
The proposed challenge offers the possibility to outsearch my Stockfish 1000 to 1, to use the version I'm using to try and target it directly for blindspots with other engines/versions/deeper searches, to use correspondence search techniques to improve on unassisted SF, to use opening books for help... The odds are massively stacked against my Stockfish. It should lose, I would expect it to.

The logic is obvious. If current Stockfish is very far from perfect play, then it makes plenty of exploitable mistakes, right ? So being allowed to use any available engine and search method, to outsearch it 1000 to 1, and so on, should allow to find plenty of improvements and find ways to punish those mistakes.

This doesn't mean that the "everything allowed" method would play perfect chess. In your hypothesis, it could still be very far off. Carlsen would crush us OTB, and Stockfish would crush Carlsen. But if "everything allowed" failed to win the challenge, then that would mean that even with all the advantages it couldn't find and punish enough mistakes. This would hint that my Stockfish isn't making that many/that big mistakes.
Your argument contains false logic. There is an incalculably enormous difference between a 32 man TB and whatever "everything allowed" method you are proposing someone challenge you with.

Firstly, the "everything allowed" has only engines weaker than stockfish and opening books (made from the games of engines weaker than stockfish or equal), plus 7-man tablebases (max) - how on earth is that a good test for how SF will handle a 32 man TB as an "opponent"?

Your logic says: "Let's test how good a seabass is at swimming, by placing prime Phelps and Thorpe in a swimming race in perfect conditions. Thorpe can have a 10-yard headstart. If Phelps still wins, it means he is faster than a seabass".

It seems like you are arguing from a false premise.

That is: "If Stockfish has a low error rate against current bleeding edge opponents - this is proof that Stockfish could draw with a 32 man tablebase"

The thing is, computer chess has a revolution every 8-10 years (estimate) where we see Rybka arrive, then the open-source revolution that birthed Stockfish, then the AlphaZero/Leela craze and now NNUE.

At almost every revolutionary moment, we had people kind of inferring that these engines were now approaching "perfect" chess - despite the fact that we are, as yet, only at 7 man tablebases.

Truth is, we are still FAAAAAR from perfect chess, and perfect chess would be unrecognizable.

We won't see perfect chess, perhaps for another 50 or 100 years.

Try understanding the moves (even with SF's help) of the solution of that mate in 545 above.

And that's only with 7 men on the board. Slightly less than 22% of the entire set - hardly even a glimpse at what perfect chess will look like.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mwyoung »

Alayan wrote: Mon Nov 09, 2020 7:51 pm It's not because you're loud that you're right.

Stockfish plays almost all TB positions perfectly.

This is with 1.5 years old Stockfish over 10M+ 7-men positions :

Code: Select all

Stockfish 190501 errors with 6-piece Syzygy
options = {"Hash": min(NODES*10//10**6*4, 16384), "SyzygyPath": "/usr/games/syzygy", "Syzygy50MoveRule": False, "Clear Hash": None}
Threads=1 for Nodes<=1G
Threads=16 for Nodes>1G
 
All positions: 10099554
Nodes  Wrong  Total%   Previous%
random        ~22%     ~22%
depht1        ~1.4%    ~6%
100k    6828  0.0676%  ~5%
1M      1284  0.0127%  19%
10M      403  0.0040%  31%
100M     214  0.0021%  53%
1G       127  0.0013%  59%
10G       77  0.0008%  61%
Source :

1% is the error rate of Stockfish at depth 2 or so...
This has been discussed. If Stockfish can not play 6 man correctly in the simple endgames. This only compounds exponentially as the game complexity increases with more men on the board. Resulting in crushing losses for Stockfish against perfect play.

"You NEED a way to evaluate which positions are more difficult to play than others. This is fundamental."

Not correct again. The solution is simple. And you do not need to evaluate anything. As everything is already evaluated. You only need to instruct the TB to play the longest conversions in a 0.00 position. Stockfish will be helpless to this simple tactic. Resulting in games 100's of moves long, and waiting for the built in Stockfish error rate to decide the game.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by Alayan »

BrendanJNorman wrote: Mon Nov 09, 2020 8:03 pm
Alayan wrote: Mon Nov 09, 2020 6:41 pm

You NEED a way to evaluate which positions are more difficult to play than others. This is fundamental.
What could possibly be 'difficult' for an endgame tablebase?
But that's the point !

The tablebase has NO CLUE what is the difference between a difficult to hold draw and an easy to hold draw, because for the TB it's all the same. It's all easy.

But for the imperfect opponent you're trying to win against, there IS a huge difference.

So to beat the imperfect opponent, you need a way to model which positions are difficult for the imperfect opponent and try to create as much difficulty as possible for it.
BrendanJNorman wrote: Mon Nov 09, 2020 8:03 pm Consider the following position and hypothetical scenario.

Stockfish as black has just allowed white to enter this line where he can promote a pawn.

White has just played: a8=Q

[d]QN4n1/6r1/3k4/8/b2K4/8/8/8 b - -

Stockfish thinks he is completely okay.

Image

But Stockfish is deluded. He simply hasn't seen the forced mate in 545 - which the TB sees instantly.

Image

And this is only a SEVEN man position. Stockfish's failures will increase exponentially as more pieces are added, while the TB remains perfect.
Chess is played with the 50 move rule. Weird pawnless positions with ultra-long shuffling are of very little practical relevance because they're virtually inexistant in practical play and you can't force them to happen at no cost.

In chess, if your opponent doesn't make mistakes, to create imbalances you need to make your position worse (it would score worse between imperfect opponent of a similar strength). This doesn't help overcoming the drawing margin.
BrendanJNorman wrote: Mon Nov 09, 2020 8:03 pm Isn't the result obvious?
No it's not. Stockfish + 6 men from 1.5 years ago could play perfectly 99.999% of 7-men positions in TCEC endgame conditions. Forcing on the board a position where it would go wrong is not a trivial task.
BrendanJNorman wrote: Mon Nov 09, 2020 8:03 pm Your argument contains false logic. There is an incalculably enormous difference between a 32 man TB and whatever "everything allowed" method yo you are proposing someone challenge you with.
The 32-men TB will get much less positions wrong than what I'm suggesting to be challenged with, sure.

But your argument about SF being very far from the TB relies on the assumption that SF is making a lot of inaccuracies/mistakes. If it makes few of them, then it's close from the TB;

If Stockfish is so inaccurate, then massively skewing the odds should create plenty of opportunities to take advantage of those.

A few years ago, a doubling of TC on big hardware/TC was still estimated to bring 30-50 elo.

Mwyoung with his hardware could easily have 10 doublings advantage on my SF, plus using whatever search method enhancement and auxiliary engines like Leela. With the old rules of thumb, this should mean crushing my SF easily.
BrendanJNorman wrote: Mon Nov 09, 2020 8:03 pm Firstly, the "everything allowed" has only engines weaker than stockfish and opening books (made from the games of engines weaker than stockfish or equal), plus 7-man tablebases (max) - how on earth is that a good test for how SF will handle a 32 man TB as an "opponent"?
You don't get it.

There is no "only engines weaker" clause.

My opponent can use Stockfish itself. It can run any version of it he wants, he can run the very same version I'm using to check if it blunders in different positions to try and steer the game towards positions where it blunders.

Using Leela or others, even if they are weaker than SF overall, can help to find blindspots and different move suggestions.

Opening books are much more useful than what you make them to be, there is a reason ICCF players and playchess engine room users maintain their owns.
BrendanJNorman wrote: Mon Nov 09, 2020 8:03 pm Your logic says: "Let's test how good a seabass is at swimming, by placing prime Phelps and Thorpe in a swimming race in perfect conditions. Thorpe can have a 10-yard headstart. If Phelps still wins, it means he is faster than a seabass".

It seems like you are arguing from a false premise.

That is: "If Stockfish has a low error rate against current bleeding edge opponents - this is proof that Stockfish could draw with a 32 man tablebase"
No. Stockfish drawing in this experiment would not be proof it could draw against any possible opponent. I have never claimed it to be so, do not put into my mouth words I didn't say.

I said it would be an indication at how far we are from perfect play, and I stand by this claim. If 1000x the resources fails to achieve a win, then thinking current SF is thousands of elo away from perfect play is dubious because if it makes so many mistakes, an opponent that is all around better should manage to take advantage. If it does win, it doesn't tell us how close we are to the ceiling but it demonstrates convincingly current SF in conditions similar to mine would be crushed by TB32+ opponent modeling, and that the original poster of this thread is just wrong thinking SF might be 200 elo off the TB32.

Of course, a single game is a tiny sample, but that's a start.
BrendanJNorman wrote: Mon Nov 09, 2020 8:03 pm The thing is, computer chess has a revolution every 8-10 years (estimate) where we see Rybka arrive, then the open-source revolution that birthed Stockfish, then the AlphaZero/Leela craze and now NNUE.

At almost every revolutionary moment, we had people kind of inferring that these engines were now approaching "perfect" chess - despite the fact that we are, as yet, only at 7 man tablebases.

Truth is, we are still FAAAAAR from perfect chess, and perfect chess would be unrecognizable.

We won't see perfect chess, perhaps for another 50 or 100 years.

Try understanding the moves (even with SF's help) of the solution of that mate in 545 above.

And that's only with 7 men on the board. Slightly less than 22% of the entire set - hardly even a glimpse at what perfect chess will look like.
I agree with you that since the Rybka days, there has always been people to think that the newest latest engines was just next to perfect play, only to be proven wrong when new engines came out beating the old version by dozens of elo. You're right that this thread's OP could look silly in a few years if we get an engine beating SF12 by 200 elo at TCEC conditions.

However, running these old "perfect" engines at 1:100 odds vs themselves (with the odds giver at a high TC) would give you plenty of wins, and correspondence players could reliably outmaneuver and beat engine slaves.

These engines made mistakes, and it was possible to show it even without access to engines of the future.
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by Alayan »

mwyoung wrote: Mon Nov 09, 2020 8:15 pm This has been discussed. If Stockfish can not play 6 man correctly in the simple endgames. This only compounds exponentially as the game complexity increases with more men on the board. Resulting in crushing losses for Stockfish against perfect play.

"You NEED a way to evaluate which positions are more difficult to play than others. This is fundamental."

Not correct again. The solution is simple. And you do not need to evaluate anything. As everything is already evaluated. You only need to instruct the TB to play the longest conversions in a 0.00 position. Stockfish will be helpless to this simple tactic. Resulting in games 100's of moves long, and waiting for the built in Stockfish error rate to decide the game.
The assumption that there is a fixed error rate in Stockfish that doesn't depend on the position is nonsensical.

The Stockfish error rate on the starting position itself is 0%. Not 0.1%. Not 0.0001%. 0%. It's never going to play moves like 1. g4, you could run it with as many threads as you want as many times as you want, it won't.

There are many many positions where Stockfish with a certain minimum amount of nodes will NEVER play a losing move, and a much smaller subset from which it would draw or win against arbitrary play (no matter the move played by the other side, no matter the randomness from many threads, it would never lose).

To maximize winning chances, the perfect TB player must do its best to avoid such positions, and find positions where the SF error rate is >0%.

The assumption that delaying the forced draw by as much moves as possible (existing TBs have DTZ/DTM data about converting a mate, they do NOT have data on the distance to shortest forced draw which would make them much bigger, but let's assume your TB does have this data) will precisely guide the game into the positions that are most difficult to not lose for Stockfish is entirely unproven and most likely wrong. It would play stronger than playing a random perfect move, however.

Also, if Stockfish is so bad, why are you not accepting the challenge ? If it blunders left and right, surely it would be easy to defeat it ?
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mwyoung »

Alayan wrote: Mon Nov 09, 2020 8:57 pm
mwyoung wrote: Mon Nov 09, 2020 8:15 pm This has been discussed. If Stockfish can not play 6 man correctly in the simple endgames. This only compounds exponentially as the game complexity increases with more men on the board. Resulting in crushing losses for Stockfish against perfect play.

"You NEED a way to evaluate which positions are more difficult to play than others. This is fundamental."

Not correct again. The solution is simple. And you do not need to evaluate anything. As everything is already evaluated. You only need to instruct the TB to play the longest conversions in a 0.00 position. Stockfish will be helpless to this simple tactic. Resulting in games 100's of moves long, and waiting for the built in Stockfish error rate to decide the game.
The assumption that there is a fixed error rate in Stockfish that doesn't depend on the position is nonsensical.

The Stockfish error rate on the starting position itself is 0%. Not 0.1%. Not 0.0001%. 0%. It's never going to play moves like 1. g4, you could run it with as many threads as you want as many times as you want, it won't.

There are many many positions where Stockfish with a certain minimum amount of nodes will NEVER play a losing move, and a much smaller subset from which it would draw or win against arbitrary play (no matter the move played by the other side, no matter the randomness from many threads, it would never lose).

To maximize winning chances, the perfect TB player must do its best to avoid such positions, and find positions where the SF error rate is >0%.

The assumption that delaying the forced draw by as much moves as possible (existing TBs have DTZ/DTM data about converting a mate, they do NOT have data on the distance to shortest forced draw which would make them much bigger, but let's assume your TB does have this data) will precisely guide the game into the positions that are most difficult to not lose for Stockfish is entirely unproven and most likely wrong. It would play stronger than playing a random perfect move, however.

Also, if Stockfish is so bad, why are you not accepting the challenge ? If it blunders left and right, surely it would be easy to defeat it ?
We know stockfish has a error rate, because it loses games against non perfect play. And we see this also in simple endgames. Where we know what perfect play looks like.

And you do not need me to accept your challenge. You can do that testing on your own system. And BTW it would prove nothing. You can waste all the time you wish.

Here is what we know. Stockfish has a error rate, and Stockfish can think a million years on a position and still not understand the position. For the simple reason that Stockfish uses a type B search. Stockfish is not capable of playing perfect chess. Stockfish will always have an error rate, and this is clearly exposed in very simple endgame positions.

"The Stockfish error rate on the starting position itself is 0%. Not 0.1%. Not 0.0001%. 0%. It's never going to play moves like 1. g4, you could run it with as many threads as you want as many times as you want, it won't"

This is your typical B.S. :lol:

How do you know anything above is true. How do you know 1. g4 is not the only winning first move in chess. You clearly don't know! :shock:
Last edited by mwyoung on Mon Nov 09, 2020 9:44 pm, edited 2 times in total.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.