Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by Raphexon »

Uri Blass wrote: Fri Nov 13, 2020 1:53 pm
mwyoung wrote: Fri Nov 13, 2020 8:07 am
mmt wrote: Fri Nov 13, 2020 7:16 am
mwyoung wrote: Fri Nov 13, 2020 5:23 am No! You can not prove this ever. Why do you think chess can be proven. What laws of physics do you live under. Your ignorance is profound. If you don't like this response then stop posting such nonsense.
It's like I'm talking past you. I just said that I know I can't prove it and everyone knows we can't prove it. "Evidence" doesn't mean a full proof. We have some indications that chess might be a draw. This has nothing to do with the laws of physics, lol.
mwyoung wrote: Fri Nov 13, 2020 5:23 am And if chess is a draw as you claimed. Then chess is a coin flip problem. Meaning after millions of games played. With human and computer games. Why do we see the bias winning percentage for white.

That is stronger evidence that chess is a win for white.Then you have for chess being a draw! :roll:
First, I haven't claimed it's a draw. I said it's more likely. Second, no. There are more white draws than wins with the best play we can examine.
Let us not talk past each other.

1. How is any chess game statistic from chess played by imperfect players and computers. Why is this any kind of evidence as to chess is a win or draw with perfect play?

You can not marry these two contradictions.

When we have no perfect players the best conjecture for the result is the result of the best imperfect players.

I believe this logic work for other games that today are solved but not many years ago.
People proved that 1.e3 is winning move in the losing chess game and I wonder what is the statistic of games of top players in this game many year ago.

I will be surprised to find out that at some point of time there was majority of draws by top players.

Checker(8*8) is proved to be a draw and again
I believe that a big majority of top level games were draws even before solving the game.

I would like to see if there is some statistics that show a different result for games that people solved and are not trivial for humans.
(From what I've heard after asking around in the past.)
Connect Four with board sizes where second player has a proven win still have first player advantage in human games.

But I don't have proof of that myself.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mwyoung »

Uri Blass wrote: Fri Nov 13, 2020 1:53 pm
mwyoung wrote: Fri Nov 13, 2020 8:07 am
mmt wrote: Fri Nov 13, 2020 7:16 am
mwyoung wrote: Fri Nov 13, 2020 5:23 am No! You can not prove this ever. Why do you think chess can be proven. What laws of physics do you live under. Your ignorance is profound. If you don't like this response then stop posting such nonsense.
It's like I'm talking past you. I just said that I know I can't prove it and everyone knows we can't prove it. "Evidence" doesn't mean a full proof. We have some indications that chess might be a draw. This has nothing to do with the laws of physics, lol.
mwyoung wrote: Fri Nov 13, 2020 5:23 am And if chess is a draw as you claimed. Then chess is a coin flip problem. Meaning after millions of games played. With human and computer games. Why do we see the bias winning percentage for white.

That is stronger evidence that chess is a win for white.Then you have for chess being a draw! :roll:
First, I haven't claimed it's a draw. I said it's more likely. Second, no. There are more white draws than wins with the best play we can examine.
Let us not talk past each other.

1. How is any chess game statistic from chess played by imperfect players and computers. Why is this any kind of evidence as to chess is a win or draw with perfect play?

You can not marry these two contradictions.

When we have no perfect players the best conjecture for the result is the result of the best imperfect players.

I believe this logic work for other games that today are solved but not many years ago.
People proved that 1.e3 is winning move in the losing chess game and I wonder what is the statistic of games of top players in this game many year ago.

I will be surprised to find out that at some point of time there was majority of draws by top players.

Checker(8*8) is proved to be a draw and again
I believe that a big majority of top level games were draws even before solving the game.

I would like to see if there is some statistics that show a different result for games that people solved and are not trivial for humans.
Exactly! :lol: It is evidence that has nothing to do with perfect players. So you have no evidence. Conjecture is not evidence. :roll: Check and mate!

con·​jec·​ture | \ kən-ˈjek-chər \
Definition of conjecture (Entry 1 of 2)
1a: inference formed without proof or sufficient evidence
b: a conclusion deduced by surmise or guesswork
The criminal's motive remains a matter of conjecture.
c: a proposition (as in mathematics) before it has been proved or disproved
2obsolete
a: interpretation of omens
b: SUPPOSITION
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by Alayan »

Posting definitions of words when you are too thick or stubborn to acknowledge the definition of evidence. Pathetic.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mwyoung »

Alayan wrote: Fri Nov 13, 2020 4:48 pm Posting definitions of words when you are too thick or stubborn to acknowledge the definition of evidence. Pathetic.
You can boohoo all you want. But facts are facts. And conjectures and fantasies are not evidence.

fan·ta·sy
/ˈfan(t)əsē/
noun
the faculty or activity of imagining things, especially things that are impossible or improbable.

ev·i·dence
noun
the available body of facts or information indicating whether a belief or proposition is true or valid
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
mmt
Posts: 343
Joined: Sun Aug 25, 2019 8:33 am
Full name: .

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mmt »

mwyoung wrote: Fri Nov 13, 2020 8:07 am Let us not talk past each other.

1. How is any chess game statistic from chess played by imperfect players and computers. Why is this any kind of evidence as to chess is a win or draw with perfect play?

You can not marry these two contradictions.
I'm quite sure that almost all smaller solvable chess-like games that exhibit these statistics are drawn. We can't be sure it's true with chess but unless we find a reason chess is special, it becomes likely. If you had an even odds bet on whether chess is a draw or a win for either white or black, what would you bet?

Edit: Ha, I see Uri made pretty much the same point about solved games.
User avatar
towforce
Posts: 11588
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by towforce »

For this mini-thought experiment, please assume that chess is drawn (I know it's not proven yet):

* losses strongly correlate with blunders

* the deeper the search, the fewer the number of blunders

Unfortunately, not all engines measure depth in the same way. However maybe we can come up with a "reasonable guess" based on experience.

Another complicating factor: some positions would require a prohibitively deep search to uncover the blunder. In these cases, knowledge would be needed: the eval would need to be able to avoid blunders that search cannot reach. The good news regarding this is that, thanks to NNs, engines are also getting cleverer now, as well as just faster. Again, exactly how "smart" a NN is is difficult to say - but again, we can have a go.

So if chess is drawn (which I believe it is), then the time to perfect chess engines depends on the shape of the 3-dimensional chart that plots blunders against depth and knowledge.

Edit: here's a simplistic view of what the 3d graph might look like (X = depth, Y = knowledge, Z = blunders. Simple expression produces a plane. Drag with the mouse to rotate up/down/left/right to see clearly) - link.
Writing is the antidote to confusion.
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mwyoung »

towforce wrote: Sun Nov 15, 2020 1:53 am For this mini-thought experiment, please assume that chess is drawn (I know it's not proven yet):

* losses strongly correlate with blunders

* the deeper the search, the fewer the number of blunders

Unfortunately, not all engines measure depth in the same way. However maybe we can come up with a "reasonable guess" based on experience.

Another complicating factor: some positions would require a prohibitively deep search to uncover the blunder. In these cases, knowledge would be needed: the eval would need to be able to avoid blunders that search cannot reach. The good news regarding this is that, thanks to NNs, engines are also getting cleverer now, as well as just faster. Again, exactly how "smart" a NN is is difficult to say - but again, we can have a go.

So if chess is drawn (which I believe it is), then the time to perfect chess engines depends on the shape of the 3-dimensional chart that plots blunders against depth and knowledge.

Edit: here's a simplistic view of what the 3d graph might look like (X = depth, Y = knowledge, Z = blunders. Simple expression produces a plane. Drag with the mouse to rotate up/down/left/right to see clearly) - link.
This is were some do not understand the problem.

"The deeper the search, the fewer number of blunders."

The problem is the type B search. A type B search is fine playing scrub humans, and other scrub engines. It gives us a great approximation.

The issue is you are making billions of guesses as to what lines to cut to achieve the great search depths we see today. And you only need to be wrong once against perfect play.

And no amount of search in a type B search can ever achieve perfect play.

This is why we see the errors as shown here in this thread. And why Stockfish fails in the examples against perfect play.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
towforce
Posts: 11588
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by towforce »

mwyoung wrote: Sun Nov 15, 2020 4:51 am
towforce wrote: Sun Nov 15, 2020 1:53 am For this mini-thought experiment, please assume that chess is drawn (I know it's not proven yet):

* losses strongly correlate with blunders

* the deeper the search, the fewer the number of blunders

Unfortunately, not all engines measure depth in the same way. However maybe we can come up with a "reasonable guess" based on experience.

Another complicating factor: some positions would require a prohibitively deep search to uncover the blunder. In these cases, knowledge would be needed: the eval would need to be able to avoid blunders that search cannot reach. The good news regarding this is that, thanks to NNs, engines are also getting cleverer now, as well as just faster. Again, exactly how "smart" a NN is is difficult to say - but again, we can have a go.

So if chess is drawn (which I believe it is), then the time to perfect chess engines depends on the shape of the 3-dimensional chart that plots blunders against depth and knowledge.

Edit: here's a simplistic view of what the 3d graph might look like (X = depth, Y = knowledge, Z = blunders. Simple expression produces a plane. Drag with the mouse to rotate up/down/left/right to see clearly) - link.
This is were some do not understand the problem.

"The deeper the search, the fewer number of blunders."

The problem is the type B search. A type B search is fine playing scrub humans, and other scrub engines. It gives us a great approximation.

The issue is you are making billions of guesses as to what lines to cut to achieve the great search depths we see today. And you only need to be wrong once against perfect play.

And no amount of search in a type B search can ever achieve perfect play.

This is why we see the errors as shown here in this thread. And why Stockfish fails in the examples against perfect play.

You haven't addressed the issue of knowledge which I raised (see above quoted text). You appear to be saying that the 3 dimensional chart should have a long tail on the way to Z=0 (if you're willing to assume that chess is a draw without a blunder). Maybe you could come up with your own mathematical expression and redraw my chart? "A picture is worth a thousand words". :)

In this post Albert Silver told us that in top level correspondence chess (TLCC), wins are rare in completed games. Let's consider some candidate reasons why this might be so (my preferred choice is option 1 - that TLCC is the cutting edge, and is almost there in terms of error-free chess).

1. Chess is a draw, a win requires a blunder, and TLCC has almost eliminated blunders

2. Chess is a draw, a win requires a blunder, blunders occur in TLCC, but TLCC suffers from groupthink, and hence the players fail to find each other's blunders

3. Chess is a win, but TLCC players are not good enough to find the available wins

Which of the above 3 choices do you prefer?
Writing is the antidote to confusion.
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by mwyoung »

towforce wrote: Sun Nov 15, 2020 11:48 am
mwyoung wrote: Sun Nov 15, 2020 4:51 am
towforce wrote: Sun Nov 15, 2020 1:53 am For this mini-thought experiment, please assume that chess is drawn (I know it's not proven yet):

* losses strongly correlate with blunders

* the deeper the search, the fewer the number of blunders

Unfortunately, not all engines measure depth in the same way. However maybe we can come up with a "reasonable guess" based on experience.

Another complicating factor: some positions would require a prohibitively deep search to uncover the blunder. In these cases, knowledge would be needed: the eval would need to be able to avoid blunders that search cannot reach. The good news regarding this is that, thanks to NNs, engines are also getting cleverer now, as well as just faster. Again, exactly how "smart" a NN is is difficult to say - but again, we can have a go.

So if chess is drawn (which I believe it is), then the time to perfect chess engines depends on the shape of the 3-dimensional chart that plots blunders against depth and knowledge.

Edit: here's a simplistic view of what the 3d graph might look like (X = depth, Y = knowledge, Z = blunders. Simple expression produces a plane. Drag with the mouse to rotate up/down/left/right to see clearly) - link.
This is were some do not understand the problem.

"The deeper the search, the fewer number of blunders."

The problem is the type B search. A type B search is fine playing scrub humans, and other scrub engines. It gives us a great approximation.

The issue is you are making billions of guesses as to what lines to cut to achieve the great search depths we see today. And you only need to be wrong once against perfect play.

And no amount of search in a type B search can ever achieve perfect play.

This is why we see the errors as shown here in this thread. And why Stockfish fails in the examples against perfect play.

You haven't addressed the issue of knowledge which I raised (see above quoted text). You appear to be saying that the 3 dimensional chart should have a long tail on the way to Z=0 (if you're willing to assume that chess is a draw without a blunder). Maybe you could come up with your own mathematical expression and redraw my chart? "A picture is worth a thousand words". :)

In this post Albert Silver told us that in top level correspondence chess (TLCC), wins are rare in completed games. Let's consider some candidate reasons why this might be so (my preferred choice is option 1 - that TLCC is the cutting edge, and is almost there in terms of error-free chess).

1. Chess is a draw, a win requires a blunder, and TLCC has almost eliminated blunders

2. Chess is a draw, a win requires a blunder, blunders occur in TLCC, but TLCC suffers from groupthink, and hence the players fail to find each other's blunders

3. Chess is a win, but TLCC players are not good enough to find the available wins

Which of the above 3 choices do you prefer?
"You haven't addressed the issue of knowledge which I raised"

Yes, I have many times. And in the knowledge standard you are asking for only exist in one form. As I said before Chess is a 100% tactical game...

And I will take option 4. Chess is either a win or a draw, but it does not matter, as humans are a type B searcher, and the computers they using are a type B searcher. Even in correspondence chess, and hence the players fail to find each other's blunders.

"Another complicating factor: some positions would require a prohibitively deep search to uncover the blunder. In these cases, knowledge would be needed: the eval would need to be able to avoid blunders that search cannot reach." :lol:

And it is above that tells me you have no idea what you are talking about. You are just putting words together that you think make sense. But are logically flawed. Not only do you not know the rules of chess, but you are clueless as to how a type B search works.

If you had an eval that could "avoid blunders that search cannot reach."

If you had this type of evaluation. Do you know what would not be needed........A search of any kind. :lol: :lol: :lol:

Here is a simple test to see if you have an evaluation that meets your standard. If your STATIC EVALUATION outputs anything other then the 3 true evaluations of chess, and it is not correct 100% of the time. Your evaluation is flawed.

And yes this type of knowledge does exist in only one form, and it is called a table base. :lol:
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
towforce
Posts: 11588
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK

Re: Perfect chess engine elo ( 32 men TB) can be within 200 of Stocfish in Tcec LTC conditions

Post by towforce »

mwyoung wrote: Sun Nov 15, 2020 5:13 pm
towforce wrote: Sun Nov 15, 2020 11:48 am
mwyoung wrote: Sun Nov 15, 2020 4:51 am
towforce wrote: Sun Nov 15, 2020 1:53 am For this mini-thought experiment, please assume that chess is drawn (I know it's not proven yet):

* losses strongly correlate with blunders

* the deeper the search, the fewer the number of blunders

Unfortunately, not all engines measure depth in the same way. However maybe we can come up with a "reasonable guess" based on experience.

Another complicating factor: some positions would require a prohibitively deep search to uncover the blunder. In these cases, knowledge would be needed: the eval would need to be able to avoid blunders that search cannot reach. The good news regarding this is that, thanks to NNs, engines are also getting cleverer now, as well as just faster. Again, exactly how "smart" a NN is is difficult to say - but again, we can have a go.

So if chess is drawn (which I believe it is), then the time to perfect chess engines depends on the shape of the 3-dimensional chart that plots blunders against depth and knowledge.

Edit: here's a simplistic view of what the 3d graph might look like (X = depth, Y = knowledge, Z = blunders. Simple expression produces a plane. Drag with the mouse to rotate up/down/left/right to see clearly) - link.
This is were some do not understand the problem.

"The deeper the search, the fewer number of blunders."

The problem is the type B search. A type B search is fine playing scrub humans, and other scrub engines. It gives us a great approximation.

The issue is you are making billions of guesses as to what lines to cut to achieve the great search depths we see today. And you only need to be wrong once against perfect play.

And no amount of search in a type B search can ever achieve perfect play.

This is why we see the errors as shown here in this thread. And why Stockfish fails in the examples against perfect play.

You haven't addressed the issue of knowledge which I raised (see above quoted text). You appear to be saying that the 3 dimensional chart should have a long tail on the way to Z=0 (if you're willing to assume that chess is a draw without a blunder). Maybe you could come up with your own mathematical expression and redraw my chart? "A picture is worth a thousand words". :)

In this post Albert Silver told us that in top level correspondence chess (TLCC), wins are rare in completed games. Let's consider some candidate reasons why this might be so (my preferred choice is option 1 - that TLCC is the cutting edge, and is almost there in terms of error-free chess).

1. Chess is a draw, a win requires a blunder, and TLCC has almost eliminated blunders

2. Chess is a draw, a win requires a blunder, blunders occur in TLCC, but TLCC suffers from groupthink, and hence the players fail to find each other's blunders

3. Chess is a win, but TLCC players are not good enough to find the available wins

Which of the above 3 choices do you prefer?
"You haven't addressed the issue of knowledge which I raised"

Yes, I have many times. And in the knowledge standard you are asking for only exist in one form. As I said before Chess is a 100% tactical game...

And I will take option 4. Chess is either a win or a draw, but it does not matter, as humans are a type B searcher, and the computers they using are a type B searcher. Even in correspondence chess, and hence the players fail to find each other's blunders.

"Another complicating factor: some positions would require a prohibitively deep search to uncover the blunder. In these cases, knowledge would be needed: the eval would need to be able to avoid blunders that search cannot reach." :lol:

And it is above that tells me you have no idea what you are talking about. You are just putting words together that you think make sense. But are logically flawed. Not only do you not know the rules of chess, but you are clueless as to how a type B search works.

If you had an eval that could "avoid blunders that search cannot reach."

If you had this type of evaluation. Do you know what would not be needed........A search of any kind. :lol: :lol: :lol:

Here is a simple test to see if you have an evaluation that meets your standard. If your STATIC EVALUATION outputs anything other then the 3 true evaluations of chess, and it is not correct 100% of the time. Your evaluation is flawed.

And yes this type of knowledge does exist in only one form, and it is called a table base. :lol:

You're basically right - but if a position was won, you'd want one more thing from the eval - distance to mate. If you had a choice of winning moves, your preference would be for the one that reaches mate first.

To summarise your answer as to why the draw ratio is so high in completed TLCC games: the players all use similar computers for analysis, and this is causing groupthink.

I cannot prove that you're wrong, but here's a bit of evidence against that assertion:

* TLCC having such a high draw ratio in completed games is relatively recent

* if it's caused by groupthink, the players must therefore be relying more on the computers (or the high draw ratio in completed games would have been there previously)

* therefore, one would expect the computers playing each other to also have high draw ratios

* we're not (yet) seeing such a high draw ratio in computers playing each other

If the high draw ratio in completed games in TLCC is actually a reflection of the fact that a blunder is required for a win in chess, and there aren't many blunders in TLCC these days, then the above problem doesn't arise.
Writing is the antidote to confusion.
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!