Why Chess Might Be Almost "Solved" IMO

Dann Corbit · Post by **Dann Corbit** » Thu Jan 03, 2008 8:35 pm

Uri Blass wrote:
Dann Corbit wrote:I think that when the board is cluttered, the best moves are often missed.
So that if you had a 64 CPU version of Rybka, it will find moves that single CPU Rybka missed and beat it handily. So the choices Rybka makes are not perfect or even nearly perfect. I also think that chess programs can improve by hundreds of Elo still.

I think that perfect play might be as difficult to achieve as proving the game. Note that I think that these are two completely different problems.
Chess is a game when one mistake may be enough to lose so you do not need many mistakes of the opponent to be significantly stronger than the opponent by hundreds of elo.

I do not believe that engines make mistakes that change the theoretical result of the game in 30% of their moves.
If you talk about games that are not draw than in part of the games
the loser did only one mistake and it was enough to lose.
If you talk about drawn games
I even suspect that part of the engine-engine games are perfect games in the meaning that no move changes the theoretical result of the game.

It does not mean that the games include no practical mistakes that make the life of the opponent easier to draw but even if we talk about practical mistake(term that is not defined) then 30% seems to me an estimate that is too high.

Uri

When I say a mistake, I do not necessarily mean going from a winning move to a losing move, but perhaps getting a 0.03 pawn advantage when they could have had a .07 pawn advantage. Tiny advantages can add up over time because of better development or board control or whatever.

If you take every position in a game of Rybka verses Rybka at 40/2hrs, and then analyze each position for 24 hours, I think at least 30% of them will be different moves. That is what I mean by making a mistake. Now, it may be that only 2 moves by the losing side were the real cause of the loss, but which two of them is very hard to know a-priori.

Dann Corbit · Post by **Dann Corbit** » Thu Jan 03, 2008 8:40 pm

smirobth wrote:
Dann Corbit wrote:Even the best chess engines make mistakes all the time. I guess that if two strong chess engines play a game of 100 moves, there are at least 30 mistakes in it.
I am very skeptical of this claim. Of course in part it will depend on what you mean by "mistakes". If by mistake you mean a move that (with perfect play) changes a won position to drawn or lost, or a drawn position to lost, then the estimate of 30 mistakes in 100 moves is clearly way too high. On the other hand other definitions of mistake will be unavoidably subjective.

See:
http://64.68.157.89/forum/viewtopic.php ... f2d75386e0

smirobth · Post by **smirobth** » Fri Jan 04, 2008 7:58 am

Dann Corbit wrote:
Uri Blass wrote:
Dann Corbit wrote:I think that when the board is cluttered, the best moves are often missed.
So that if you had a 64 CPU version of Rybka, it will find moves that single CPU Rybka missed and beat it handily. So the choices Rybka makes are not perfect or even nearly perfect. I also think that chess programs can improve by hundreds of Elo still.

I think that perfect play might be as difficult to achieve as proving the game. Note that I think that these are two completely different problems.
Chess is a game when one mistake may be enough to lose so you do not need many mistakes of the opponent to be significantly stronger than the opponent by hundreds of elo.

I do not believe that engines make mistakes that change the theoretical result of the game in 30% of their moves.
If you talk about games that are not draw than in part of the games
the loser did only one mistake and it was enough to lose.
If you talk about drawn games
I even suspect that part of the engine-engine games are perfect games in the meaning that no move changes the theoretical result of the game.

It does not mean that the games include no practical mistakes that make the life of the opponent easier to draw but even if we talk about practical mistake(term that is not defined) then 30% seems to me an estimate that is too high.

Uri
When I say a mistake, I do not necessarily mean going from a winning move to a losing move, but perhaps getting a 0.03 pawn advantage when they could have had a .07 pawn advantage. Tiny advantages can add up over time because of better development or board control or whatever.

If you take every position in a game of Rybka verses Rybka at 40/2hrs, and then analyze each position for 24 hours, I think at least 30% of them will be different moves. That is what I mean by making a mistake. Now, it may be that only 2 moves by the losing side were the real cause of the loss, but which two of them is very hard to know a-priori.

I agree with you that if you let a program such as Rybka think for 24 hours of search it might select a different move than the move selected after 2 minutes of search about 30% of the time (it will of course depend on the game, in very highly tactical games the percent would probably be much lower).

However defining moves that are abandoned after a 24 hour search as mistakes is, IMO, not a very satisfactory definition of mistake. If you let the same program search for 2 years (about the same ratio to 1 day as 1 day is to two minutes) you would likely see almost the same 30% frequency of change in selected moves ... and many times you could see the 2 year move end up being the same as the 2 minute move, even in cases where the 24 hour move was different from the two minute one. For example there are many trivially drawn positions, where virtually no program would ever lose, but where the selected move will jump around and also the evaluations will jump around a few centipawns.

Also what if program A selects move alpha after 2 minutes, but move beta after 24 hours; but equally strong program B selects move beta after 2 minutes and move alpha after 24 hours ... which move is the mistake?

And your definition of mistake also fails to consider factors such as who is the opponent. For example some programs (let's say program A) will always try to avoid locked positions since they want to maximize how well they do against humans. If a different program that is usually stronger against other programs (let's say program B) does not avoid the locked position and usually beats program A in head to head matches, but does less well than program A against humans, is it program A or program B that makes a mistake when there is the opportunity to close the position or keep it open?

Also many of the "tiny advantages" that "can add up over time because of better development or board control or whatever" are not relevant to the positions at hand. Programmers will program in factors like center control, rooks on the 7th rank, degree of advancement of passed pawns, mobility, king safety etc. etc. etc. etc. because these very frequently are relevant. But there are also frequently positions where one or two of these "tiny advantages" are actually totally irrelevant to the position and a program changing it's mind about these factors is really just producing a certain amount of what might be considered noise in its evaluations (and thus also move selections).

Also there are some programs that seem to change their mind more often than other equally highly rated programs. Your definition of mistake would indicate that a program that changes its mind frequently makes more mistakes, but program ratings don't seem to bear that out.

For all these reasons and more, I think the idea of defining what constitutes a mistake (beyond moves that change the game theoretic result) is both very difficult and also inherently somewhat subjective. And I also think that for most reasonable definitions of mistake the 30% figure is too high.

Dann Corbit · Post by **Dann Corbit** » Fri Jan 04, 2008 8:14 am

smirobth wrote:
Dann Corbit wrote:
Uri Blass wrote:
Dann Corbit wrote:I think that when the board is cluttered, the best moves are often missed.
So that if you had a 64 CPU version of Rybka, it will find moves that single CPU Rybka missed and beat it handily. So the choices Rybka makes are not perfect or even nearly perfect. I also think that chess programs can improve by hundreds of Elo still.

I think that perfect play might be as difficult to achieve as proving the game. Note that I think that these are two completely different problems.
Chess is a game when one mistake may be enough to lose so you do not need many mistakes of the opponent to be significantly stronger than the opponent by hundreds of elo.

I do not believe that engines make mistakes that change the theoretical result of the game in 30% of their moves.
If you talk about games that are not draw than in part of the games
the loser did only one mistake and it was enough to lose.
If you talk about drawn games
I even suspect that part of the engine-engine games are perfect games in the meaning that no move changes the theoretical result of the game.

It does not mean that the games include no practical mistakes that make the life of the opponent easier to draw but even if we talk about practical mistake(term that is not defined) then 30% seems to me an estimate that is too high.

Uri
When I say a mistake, I do not necessarily mean going from a winning move to a losing move, but perhaps getting a 0.03 pawn advantage when they could have had a .07 pawn advantage. Tiny advantages can add up over time because of better development or board control or whatever.

If you take every position in a game of Rybka verses Rybka at 40/2hrs, and then analyze each position for 24 hours, I think at least 30% of them will be different moves. That is what I mean by making a mistake. Now, it may be that only 2 moves by the losing side were the real cause of the loss, but which two of them is very hard to know a-priori.
I agree with you that if you let a program such as Rybka think for 24 hours of search it might select a different move than the move selected after 2 minutes of search about 30% of the time (it will of course depend on the game, in very highly tactical games the percent would probably be much lower).

However defining moves that are abandoned after a 24 hour search as mistakes is, IMO, not a very satisfactory definition of mistake. If you let the same program search for 2 years (about the same ratio to 1 day as 1 day is to two minutes) you would likely see almost the same 30% frequency of change in selected moves ... and many times you could see the 2 year move end up being the same as the 2 minute move, even in cases where the 24 hour move was different from the two minute one. For example there are many trivially drawn positions, where virtually no program would ever lose, but where the selected move will jump around and also the evaluations will jump around a few centipawns.

Also what if program A selects move alpha after 2 minutes, but move beta after 24 hours; but equally strong program B selects move beta after 2 minutes and move alpha after 24 hours ... which move is the mistake?

And your definition of mistake also fails to consider factors such as who is the opponent. For example some programs (let's say program A) will always try to avoid locked positions since they want to maximize how well they do against humans. If a different program that is usually stronger against other programs (let's say program B) does not avoid the locked position and usually beats program A in head to head matches, but does less well than program A against humans, is it program A or program B that makes a mistake when there is the opportunity to close the position or keep it open?

Also many of the "tiny advantages" that "can add up over time because of better development or board control or whatever" are not relevant to the positions at hand. Programmers will program in factors like center control, rooks on the 7th rank, degree of advancement of passed pawns, mobility, king safety etc. etc. etc. etc. because these very frequently are relevant. But there are also frequently positions where one or two of these "tiny advantages" are actually totally irrelevant to the position and a program changing it's mind about these factors is really just producing a certain amount of what might be considered noise in its evaluations (and thus also move selections).

Also there are some programs that seem to change their mind more often than other equally highly rated programs. Your definition of mistake would indicate that a program that changes its mind frequently makes more mistakes, but program ratings don't seem to bear that out.

For all these reasons and more, I think the idea of defining what constitutes a mistake (beyond moves that change the game theoretic result) is both very difficult and also inherently somewhat subjective. And I also think that for most reasonable definitions of mistake the 30% figure is too high.

Let me define clearly what I mean:
If a program plays perfect chess, then it will never change its mind about what move to make, no matter how short or how long the thinking.
If a program plays perfect chess, then any move it chooses from any position will always be the best move.
If a program plays perfect chess, then no matter what move it chooses, the opponent cannot outsmart the program that plays perfect chess.
If a program plays perfect chess and there is a mate in 5 then it must play a move that is a mate in 5 and not a mate in 6 or more.

I admit that thinking for a longer time does not result in a better move (though it usually does). However, if a program never made a mistake it would never change its mind. The reason that programs change their minds is that they *think* they see something better. It may or may not be better in the real case, but they think it is.

smirobth · Post by **smirobth** » Fri Jan 04, 2008 8:46 am

Dann Corbit wrote:
smirobth wrote:
Dann Corbit wrote:
Uri Blass wrote:
Dann Corbit wrote:I think that when the board is cluttered, the best moves are often missed.
So that if you had a 64 CPU version of Rybka, it will find moves that single CPU Rybka missed and beat it handily. So the choices Rybka makes are not perfect or even nearly perfect. I also think that chess programs can improve by hundreds of Elo still.

I think that perfect play might be as difficult to achieve as proving the game. Note that I think that these are two completely different problems.
Chess is a game when one mistake may be enough to lose so you do not need many mistakes of the opponent to be significantly stronger than the opponent by hundreds of elo.

I do not believe that engines make mistakes that change the theoretical result of the game in 30% of their moves.
If you talk about games that are not draw than in part of the games
the loser did only one mistake and it was enough to lose.
If you talk about drawn games
I even suspect that part of the engine-engine games are perfect games in the meaning that no move changes the theoretical result of the game.

It does not mean that the games include no practical mistakes that make the life of the opponent easier to draw but even if we talk about practical mistake(term that is not defined) then 30% seems to me an estimate that is too high.

Uri
When I say a mistake, I do not necessarily mean going from a winning move to a losing move, but perhaps getting a 0.03 pawn advantage when they could have had a .07 pawn advantage. Tiny advantages can add up over time because of better development or board control or whatever.

If you take every position in a game of Rybka verses Rybka at 40/2hrs, and then analyze each position for 24 hours, I think at least 30% of them will be different moves. That is what I mean by making a mistake. Now, it may be that only 2 moves by the losing side were the real cause of the loss, but which two of them is very hard to know a-priori.
I agree with you that if you let a program such as Rybka think for 24 hours of search it might select a different move than the move selected after 2 minutes of search about 30% of the time (it will of course depend on the game, in very highly tactical games the percent would probably be much lower).

However defining moves that are abandoned after a 24 hour search as mistakes is, IMO, not a very satisfactory definition of mistake. If you let the same program search for 2 years (about the same ratio to 1 day as 1 day is to two minutes) you would likely see almost the same 30% frequency of change in selected moves ... and many times you could see the 2 year move end up being the same as the 2 minute move, even in cases where the 24 hour move was different from the two minute one. For example there are many trivially drawn positions, where virtually no program would ever lose, but where the selected move will jump around and also the evaluations will jump around a few centipawns.

Also what if program A selects move alpha after 2 minutes, but move beta after 24 hours; but equally strong program B selects move beta after 2 minutes and move alpha after 24 hours ... which move is the mistake?

And your definition of mistake also fails to consider factors such as who is the opponent. For example some programs (let's say program A) will always try to avoid locked positions since they want to maximize how well they do against humans. If a different program that is usually stronger against other programs (let's say program B) does not avoid the locked position and usually beats program A in head to head matches, but does less well than program A against humans, is it program A or program B that makes a mistake when there is the opportunity to close the position or keep it open?

Also many of the "tiny advantages" that "can add up over time because of better development or board control or whatever" are not relevant to the positions at hand. Programmers will program in factors like center control, rooks on the 7th rank, degree of advancement of passed pawns, mobility, king safety etc. etc. etc. etc. because these very frequently are relevant. But there are also frequently positions where one or two of these "tiny advantages" are actually totally irrelevant to the position and a program changing it's mind about these factors is really just producing a certain amount of what might be considered noise in its evaluations (and thus also move selections).

Also there are some programs that seem to change their mind more often than other equally highly rated programs. Your definition of mistake would indicate that a program that changes its mind frequently makes more mistakes, but program ratings don't seem to bear that out.

For all these reasons and more, I think the idea of defining what constitutes a mistake (beyond moves that change the game theoretic result) is both very difficult and also inherently somewhat subjective. And I also think that for most reasonable definitions of mistake the 30% figure is too high.
Let me define clearly what I mean:
If a program plays perfect chess, then it will never change its mind about what move to make, no matter how short or how long the thinking.
If a program plays perfect chess, then any move it chooses from any position will always be the best move.

Sorry for being such a nit-picker, but there are programs that when in a tablebase position play game theoretic perfect chess, but if the game theoretic result is a draw or loss for their side they will also run the chess engine looking for a way to induce a mistake by their opponent. Such programs both play perfect chess and also often change their minds. I tend to think these programs play better chess when in these tablebase positions than programs that only grab the first move they see in the tablebase and never change their mind. Extending this to the hypothetical case of programs that had access to 32 man tablebases, you could end up in the situation where perfect play program A starts the game with 1.f3 and followed it up with 2.Kf2 (if in this hypothetical the 32 man tablebase said these moves don't lose). But I would rather have perfect play program B, that thought a little bit longer and waffled back and forth between 1.e4 and 1.d4 from among the 20 drawing first move choices.

Dann Corbit wrote:If a program plays perfect chess, then no matter what move it chooses, the opponent cannot outsmart the program that plays perfect chess.
If a program plays perfect chess and there is a mate in 5 then it must play a move that is a mate in 5 and not a mate in 6 or more.

I admit that thinking for a longer time does not result in a better move (though it usually does). However, if a program never made a mistake it would never change its mind. The reason that programs change their minds is that they *think* they see something better. It may or may not be better in the real case, but they think it is.

Yes, I definitely agree with your last two sentences. I also agree with your original claim that Rybka's play is still far from perfect and that programs can still improve by hundreds of elo. It was just the "30% of the moves are mistakes" claim I take exception to.

Dann Corbit · Post by **Dann Corbit** » Fri Jan 04, 2008 9:53 am

smirobth wrote:
Dann Corbit wrote:
smirobth wrote:
Dann Corbit wrote:
Uri Blass wrote:
Dann Corbit wrote:I think that when the board is cluttered, the best moves are often missed.
So that if you had a 64 CPU version of Rybka, it will find moves that single CPU Rybka missed and beat it handily. So the choices Rybka makes are not perfect or even nearly perfect. I also think that chess programs can improve by hundreds of Elo still.

I think that perfect play might be as difficult to achieve as proving the game. Note that I think that these are two completely different problems.
Chess is a game when one mistake may be enough to lose so you do not need many mistakes of the opponent to be significantly stronger than the opponent by hundreds of elo.

I do not believe that engines make mistakes that change the theoretical result of the game in 30% of their moves.
If you talk about games that are not draw than in part of the games
the loser did only one mistake and it was enough to lose.
If you talk about drawn games
I even suspect that part of the engine-engine games are perfect games in the meaning that no move changes the theoretical result of the game.

It does not mean that the games include no practical mistakes that make the life of the opponent easier to draw but even if we talk about practical mistake(term that is not defined) then 30% seems to me an estimate that is too high.

Uri
When I say a mistake, I do not necessarily mean going from a winning move to a losing move, but perhaps getting a 0.03 pawn advantage when they could have had a .07 pawn advantage. Tiny advantages can add up over time because of better development or board control or whatever.

If you take every position in a game of Rybka verses Rybka at 40/2hrs, and then analyze each position for 24 hours, I think at least 30% of them will be different moves. That is what I mean by making a mistake. Now, it may be that only 2 moves by the losing side were the real cause of the loss, but which two of them is very hard to know a-priori.
I agree with you that if you let a program such as Rybka think for 24 hours of search it might select a different move than the move selected after 2 minutes of search about 30% of the time (it will of course depend on the game, in very highly tactical games the percent would probably be much lower).

However defining moves that are abandoned after a 24 hour search as mistakes is, IMO, not a very satisfactory definition of mistake. If you let the same program search for 2 years (about the same ratio to 1 day as 1 day is to two minutes) you would likely see almost the same 30% frequency of change in selected moves ... and many times you could see the 2 year move end up being the same as the 2 minute move, even in cases where the 24 hour move was different from the two minute one. For example there are many trivially drawn positions, where virtually no program would ever lose, but where the selected move will jump around and also the evaluations will jump around a few centipawns.

Also what if program A selects move alpha after 2 minutes, but move beta after 24 hours; but equally strong program B selects move beta after 2 minutes and move alpha after 24 hours ... which move is the mistake?

And your definition of mistake also fails to consider factors such as who is the opponent. For example some programs (let's say program A) will always try to avoid locked positions since they want to maximize how well they do against humans. If a different program that is usually stronger against other programs (let's say program B) does not avoid the locked position and usually beats program A in head to head matches, but does less well than program A against humans, is it program A or program B that makes a mistake when there is the opportunity to close the position or keep it open?

Also many of the "tiny advantages" that "can add up over time because of better development or board control or whatever" are not relevant to the positions at hand. Programmers will program in factors like center control, rooks on the 7th rank, degree of advancement of passed pawns, mobility, king safety etc. etc. etc. etc. because these very frequently are relevant. But there are also frequently positions where one or two of these "tiny advantages" are actually totally irrelevant to the position and a program changing it's mind about these factors is really just producing a certain amount of what might be considered noise in its evaluations (and thus also move selections).

Also there are some programs that seem to change their mind more often than other equally highly rated programs. Your definition of mistake would indicate that a program that changes its mind frequently makes more mistakes, but program ratings don't seem to bear that out.

For all these reasons and more, I think the idea of defining what constitutes a mistake (beyond moves that change the game theoretic result) is both very difficult and also inherently somewhat subjective. And I also think that for most reasonable definitions of mistake the 30% figure is too high.
Let me define clearly what I mean:
If a program plays perfect chess, then it will never change its mind about what move to make, no matter how short or how long the thinking.
If a program plays perfect chess, then any move it chooses from any position will always be the best move.
Sorry for being such a nit-picker, but there are programs that when in a tablebase position play game theoretic perfect chess, but if the game theoretic result is a draw or loss for their side they will also run the chess engine looking for a way to induce a mistake by their opponent. Such programs both play perfect chess and also often change their minds. I tend to think these programs play better chess when in these tablebase positions than programs that only grab the first move they see in the tablebase and never change their mind. Extending this to the hypothetical case of programs that had access to 32 man tablebases, you could end up in the situation where perfect play program A starts the game with 1.f3 and followed it up with 2.Kf2 (if in this hypothetical the 32 man tablebase said these moves don't lose). But I would rather have perfect play program B, that thought a little bit longer and waffled back and forth between 1.e4 and 1.d4 from among the 20 drawing first move choices.

If a program plays perfect chess and the opening move is a mate in 300 and every one of the 20 moves is a mate in 300, then all moves are equal. But if any move is better, then the perfect program must choose that move, and if any move is worse, then the perfect program must not choose that move.

Dann Corbit wrote:If a program plays perfect chess, then no matter what move it chooses, the opponent cannot outsmart the program that plays perfect chess.
If a program plays perfect chess and there is a mate in 5 then it must play a move that is a mate in 5 and not a mate in 6 or more.

I admit that thinking for a longer time does not result in a better move (though it usually does). However, if a program never made a mistake it would never change its mind. The reason that programs change their minds is that they *think* they see something better. It may or may not be better in the real case, but they think it is.
Yes, I definitely agree with your last two sentences. I also agree with your original claim that Rybka's play is still far from perfect and that programs can still improve by hundreds of elo. It was just the "30% of the moves are mistakes" claim I take exception to.

At least 30% of moves played are not the best possible move (though expanding the search to 100x normal search time may not tell us what they are). If a program plays perfect chess, then all of these less than perfect moves are mistakes.

I think that a program that plays perfect chess is probably just as hard to achieve as proving the game, but that is only speculation on my part.

Finally, if white has at least a drawn position {if the game were to be solved} then a perfect chess program would literally never lose to any opponent or set of opponents at any time control in any single game.

Marek Soszynski · Post by **Marek Soszynski** » Fri Jan 04, 2008 10:16 am

I'm disappointed that Paul Gift's original arguments and questions have been largely left behind, but I suppose that's how these threads go.

Sorry for being such a nit-picker, but there are programs that when in a tablebase position play game theoretic perfect chess, but if the game theoretic result is a draw or loss for their side they will also run the chess engine looking for a way to induce a mistake by their opponent. Such programs both play perfect chess and also often change their minds. I tend to think these programs play better chess when in these tablebase positions than programs that only grab the first move they see in the tablebase and never change their mind. Extending this to the hypothetical case of programs that had access to 32 man tablebases, you could end up in the situation where perfect play program A starts the game with 1.f3 and followed it up with 2.Kf2 (if in this hypothetical the 32 man tablebase said these moves don't lose). But I would rather have perfect play program B, that thought a little bit longer and waffled back and forth between 1.e4 and 1.d4 from among the 20 drawing first move choices.

What is perfect play in a drawn or lost position? What has "waffled back and forth" to do with perfection?

If there were 32-man tablebases then ratings lists would not become obsolete. Engine Delta could rate more highly then engine Epsilon for two reasons. 1) Delta accesses tablebases more efficiently than Epsilon (which could also be because they are running on different hardware), so much so as to make a difference at relatively short time controls. 2) Delta performs better than Epsilon when facing opponents that don't have access to 32-man tablebases. That is, it "waffles" more successfully in drawn or lost positions.

YL84 · Post by **YL84** » Fri Jan 04, 2008 10:28 am

towforce wrote: Surely it is obvious that the deeper a computer can search, the less knowledge it will need? Many pieces of knowledge that are very important to evaluate at a shallow search depth simply won't be needed ar a deeper search depth - and hence they can be removed, and the evaluation can run faster - itself increasing the search depth!

You assumption is not exact.
We could say that "any knowledge has an equivalent search depth". Some knowledge will prevent you to search 10 plies deeper (passed pawns
for examples), some will prevent you to search 2 plies only. And the difficulty is to imagine what your gain will be because the gains are not
constants. Some terms in the evaluation slow the seach and give nothing most of the time, some terms are very important and give you more than 10 plies. Maybe the art of the programmer is to add long term knowledge in his evaluation function ? So that these terms are valuable in the endgame or in the middle-endgame transition as it has been said below.
My two cents,
Yves

towforce · Post by **towforce** » Fri Jan 04, 2008 10:37 am

smirobth wrote:For all these reasons and more, I think the idea of defining what constitutes a mistake (beyond moves that change the game theoretic result) is both very difficult and also inherently somewhat subjective. And I also think that for most reasonable definitions of mistake the 30% figure is too high.

If the start position is a theoretical draw, and if one's opponent plays almost perfectly (a herioc assumption at this time, I know), then it would only be possible to make one mistake (as defined by me in the opening post) in a game - so really the definition should be probability of making a mistake in positions in which it is possible to make a mistake.

smirobth · Post by **smirobth** » Fri Jan 04, 2008 4:33 pm

Dann Corbit wrote:
smirobth wrote:
Dann Corbit wrote:
smirobth wrote:
Dann Corbit wrote:
Uri Blass wrote:
Dann Corbit wrote:I think that when the board is cluttered, the best moves are often missed.
So that if you had a 64 CPU version of Rybka, it will find moves that single CPU Rybka missed and beat it handily. So the choices Rybka makes are not perfect or even nearly perfect. I also think that chess programs can improve by hundreds of Elo still.

I think that perfect play might be as difficult to achieve as proving the game. Note that I think that these are two completely different problems.
Chess is a game when one mistake may be enough to lose so you do not need many mistakes of the opponent to be significantly stronger than the opponent by hundreds of elo.

I do not believe that engines make mistakes that change the theoretical result of the game in 30% of their moves.
If you talk about games that are not draw than in part of the games
the loser did only one mistake and it was enough to lose.
If you talk about drawn games
I even suspect that part of the engine-engine games are perfect games in the meaning that no move changes the theoretical result of the game.

It does not mean that the games include no practical mistakes that make the life of the opponent easier to draw but even if we talk about practical mistake(term that is not defined) then 30% seems to me an estimate that is too high.

Uri
When I say a mistake, I do not necessarily mean going from a winning move to a losing move, but perhaps getting a 0.03 pawn advantage when they could have had a .07 pawn advantage. Tiny advantages can add up over time because of better development or board control or whatever.

If you take every position in a game of Rybka verses Rybka at 40/2hrs, and then analyze each position for 24 hours, I think at least 30% of them will be different moves. That is what I mean by making a mistake. Now, it may be that only 2 moves by the losing side were the real cause of the loss, but which two of them is very hard to know a-priori.
I agree with you that if you let a program such as Rybka think for 24 hours of search it might select a different move than the move selected after 2 minutes of search about 30% of the time (it will of course depend on the game, in very highly tactical games the percent would probably be much lower).

However defining moves that are abandoned after a 24 hour search as mistakes is, IMO, not a very satisfactory definition of mistake. If you let the same program search for 2 years (about the same ratio to 1 day as 1 day is to two minutes) you would likely see almost the same 30% frequency of change in selected moves ... and many times you could see the 2 year move end up being the same as the 2 minute move, even in cases where the 24 hour move was different from the two minute one. For example there are many trivially drawn positions, where virtually no program would ever lose, but where the selected move will jump around and also the evaluations will jump around a few centipawns.

Also what if program A selects move alpha after 2 minutes, but move beta after 24 hours; but equally strong program B selects move beta after 2 minutes and move alpha after 24 hours ... which move is the mistake?

And your definition of mistake also fails to consider factors such as who is the opponent. For example some programs (let's say program A) will always try to avoid locked positions since they want to maximize how well they do against humans. If a different program that is usually stronger against other programs (let's say program B) does not avoid the locked position and usually beats program A in head to head matches, but does less well than program A against humans, is it program A or program B that makes a mistake when there is the opportunity to close the position or keep it open?

Also many of the "tiny advantages" that "can add up over time because of better development or board control or whatever" are not relevant to the positions at hand. Programmers will program in factors like center control, rooks on the 7th rank, degree of advancement of passed pawns, mobility, king safety etc. etc. etc. etc. because these very frequently are relevant. But there are also frequently positions where one or two of these "tiny advantages" are actually totally irrelevant to the position and a program changing it's mind about these factors is really just producing a certain amount of what might be considered noise in its evaluations (and thus also move selections).

Also there are some programs that seem to change their mind more often than other equally highly rated programs. Your definition of mistake would indicate that a program that changes its mind frequently makes more mistakes, but program ratings don't seem to bear that out.

For all these reasons and more, I think the idea of defining what constitutes a mistake (beyond moves that change the game theoretic result) is both very difficult and also inherently somewhat subjective. And I also think that for most reasonable definitions of mistake the 30% figure is too high.
Let me define clearly what I mean:
If a program plays perfect chess, then it will never change its mind about what move to make, no matter how short or how long the thinking.
If a program plays perfect chess, then any move it chooses from any position will always be the best move.
Sorry for being such a nit-picker, but there are programs that when in a tablebase position play game theoretic perfect chess, but if the game theoretic result is a draw or loss for their side they will also run the chess engine looking for a way to induce a mistake by their opponent. Such programs both play perfect chess and also often change their minds. I tend to think these programs play better chess when in these tablebase positions than programs that only grab the first move they see in the tablebase and never change their mind. Extending this to the hypothetical case of programs that had access to 32 man tablebases, you could end up in the situation where perfect play program A starts the game with 1.f3 and followed it up with 2.Kf2 (if in this hypothetical the 32 man tablebase said these moves don't lose). But I would rather have perfect play program B, that thought a little bit longer and waffled back and forth between 1.e4 and 1.d4 from among the 20 drawing first move choices.
If a program plays perfect chess and the opening move is a mate in 300 and every one of the 20 moves is a mate in 300, then all moves are equal. But if any move is better, then the perfect program must choose that move, and if any move is worse, then the perfect program must not choose that move.

I think that the odds that White has a force win on move one are somewhat less than 1% and what I wrote above was based on assuming White doesn't have a forced win.

Dann Corbit wrote:

Dann Corbit wrote:If a program plays perfect chess, then no matter what move it chooses, the opponent cannot outsmart the program that plays perfect chess.
If a program plays perfect chess and there is a mate in 5 then it must play a move that is a mate in 5 and not a mate in 6 or more.

I admit that thinking for a longer time does not result in a better move (though it usually does). However, if a program never made a mistake it would never change its mind. The reason that programs change their minds is that they *think* they see something better. It may or may not be better in the real case, but they think it is.
Yes, I definitely agree with your last two sentences. I also agree with your original claim that Rybka's play is still far from perfect and that programs can still improve by hundreds of elo. It was just the "30% of the moves are mistakes" claim I take exception to.
At least 30% of moves played are not the best possible move (though expanding the search to 100x normal search time may not tell us what they are). If a program plays perfect chess, then all of these less than perfect moves are mistakes.

I think that a program that plays perfect chess is probably just as hard to achieve as proving the game, but that is only speculation on my part.

Finally, if white has at least a drawn position {if the game were to be solved} then a perfect chess program would literally never lose to any opponent or set of opponents at any time control in any single game.

Dan you keep talking about mistakes, best possible move, and perfect play as though the meaning of these is obvious. But in a position where the game theoretic outcome is a draw (such as is probably the case in the starting position) what the "best possible move" is IMO not easy to define.

Why Chess Might Be Almost "Solved" IMO

Re: Why Chess Might Be Almost "Solved" IMO

Re: Why Chess Might Be Almost "Solved" IMO

Re: Why Chess Might Be Almost "Solved" IMO

Re: Why Chess Might Be Almost "Solved" IMO

Re: Why Chess Might Be Almost "Solved" IMO

Re: Why Chess Might Be Almost "Solved" IMO

Re: Why Chess Might Be Almost "Solved" IMO

Re: Why Chess Might Be Almost "Solved" IMO

Re: Why Chess Might Be Almost "Solved" IMO

Re: Why Chess Might Be Almost "Solved" IMO