Dann Corbit wrote:Uri Blass wrote:Dann Corbit wrote:I think that when the board is cluttered, the best moves are often missed.
So that if you had a 64 CPU version of Rybka, it will find moves that single CPU Rybka missed and beat it handily. So the choices Rybka makes are not perfect or even nearly perfect. I also think that chess programs can improve by hundreds of Elo still.
I think that perfect play might be as difficult to achieve as proving the game. Note that I think that these are two completely different problems.
Chess is a game when one mistake may be enough to lose so you do not need many mistakes of the opponent to be significantly stronger than the opponent by hundreds of elo.
I do not believe that engines make mistakes that change the theoretical result of the game in 30% of their moves.
If you talk about games that are not draw than in part of the games
the loser did only one mistake and it was enough to lose.
If you talk about drawn games
I even suspect that part of the engine-engine games are perfect games in the meaning that no move changes the theoretical result of the game.
It does not mean that the games include no practical mistakes that make the life of the opponent easier to draw but even if we talk about practical mistake(term that is not defined) then 30% seems to me an estimate that is too high.
Uri
When I say a mistake, I do not necessarily mean going from a winning move to a losing move, but perhaps getting a 0.03 pawn advantage when they could have had a .07 pawn advantage. Tiny advantages can add up over time because of better development or board control or whatever.
If you take every position in a game of Rybka verses Rybka at 40/2hrs, and then analyze each position for 24 hours, I think at least 30% of them will be different moves. That is what I mean by making a mistake. Now, it may be that only 2 moves by the losing side were the real cause of the loss, but which two of them is very hard to know a-priori.
I agree with you that if you let a program such as Rybka think for 24 hours of search it might select a different move than the move selected after 2 minutes of search about 30% of the time (it will of course depend on the game, in very highly tactical games the percent would probably be much lower).
However defining moves that are abandoned after a 24 hour search as mistakes is, IMO, not a very satisfactory definition of mistake. If you let the same program search for 2 years (about the same ratio to 1 day as 1 day is to two minutes) you would likely see almost the same 30% frequency of change in selected moves ... and many times you could see the 2 year move end up being the same as the 2 minute move, even in cases where the 24 hour move was different from the two minute one. For example there are many trivially drawn positions, where virtually no program would ever lose, but where the selected move will jump around and also the evaluations will jump around a few centipawns.
Also what if program A selects move alpha after 2 minutes, but move beta after 24 hours; but equally strong program B selects move beta after 2 minutes and move alpha after 24 hours ... which move is the mistake?
And your definition of mistake also fails to consider factors such as who is the opponent. For example some programs (let's say program A) will always try to avoid locked positions since they want to maximize how well they do against humans. If a different program that is usually stronger against other programs (let's say program B) does not avoid the locked position and usually beats program A in head to head matches, but does less well than program A against humans, is it program A or program B that makes a mistake when there is the opportunity to close the position or keep it open?
Also many of the "tiny advantages" that "can add up over time because of better development or board control or whatever" are not relevant to the positions at hand. Programmers will program in factors like center control, rooks on the 7th rank, degree of advancement of passed pawns, mobility, king safety etc. etc. etc. etc. because these very frequently are relevant. But there are also frequently positions where one or two of these "tiny advantages" are actually totally irrelevant to the position and a program changing it's mind about these factors is really just producing a certain amount of what might be considered noise in its evaluations (and thus also move selections).
Also there are some programs that seem to change their mind more often than other equally highly rated programs. Your definition of mistake would indicate that a program that changes its mind frequently makes more mistakes, but program ratings don't seem to bear that out.
For all these reasons and more, I think the idea of defining what constitutes a mistake (beyond moves that change the game theoretic result) is both very difficult and also inherently somewhat subjective. And I also think that for most reasonable definitions of mistake the 30% figure is too high.