lkaufman wrote: ↑Wed Aug 30, 2023 9:20 pm
My article on comparing great players of the past and present by their "Accuracy" measured against chess engines was published on chess.com on Monday, and has received a lot of publicity and commentary. https://www.chess.com/article/view/ches ... tings-goat. The key innovation in my method was to exclude draws from the study, primarily because draws get much higher accuracy scores than wins, so including them unduly favors players with high draw percentages, especially short draws (which may even be "Perfect"). I also found a way to adjust for opponents' strength.
Although the fit with actual ratings of modern players was very good, much better than expected, a few of the historical players seem to have unreasonably low "absolute" ratings. I don't know the reason for this in general. Anyway, if anyone reading the article has ideas of how it might be improved in the future, let me know here. I warn you though, it's not easy to suggest improvements that do more good than harm in general, I have tried!
I do not like ignoring part of the data and I think that accuracy is also a function of the positions that the player play.
I think that it may be better to test the accuracy of relative weak engines(or strong engines but with a very small depth) in the positions of the game to decide about expected accuracy for 2400 or for 2600 in the relevant game and give rating for the specific game based on the performance of the humans relative to engines.
Note that I am not sure if strong engines with small depth are better than weak engines to emulate humans or not.
I can add that you can give little weight for theory moves that are played very often(I do not say no weight because the choice of humans to play some not accurate lines like king's gambit are also part of the strength of them)
Fritz 0 wrote: ↑Thu Aug 31, 2023 2:34 pm
It's hard to believe that la Bourdonnais was weaker than me(!) and Staunton just a bit stronger(!!). Maybe it would be somewhat different if draws were included, but I suppose there were not many draws back then anyways.
As for other things, it's amazing how little Carlsen's play drops from classical to rapid, and not even the official rapid (15'+10'') but much faster one.
Similarly for me it seems strange that in my early 60s I played better (according to this metric) than Morphy or Rubinstein or Euwe at their peaks! I cannot imagine winning a match from any of those great players at any time in my life. But the method predicts current ratings quite well, so I don't have an explanation. Regarding the dropoff from classical to Rapid, remember that Kasparov crushed the Israeli team (2600+) in a simul, which is sort of like classical to rapid odds, so that is easy for me to believe.
Just on the basis of looking to some of the Staunton's games, I'm sure that he would demolish me.
Besides, all those top players from 19th century regularly and routinely played blindfold, didn't they? I can't imagine a today's 1900 Elo player playing a decent blindfold game (and I am certainly not capable of that either).
lkaufman wrote: ↑Wed Aug 30, 2023 9:20 pm
My article on comparing great players of the past and present by their "Accuracy" measured against chess engines was published on chess.com on Monday, and has received a lot of publicity and commentary. https://www.chess.com/article/view/ches ... tings-goat. The key innovation in my method was to exclude draws from the study, primarily because draws get much higher accuracy scores than wins, so including them unduly favors players with high draw percentages, especially short draws (which may even be "Perfect"). I also found a way to adjust for opponents' strength.
Although the fit with actual ratings of modern players was very good, much better than expected, a few of the historical players seem to have unreasonably low "absolute" ratings. I don't know the reason for this in general. Anyway, if anyone reading the article has ideas of how it might be improved in the future, let me know here. I warn you though, it's not easy to suggest improvements that do more good than harm in general, I have tried!
I do not like ignoring part of the data and I think that accuracy is also a function of the positions that the player play.
I think that it may be better to test the accuracy of relative weak engines(or strong engines but with a very small depth) in the positions of the game to decide about expected accuracy for 2400 or for 2600 in the relevant game and give rating for the specific game based on the performance of the humans relative to engines.
Note that I am not sure if strong engines with small depth are better than weak engines to emulate humans or not.
I can add that you can give little weight for theory moves that are played very often(I do not say no weight because the choice of humans to play some not accurate lines like king's gambit are also part of the strength of them)
I had the very same idea, probably using Komodo Dragon with Elo settings (which are short searches). But I wanted to see what could be done with just currently available data. Perhaps I'll try this in the future.
Fritz 0 wrote: ↑Thu Aug 31, 2023 2:34 pm
It's hard to believe that la Bourdonnais was weaker than me(!) and Staunton just a bit stronger(!!). Maybe it would be somewhat different if draws were included, but I suppose there were not many draws back then anyways.
As for other things, it's amazing how little Carlsen's play drops from classical to rapid, and not even the official rapid (15'+10'') but much faster one.
Similarly for me it seems strange that in my early 60s I played better (according to this metric) than Morphy or Rubinstein or Euwe at their peaks! I cannot imagine winning a match from any of those great players at any time in my life. But the method predicts current ratings quite well, so I don't have an explanation. Regarding the dropoff from classical to Rapid, remember that Kasparov crushed the Israeli team (2600+) in a simul, which is sort of like classical to rapid odds, so that is easy for me to believe.
Just on the basis of looking to some of the Staunton's games, I'm sure that he would demolish me.
Besides, all those top players from 19th century regularly and routinely played blindfold, didn't they? I can't imagine a today's 1900 Elo player playing a decent blindfold game (and I am certainly not capable of that either).
Well, not all, but Paulsen in particular could play several games simul blindfold, and he was only about 2100 or so at the time by my metric, which does seem unreasonably low. I did have one chess friend in my youth who was about 1900 back then who could play decent blindfold chess, better than I could. He did reach 2200+ FIDE, 2300+ USCF, but my lifetime record in tournament play vs. him was 20 to 0 without a single draw! So blindfold chess isn't that highly correlated with actual chess skill. But your point is valid, those ratings do seem too low. I'm trying to investigate the reason.
Fritz 0 wrote: ↑Thu Aug 31, 2023 2:34 pm
It's hard to believe that la Bourdonnais was weaker than me(!) and Staunton just a bit stronger(!!). Maybe it would be somewhat different if draws were included, but I suppose there were not many draws back then anyways.
As for other things, it's amazing how little Carlsen's play drops from classical to rapid, and not even the official rapid (15'+10'') but much faster one.
Similarly for me it seems strange that in my early 60s I played better (according to this metric) than Morphy or Rubinstein or Euwe at their peaks! I cannot imagine winning a match from any of those great players at any time in my life. But the method predicts current ratings quite well, so I don't have an explanation. Regarding the dropoff from classical to Rapid, remember that Kasparov crushed the Israeli team (2600+) in a simul, which is sort of like classical to rapid odds, so that is easy for me to believe.
Just on the basis of looking to some of the Staunton's games, I'm sure that he would demolish me.
Besides, all those top players from 19th century regularly and routinely played blindfold, didn't they? I can't imagine a today's 1900 Elo player playing a decent blindfold game (and I am certainly not capable of that either).
I am not capable of it but I knew in the past weaker players who can play a decent blindfold game.
You cannot get conclusion about playing strength from the ability to play blindfold.
I was never a good player, but I was able to play two blindfold games simultaneously when I was thirty, and now, at almost sixty, I play blindfold games against weak engines in Lucas chess at least once a month. I believe that any player over 2200 is able to play blindfold, and perhaps most players over 2000 Elo, especially if they play long games regularly.
matejst wrote: ↑Sat Sep 02, 2023 1:35 am
I was never a good player, but I was able to play two blindfold games simultaneously when I was thirty, and now, at almost sixty, I play blindfold games against weak engines in Lucas chess at least once a month. I believe that any player over 2200 is able to play blindfold, and perhaps most players over 2000 Elo, especially if they play long games regularly.
I believe that you are wrong and there are 2200 players who cannot play blindfold.
I was never 2200 but I had fide rating above 2000 and I have weakness in chess that are not related to not knowing to play blindfold.
I believe that I could become 2200 without being able to play a single game blindfold.
I play long games regularly and I do not think playing long games regularly is relevant.
carldaman wrote: ↑Fri Sep 01, 2023 2:43 amRe: Pillsbury and Tarrasch, they really had comparably strong results, despite Pillsbury's accuracy being 'clearly' better, as you state. Perhaps some of that can be accounted for by certain stylistic differences, but it could also be that the absence of drawn game data is giving an incomplete picture.
But Pillsbury might not have reached his full potential in the 1890s, since illness eventually derailed his career, so the greater accuracy may be an indication of that.
Many of the chess players of the past had health problems (both physical and mental). Maybe that could explain why they couldn't reach their full potential.
Pillsbury, Tal, Keres (and probably many others) have struggled with some serious health issues. Also healthcare and medicine was not that good back in those days.