Massive Elo calculation

Peteshnick · Post by **Peteshnick** » Mon Jun 14, 2010 3:02 am

Say you took the whole Chessbase database and ran ELOstat on it. (The program only takes 2000 or so different players, but say for arguments sake it could take more...) Assuming player strengths don't change significantly over time, wouldn't this give you a set of ratings by which you could compare say, Morphy and Kasparov?

The reason I am using the whole Chessbase database as an example is because the larger the database the higher the chance you get a connected graph (nodes are players, edges are games). You need connectedness to compare arbitrary players I would think.

bob · Post by **bob** » Mon Jun 14, 2010 4:06 am

Peteshnick wrote:Say you took the whole Chessbase database and ran ELOstat on it. (The program only takes 2000 or so different players, but say for arguments sake it could take more...) Assuming player strengths don't change significantly over time, wouldn't this give you a set of ratings by which you could compare say, Morphy and Kasparov?

The reason I am using the whole Chessbase database as an example is because the larger the database the higher the chance you get a connected graph (nodes are players, edges are games). You need connectedness to compare arbitrary players I would think.

If you are not careful, you end up with several "pools" of players, say the players of Morphy's era, or Fischer's, etc. If you don't have a lot of cross-pool games, this won't be accurate at all. The real problem is that the body of games from 100 years ago or more is not very large compared to what we have today, which is practically every game played on the planet, chess servers included if you want.

In any case, if you have players A,B and C that play each other a lot, and then players X, Y and Z that play each other a lot, and you have one or two games where B played Z, but no other cross-pool games, then the rating difference between any in the set {A, B, C} and any in the set {X, Y and Z} won't be very meaningful.

Would be an interesting question to see answered, "if you take a group of less-than-super-GM players of today and count the number of PGN games available, and compare to a similar group of players from 100 years ago, how do the total number of games compare?" Almost certainly not very equally. The challenge would be to find the "connecting players" that played against (say) fischer or his opponents and (say) Kasparov or his. And Fischer was playing 40 years ago so he is pretty recent. I'd bet even those "connector games" would be difficult to find. Going back to Morphy, Tal, Capablanca, Petrosian, Reti, etc would likely prove a greater challenge...

Just dumping them into BayesElo or EloStat won't do much unless there are lots of games connecting most of the players solidly.

Christopher Conkie · Post by **Christopher Conkie** » Mon Jun 14, 2010 4:27 am

Peteshnick wrote:Say you took the whole Chessbase database and ran ELOstat on it. (The program only takes 2000 or so different players, but say for arguments sake it could take more...) Assuming player strengths don't change significantly over time, wouldn't this give you a set of ratings by which you could compare say, Morphy and Kasparov?

The reason I am using the whole Chessbase database as an example is because the larger the database the higher the chance you get a connected graph (nodes are players, edges are games). You need connectedness to compare arbitrary players I would think.

I am not sure if your idea would work unless the database were very large indeed. Most attempts at this that I have seen involve using a benchmark engine and comparing similarities of move.

Matej Guid and Ivan Bratko developed a method that used Crafty as such a benchmark.

http://www.chessbase.com/newsdetail.asp?newsid=3455

Another more recent attempt that you may find interesting is contained in the paper below.

http://web.zone.ee/chessanalysis/summary450.pdf

They found that the relative elo of top players has increased be some 300 elo over a 120 year period. I would imagine it is not that easy to compare Morphy with Kasparov by using elostat because of this anomaly.

Chris

Peteshnick · Post by **Peteshnick** » Mon Jun 14, 2010 4:36 am

bob wrote:
If you are not careful, you end up with several "pools" of players, say the players of Morphy's era, or Fischer's, etc. If you don't have a lot of cross-pool games, this won't be accurate at all. The real problem is that the body of games from 100 years ago or more is not very large compared to what we have today, which is practically every game played on the planet, chess servers included if you want.

In any case, if you have players A,B and C that play each other a lot, and then players X, Y and Z that play each other a lot, and you have one or two games where B played Z, but no other cross-pool games, then the rating difference between any in the set {A, B, C} and any in the set {X, Y and Z} won't be very meaningful.

Would be an interesting question to see answered, "if you take a group of less-than-super-GM players of today and count the number of PGN games available, and compare to a similar group of players from 100 years ago, how do the total number of games compare?" Almost certainly not very equally. The challenge would be to find the "connecting players" that played against (say) fischer or his opponents and (say) Kasparov or his. And Fischer was playing 40 years ago so he is pretty recent. I'd bet even those "connector games" would be difficult to find. Going back to Morphy, Tal, Capablanca, Petrosian, Reti, etc would likely prove a greater challenge...

Just dumping them into BayesElo or EloStat won't do much unless there are lots of games connecting most of the players solidly.

Hi Professor Hyatt,

I was thinking something similar, but wasn't sure. I guess the formal way to analyze that would be a max flow-min cut type exercise (arcs are games between players, each arc gets capacity = # of games played between them). The max flow obtainable from a source (say Morphy) to a sink (say Kasparov) would indicate the strength of the weakest link. I agree that the weakest point would be the earliest in the chain, but it may not be so bad. I mean, we don't need to rely on the Andre Lilienthals or Victor Korchnois here. As long as we have enough paths (e.g. Morphy->Anderssen->Steinitz->Lasker->Capablanca->...->Kasparov, we may be ok.. This seems like a good CS project for a curious undergrad.

I think Keene et al did something similar in Warriors of the Mind, but only for a small subset of carefully selected players. In any case, as you say you'd have to do it carefully, but I think it might be interesting.

Thanks,
Pete

Peteshnick · Post by **Peteshnick** » Mon Jun 14, 2010 4:41 am

Christopher Conkie wrote:
I am not sure if your idea would work unless the database were very large indeed. Most attempts at this that I have seen involve using a benchmark engine and comparing similarities of move.

Matej Guid and Ivan Bratko developed a method that used Crafty as such a benchmark.

http://www.chessbase.com/newsdetail.asp?newsid=3455

Another more recent attempt that you may find interesting is contained in the paper below.

http://web.zone.ee/chessanalysis/summary450.pdf

They found that the relative elo of top players has increased be some 300 elo over a 120 year period. I would imagine it is not that easy to compare Morphy with Kasparov by using elostat because of this anomaly.

Chris

Hi Chris,

I'm actually a pretty big fan of these analyses, although I don't quite trust their elo predictions. Lasker finished ahead of Capablanca in almost every tournament they played in together (until Lasker was very old), but the prediction of the latter analysis is that Capa is 2750 and Lasker 2450. That'd be like some IM beating Super-GMs consistently.

The findings in Warriors of the Mind are the ones I find most convincing.

Best,
Pete

Norm Pollock · Post by **Norm Pollock** » Mon Jun 14, 2010 7:37 am

http://db.chessmetrics.com/

Peteshnick · Post by **Peteshnick** » Mon Jun 14, 2010 8:16 am

Norm Pollock wrote:http://db.chessmetrics.com/

From the chessmetrics website:

Jeff Sonas wrote:Each month's rating list is totally independent of the prior month's list, other than the fact that they both share a very similar set of games, since they each reach back 48 months. In order to align them, I take everyone who is ranked between #3 and #20 on both lists, average them together, and adjust the second month's list up or down by a constant amount (I call this "calibrating") so that the average rating of the two groups (i.e., the players who were ranked between #3 and #20 on both lists) is the same.

Chessmetrics doesn't try to compare objectively the playing strength of old and new players, just how dominant they were. Jeff Sonas doesn't do my idea because it introduces too much rating inflation, but I'm not convinced rating inflation actually exists, if you normalize by population... See figure 6 here: http://www.chessbase.com/newsdetail.asp?newsid=6401

Dann Corbit · Post by **Dann Corbit** » Mon Jun 14, 2010 3:41 pm

Peteshnick wrote:Say you took the whole Chessbase database and ran ELOstat on it.

Elostat won't work, but Chessbase itself can do it (and then use John Nunn to normalize the result -- there is even an article somewhere that explains how to do it).
Also, BayesElo will do a good job of it.

(The program only takes 2000 or so different players, but say for arguments sake it could take more...) Assuming player strengths don't change significantly over time,

If your result hinges on this assumption, then I think we can dismiss it.

wouldn't this give you a set of ratings by which you could compare say, Morphy and Kasparov?

The reason I am using the whole Chessbase database as an example is because the larger the database the higher the chance you get a connected graph (nodes are players, edges are games). You need connectedness to compare arbitrary players I would think.

Look here:
http://db.chessmetrics.com/

They did your exact calculation sequence.

Peteshnick · Post by **Peteshnick** » Mon Jun 14, 2010 5:38 pm

Dann Corbit wrote: Look here:
http://db.chessmetrics.com/

They did your exact calculation sequence.

Dann,

See my above response to Norm. Chessmetrics does not do my calculation sequence - it doesn't try to compare everyone at once. It makes the assumption that the #3-#20 ranked players in the world have a constant Elo. (Unless I am totally misunderstanding that sentence, but I don't think I am..)

As for your point about rating staying constant over time, restricting the set of games for each player to maybe a 15 year window in the peak of their career would probably fix this, yet still allow the calculation to be done, although it probably can be wider than that for most players that stayed active for a while, like Lasker.

Finally, about Chessbase being able to do it, I didn't know that. It's a little too expensive for my grad student self right now, but maybe someday... BayesElo wasn't quite working for me either with such a big database, but I'll give it another go.

Best,
Pete

Albert Silver · Post by **Albert Silver** » Mon Jun 14, 2010 5:43 pm

Peteshnick wrote:
Dann Corbit wrote: Look here:
http://db.chessmetrics.com/

They did your exact calculation sequence.
Dann,

See my above response to Norm. Chessmetrics does not do my calculation sequence - it doesn't try to compare everyone at once. It makes the assumption that the #3-#20 ranked players in the world have a constant Elo. (Unless I am totally misunderstanding that sentence, but I don't think I am..)

As for your point about rating staying constant over time, restricting the set of games for each player to maybe a 15 year window in the peak of their career would probably fix this, yet still allow the calculation to be done, although it probably can be wider than that for most players that stayed active for a while, like Lasker.

Finally, about Chessbase being able to do it, I didn't know that. It's a little too expensive for my grad student self right now, but maybe someday... BayesElo wasn't quite working for me either with such a big database, but I'll give it another go.

Best,
Pete

There is also a Chessbase Light, that costs considerably less. I don't know if it has the functionality you seek, but it can't hurt to look.

Massive Elo calculation

Massive Elo calculation

Re: Massive Elo calculation

Re: Massive Elo calculation

Re: Massive Elo calculation

Re: Massive Elo calculation

Re: Massive Elo calculation

Re: Massive Elo calculation

Re: Massive Elo calculation

Re: Massive Elo calculation

Re: Massive Elo calculation