which program is best to predict rating based on pgn?

Dann Corbit · Post by **Dann Corbit** » Fri Dec 19, 2008 8:51 pm

Dr.Wael Deeb wrote:
bob wrote:
Dann Corbit wrote:
Uri Blass wrote:
Dann Corbit wrote:Chessbase has a function to read a PGN file and calculate Elo for the players based on the games in the file.

BayesElo does the same thing.

I guess what you are really asking is "Is there a program that can look at the *moves* of a small sample of chess games and estimate Elo based on the moves made?"

If that is the question then I guess that no program can do it.
I mean to the last question and it is not correct that no program can do it.

No program can do it well but I remember that Fritz3 could do it.
But you also said:
"it calculated only 2100 based on pgn of kasparov when it calculated something like 3000 based on pgn of Fritz3 when Fritz3 on p90 of that time was clearly not better than 2450."
Which tells me that Fritz3 was not able to do it.
Or it produced utter foolishness that claimed to be of value.
Take your pick.
Here's all it can do: Run a typical "annotate" type operation on the PGN, and then count the number of times the player agress with the program, or the player agrees with the program's second choice, etc. Then use that to derive some sort of rating based on previous curve-fitting data obtained by feeding the PGN from a _bunch_ of players of each rating range through the program to see how they match.

It fails for many reasons, not to mention the most obvious which is the hardware is critical. Use slower hardware and the ratings will be over-estimated, use faster hardware and the ratings will be under-estimated.

In short, a SWAG at best, a random number at worst. Neither of which is particularly useful or interesting.
I am quite surprised that an experinced programmer like Uri asks such a question....thinking less than a second,I knew that there is no such program....

There are some EPD test sets that are supposed to accomplish this.
For instance:
BS2830.EPD
BT2450.EPD
BT2630.EPD
GS2930.EPD
all purport to calculate Elo based on score against these tests.
Theoretically, the same thing could be done using a chess program.

Here is how I would attack the problem:
Do a search for the top 6 moves of every position in the games, ranked by best to worst and see how the player did (e.g. did he get the best move 60% of the time, second best 25% of the time,... etc. and then match it against a database of known players who were used to calibrate the system).

I think that also revealing would be how often the chosen move was not even in the top 6.

I guess that this method will give decent answers given 30 games of input if you have also analyzed at least 30 games * at least 30 players of very well known strength = at least 900 games to create your baseline.

This method can probably also point out vulnerabilities (e.g. you have too many hanging pieces ... you are susceptible to poison pawns ... you fall prey to forks... etc.) if we add smarts to look for that. I think it would have a market.

Dirt · Post by **Dirt** » Fri Dec 19, 2008 9:42 pm

Dann Corbit wrote:There are some EPD test sets that are supposed to accomplish this.
For instance:
BS2830.EPD
BT2450.EPD
BT2630.EPD
GS2930.EPD
all purport to calculate Elo based on score against these tests.
Theoretically, the same thing could be done using a chess program.

Here is how I would attack the problem:
Do a search for the top 6 moves of every position in the games, ranked by best to worst and see how the player did (e.g. did he get the best move 60% of the time, second best 25% of the time,... etc. and then match it against a database of known players who were used to calibrate the system).

I think that also revealing would be how often the chosen move was not even in the top 6.

I guess that this method will give decent answers given 30 games of input if you have also analyzed at least 30 games * at least 30 players of very well known strength = at least 900 games to create your baseline.

This method can probably also point out vulnerabilities (e.g. you have too many hanging pieces ... you are susceptible to poison pawns ... you fall prey to forks... etc.) if we add smarts to look for that. I think it would have a market.

You might be able to improve the evaluation by considering the depth of the move. For instance, a move that the computer can see in one ply drops a pawn probably implies a lower Elo than one that drops a knight but can only be seen by looking ten plies deep.

Dann Corbit · Post by **Dann Corbit** » Fri Dec 19, 2008 9:59 pm

We can probably collect most of that data from the standard engine output.

For example, here is an Arena dump for multi-pv with 4 alternates chosen:

Code: Select all

1) f6 Rc7; id = "Underm 
    Searching move: f5-f6, Rc1-c7
    Best move (Rybka 3): Bd3-c4
    Not found in: 10:00
      2	00:00	       2.089	2.139.136	-0.06	Bd3c4
      2	00:00	       2.454	2.512.896	+0.03	Qd2c2
      2	00:00	       1.043	1.068.032	+0.07	Qd2f4
      2	00:00	       1.190	1.218.560	+0.15	f5f6
   ---------------------------------------------------------------------------
      3	00:01	       4.547	273.889	+0.04	Kf2g1
      3	00:01	       4.032	242.868	+0.05	Rc1c7
      3	00:01	       3.331	200.643	+0.20	Qd2f4
      3	00:01	       3.102	186.849	+0.34	f5f6
   ---------------------------------------------------------------------------
      4	00:01	       8.630	184.106	-0.02	Kf2g1
      4	00:01	       8.064	172.032	+0.06	Rc1c7
      4	00:01	       7.496	239.872	+0.34	Qd2f4
      4	00:01	       5.028	160.896	+0.34	f5f6
   ---------------------------------------------------------------------------
      5	00:01	      15.013	161.824	+0.11	Kf2g1 Rb8c8
      5	00:01	      13.304	172.446	+0.19	Rc1c7 Rd8e8
      5	00:01	      11.489	183.824	+0.39	Qd2f4 Qa3b2
      5	00:01	      10.975	175.600	+0.39	f5f6 Rd8e8
   ---------------------------------------------------------------------------
      6	00:02	      33.679	169.055	+0.04	Kf2g1 Rb8c8 f5f6
      6	00:02	      31.857	173.518	+0.15	Rc1c7 Rd8c8 Ra1c1
      6	00:02	      27.292	178.006	+0.52	Qd2f4 Qa3b2 f5f6
      6	00:02	      23.614	170.286	+0.52	f5f6 Rd8e8 Qd2f4
   ---------------------------------------------------------------------------
      7	00:02	     116.526	177.563	-0.01	Rc1c7 Rd8c8 Ra1c1 Bd5c6
      7	00:02	     116.528	96.697	+0.10	Kf2g1 Rb8c8 f5f6 g7g5
      7	00:02	     114.169	178.215	+0.53	f5f6 Bd5e6 Bd3e4 b7b5
      7	00:02	     114.172	178.219	+0.55	Qd2f4 Rb8c8 Rc1xc8 Rd8xc8 Qf4xd4 Qa3a5 f5f6
   ---------------------------------------------------------------------------
      8	00:02	     200.772	120.793	-0.05	Kf2g1 Qa3e7 Qd2f4 Rd8e8
      8	00:02	     222.389	125.746	+0.24	Rc1c7 Rd8c8 Ra1c1 Bd5c6 Rc7xc8+ Rb8xc8 f5f6 Rc8e8
      8	00:02	     179.115	116.305	+0.62	f5f6 b7b5 f6xg7 Kg8xg7 Qd2f4 Qa3e7 Qf4xd4
      8	00:02	     143.211	106.730	+0.62	Qd2f4 Rb8c8 f5f6 g7g6 Qf4xh6
   ---------------------------------------------------------------------------
      9	00:06	   1.068.703	180.288	+0.06	Ra1b1 Rd8e8 Qd2f4 Qa3e7 Qf4xd4 Rb8d8 f5f6 Qe7xe5
      9	00:06	     761.853	168.350	+0.43	Rc1c7 Rd8c8 Ra1c1 Bd5c6 Rc7xc8+ Rb8xc8 f5f6 Rc8e8 Qd2f4 Re8e6 f6xg7 Kg8xg7 Kf2g1
      9	00:06	     491.544	155.833	+0.45	Qd2f4 Qa3e7 Qf4xd4 Bd5c6 Qd4e3 Rd8d5 e5e6 f7xe6
      9	00:06	     627.190	164.635	+0.72	f5f6 Bd5e6 Qd2f4 Qa3a5 Qf4e4 g7g6
   ---------------------------------------------------------------------------
     10	00:08	   1.476.373	187.429	+0.06	Ra1b1 Rd8e8 Qd2f4 Qa3e7 Qf4xd4
     10	00:08	   1.476.367	187.428	+0.40	Rc1c7 Rd8c8 Ra1c1 Bd5c6 Rc7xc8+ Rb8xc8 f5f6 Rc8e8 Qd2f4 Re8e6 f6xg7 Kg8xg7 Kf2g1 Qa3e7 Qf4xd4
     10	00:08	   1.439.626	187.841	+0.59	Qd2f4 Qa3e7 f5f6 Qe7e6 Bd3f5 g7g5 Bf5xe6 g5xf4 Be6f5 f4xg3+ Kf2xg3 Kg8f8 Kg3f4
     10	00:08	   1.223.621	183.346	+0.82	f5f6 b7b5 Qd2f4 Qa3f8
   ---------------------------------------------------------------------------
     11	00:13	   2.629.514	195.898	+0.07	Ra1b1 Rd8e8 Qd2f4 Qa3e7 Qf4xd4 Rb8d8 f5f6 Qe7xe5 Qd4xe5 Re8xe5 Rb1b6 g7xf6 Rb6xf6 Bd5xa2 Rf6xh6 Ba2e6 Rc1c7
     11	00:13	   2.249.799	196.100	+0.33	Qd2f4 Qa3e7 Qf4xd4 Bd5c6 Qd4e3 Rd8d5 e5e6 Rb8e8 Kf2g1 f7xe6 Qe3xe6+
     11	00:13	   2.428.277	195.791	+0.39	Rc1c7 b7b5 f5f6 Rd8c8 f6xg7 Kg8xg7 Rc7xc8 Rb8xc8 Qd2f4 Qa3c3 Qf4f6+ Kg7g8
     11	00:13	   1.664.970	188.078	+0.77	f5f6 b7b5 Qd2f4 Qa3f8 Rc1c7 Rb8b7 Rc7c2 Rb7d7 Ra1c1
   ---------------------------------------------------------------------------
     12	00:24	   4.902.126	203.518	+0.07	Ra1b1 Rd8e8 Qd2f4 Qa3e7 Qf4xd4 Rb8d8 f5f6 Qe7xe5 Qd4xe5 Re8xe5 Rb1b6 g7xf6 Rb6xf6 Bd5xa2 Rf6xh6 Ba2e6 Rc1c7
     12	00:24	   4.708.022	203.435	+0.27	Qd2f4 Qa3e7 f5f6 g7xf6 e5xf6 Qe7e3+ Qf4xe3 d4xe3+ Kf2xe3 Rd8d6 Rc1f1 Rb8e8+ Ke3d2 Re8e6 a2a4 Re6xf6 Rf1xf6 Rd6xf6 Kd2e3 Kg8g7 Ra1c1 Bd5c6
     12	00:24	   3.990.528	200.860	+0.48	Rc1c7 Rb8c8 Ra1c1 Rc8xc7
     12	00:24	   2.869.248	196.174	+0.77	f5f6 b7b5 Qd2f4 Qa3f8 Rc1c7 Rb8b7 Rc7c2 Rb7d7 Kf2g1 g7g6
   ---------------------------------------------------------------------------
     13	00:44	   8.999.819	205.976	+0.07	Ra1b1 Rd8e8 Qd2f4 Qa3e7
     13	00:44	   8.756.017	206.294	+0.24	Qd2f4 Qa3e7 f5f6 g7xf6 e5xf6 Qe7e3+ Qf4xe3 d4xe3+ Kf2xe3 Rd8d6 Rc1f1 Rb8e8+ Ke3d2 Re8e6 a2a4 Re6xf6 Rf1xf6 Rd6xf6 Kd2e3 Kg8f8 a4a5 Kf8e7
     13	00:44	   8.245.395	203.850	+0.48	Rc1c7 Rd8c8 Ra1c1 Bd5c6
     13	00:44	   6.462.488	207.727	+0.75	f5f6 b7b5 Qd2f4 Bd5c4 Bd3xc4 b5xc4 Rc1xc4 Rb8b2 f6xg7 Qa3e3+ Qf4xe3 d4xe3+ Kf2xe3 Rd8d2 Ra1e1 Rb2xa2 Rc4c6 Rd2d5 Ke3e4 Rd5d2
   ---------------------------------------------------------------------------
     14	01:35	  20.273.696	218.049	-0.01	Ra1b1 Rd8e8 Qd2f4 Qa3e7 Qf4xd4
     14	01:35	  18.564.996	217.606	+0.26	Qd2f4 Qa3e7 f5f6 g7xf6 e5xf6 Qe7e3+ Qf4xe3 d4xe3+ Kf2xe3 Rd8d6 Rc1f1 Rb8e8+ Ke3d2 Re8e6 a2a4 Re6xf6 Rf1xf6 Rd6xf6 Kd2e3 Kg8f8 a4a5 Kf8e7 Ra1d1
     14	01:35	  17.302.393	215.753	+0.64	Rc1c7 Rd8c8 Ra1c1 Rc8xc7 Rc1xc7 Qa3f8 f5f6 Qf8d8 Qd2c1 Bd5xa2 Rc7e7 Ba2d5 Qc1c5 b7b5 Bd3f5
     14	01:35	  10.600.601	205.618	+0.75	f5f6 b7b5 Qd2f4 Bd5c4 Bd3xc4 b5xc4 Rc1xc4 Rb8b2 f6xg7 Qa3e3+ Qf4xe3 d4xe3+ Kf2xe3 Rd8d2 Ra1e1 Rb2xa2 Rc4c6 Rd2d5 Ke3e4 Rd5d2
   ---------------------------------------------------------------------------
     15	02:29	  32.126.750	220.839	-0.01	Ra1b1 Rd8e8 Qd2f4 Qa3e7 Qf4xd4
     15	02:29	  30.760.568	220.813	+0.25	Qd2f4 Qa3e7 f5f6 g7xf6 e5xf6 Qe7e3+ Qf4xe3 d4xe3+ Kf2xe3 Rd8d6 Rc1f1 Rb8e8+ Ke3d2 Re8e6 a2a4 Re6xf6 Rf1xf6 Rd6xf6 Kd2e3 Kg8f8 a4a5 Kf8e7 Ra1c1 Bd5c6
     15	02:29	  27.903.699	217.193	+0.64	Rc1c7 Rd8c8 Ra1c1 Rc8xc7 Rc1xc7 Qa3f8 f5f6 Qf8d8 Qd2c1 Bd5xa2 Rc7e7 Ba2d5 Qc1c5 b7b5 Bd3f5
     15	02:29	  26.384.798	220.231	+0.85	f5f6 b7b5 Qd2f4 Bd5c4 Bd3xc4 b5xc4 Rc1xc4 Rb8b2 f6xg7 Qa3e3+ Qf4xe3 d4xe3+ Kf2f3 Rd8d2 Ra1e1
   ---------------------------------------------------------------------------
     16	06:47	  90.587.042	227.613	 0.00	Ra1b1 Rd8e8 Qd2f4 Qa3e7 Qf4xd4
     16	06:47	  85.320.760	227.606	+0.24	Qd2f4 Qa3e7 f5f6 g7xf6 e5xf6 Qe7e3+ Qf4xe3 d4xe3+ Kf2xe3 Rd8d6 Rc1c5 Rb8e8+ Ke3d2 Re8d8
     16	06:47	  46.906.520	227.531	+0.56	f5f6 g7g5 h2h4 Rd8e8
     16	06:47	  70.622.105	221.261	+0.79	Rc1c7 Rd8c8 Ra1c1 Qa3f8 Qd2f4 Bd5xa2 Qf4xd4 b7b5 f5f6 Ba2e6 Qd4e4 g7g6 Qe4e3 Rc8xc7 Rc1xc7 Kg8h7 Rc7c6 Be6d5 Rc6d6 Bd5b7
   ---------------------------------------------------------------------------
     16	09:10	  90.587.042	227.613	 0.00	Ra1b1 Rd8e8 Qd2f4 Qa3e7 Qf4xd4
     16	09:10	  85.320.760	227.606	+0.24	Qd2f4 Qa3e7 f5f6 g7xf6 e5xf6 Qe7e3+ Qf4xe3 d4xe3+ Kf2xe3 Rd8d6 Rc1c5 Rb8e8+ Ke3d2 Re8d8
     16	09:10	  46.906.520	227.531	+0.56	f5f6 g7g5 h2h4 Rd8e8
     17	09:10	 124.253.851	231.404	+0.75	Rc1c7 Rd8c8 Ra1c1 Qa3f8 Qd2f4 Bd5xa2 Qf4e4 b7b5 f5f6 g7g6 Qe4xd4 Ba2c4 Rc7xc8 Rb8xc8 Bd3xc4 b5xc4 Rc1xc4 Rc8xc4 Qd4xc4 Qf8e8 Qc4c3 Qe8e6 Kf2e3 Qe6b6+ Qc3d4 Qb6b7 Qd4e4
   12/11/2008 9:12:58 PM, Time for this analysis: 00:09:59, Rated time: 10:00

Uri Blass · Post by **Uri Blass** » Fri Dec 19, 2008 11:52 pm

Dr.Wael Deeb wrote:
bob wrote:
Dann Corbit wrote:
Uri Blass wrote:
Dann Corbit wrote:Chessbase has a function to read a PGN file and calculate Elo for the players based on the games in the file.

BayesElo does the same thing.

I guess what you are really asking is "Is there a program that can look at the *moves* of a small sample of chess games and estimate Elo based on the moves made?"

If that is the question then I guess that no program can do it.
I mean to the last question and it is not correct that no program can do it.

No program can do it well but I remember that Fritz3 could do it.
But you also said:
"it calculated only 2100 based on pgn of kasparov when it calculated something like 3000 based on pgn of Fritz3 when Fritz3 on p90 of that time was clearly not better than 2450."
Which tells me that Fritz3 was not able to do it.
Or it produced utter foolishness that claimed to be of value.
Take your pick.
Here's all it can do: Run a typical "annotate" type operation on the PGN, and then count the number of times the player agress with the program, or the player agrees with the program's second choice, etc. Then use that to derive some sort of rating based on previous curve-fitting data obtained by feeding the PGN from a _bunch_ of players of each rating range through the program to see how they match.

It fails for many reasons, not to mention the most obvious which is the hardware is critical. Use slower hardware and the ratings will be over-estimated, use faster hardware and the ratings will be under-estimated.

In short, a SWAG at best, a random number at worst. Neither of which is particularly useful or interesting.
I am quite surprised that an experinced programmer like Uri asks such a question....thinking less than a second,I knew that there is no such program....

I do not think that it is obvious that there is no such program.
There is no good program that can predict the rating with a small error
but the question is question of accuracy.
Fritz3 was bad in accuracy because it was relatively weak program but today there are better programs and probably even using the same strategy that Fritz3 used can give better results.

I do not believe that there is a very good program to get an estimate for the rating based on pgn but it is still interesting how much accuracy people can get.

One possible problem is that mistakes are question of style and human players who get simple positions may do less significant mistakes not because they are stronger but because it is easier to avoid mistakes in their games so it is clear that finding only the difference in evaluation between best move and human move is not the way to go.

It is possible to find also if the position is simple and positions when many moves have almost the same score or the score for different moves is almost constant from the same iteration can be considered as relatively more simple.

Uri

Uri Blass · Post by **Uri Blass** » Fri Dec 19, 2008 11:56 pm

Dann Corbit wrote:
Uri Blass wrote:
Dann Corbit wrote:Chessbase has a function to read a PGN file and calculate Elo for the players based on the games in the file.

BayesElo does the same thing.

I guess what you are really asking is "Is there a program that can look at the *moves* of a small sample of chess games and estimate Elo based on the moves made?"

If that is the question then I guess that no program can do it.
I mean to the last question and it is not correct that no program can do it.

No program can do it well but I remember that Fritz3 could do it.
But you also said:
"it calculated only 2100 based on pgn of kasparov when it calculated something like 3000 based on pgn of Fritz3 when Fritz3 on p90 of that time was clearly not better than 2450."
Which tells me that Fritz3 was not able to do it.
Or it produced utter foolishness that claimed to be of value.
Take your pick.

I agree it was not good but it was still better than giving random numbers and I guess that it is not going to give a GM rating below 1900 and that it is not going to give players with rating below 1900 rating above 2600.

Today there are better programs than Fritz and even using the same bad strategy that Fritz used may give relatively better results relative to Fritz3.

I do not expect maximal error of 100 elo or 200 elo in the estimate but maybe it is possible to get maximal error of 400 elo.

Uri

which program is best to predict rating based on pgn?

Re: which program is best to predict rating based on pgn?

Re: which program is best to predict rating based on pgn?

Re: which program is best to predict rating based on pgn?

Re: which program is best to predict rating based on pgn?

Re: which program is best to predict rating based on pgn?