Evaluation of material imbalance (a Rybka secret?)

Discussion of chess software programming and technical issues.

Moderator: Ras

Alessandro Scotti

Re: Evaluation of material imbalance (a Rybka secret?)

Post by Alessandro Scotti »

This is the latest table I computed:

Code: Select all

Stage:  0, P(succ)=1.60, P(actual)=0.84, R=0.95, V=129.12 
Stage:  2, P(succ)=1.39, P(actual)=0.72, R=0.97, V=110.28 
Stage:  4, P(succ)=1.41, P(actual)=0.73, R=0.97, V=112.23 
Stage:  6, P(succ)=1.34, P(actual)=0.68, R=0.98, V=105.05 
Stage:  8, P(succ)=1.32, P(actual)=0.67, R=0.99, V=103.18 
Stage: 10, P(succ)=1.32, P(actual)=0.67, R=0.98, V=103.04 
Stage: 12, P(succ)=1.28, P(actual)=0.64, R=1.00, V=99.04 
Stage: 14, P(succ)=1.29, P(actual)=0.65, R=0.99, V=99.42 
Stage: 16, P(succ)=1.27, P(actual)=0.63, R=1.00, V=97.30 
Stage: 18, P(succ)=1.23, P(actual)=0.61, R=1.01, V=93.79 
Stage: 20, P(succ)=1.16, P(actual)=0.57, R=1.02, V=87.39 
Stage: 22, P(succ)=1.09, P(actual)=0.53, R=1.03, V=81.80 
Stage: 24, P(succ)=1.05, P(actual)=0.51, R=1.02, V=78.37 
Allard Siemelink
Posts: 297
Joined: Fri Jun 30, 2006 9:30 pm
Location: Netherlands

Re: Evaluation of material imbalance (a Rybka secret?)

Post by Allard Siemelink »

Here's one of mine:

Code: Select all

What               #   eval   e-e   elo  score   draw
-------------------------------------------------------
TOTAL          25477     77    36   113  65.68  36.13
8 pawns         1424     43    -4    40  55.69  27.39
7 pawns         8204     59     2    61  58.62  32.41
6 pawns         8128     75    45   121  66.70  31.90
5 pawns         5212     92    88   179  73.74  35.59
4 pawns         3123    100    85   186  74.43  45.44
3 pawns         1898    103    37   140  69.07  59.11
2 pawns          759    103   -18    85  62.06  74.04
1 pawns   
0 pawns   
7 pieces        2977     52   -23    29  54.13  30.00
6 pieces        4164     56   -21    35  55.01  31.00
5 pieces        4060     60    12    71  60.11  29.58
4 pieces        3790     76    45   121  66.74  31.16
3 pieces        4010     87    79   166  72.23  35.19
2 pieces        5289     96   100   196  75.59  40.69
1 pieces        4751     96    72   168  72.46  50.92 
0 pieces         191     84   111   195  75.39  38.74
majors qrr     11608     58    -2    56  58.02  30.18
majors qr       2944     92    49   141  69.23  34.17
majors q        1448    109    71   180  73.79  41.78
majors rr       3717     71    65   137  68.72  37.05
majors r        6531     95    84   179  73.75  45.00
majors 0        1810     98    94   192  75.11  44.25
4 minors        3404     54   -20    34  54.82  31.08 
3 minors        4703     58   -15    43  56.16  31.04
2 minors        6181     73    37   110  65.35  30.38
1 minors        7879     89    73   162  71.70  38.44
0 minors        5236     92    79   171  72.76  47.27
<no pieces>       191     84   111   195  75.39  38.74
2 bishops       7345     56   -19    37  55.30  31.64
1 bishops      11219     78    48   126  67.36  34.78
0 bishops       8219     92    78   170  72.74  42.04
2 knights       4681     58    -9    49  57.04  30.25
1 knights      11505     72    25    97  63.64  31.85
0 knights      10410     89    71   160  71.58  43.68
The elo values are calculated by analysing 25000 grandmaster games in which a pawn-up situation occured.

I guess my opening values are so much lower than HGM indicated since they include the opponent's compensation.

The e-e column compares the calculated elo with bright's average evaluation (which also includes the compensation). It looks like it is showing that bright is under evaluating extra pawns, and more so towards the endgame.
Tony

Re: Evaluation of material imbalance (a Rybka secret?)

Post by Tony »

Allard Siemelink wrote:Here's one of mine:

Code: Select all

What               #   eval   e-e   elo  score   draw
-------------------------------------------------------
TOTAL          25477     77    36   113  65.68  36.13
8 pawns         1424     43    -4    40  55.69  27.39
7 pawns         8204     59     2    61  58.62  32.41
6 pawns         8128     75    45   121  66.70  31.90
5 pawns         5212     92    88   179  73.74  35.59
4 pawns         3123    100    85   186  74.43  45.44
3 pawns         1898    103    37   140  69.07  59.11
2 pawns          759    103   -18    85  62.06  74.04
1 pawns   
0 pawns   
7 pieces        2977     52   -23    29  54.13  30.00
6 pieces        4164     56   -21    35  55.01  31.00
5 pieces        4060     60    12    71  60.11  29.58
4 pieces        3790     76    45   121  66.74  31.16
3 pieces        4010     87    79   166  72.23  35.19
2 pieces        5289     96   100   196  75.59  40.69
1 pieces        4751     96    72   168  72.46  50.92 
0 pieces         191     84   111   195  75.39  38.74
majors qrr     11608     58    -2    56  58.02  30.18
majors qr       2944     92    49   141  69.23  34.17
majors q        1448    109    71   180  73.79  41.78
majors rr       3717     71    65   137  68.72  37.05
majors r        6531     95    84   179  73.75  45.00
majors 0        1810     98    94   192  75.11  44.25
4 minors        3404     54   -20    34  54.82  31.08 
3 minors        4703     58   -15    43  56.16  31.04
2 minors        6181     73    37   110  65.35  30.38
1 minors        7879     89    73   162  71.70  38.44
0 minors        5236     92    79   171  72.76  47.27
<no pieces>       191     84   111   195  75.39  38.74
2 bishops       7345     56   -19    37  55.30  31.64
1 bishops      11219     78    48   126  67.36  34.78
0 bishops       8219     92    78   170  72.74  42.04
2 knights       4681     58    -9    49  57.04  30.25
1 knights      11505     72    25    97  63.64  31.85
0 knights      10410     89    71   160  71.58  43.68
The elo values are calculated by analysing 25000 grandmaster games in which a pawn-up situation occured.

I guess my opening values are so much lower than HGM indicated since they include the opponent's compensation.

The e-e column compares the calculated elo with bright's average evaluation (which also includes the compensation). It looks like it is showing that bright is under evaluating extra pawns, and more so towards the endgame.
Yes, that's a serious problem. A grandmaster will only be a pawn behind in the opening if there is compensation. So by selecting grandmastergames, you lower the pawn value. Extend this to a piece, and it will mess up even more.

My engine started playing h4 etc for a king attack. Bad, but grandmasters only play it when it's good, so the score for this kind of moves was way to optimistic. (IIRC dutch open 2006)

This all could actually be a reason to include games from lower level players.

Tony
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Evaluation of material imbalance (a Rybka secret?)

Post by hgm »

This is exactly why I believe this method is fundamentally flawed, and can never produce meaningful piece values. You really would have to do the statistics on random positions, not on positions selected by skilled players.
Allard Siemelink
Posts: 297
Joined: Fri Jun 30, 2006 9:30 pm
Location: Netherlands

Re: Evaluation of material imbalance (a Rybka secret?)

Post by Allard Siemelink »

hgm wrote:This is exactly why I believe this method is fundamentally flawed, and can never produce meaningful piece values. You really would have to do the statistics on random positions, not on positions selected by skilled players.
The trouble with the random positions is that I do not know the outcome, throwing away all the hours of human computing that went into those games.
Surely, there must be some way to get something out of that?

Perhaps, the pawn-up situation is a bad example.
Actually this method can calculate the value of any evaluation component.
Surely compensation is less of an issue for the piece square tables, rook on open file (which turns out to be a familiair 20), penalties for double and weak pawns, etc...?
Allard Siemelink
Posts: 297
Joined: Fri Jun 30, 2006 9:30 pm
Location: Netherlands

Re: Evaluation of material imbalance (a Rybka secret?)

Post by Allard Siemelink »

I am not sure I am interested in the piece values per se.
Rather than calculating 'the value of a knight', I calculate the values of material imbalances involving a knight. E.g. R-NP, B-N, N-PPP.

I would think it is rather a good idea to try to include games of lower rated players.
Perhaps the ratings should be as low as the level of an engine that plays with a 1-ply search?
Tony wrote: Yes, that's a serious problem. A grandmaster will only be a pawn behind in the opening if there is compensation. So by selecting grandmastergames, you lower the pawn value. Extend this to a piece, and it will mess up even more.

My engine started playing h4 etc for a king attack. Bad, but grandmasters only play it when it's good, so the score for this kind of moves was way to optimistic. (IIRC dutch open 2006)

This all could actually be a reason to include games from lower level players.

Tony
Tony

Re: Evaluation of material imbalance (a Rybka secret?)

Post by Tony »

hgm wrote:This is exactly why I believe this method is fundamentally flawed, and can never produce meaningful piece values. You really would have to do the statistics on random positions, not on positions selected by skilled players.
Well, you're a bit faster than me, it only took me 1 year to come to that conclusion.

There is a way however.

If you make the model more complex (add more features) you should be able to get a lot of values. Now, the trick, rerun the gathering of data, but correct for the score already found.

ie if my pawn ahead has a 65% winning chance (from the first run), and you have features worth 55% winning chance, I should adjust my pawn ahead winning chance to something like 69%.

Somehow....

And then rerun and rerun etc.. until no more changes.

Tony
Allard Siemelink
Posts: 297
Joined: Fri Jun 30, 2006 9:30 pm
Location: Netherlands

Re: Evaluation of material imbalance (a Rybka secret?)

Post by Allard Siemelink »

Tony wrote:
hgm wrote:This is exactly why I believe this method is fundamentally flawed, and can never produce meaningful piece values. You really would have to do the statistics on random positions, not on positions selected by skilled players.
Well, you're a bit faster than me, it only took me 1 year to come to that conclusion.

There is a way however.

If you make the model more complex (add more features) you should be able to get a lot of values. Now, the trick, rerun the gathering of data, but correct for the score already found.

ie if my pawn ahead has a 65% winning chance (from the first run), and you have features worth 55% winning chance, I should adjust my pawn ahead winning chance to something like 69%.

Somehow....

And then rerun and rerun etc.. until no more changes.

Tony
Yeah, that's what I do too. My complex model is the evaluation function itself.
(The unit value is 1 elo, not the usual centipawn. But with my presumed pawn value of 100 elo points, there really is no practical difference).
The e-e column shows by how much I should adjust the eval.
Usually it takes only one or two runs to make the values match.