Trading penalty with imbalances

Rebel · Post by **Rebel** » Mon Dec 23, 2013 9:04 pm

cetormenter wrote:
hgm wrote: What helped a lot in Joker was making pawn evaluation dependent on the number of (non-Pawn) pieces
I have done something similar in Nirvana only in a slightly different way. I count up the difference in non pawn pieces and then subtract five times this value from each pawn the opponent has. That is,
Code: Select all
value += wcount > bcount ? 5 * &#40;wcount - bcount&#41; * bpawns&#58; 5 * &#40;wcount - bcount&#41; * wpawns;

The "non pawn pieces", are these the real amount of pieces or the sum following the 1 | 3 | 3 | 5 | 9 rule? I think the latter (with a lower multiplier) is interesting too.

cetormenter · Post by **cetormenter** » Mon Dec 23, 2013 11:55 pm

This is very interesting. I've had on my todo list for a while to do an imbalance bonus/penalty based on the difference in piece count. But I never thought of making it depend on the number of pawns of the side with the least pieces.

But 20 elo seems a bit too much. Are you sure it is really worth 20 elo by itself ? Material imbalances are quite rare.

As for piece values, I use the following:
Code:
Pawn = 80 (opening) 100 (endgame)
Knight = Bishop = 330
Rook = 545
Queen = 1000

But yes, it's important to be sure the first order (piece values) is well tuned before adding a second order (imbalance). Otherwise, you end up adding code you think is useful but is just bloat and is only useful because of its orthogonal projection on the first order.

I ran two tests while I was out today. It seems that this kind of test is very time sensitive. The first test I ran at 2"+0.02" and the version without the pawn imbalance code was clearly better after just a few hundred games. However now at 6"+0.05" the results are a little less clear

Code: Select all

Rank Name                Elo    +    - games score oppo. draws 
   1 Nirvanachess 1.4 Pawn Imb     3   10   10  2713   51%    -3   46% 
   2 Nirvanachess 1.4 No Imb     -3   10   10  2713   49%     3   46%

I am not sure where exactly I got this number of + 20 elo. Perhaps this evaluation term scales beyond 6" even better? This is highly doubtful as that is counter to pretty much every other test. However I normally run my tests at 10"+0.05" or 15"+0.05". However I recall trying to remove this piece of code a couple of months ago and noticing a significant regression. I am not sure what has changed since then. Perhaps I just got a really lucky run? Or perhaps I am simply not remembering correctly but I remember being surprised at the result.

The "non pawn pieces", are these the real amount of pieces or the sum following the 1 | 3 | 3 | 5 | 9 rule? I think the latter (with a lower multiplier) is interesting too.

I am simply using the sum of the pieces, bishop = knight = rook = queen = 1. However I did attempt to mess around with not only making the opponents pawns worth less but also increasing the values of friendly pawns. This failed fairly quickly but it may have been due to poorly tuned initial values. Using the 1:3:3:5:9 rule may prove useful I will see if it performs any better in my tests.

Edit:
Also another thing to note is that Nirvana has no other imbalance terms in its evaluations besides redundant rooks and knights.

hgm · Post by **hgm** » Tue Dec 24, 2013 12:07 am

Lyudmil Tsvetkov wrote:So, basically, additive piece values in your view, means basic piece values.

No, it means what it says: that you add them. The actual value does not matter. When you take P=9 and Q=1, that would still be additive when you don't do anything beyond adding them.

Imbalance bonuses are corrections to the additive terms, and cannot be assigned to any particular piece. Sometimes they are only dependent on a small subset of the pieces, but always on more than one. (E.g. the Bishop-pair bonus only depends on the number of Bishops of each side.)

Distinguishing end-game and opening values, and interpolating them with game phase already gives non-linear terms in the material evaluation. With linear interpolation and a game phase that is quadratic (minors + 3*Rooks + 6*Queens, for example) you would get second-order terms (products of two piece counts) in the material evaluation.

cetormenter wrote:Edit:
Also another thing to note is that Nirvana has no other imbalance terms in its evaluations besides redundant rooks and knights.

Redundancy og Knights seems a shaky concept, considering how badly 3 Queens lose to 7 Knights (and that they in general do better than a mixture of Knights and Bishops).

lucasart · Post by **lucasart** » Tue Dec 24, 2013 5:22 am

hgm wrote:
cetormenter wrote:Edit:
Also another thing to note is that Nirvana has no other imbalance terms in its evaluations besides redundant rooks and knights.
Redundancy og Knights seems a shaky concept, considering how badly 3 Queens lose to 7 Knights (and that they in general do better than a mixture of Knights and Bishops).

Indeed.

In fact, the whole redundancy of major pieces theory is unproven. At lease by MY standard of proof. I added a double rook penalty in DiscoCheck, based on that idea, but I've never managed to measure a clear gain from it. Assuming it is a gain (unproven), it could well be that the only reason is that it reduces on average the value of a rook, and the value of a rook is too high in my program.

Really, being able to determine scientifically if all this material imbalance stuff is good or not, is a very difficult problem. Let's take the simplest example, the bishop pair. To demonstrate that the bishop pair has a value by itself, only has to cancel out the first order to isolate the second order:

=> scenario 1: N=x B=y R=z Q=t. Optimize (using, say, CLOP) (x,y,z,t)
=> scenario 2: N=B=x R=y Q=z BPair=t. Optimize (x,y,z,t).

The test that would measure the value of the bishop pair per se, would be to match the scenario 1 and scenario 2 versions with their respective optimal (x,y,z,t).

All that is fine in theory. But in practice, the difficulty is that it's extremely hard to co-tune 4 parameters with high precision. At some point you reach a large indifference zone and convergence becomes so slow...

hgm · Post by **hgm** » Tue Dec 24, 2013 8:17 am

Actually it is very easy to measure the value of the B-pair. But of course tuning is not the way. Just compare the result from self-play of positons with BN-BB deleted (B vs N, no pair) with those that delete BN-NN (B-pair vs BN). The first one measures the basic B-N difference, the latter the B+pairBonus-N difference.

Then you will know how the possession of a B-pair affects your engines winning chances.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Tue Dec 24, 2013 11:21 am

hgm wrote:Redundancy og Knights seems a shaky concept, considering how badly 3 Queens lose to 7 Knights (and that they in general do better than a mixture of Knights and Bishops).

Redundancy is very important. But it depends on the larger pool of pieces. For example, if you have N,N,R,R, here redundancy might be painful; but if you have N,N,R,B, the redundancy of the knights here is already not so badly felt, as they operate in a pool with 3 piece types (so redundancy gets diluted), while in the first case they would operate just in a pool of 2 piece types.

Regarding what also Lucas said, I think that it is not difficult to tune imbalance values, but somehow the right approach is missing. There is enormous benefit to be gained from different imbalances, but even modern top engines do not understand sufficiently imbalances, and it is a known fact that you play according to the knowledge you have, i.e. engines miss a significant portion of promising positions, only because they lack the knowledge that those positions are promising.

It is still not a proven fact that 3Qs lose to 7Ns, and a mix of Ns and Bs definitely perform much much better than a group of knights only. However, the particular imbalance with 7 minor pieces has nothing to do with real chess, as you have here another variable based on the very large quantity of pieces, while in real chess such very large quantities simply do not exist.

hgm · Post by **hgm** » Tue Dec 24, 2013 12:35 pm

Whether it occurs in orthodox Chess or not is not really relevant for the validity of the concept. Like everywhere in engineering, You test things by stressing them to their limits, as that is where construction faults most easily manifest themselves. That in real-life they might never be exposed to such stress doesn't mean the faults were not there.

In all material-imbalance measurements I ever did, I never found any evidence whatsoever for any form of reducndancy. Where Larry Kaufman invoked redundancy {e.g. to explain why Q-BNN is better than QR-RBNN), elephantiasis already explains the effect in a cleaner and a les ad-hoc way. The hypothesis of redundance of R next to Q is also easily falsified by noting that adding a pair of Rooks to an A-BNN imbalance causes a similar disadvantage for the Archbishop side as adding it to Q-BNN, despite the fact that there is obviously no redundancy between A (= B+N compound) and R, and a lot of redundancy with BNN.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Tue Dec 24, 2013 1:12 pm

hgm wrote:Whether it occurs in orthodox Chess or not is not really relevant for the validity of the concept. Like everywhere in engineering, You test things by stressing them to their limits, as that is where construction faults most easily manifest themselves. That in real-life they might never be exposed to such stress doesn't mean the faults were not there.

In all material-imbalance measurements I ever did, I never found any evidence whatsoever for any form of reducndancy. Where Larry Kaufman invoked redundancy {e.g. to explain why Q-BNN is better than QR-RBNN), elephantiasis already explains the effect in a cleaner and a les ad-hoc way. The hypothesis of redundance of R next to Q is also easily falsified by noting that adding a pair of Rooks to an A-BNN imbalance causes a similar disadvantage for the Archbishop side as adding it to Q-BNN, despite the fact that there is obviously no redundancy between A (= B+N compound) and R, and a lot of redundancy with BNN.

Harm often says stupid things, but I must acknowledge that he also gives good suggestions. One is using testing positions to verify hypotheses. Below are my 4 positions that could be used for verifying if redundancy matters, and if an additional piece quite frequently benefits the side with the more pieces. All positions contain no direct attacks, so I think are testable with both black and white.

[d]4k1n1/pppppppp/8/8/8/8/PPPPPPPP/4KB2 w - - 0 1
B vs N in a pure setting. I guess the bishop will perform better. One piece type each side, no repetitions/redundancies

[d]1n2k1n1/pppppppp/8/8/8/8/PPPPPPPP/1N2KB2 w - - 0 1
B+N vs N+N. Here already we have 2 piece types for one side with no redundancies, and one piece type for the other with one repetition. I guess here the B+N side will perform better than the B in a pure B vs N setting. If so, that would confirm that redundancies matter, and also piece types. Additive piece values could not help here, no matter how hard you try to tune them. You need something separate to explain the phenomenon.

[d]r3k3/pppppppp/8/8/8/8/P1PPP1PP/1N2KB2 w - - 0 1
B+N vs R+ 2 pawns in a pure setting. The imbalance should favour the rook side. 2 piece types with no repetitions for one side, one piece type with no repetitions for the other.

[d]r3k2r/pppppppp/8/8/8/8/P1PPP1PP/RN2KB2 w - - 0 1
B+N+R vs R+R. You add one piece to each side without changing anything else. 3 piece types with no repetitions for one side, and just one piece type with one repetition for the other. Here already I guess the additional piece will favour sensibly the side with the more pieces (and simultaneously the side with more piece types and less repetitions, so that quite often having a piece more is equivalent to having more piece types). I expect that the side with the 3 pieces could even win. You can not explain the phenomenon with basic additive piece values, you need something more, something separate, and that would be either the introduction of piece types and repetitions, or, in the case of unequal number of pieces, a trading down penalty for the side with the more pieces.

Those imbalances are very frequent, and if they do not occur so often in engine games, one of the main reasons is the engine lack of understanding of them. You play what you know.

Maybe someone could test the above positions by playing some 100 games each. In the end, you will know what a redundancy is worth, if piece types matter, and if a trading down penalty for the side with more pieces is justifiable.

Btw., someone suggested here that a bigger number of pieces could have an impact on the values of pawns. I do not think this is really the case. This will be so only when one of the sides has 2 pieces more, as in the Q vs 3 pieces imbalance, but not in the case when one of the sides has only a single piece more - the predominant majority.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Tue Dec 24, 2013 1:41 pm

Btw., in the case of Q vs 3 pieces imbalance, a sensible thing would be to assign additional 10cps bonus for the side with the 3 pieces for any piece after the 3rd. Thus, with 4 pieces, you will have, 10cps bonus, with 5 20cps, etc. This would be much more precise.

Same with an imbalance involving just a single piece more: instead of a trading down penalty, a wise thing could be to assign incremental bonus for any piece after the first. So, when you have a piece vs couple of pawns, you assign no bonus; when you have 2 pieces vs 1, you assign 5cps for the second piece, when you have 3 vs 2, you assign 10cps for the 3rd piece (and of course, both second pieces of both sides get equal 5cps bonus), when you have 4 vs 3, you assign 15cps bonus for the 4th piece (all others equal), when you have 5 vs 4, you assign 20cps for the 5th piece, with 6 vs 5, you assign 25 cps bonus for the 6th piece, and with 7vs 6 30cps bonus for the 7th piece. I think this way of handling would be much more accurate. It would reflect the fact that it is better to have 7 pieces vs 6 than 6vs 5, in turn 6vs 5 would be better than 5vs 4, etc. A larger quantity of pieces usually works better in a pool.

Btw., I think testing imbalances as a whole would be extremely time sensitive, as they are generally more complex to play, while engines lack to an extent the necessary specific skills to play them; but this would be especially true of the Q vs 3 pieces imbalance, so that my guess is that testing at longer time control would provide better and more realistic results.

hgm · Post by **hgm** » Tue Dec 24, 2013 1:46 pm

Lyudmil Tsvetkov wrote:Btw., in the case of Q vs 3 pieces imbalance, a sensible thing would be to assign additional 10cps bonus for the side with the 3 pieces for any piece after the 3rd. Thus, with 4 pieces, you will have, 10cps bonus, with 5 20cps, etc. This would be much more precise.

Again, that is basically the same as increasing all piece values (except Pawns) by 10cP. You don't need bonuses for that, just tune the piece values differently. Unless you mean to only do that against a Queen. But then it is equivalent elephantiasis: subtract 10 cP from Q for each piece the opponent has. (Except that 10 cP seems too little.)

Trading penalty with imbalances

Re: Trading penalty with imbalances

Re: Trading penalty with imbalances

Re: Trading penalty with imbalances

Re: Trading penalty with imbalances

Re: Trading penalty with imbalances

Re: Trading penalty with imbalances

Re: Trading penalty with imbalances

Re: Trading penalty with imbalances

Re: Trading penalty with imbalances

Re: Trading penalty with imbalances