An Evaluation Mystery ?

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: An Evaluation Mystery ?

Post by Evert »

Chan Rasjid wrote: Some programs do immediately choose to exchange a rook. Still my exit(-1) was not triggered.
I can't believe there is a bug in numpc[][].
I have no idea, but the root search should probe RXR and the ply=1 search should probe KxR. Print out the tree for the short search and you can verify whether that position is indeed reached. Then you can test why it's not picked up by the evaluation. My guess would be an exit condition that gets triggered first...
Could you clarify the technique "just drag the score close to 0 (to make it clear that the position is "drawish") and leave the rest to the search"
So I actually forgot that I do something more complicated in Jazz, but the idea is to recognise certain combinations as "drawish" and then at the end of the evaluation do "if (position_is_drawish) score /= 16;" or something similar. Doesn't have to be 16 of course, could be larger or smaller. The main idea is that the score is not "two pawns ahead" but "almost drawn". Not exactly drawn, because you can still win the bishop. The point is that the program should avoid exchanging into positions like this so it doesn't matter too much how you do it, as long as it sees the drop in the score after the exchange.

A slightly more accurate idea is to set the material component of the evaluation to 0 in these cases, but leave the piece square tables (for instance) alone so the program still plays sensibly.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: An Evaluation Mystery ?

Post by Sven »

Chan Rasjid wrote:
AlvaroBegue wrote:It sounds like a bug.
Tell me I am that bad in progamming!
numpc[0][0] is all while pieces counting the king:

Code: Select all

    if (numpc[0][0] == 2 && numpc[1][0] == 2
            &&
            ((numpc[0][Bishop] == 1 && numpc[1][Rook] == 1)
            || (numpc[1][Bishop] == 1 && numpc[0][Rook] == 1))

            ) {
        /* loss if wrong bishop */
        return evalBR(side);
    }
My codes have passed many many asserts! And play very normally. If the global arrays numpc[2][8] were wrong, I expected a big issue.

Since others suspect it is a bug, it somehow must be.

Best Regards,
Rasjid.
The condition looks correct to me. One question that comes to mind is _where_ the conditional code above is placed and _if_ that code is used at all. Add a printf("hi\n") immediately above the first line of the code you quoted, and if it is printed then change it into printing the values of all the numpc[][] values used in the condition. Maybe there is some other code in your program that prevents the code above from being called in a KRKB position.

Another question is whether your evalBR() function makes any assumption about which side has the rook and which the bishop.

Sven
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: An Evaluation Mystery ?

Post by Sven »

Chan Rasjid wrote:
AlvaroBegue wrote:It sounds like a bug.
Tell me I am that bad in progamming!
...
My codes have passed many many asserts!
Another hint: a successful development strategy includes to assume that even "good programmers" have bugs in their code sometimes. The ability to find and fix bugs is part of a programmer's skills, of course. Usually I am quite bad in that area so it sometimes takes me a couple of days or a week until I find some hidden bug.

As to "asserts": they are a good tool to _reduce_ the number of bugs that go unnoticed. But they are not an insurance against bugs, so don't rely on asserts alone. You also need a good testing strategy at least that helps to cover as many different cases as possible so that ideally all your code branches are visited during your tests. A low test coverage can lead to buggy code never being visited during your test but only in game play.

Sven
Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: An Evaluation Mystery ?

Post by Chan Rasjid »

Hello,
Sven Schüle wrote:
Chan Rasjid wrote:
AlvaroBegue wrote:It sounds like a bug.
Tell me I am that bad in progamming!
numpc[0][0] is all while pieces counting the king:

Code: Select all

    if (numpc[0][0] == 2 && numpc[1][0] == 2
            &&
            ((numpc[0][Bishop] == 1 && numpc[1][Rook] == 1)
            || (numpc[1][Bishop] == 1 && numpc[0][Rook] == 1))

            ) {
        /* loss if wrong bishop */
        return evalBR(side);
    }
My codes have passed many many asserts! And play very normally. If the global arrays numpc[2][8] were wrong, I expected a big issue.

Since others suspect it is a bug, it somehow must be.

Best Regards,
Rasjid.
The condition looks correct to me. One question that comes to mind is _where_ the conditional code above is placed and _if_ that code is used at all. Add a printf("hi\n") immediately above the first line of the code you quoted, and if it is printed then change it into printing the values of all the numpc[][] values used in the condition. Maybe there is some other code in your program that prevents the code above from being called in a KRKB position.

Another question is whether your evalBR() function makes any assumption about which side has the rook and which the bishop.

Sven
It is a little silly right now and I can't make out what's what.

In my eval(), a material flag would flag out the conditions:
1) board at most only 1 pawn - calls evalAtMost1Pawn().
2) board only N/B - calls evalOnlyNB().
3) KPK.
4) definite material draws.

If not one of the above, then evaluation proceeds to the main body.

evalBR() is called within evalAtMost1Pawn() with the codes shown.

Code: Select all

int evalAtMost1P(const int side, const int w_mat_table) {
//...

    if (numpc[0][0] == 2 && numpc[1][0] == 2
            &&
            ((numpc[0][Bishop] == 1 && numpc[1][Rook] == 1)
            || (numpc[1][Bishop] == 1 && numpc[0][Rook] == 1))

            ) {
        exit(-1);/* this never triggered */
        return evalBR(side);
    }
    
    //exit(-1);/* this would be triggered */
    
    /* TEMP as the other codes not finished; return -INFI  means use the main evaluation  */
    return -INFI;

}
But if I put the condition blocks,etc at the very beginning of eval(), then the condition is satisfied and evalBR() is used:
...
### BR
### BR
### BR
### BR
### BR
### BR
### BR
### BR
### BR
50 g8e6 score( -255) depth(12) pvL( 7) ply(19) nps( 2712827) pc(4) cttime(19) ctpoll(0)
sec(0.14, used 0.26) nodes(708048, q 0.1%, h 21.1%, f 78.8%) FH(12.1%) evasion(5.9%) invalid(4.9%)
Root-red(0.00%, rsc -nan%) Root-research(13.64%) BranchF(0.80%)
Hash-hit(full 81.5%, qs 4.6%, p( 100.0%, cover 0.0%), ev (97.3%) matOWrite( 0.35) draw(184)
ext(1.67%) see(qs 0.14%, delay 20.57%) zero-wnd(15.09%, rsc 8.38%)
null(0.0% hit -nan% verify -nan% ok -nan% ) lmr(0.00%, rsc -nan%) ply1_rsc(0) fut(0.00%) killer(1.92%)
EV call(2.16%) lazy(0.00%) cut(0.00%) fl(0.00%)
The ### BR means call to evalBR(). And is it not a smart evaluator - it held komodo to a fifty draw.:P

To be embarrassingly honest, I don't really know the proper way to trace bugs - only the crude method of assert, printf and some simple stuffs from common sense when needed.

I'll have to see what's what.

Best Regards,
Rasjid.
User avatar
hgm
Posts: 27703
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: An Evaluation Mystery ?

Post by hgm »

I still find (conditional) printfs the most powerful debug method. Print all numpc[] values just before the posted code that tests them (or at the exit() call that is triggered. Some of the earlier code must change them. Must likely by an = that should have been an ==.
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: An Evaluation Mystery ?

Post by AlvaroBegue »

tpetzke wrote:If you encounter such a position in search (not at a leaf node) continue searching. This avoids easy errors where the bishop moves into a pin or so.

If you want to improve that then you have to implement a recognizer that is able to classify a position as DRAW or UNKNOWN. In case of UNKNOWN you continue searching in case of DRAW you can stop and return 0.
I understand all of that. The question was what the recognizer recognizes. I can't think of any simple conditions under which I know the result is a draw, and I was wondering if you could share some of yours, or at least give me an idea of what the rules look like.

EDIT: Never mind. I found some tutorials on youtube that explain specific circumstances where the result is a draw.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: An Evaluation Mystery ?

Post by Sven »

Chan Rasjid wrote:evalBR() is called within evalAtMost1Pawn() with the codes shown.
...
But if I put the condition blocks,etc at the very beginning of eval(), then the condition is satisfied and evalBR() is used:
That means that the condition to call evalAtMost1Pawn() is false.

Sven
Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: An Evaluation Mystery ?

Post by Chan Rasjid »

Hello,
Evert wrote: ...
So I actually forgot that I do something more complicated in Jazz, but the idea is to recognise certain combinations as "drawish" and then at the end of the evaluation do "if (position_is_drawish) score /= 16;" or something similar. Doesn't have to be 16 of course, could be larger or smaller. The main idea is that the score is not "two pawns ahead" but "almost drawn". Not exactly drawn, because you can still win the bishop. The point is that the program should avoid exchanging into positions like this so it doesn't matter too much how you do it, as long as it sees the drop in the score after the exchange.

A slightly more accurate idea is to set the material component of the evaluation to 0 in these cases, but leave the piece square tables (for instance) alone so the program still plays sensibly.
I think I have some idea now about this "dragging the score to zero if a position is drawish". If I am right, it does not help if the program is played from a position of KBxKR. It is only a technique in evaluation in general.

The idea is closely related to how the piece type weights vary according to board positions. In KBxKR, the R less B value might not be really worth 2 pawns if we know it is very likely drawn; so we do a "score /= 16", etc...to "drag" it close to 0. Say, there is a fairly deep search from a position with a fairly full board and two leafs are reached, one KBxKR and another also with a 2 pawn difference but clearly not drawish, then evaluation would know which leaf position is preferable.

If this explanation is correct, it still is a little "secret" that I missed. I did not come across discussions on this topic.

Best Regards,
Rasjid.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: An Evaluation Mystery ?

Post by Evert »

Sven Schüle wrote: As to "asserts": they are a good tool to _reduce_ the number of bugs that go unnoticed. But they are not an insurance against bugs, so don't rely on asserts alone. You also need a good testing strategy at least that helps to cover as many different cases as possible so that ideally all your code branches are visited during your tests. A low test coverage can lead to buggy code never being visited during your test but only in game play.
Absolutely true.

I have a fair number of assertions in my program and they are useful, but I invariably find that they're more often than not in the wrong place.

Two examples.

I recently fixed a bug that caused invalid ponder-moves to be played in some circumstances. This could lead to leaving the king in check, which leads to problems when the king is captured (because the code is littered with assumptions that the king is never captured). This was eventually caught by an assertion somewhere far away from where the bug actually was. I was able to find the problem by adding more and more assertions that the state of the board is consistent. This is all good and useful, but what I should have asserted in the first place is that the ponder move is legal before making it.

Just now I fixed a bug that was caused by setting a random en-passant target square while parsing a malformed FEN. It would have been caught much earlier if I'd done an assert() on the en-passant square read from the FEN.

Bottom line: add assertions, loads of them. But bear in mind that an assertion is most useful if it is where the bug actually occurs. Since you don't know where that is ahead of time, chances are they're not exactly where they need to be.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: An Evaluation Mystery ?

Post by Evert »

Chan Rasjid wrote: The idea is closely related to how the piece type weights vary according to board positions. In KBxKR, the R less B value might not be really worth 2 pawns if we know it is very likely drawn; so we do a "score /= 16", etc...to "drag" it close to 0. Say, there is a fairly deep search from a position with a fairly full board and two leafs are reached, one KBxKR and another also with a 2 pawn difference but clearly not drawish, then evaluation would know which leaf position is preferable.
Correct.
If this explanation is correct, it still is a little "secret" that I missed. I did not come across discussions on this topic.
I think I got the idea from Ed's programming notes ("How Rebel plays chess"), but I'm fairly sure it's also discussed on the wiki. It's also described in the comments in Crafty, for instance. I'm sure there are other examples.

I think it's nice to have, but I'm not sure how much Elo points it's worth. I suspect not that many, though it must help some.