Scores for 'white has a decisive advantage' etc.?

Discussion of chess software programming and technical issues.

Moderator: Ras

trojanfoe

Scores for 'white has a decisive advantage' etc.?

Post by trojanfoe »

Would anyone care to hazard a guess and what approx scores would be given to the standard Informant evaluations of:

White/Black has a Decisive/Clear/Slight Advantage?

Cheers,
Andy[/list]
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Scores for 'white has a decisive advantage' etc.?

Post by diep »

trojanfoe wrote:Would anyone care to hazard a guess and what approx scores would be given to the standard Informant evaluations of:

White/Black has a Decisive/Clear/Slight Advantage?

Cheers,
Andy[/list]
Well, you touch a big problem here.

Something can look dead even for a player who just started chess,
but "because it is easy to win" it gets decisive advantage.

On other hand a total suicidal dead lost position gets "unclear", because one could make a tactical error either side. Something todays generation software won't soon do.

It is total impossible to have a "one to one" conversion of human conventions to computerchess, simply because human judgements are subjective. They are done in order to 'fool' so to speak other players. Computer cannot get fooled. Computer is far more objective within its search space and far too strong to have that conversion.

To still give some estimates, which as i said are total wrong objectively, but a programmer wants of course a conversion table:

In general one could say that in middlegame having an advantage over over +0.75 is on the brink of += / +-

WON advantage is total relative to the position. You really must think soon of +2.0 or above in a position where you cannot easily blunder.

Vincent
trojanfoe

Re: Scores for 'white has a decisive advantage' etc.?

Post by trojanfoe »

Thanks Vincent that is useful. I have created an EPD file of 100,000+ positions with a bespoke 'eval' opcode using the standard Informant symbols as operands (converted to ASCII: +- +/- +/= etc.) and have a special UCI instruction in my chess engine to evaluate the position without search. My test harness then maps the score back to the Informant-level evaluation and scores the engine for accuracy. I was hoping to use this during development of my evaluation function.

I am wondering if I can use the test harness with other UCI engines, which don't have the special evaluation command - perhaps using a limited-depth search. Is it fair to judge the 'evaluation' of a position on the score of a search when in fact the annotator really meant the evaluation relates to the current position, not the position after X best moves have been made? Or perhaps the evaluation relates to the possibilities in the position as demonstrated by the moves found during search?

Cheers,
Andy
User avatar
Bill Rogers
Posts: 3562
Joined: Thu Mar 09, 2006 3:54 am
Location: San Jose, California

Re: Scores for 'white has a decisive advantage' etc.?

Post by Bill Rogers »

Becase of the vagueness of your original post I could not answer immediately but after thinking it over for a while I think I cam up with a probable answer.
To me the side that obtains and keeps the tempo has the greatest chance of winning. Upon starting a new game white has the tempo because it has the first move but that can quickly change and does.

Any time you can put your opponent on the defensive you take away his tempo.

Bill
trojanfoe

Re: Scores for 'white has a decisive advantage' etc.?

Post by trojanfoe »

I don't think my original question was vague - it was very simple - however it's now changed as I work through the use of the Informant symbol evaluations into something more concrete that can be used to determine the accuracy of an engine's evaluation function.

I got the idea from EVALUATION FUNCTION TUNING VIA ORDINAL CORRELATION (http://www.cs.ualberta.ca/~mburo/ps/tcs-learn.pdf) in which Crafty was modified to be able to iteratively tune several evaluation-related weights to see how this improved it's evaluation of a position. The experimenters used a limited-node search to get their score which they then mapped back into an Informant-level evaluation to compare with human master evaluations from Informant publications. Therefore I am inclined to think that it's OK to do this in place of a special evaluation command but I wanted to know other's thoughts on what an human evaluation really means.

Cheers,
Andy
Dave Gomboc

Re: Scores for 'white has a decisive advantage' etc.?

Post by Dave Gomboc »

If I recall correctly, thresholds for versions of Crafty 18 and 19 were about

equal 0 to 0.30 or 0.35
slight advantage 0.30 or 0.35 to 0.80 or 0.85
clear advantage 0.80 to 0.85 to 1.30 or 1.35
decisive advantage 1.30 or 1.35 and higher magnitude

Other chess programs (and even later versions of Crafty, depending on how Bob has adjusted the eval) would be different, though.
Dave Gomboc

Re: Scores for 'white has a decisive advantage' etc.?

Post by Dave Gomboc »

That's not quite right, Andy: machine centipawn evaluations were not mapped back to Informant-level evaluations. It's more like this: if a human expert says that position A is "+/=" and position B is "=", then if the evaluation function scores position A better than position B, we are happy, but if the evaluation function scores position B better than position A, we are unhappy.

Now, take a huge pile of positions: there are many different ways to pick two positions A and B from them. Do all of them (in an efficient manner!) and this will tell you how close your evaluation function is to what the human expert thinks.

It's not actually necessary that the expert be a human: you could instead take the scores returned by a deep search by a computer chess engine.
trojanfoe

Re: Scores for 'white has a decisive advantage' etc.?

Post by trojanfoe »

Sorry Dave, I missed your reply originally. Thanks for the update - I am currently far away from evaluation at the moment though, having ripped my move generator to bits and I have those annoying move generation bugs that only happen at 6-ply :shock:

I will update my test suite with your suggestion/correction when I am next looking at evaluation.

Cheers,
Andy