Kibitz score reporting in server play

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

How should position evaluation scores be reported when kibitzing in server play?

Poll ended at Fri Mar 30, 2007 5:38 pm

In centipawns, from the viewpoint of White
3
13%
In centipawns, from the viewpoint of the program
7
29%
In decimal pawns, from the viewpoint of White
7
29%
In decimal pawns, from the viewpoint of the program
6
25%
Other
1
4%
 
Total votes: 24

User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Kibitz score reporting in server play

Post by hgm »

CentiPawns? Joker uses Pawn = 256! :lol: :lol: :lol:

Actually, Steven's last post brushes on something that is much more fundamental than just for display purposes: minimax should really be applied to the winning probability, and not to the "woodcounting score".

In the range of small scores, the two are proportional, but above a lead of more than 2 Pawns, it starts to saturate. So it would be more natural to use something like 2*arctan(score/2) (which saturates towards +/- PI, score given in Pawn units).

Now if scores are only compared, such a non-linear but monotonous transformation would not have any effect. But it will have an effect in combination with 'ageing' of future gains: in a discounted-cash-flow manner reduce the value of a gain (=score minus current eval) if it lies in the future. uMax and Joker both use this mechanism to select between fast and slow paths to the same position (e.g. the checkmate) anyway. They just ameliorate scores below current eval a little bit.

In combination with the arctan correction to the score, this would prevent problems of the kind where the computer would 'defeat itself', by sacrificing a Rook now to postpone a checkmate in 25 to one in 30. Such a 'theoretically best' move of course only buys you a 100% certain loss, as even a patzer would take the Rook and have an easy win after that. If the opponent is so poor that he cannot win with the extra Rook, he certainly would not see the mate in 25...

Even with an aging of 5 centPawn per move, the mate in 25 score (which is only the asymptotically high 3.14) would have deflated by 125 cP to 1.89, so that the engine would not give more than a Pawn to delay the checkmate, and at least retain a chance if the opponent does not see it. You could adapt the ageing rate to the expected search depth of the opponent, as a very course method of opponent modelling.
Tord Romstad
Posts: 1808
Joined: Wed Mar 08, 2006 9:19 pm
Location: Oslo, Norway

Re: Kibitz score reporting in server play

Post by Tord Romstad »

hgm wrote:CentiPawns? Joker uses Pawn = 256! :lol: :lol: :lol:
It does? I thought I was the only one to use pawn = 256. :)

I convert the scores to centipawns before printing them, though. Otherwise many GUIs would resign on behalf of my engine every time it lost a couple of pawns.

Tord
User avatar
mhull
Posts: 13447
Joined: Wed Mar 08, 2006 9:02 pm
Location: Dallas, Texas
Full name: Matthew Hull

Re: Kibitz score reporting in server play

Post by mhull »

sje wrote:How should position evaluation scores be reported when kibitzing in server play?
I like the idea of a score that reflects the position -- negative scores good for Black, positive for White.

For display purposes, I like the assumption that a pawn is roughly equal to 1, understanding that internal representations are working in centipawns or whatever.

It's also nice to see some indicator of search depth.

For display, I wish more programs put the ply depth and score at the front of the output string so you don't have to hunt for it at the end of a varable length PV.
Matthew Hull
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Kibitz score reporting in server play

Post by bob »

mhull wrote:
sje wrote:How should position evaluation scores be reported when kibitzing in server play?
I like the idea of a score that reflects the position -- negative scores good for Black, positive for White.

For display purposes, I like the assumption that a pawn is roughly equal to 1, understanding that internal representations are working in centipawns or whatever.

It's also nice to see some indicator of search depth.

For display, I wish more programs put the ply depth and score at the front of the output string so you don't have to hunt for it at the end of a varable length PV.
I can't even understand why there's a poll on this. I started doing this _many_ years ago in Crafty, because of the real-time analysis I often have it give during important chess games between GM players. I got tired of the question "what does a -.3 score mean?" Now everyone knows that for Crafty, -.3 means black is a bit better.

The idea of displaying a score from the program's perspective is more confusing and after a couple of years, most people decided that they liked the way Crafty displays scores. If you don't do it like this, how do you "annotate" a game with meaningful scores???
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Kibitz score reporting in server play

Post by sje »

bob wrote:The idea of displaying a score from the program's perspective is more confusing and after a couple of years, most people decided that they liked the way Crafty displays scores. If you don't do it like this, how do you "annotate" a game with meaningful scores???
Because server play with one or two programs is fundamentally different from annotation. I always think in terms of how well the side to move is doing, and that means the program's POV at the root.

Also, for annotations, I think most humans would be better off with standardized symbols and localized text than numeric output. This has been the practice for centuries; maybe there's better, but I doubt that just reporting centipawns is what most humans want.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Kibitz score reporting in server play

Post by bob »

sje wrote:
bob wrote:The idea of displaying a score from the program's perspective is more confusing and after a couple of years, most people decided that they liked the way Crafty displays scores. If you don't do it like this, how do you "annotate" a game with meaningful scores???
Because server play with one or two programs is fundamentally different from annotation. I always think in terms of how well the side to move is doing, and that means the program's POV at the root.

Also, for annotations, I think most humans would be better off with standardized symbols and localized text than numeric output. This has been the practice for centuries; maybe there's better, but I doubt that just reporting centipawns is what most humans want.
When I'm watching, I find it _far_ easier to interpret +1.5 as white is winning rather than having to figure out which side that program is playing. BTW I don't report centipawns. I have always reported a decimal value that is easier to interpret.

Crafty does multiple tasks:

(1) annotates games, where +=good for white is the only way to give computer scores. Anything else would be confusing. And I think a numeric (decimal) value is more informative that +/-. Is it _trivially_ won? Won, but hard? Etc....

(2) provide real-time analysis of live games. Same issue. It is confusing if it kibitzes scores from its own perspective as each time someone moves, the score sign will change. Been there. Done that. Got the T-shirt. I answered the same question dozens of times during each game Crafty provided commentary on using channel 211. Hardly ever get a question about what does -3.0 mean now.

(3) play games on a server and kibitz/whisper scores when appropriate. No confusion when +1 means white is ahead, -1 means black is ahead.

(4) play in tournaments. I find +=good for white natural. Took a while, as early on I would peek into a log file while a game was being played, see a -2.85 and think crafty was losing until I realized it was playing black and quite happy. Now it is simply second nature and is far less confusing to me. And others that operate the program.

(5) if you don't like +=good for white, then how could you like glyphs like +/- which are most definitely from white's perspective? I've always liked absolutes rather than relatives, as there is much less ambiguity and confusion.

(6) "program's POV" leads to confusion when two programs are providing analysis on the game they are playing. When you see a + score, you have to look to see which side that program is playing to interpret the score. With +=good for white, any score is interpretable by itself, without even knowing the names/colors of the two participants. Simplicity...
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Kibitz score reporting in server play

Post by sje »

When my program is playing, I want to know how well it thinks it's doing, and I want that number to look the same regardless of which color it's playing. Same thing for someone else's program.

Annotation of games is a different matter. There is a long history of printed annotation and little (if any) use of numbers. I'm not going to suggest tossing hundreds of years of print heritage without a good reason to do so.

If humans in general wanted numbers mostly (or only) in annotations, we would have seen evidence of such in print. There's been plenty of time in the recent era for this to take place if it were a preferred choice.

High quality traditional annotations with symbols and text remains the standard. And high quality means annotation means exclusively human annotation until programs become able to understand and explain chess play at a conceptual level.