score from White's perspective, or engine's perspective?

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: score from White's perspective, or engine's perspective?

Post by sje »

bob wrote:Just like all chess diagrams show white at the bottom and black at the top for clarity.
Well, this is tradition, but tradition can be wrong. I suggest that it would be more realistic to always have a diagram oriented based on the side to move (and have this clearly labeled). Almost certainly this would be better form human players engaged in study; in real events a player sees things from Black's side half the time.
Dann Corbit
Posts: 12778
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: score from White's perspective, or engine's perspective?

Post by Dann Corbit »

sje wrote:In the Early Days long before the Internet, the literature -- and the few programs written -- consistently scored moves, positions, and searches from the White point of view. Authors who were active from Back Then or whose early influence was from literature of those days may unsurprisingly adopt White POV scoring.

But starting from the late 1970's, the literature started to change and algorithms were presented with a color neutral perspective. I believe that the idea of presenting scores form the viewpoint of the active color was easier to understand and slightly simpler to program. So more recent efforts (or at least efforts by more recent programmers) tend to the Active Color POV.

All of my code has been Active Color POV, except for opening book formation. That's because all positions are normalized for White To Move to enable extra hits for color reversed opening positions.
Of course, if you are analyzing an EPD record, then the standard (with which you are *cough* familiar) says that you *must* score from the point of view of the side to move.

When playing a game, we have something of a hole. I do not know of any standard to define how to describe the score accurately.

Hence, we see all sorts of nonsense when two engines play against each other and we probably won't really have a clear understanding of what they are really thinking until the score gets very polar.
User avatar
hgm
Posts: 28356
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: score from White's perspective, or engine's perspective?

Post by hgm »

Marc Lacrosse wrote:
hgm wrote:What you are referring to is not in the definition of PGN bt in the definition of EPD. PGN obtains no opcode, and WinBoard (nor any other GUI I know of) puts the score in the PGN in centiPawns.
Both winboard I and II protocols as written by Tim Mann do not include anything on the precise topic of how should evaluation be wroten (absolute : white better is always positive, or relative : white better is positive when white is to move, negative when black is to move).

So it is not uninteresting to have a look at what other standards say :

- EPD definition is part of the PGN definition document.
This is the only place in this document where the way to describe a position evaluation is mentioned.
It clearly recommends relative signs.

- UCI is much more than a standard of communication between engines and Polyglot. It is also simply the standard through which the majority of engines communicate with the majority of GUIs. UCI protocol explicitly recommends relative scores.

So If you decide to define that _your_ modified winboard requires and outputs evaluations the absolute way (white better is positive), how will you deal with epd's passed through copy/paste from engines and/or GUI's who do follow the EPDspec and/or the UCI one. And those who copy EPDs from your winboard to anything else will have to edit the sign of the scores before pasting them into any PGN/EPD or UCI compliant application ...

My point is that PGN/EPD specification, UCI protocol and WB protocol need to have the same interpretation of signed numerical evaluations unless you accept that everybody will need a database of what is issued by whom anytime evaluation data are exchanged .

If you wish that we go for the more logical "absolute" evaluations (which i do also prefer) then there must be a consensus for changing all three protocols alltogether simultaneously (WB, UCI, PGN-EPD) ...

... and everybody has to accept that the majority of older applications will produce strange output everytime a signed evaluation is concerned.

Marc
You are confusing matters completely.

How WinBoad protocol transmits scores is totlaly independent of, and has no effect on how WinBoard saves or reads PGN or EPD, or how UCI engines communicate with UCI GUIs. If UCI reports the score from the POV of white, Polygot will simply flip it for the black engine before passing it on to WinBoard. No need for the standards to be the same at all, they can be anything they want, and the adapters will make the necessary translation.

Unfortunately, the document specifying WB protocol is unclear and ambiguous in many places. But WB protocol is defined de facto by the engines that use it. And the overwhelming majority of those use their own POV for the scores.

There is no standard defined for including the scores of each move in the PGN. Winboard_x was the first version of winboard that could save the engine scores in the PGN as comments to the moves, and it used the POV of the engine reporting the score. I have no idea if other GUIs save the score in a compatible format that WinBoard would recognize, and if they do, if they use the same sign convention as WinBoard, (or as each other).
User avatar
hgm
Posts: 28356
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: score from White's perspective, or engine's perspective?

Post by hgm »

xsadar wrote:Personally, I think it would be better if the gui displays the score from white's point of view, but the most important thing is that it be consistent (which it isn't now). However, the way engines report scores to the gui does not need to be consistent as long as the gui knows how it's being reported. What you could do is make a feature called either 'whitepov' or 'ownpov' and recommend that engines explicitly set or clear this feature depending on what the engine does. Then, if the POV is not explicitly specified, just assume one or the other based on what's most common for engines that exist right now (which I would guess is own POV, which I believe is what polyglot does as well).
I adopted a policy of not unneccessarily bloating WinBoard protocol by supplying multiple ways to do the same thing, and then let engines choose how they want to do it. I think that in the long run such freedom would only serves to create confusion. There should be a unique standard, and if engine violate it, I don't think matter would improve by developing a secondary standard for how to violate the primary standard without consequences. This would lead to an infinite regression, as the engines could violate that secondary standard too, etc. If authors would be prepared to build in feature commands telling WB they are going to violate the basic protocol, they might as well direct their programming effort to obeying the original protocol.

So the only concern is engines that violate protocol, and are no longer being developed, or are deveoped by authors that intentionally want to violate protocol. For the benifit of people that want to use such engines, there is an option (which the user wil have to specify, so that malevolent engine authors cannot frustrate the process) to flip the score of rogue engines. I think that is the best solution, and it is already in place.

The user cannot currently specify, though, how the score is represented in the display or PGN (own POV or white POV). It could be argued if this is desirable. On the one hand it might satisfy the need of individual users better. On the other hand, in reduces the portability of the PGN files. Unless there was an official definition of how to include scores in PGN files, and that definition would allow freedom of how to choose the sign (removing the ambiguity by requiring a tag specifying the sign convention used, if this deviates from the default).
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: score from White's perspective, or engine's perspective?

Post by sje »

It needs a write-up to be sure, but PGN allows for "broket forms" to appear in the movetext, just like NAGs can appear.

A PGN broket form is simply an EPD operation enclosed in angle brackets.

Pertinent examples:

1. e4 e5. 2 Nf3 <ce -15> Nc6 <ce 12> 3. Bc4 *
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

My POV

Post by sje »

No matter which choice is taken, there are going to be some who are unhappy. So A GUI or other analysis muncher should allow the user to select per engine score POV behavior. An XML format RC file in the user's home directory or some preferences directory would be a good place for such customization.

Also, xboard (and Winboard perhaps) needs TWO window regions for displaying analysis postings when two engines are playing.

And consider that some engines will report certain scores as "MateIn4" or "LoseIn37", and these are more legible than some numeric equivalent.

Furthermore, there could be an engine that reports a score window, or a score and a certainty metric. And then there's the case of reporting no score data at all when the move is plucked from a book or is the only move available.
Dann Corbit
Posts: 12778
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: My POV

Post by Dann Corbit »

sje wrote:No matter which choice is taken, there are going to be some who are unhappy. So A GUI or other analysis muncher should allow the user to select per engine score POV behavior. An XML format RC file in the user's home directory or some preferences directory would be a good place for such customization.

Also, xboard (and Winboard perhaps) needs TWO window regions for displaying analysis postings when two engines are playing.

And consider that some engines will report certain scores as "MateIn4" or "LoseIn37", and these are more legible than some numeric equivalent.

Furthermore, there could be an engine that reports a score window, or a score and a certainty metric. And then there's the case of reporting no score data at all when the move is plucked from a book or is the only move available.
What is needed is a formal method to define scores during game play so that the result is uniformly well understood.

Besides the things you have made mention of, there are also engines that score checkmate as 9999 or 99999.
Some engines give a score of zero for book moves and others do not.

I think it could be resolved by a dialog something like:
ahead_one_pawn_score?
ahead_one_pawn_score=1000 {answers are in millipawns}
score_type?
score_type=stm {scores are given from the standpoint of side to move}
checkmate_score?
checkmate_score=9999 {9999 represents checkmate}
checkmate_minus_one?
checkmate_minus_one=9998 {interval of 1 indicates one ply to cm}

The cause of our Tower of Babel is that there is currently no clear statement of how to indicate game state from a score standpoint that everyone can agree upon.

The ChessAssistant program has a clever, heuristic method to take guesses.
It feeds some EPD positions for which it knows the answer and then looks at the engine output to decide how the engine scores positions.
Of course, this only gives hints for EPD processing, but the same idea could be used for game play.
Dann Corbit
Posts: 12778
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: My POV

Post by Dann Corbit »

sje wrote:No matter which choice is taken, there are going to be some who are unhappy. So A GUI or other analysis muncher should allow the user to select per engine score POV behavior. An XML format RC file in the user's home directory or some preferences directory would be a good place for such customization.

Also, xboard (and Winboard perhaps) needs TWO window regions for displaying analysis postings when two engines are playing.

And consider that some engines will report certain scores as "MateIn4" or "LoseIn37", and these are more legible than some numeric equivalent.

Furthermore, there could be an engine that reports a score window, or a score and a certainty metric. And then there's the case of reporting no score data at all when the move is plucked from a book or is the only move available.
What is needed is a formal method to define scores during game play so that the result is uniformly well understood.

Besides the things you have made mention of, there are also engines that score checkmate as 9999 or 99999.
Some engines give a score of zero for book moves and others do not.

I think it could be resolved by a dialog something like:
ahead_one_pawn_score?
ahead_one_pawn_score=1000 {answers are in millipawns}
score_type?
score_type=stm {scores are given from the standpoint of side to move}
checkmate_score?
checkmate_score=9999 {9999 represents checkmate}
checkmate_minus_one?
checkmate_minus_one=9998 {interval of 1 indicates one ply to cm}

The cause of our Tower of Babel is that there is currently no clear statement of how to indicate game state from a score standpoint that everyone can agree upon.

The ChessAssistant program has a clever, heuristic method to take guesses.
It feeds some EPD positions for which it knows the answer and then looks at the engine output to decide how the engine scores positions.
Of course, this only gives hints for EPD processing, but the same idea could be used for game play.
User avatar
xsadar
Posts: 147
Joined: Wed Jun 06, 2007 10:01 am
Location: United States
Full name: Mike Leany

Re: score from White's perspective, or engine's perspective?

Post by xsadar »

hgm wrote:
xsadar wrote:Personally, I think it would be better if the gui displays the score from white's point of view, but the most important thing is that it be consistent (which it isn't now). However, the way engines report scores to the gui does not need to be consistent as long as the gui knows how it's being reported. What you could do is make a feature called either 'whitepov' or 'ownpov' and recommend that engines explicitly set or clear this feature depending on what the engine does. Then, if the POV is not explicitly specified, just assume one or the other based on what's most common for engines that exist right now (which I would guess is own POV, which I believe is what polyglot does as well).
I adopted a policy of not unneccessarily bloating WinBoard protocol by supplying multiple ways to do the same thing, and then let engines choose how they want to do it. I think that in the long run such freedom would only serves to create confusion. There should be a unique standard, and if engine violate it, I don't think matter would improve by developing a secondary standard for how to violate the primary standard without consequences. This would lead to an infinite regression, as the engines could violate that secondary standard too, etc. If authors would be prepared to build in feature commands telling WB they are going to violate the basic protocol, they might as well direct their programming effort to obeying the original protocol.
That's a good point. The only issue is that there is no original standard, but if a new standard is developed and engine authors have motivation to follow it, then they probably will. And existing engines that do it differently will always have problems no matter what's done.
For what the standard would be, I would suggest engine's pov since that would work with more of the existing engines.
So the only concern is engines that violate protocol, and are no longer being developed, or are deveoped by authors that intentionally want to violate protocol. For the benifit of people that want to use such engines, there is an option (which the user wil have to specify, so that malevolent engine authors cannot frustrate the process) to flip the score of rogue engines. I think that is the best solution, and it is already in place.

The user cannot currently specify, though, how the score is represented in the display or PGN (own POV or white POV). It could be argued if this is desirable. On the one hand it might satisfy the need of individual users better. On the other hand, in reduces the portability of the PGN files. Unless there was an official definition of how to include scores in PGN files, and that definition would allow freedom of how to choose the sign (removing the ambiguity by requiring a tag specifying the sign convention used, if this deviates from the default).
I think an option for users to select how they want scores displayed would be good. I think white POV would be best, but there are likely those that would prefer engine POV. And it would give engine authors motivation to follow a standard rather than just reporting scores how they like them to be displayed. As for scores in PGN comments, since that's a little more permanent, that's a little more complex of an issue but I think I would still lean toward users being able to decide. Perhaps in that case the scores could be preceded by an initial to clarify whose point of view it is -- W for white and B for black. So a score might look like W+3.50 or B-3.50 for the same position depending on pov.

While we're on the subject of scores. I'm not sure if you've already considered this, but it might also be beneficial to have a standard for reporting mate scores, and a user friendly way of displaying them. That way we could read something like +M3 for white/engine mates in 3 rather than +327.62 or +327.61 or +99.97 or other confusing scores, depending on the engine.
User avatar
hgm
Posts: 28356
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: score from White's perspective, or engine's perspective?

Post by hgm »

sje wrote:It needs a write-up to be sure, but PGN allows for "broket forms" to appear in the movetext, just like NAGs can appear.

A PGN broket form is simply an EPD operation enclosed in angle brackets.

Pertinent examples:

1. e4 e5. 2 Nf3 <ce -15> Nc6 <ce 12> 3. Bc4 *
I am not in favor of indefinite repetition of prefixes. If the number given needs additional information for its interpretation, it can be assumed to e the same in the entire file. So it should be specified only once, in a tag, for instance. ([Scores "own"] or [scores "white"]) For the same reaon I am against repeating for every score from whose POV this score was. If convienence dictates that it should be possible to allow several POVs, the POV in use should be defined globally, not for every move seperately.

WinBoard currently uses a de-facto standard, which reports the score as first tem in a comment, in Pawn units, followed by a slash and the search depth.

Code: Select all

[Event "ChessWar XIII Promotion 40m/20'"]
[Site "DELL-E6600"]
[Date "2008.10.08"]
[Round "7.47"]
[White "Micro-Max 1.6"]
[Black "Nero 6.1"]
[Result "1/2-1/2"]
[ECO "D02"]
[WhiteElo "1426"]
[BlackElo "1545"]
[Annotator "1. +0.02   3... +0.20"]
[PlyCount "129"]
[EventDate "2008.??.??"]


1. d4 {+0.02/8 1:07} 1... d5 2. Nf3 {+0.03/8 2:12} 2... Nf6 3. Bf4 {+0.08/7 15}
3... Bg4 {+0.20/7 28} 4. Ne5 {+0.02/7 31} 4... Nc6 {+0.15/7 28} 5. Nxg4 {
+0.18/7 17} 5... Nxg4 {-0.35/7 28} 6. e3 {+0.32/7 17} 6... Nf6 {-0.43/7 28} 7.
c3 {+0.13/7 24} 7... a6 {-0.12/7 28} 8. Qb3 {+0.25/7 26} 8... Qc8 {-0.15/7 28}
9. Nd2 {+0.11/7 40} 9... e6 {-0.19/7 28} 10. O-O-O {+0.09/7 48} 10... Ng4 {
+0.65/7 28} 11. Bg3 {-0.32/7 27} 11... Bd6 {+0.82/7 28} 12. Re1 {-0.17/7 38}
12... Bxg3 {+0.54/7 28} 13. fxg3 {-0.11/7 18} 13... O-O {+0.41/7 28} 14. e4 {
+0.14/7 20} 14... Nf2 {+0.44/7 28} 15. Rg1 {+0.02/7 21} 15... dxe4 {+0.24/7 28}
16. Nxe4 {+0.04/7 25} 16... Nxe4 {+0.02/7 28} 17. Rxe4 {+0.04/7 8} 17... Qd7 {
+0.23/7 28} 18. Bd3 {+0.17/7 24} 18... Qd6 {+0.23/6 28} 19. Kd2 {+0.05/6 8}
19... g6 {+0.24/6 28} 20. Ke3 {-0.02/7 30} 20... b6 {+0.26/6 28} 21. Qc4 {
-0.12/6 9} 21... e5 {+0.32/6 28} 22. Qa4 {-0.08/7 38} 22... exd4+ {+1.56/5 28}
23. cxd4 {-0.13/7 26} 23... f5 {+1.22/6 28} 24. Rh4 {-0.04/7 21} 24... b5 {
+1.11/5 28} 25. Qb3+ {-0.09/7 18} 25... Kg7 {+1.21/5 28} 26. Qc3 {-0.08/7 25}
26... b4 {+1.16/5 28} 27. Qc4 {-0.08/7 26} 27... Rfe8+ {+0.29/5 28} 28. Kd2 {
-1.01/7 15} 28... a5 {+0.25/6 28} 29. Rc1 {+0.03/7 25} 29... g5 {-0.87/6 28}
30. Qxc6 {-0.01/7 24} 30... gxh4 {-0.25/7 28} 31. Qxc7+ {-0.04/7 10} 31... Qxc7
{-0.10/7 28} 32. Rxc7+ {-0.13/8 9} 32... Kh6 {-0.04/7 28} 33. Bxf5 {+0.11/8 22}
33... hxg3 {+0.16/7 28} 34. hxg3 {+0.04/8 22} 34... Rh8 {-0.11/7 28} 35. Ke3 {
+0.03/8 19} 35... a4 {+0.01/7 28} 36. Be4 {+0.01/8 26} 36... Rag8 {-0.83/7 28}
37. Kf4 {+0.00/8 24} 37... Rf8+ {+0.06/7 24} 38. Ke3 {-0.01/8 12} 38... Rfg8 {
+0.00/7 28} 39. Kf4 {+0.00/8 24} 39... b3 {-0.73/7 28} 40. Rc6+ {+0.08/8 21}
40... Kg7 {-0.74/7 28} 41. Ra6 {+0.91/8 13} 41... bxa2 {-0.50/7 28} 42. Ra7+ {
+0.88/9 51} 42... Kh6 {-0.04/9 26} 43. Rxa4 {+0.85/9 53} 43... Rf8+ {-0.66/7 28
} 44. Ke3 {+0.82/9 53} 44... Re8 {-0.93/6 28} 45. Rxa2 {+0.92/9 45} 45... Re7 {
-0.95/7 28} 46. Kf4 {+0.94/9 49} 46... Rf8+ {-0.10/7 28} 47. Bf5 {+0.03/9 19}
47... Rd8 {-0.94/7 28} 48. Ra4 {+0.09/9 57} 48... Re2 {-0.09/6 28} 49. Ra6+ {
+0.03/8 11} 49... Kg7 {-0.09/7 28} 50. Ra7+ {+0.03/8 12} 50... Kf8 {+0.03/7 28}
51. Rxh7 {+0.00/8 15} 51... Rxd4+ {+0.19/6 28} 52. Kf3 {-0.02/9 36} 52... Rxb2
{+0.09/8 25} 53. Ke3 {-0.01/9 40} 53... Ra4 {+0.11/7 28} 54. Be4 {-0.06/9 39}
54... Rb3+ {+0.13/6 28} 55. Kf4 {-0.06/9 22} 55... Rbb4 {+2.96/6 28} 56. Rh8+ {
-0.10/9 12} 56... Kf7 {+2.97/9 28} 57. Rh7+ {-0.03/9 28} 57... Kf6 {+2.93/8 24}
58. Rh6+ {+0.00/9 34} 58... Ke7 {+0.00/9 28} 59. Rh7+ {-0.03/9 25} 59... Kd6 {
+0.02/8 28} 60. Rh6+ {+0.00/9 39} 60... Kd7 {+0.00/7 28} 61. Rh7+ {-0.04/9 23}
61... Ke6 {+0.04/8 28} 62. Rh6+ {+0.00/9 29} 62... Kf7 {+0.00/10 28} 63. Rh7+ {
-0.03/9 24} 63... Ke8 {+0.00/8 28} 64. Rh8+ {-0.09/9 15} 64... Kf7 {+0.00/25 0}
65. Rh7+ {-0.03/9 28 Draw by repetition} 1/2-1/2
WinBoard has no trouble understanding PGN files from other GUIs, so I guess other use this format too. If there would be a need to indicate the sign convention used for the score on a per-move basis, I would propose to use a backslash in stead of a slash if the score is from the white POV, so that no extra characters have to be wasted on it.

As to standardizing mate scores: this would be nice. I don't see any merit in Dann's proposal: it is tantamount to formalizing chaos. We are not going to have every engine do as it pleases, and design a cumbersome protocol (and force these engines to use it) to tell what exactly they are going to do. In stead, we will prescribe a simple protocol for the primary task, and engines will have to abide by it.

If engines want to print mate scores, I propose they do it using scores in the 1,000,000 range, as follows:

1,000,001 = mate in 1 (1 ply)
1,000,002 = mate in 2 (3 ply)
...

-1,000,000 = checkmated
-1,000,001 = mated in 1 (2 ply)
...

The GUI can display these scores as M1, M2, ... and -M0, -M1, ..., respectively, if it wants.

This convention has the advantage that GUIs not implementing it would not choke on the engine output, as it is still a normal integer. And what they display in that case would still be easy to interpret by the user.