Towards a standard analysis output format

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Towards a standard analysis output format

Post by sje »

When reporting a search result in character formatted output, in can be convenient to have the reported analysis appear on a single line. This is certainly the case when a result is presented as a kibitz in server play.

Wouldn't it also be convenient if program authors were to adopt a standard for analysis reporting? This would better support program parsing including parsing to map the report into a graphical interface.

A sample position and it's analysis as reported by some of my code:
[d]1N2k3/6r1/3pP3/p2P1p1P/PpBpP2b/1PP2b2/2K5/B3R1q1 b - - 1 42

Code: Select all

[MateIn4/7/17.269/111,573/0] 42... Rg2+ 43 Be2 Bxe4+ 44 Kc1 Qxe1+ 45 Bd1 Bg5#
Key:

Inside the brackets, in order:

1) Expectation (decimal pawns or a special symbol like MateIn7, LoseIn2, Even, Checkmated)

2) Integer ply draft

3) Decimal seconds of CPU usage

4) Node count (commas inserted for human readability)

5) Tablebase probe count

After the bracket set, the predicted variation (if any) appears with move number labeling.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Towards a standard analysis output format

Post by Dann Corbit »

sje wrote:When reporting a search result in character formatted output, in can be convenient to have the reported analysis appear on a single line. This is certainly the case when a result is presented as a kibitz in server play.

Wouldn't it also be convenient if program authors were to adopt a standard for analysis reporting? This would better support program parsing including parsing to map the report into a graphical interface.

A sample position and it's analysis as reported by some of my code:
[d]1N2k3/6r1/3pP3/p2P1p1P/PpBpP2b/1PP2b2/2K5/B3R1q1 b - - 1 42

Code: Select all

[MateIn4/7/17.269/111,573/0] 42... Rg2+ 43 Be2 Bxe4+ 44 Kc1 Qxe1+ 45 Bd1 Bg5#
Key:

Inside the brackets, in order:

1) Expectation (decimal pawns or a special symbol like MateIn7, LoseIn2, Even, Checkmated)

2) Integer ply draft

3) Decimal seconds of CPU usage

4) Node count (commas inserted for human readability)

5) Tablebase probe count

After the bracket set, the predicted variation (if any) appears with move number labeling.
One problem in chess analysis reporting is that there is no standard for "from whose side is it?" so that some engines always report for white's perspective and others for the side to move's perspective.

For the score, I would prefer always numeric, because it is simpler to parse. But people might like special symbols for special situations, but I would advocate shorter symbols:

+M<N> where N is the number of moves to mate
-M<N> where N is the number of moves before being mated by the opponent
=D<R> where <R> is the reason for the draw:
I is (I)insufficient material
R is (R)epeated position 3 times
S is (S)talemate for 50 full moves without irreversible game state change

Or something like that.

The most important thing is to standardize for which side you are reporting the score.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Towards a standard analysis output format

Post by sje »

Dann Corbit wrote:The most important thing is to standardize for which side you are reporting the score.
Yes; I suspect that there are near irreconcilable differences on this point. In part for reason of symmetry, my code always reports scores from the perspective of the side to move. I'm not sure that this is much better than some alternatives, but it certainly isn't any worse.

For those who might chose White POV score reporting, I ask why is that any different from Black POV reporting? Either seems to be arbitrary.

And so what does "-MateIn1" really mean? In could be "LoseIn1" or maybe either of "LoseIn2" or "Checkmated" depending upon the coder's interpretation. That's another of the reasons I use side-to-move score perspective.

The tablebase probe count also has commas as needed. I'm afraid the the use of commas over spaces is an Americanism and may not be the best.

Maybe there should be a book probe count as well.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

It's not *that* slow

Post by sje »

Lest the new Chess In Lisp toolkit receive unnecessary criticism (vs necessary criticism), let me point out that some serious performance problems in the demo forced mate finder have been fixed.

[d]1N2k3/6r1/3pP3/p2P1p1P/PpBpP2b/1PP2b2/2K5/B3R1q1 b - - 1 42

Old analysis:

Code: Select all

[MateIn4/7/17.269/111,573/0] 42... Rg2+ 43 Be2 Bxe4+ 44 Kc1 Qxe1+ 45 Bd1 Bg5#
New analysis:

Code: Select all

[MateIn4/7/0.108/633/0] 42... Rg2+ 43 Be2 Bxe4+ 44 Kc1 Qxe1+ 45 Bd1 Bg5#
A speed-up factor of about 160. The actual Lisp call in both cases:

Code: Select all

(matefinder "1N2k3/6r1/3pP3/p2P1p1P/PpBpP2b/1PP2b2/2K5/B3R1q1 b - - 1 42" 4)
When running the matefinder routine on the 100,000 position matein3 FEN file, the toolkit solved all the problems with an average speed of about 16.1 Hz (period of about 62.1 msec). For the easier 100,000 position matein2 FEN file, the frequency was about 162 Hz (a period of about 6.16 msec). The matein1 problems zipped by at about 839 Hz with nearly all of the time eaten by I/O and decoding/encoding.

Why is this important? Consider a knowledge-based program with a time budget of (on average) 100 seconds per move and a node budget of (also on average) 1,000 nodes per search; this is 100 milliseconds total per node. To keep the program from making really dumb blunders due to gaps in the knowledge base, perhaps ten percent of each node time allocation can be spent on brute force calculation to help avoid stepping into obvious traps and mates. Therefore, there are about ten milliseconds per node that can be expended on, uh, cheating with brute force. These budget numbers are admittedly a bit arbitrary.

Anyway, with the above figures a mate-in-two search cheat at each node would be permissible, but not a mate in three search.

Code snippet:

Code: Select all

;; ---------- Simple mate finding routines

(defun matefinder-node (my-pos my-window my-ply my-depth)
  "Search for a mate attack/defend at the given ply and depth for the given position; return an analysis."
  (let
    (
      (result (make-analysis :draft my-depth :nodecount 1 :score (window-alfa my-window) :fgpair (pos->fgpair my-pos)))
    )
    (if (zero? my-depth)
      (setf (analysis-score result) (if (pos-is-checkmate? my-pos) checkmated-score even-score))
      (let*
        (
          (atkflag  (even? my-ply))
          (d1flag   (one? my-depth))
          (newply   (1+ my-ply))
          (newdepth (1- my-depth))
          (moves    (if d1flag (generate-checkers my-pos) (generate my-pos)))
          (basket
            (new-basket
              (if d1flag
                (movelist-assign-order-simple-est-gain moves)
                (if atkflag
                  (movelist-assign-order-negative-mobility moves my-pos)
                  (movelist-assign-order-simple-est-gain moves)))))
          (atleast1 nil)
        )
        (dowhile
          (and
            (basket-has-moves? basket)
            (not (is-window-cutoff? my-window))
            (if atkflag
              (not (is-score-mating? (window-alfa my-window)))
              (negative? (window-alfa my-window))))
          (let
            (
              (desc-analysis nil)
              (newbest       nil)
              (move          (basket-pick-best basket))
            )
            (setf atleast1 t)
            (pos-execute my-pos move)
            (setf desc-analysis (matefinder-node my-pos (downshift-window my-window) newply newdepth))
            (setf newbest (combine-node-desc-analyses move result desc-analysis))
            (when newbest
              (setf (window-alfa my-window) (analysis-score result)))
            (pos-retract my-pos)))
        (unless atleast1
          (setf (analysis-score result) (if (pos-inch my-pos) checkmated-score even-score)))))
    result))

(defun matefinder-root (my-pos my-fmvc)
  "Search for a mate in N for the given position at the root; return an analysis."
  (matefinder-node my-pos (new-full-window) 0 (1- (2* my-fmvc))))

(defun matefinder (my-fen my-fmvc)
  "Search for a mate in N for the given position; return an analysis."
  (let ((result nil) (pos (string->pos my-fen)) (st (new-cu-simpletimer t)))
    (setf result (matefinder-root pos my-fmvc))
    (setf (analysis-usage result) (simpletimer->msec st))
    (movelist-notate-sequence (analysis-pv result) pos)
    (if (is-score-mating? (analysis-score result))
      (format t "Found a mate in ~R: ~A~%"
        (calc-mate-distance (analysis-score result))
        (analysis-pv->string result))
      (setf result nil))
    result))

(defun matehammer (my-fen my-fmnc-upper)
  "Repeatedly search for a mate in full move number increments."
  (let ((result nil) (fmvc 1))
    (dowhile (and (not result) (<= fmvc my-fmnc-upper))
      (format t "Searching for a mate in ~R...~%" fmvc)
      (setf result (matefinder my-fen fmvc))
      (incf fmvc))
    result))
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Towards a standard analysis output format

Post by bob »

Dann Corbit wrote:
sje wrote:When reporting a search result in character formatted output, in can be convenient to have the reported analysis appear on a single line. This is certainly the case when a result is presented as a kibitz in server play.

Wouldn't it also be convenient if program authors were to adopt a standard for analysis reporting? This would better support program parsing including parsing to map the report into a graphical interface.

A sample position and it's analysis as reported by some of my code:
[d]1N2k3/6r1/3pP3/p2P1p1P/PpBpP2b/1PP2b2/2K5/B3R1q1 b - - 1 42

Code: Select all

[MateIn4/7/17.269/111,573/0] 42... Rg2+ 43 Be2 Bxe4+ 44 Kc1 Qxe1+ 45 Bd1 Bg5#
Key:

Inside the brackets, in order:

1) Expectation (decimal pawns or a special symbol like MateIn7, LoseIn2, Even, Checkmated)

2) Integer ply draft

3) Decimal seconds of CPU usage

4) Node count (commas inserted for human readability)

5) Tablebase probe count

After the bracket set, the predicted variation (if any) appears with move number labeling.
One problem in chess analysis reporting is that there is no standard for "from whose side is it?" so that some engines always report for white's perspective and others for the side to move's perspective.

For the score, I would prefer always numeric, because it is simpler to parse. But people might like special symbols for special situations, but I would advocate shorter symbols:

+M<N> where N is the number of moves to mate
-M<N> where N is the number of moves before being mated by the opponent
=D<R> where <R> is the reason for the draw:
I is (I)insufficient material
R is (R)epeated position 3 times
S is (S)talemate for 50 full moves without irreversible game state change

Or something like that.

The most important thing is to standardize for which side you are reporting the score.
Numeric won't work. Not everyone uses the same mated in 0 score. Some use 65535, some (ala crafty) use 32768. Some use 99999. Etc. I like the way I report them which is simply MatNN which if + means white mates in NN moves (not plies) or -MatNN means white is mated in NN moves. "Mat" is about as short as you could get unless we go to #33 or -#33. I certainly don't like "MateinNN" or "MatedinNN" That's harder to parse, using a sign is easier IMHO.

But at least with the MatNN score, everybody understands mate, where with current scores, winboard and some are royally confused.
User avatar
hgm
Posts: 28387
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Towards a standard analysis output format

Post by hgm »

I always use 100000+N for mate in N moves (= 2N+1 ply), and -(100000+N) for mated in N moves (=2N ply), in WnBoard Thinking Output. GUIs can display that as Mate<N> or M<N> or #<N> when they want, but even if they don't, the only thing the user notices s that the spell "Mate" in a funny way (namely as "1000."), but the message (i.e. N) gets clearly across.

And 100,000 is safely outside the range {-32K, 32K-1} used for normal scoring, so that it is not likely to interfere with any normal output.

Btw, the kibitzing format used by WnBoard, and captured by the ICS, is

!!! <score>/<depth> (<time> sec, <nodes> nodes, <nodeRate> knps) PV = <PV>

e.g.

!!! -2.17/12 (1.33 sec, 135627 nodes, 101 knps) d4 Rxd4 Qf6 f3 ...
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Towards a standard analysis output format

Post by Don »

sje wrote:
Dann Corbit wrote:The most important thing is to standardize for which side you are reporting the score.
Yes; I suspect that there are near irreconcilable differences on this point. In part for reason of symmetry, my code always reports scores from the perspective of the side to move. I'm not sure that this is much better than some alternatives, but it certainly isn't any worse.

For those who might chose White POV score reporting, I ask why is that any different from Black POV reporting? Either seems to be arbitrary.

And so what does "-MateIn1" really mean? In could be "LoseIn1" or maybe either of "LoseIn2" or "Checkmated" depending upon the coder's interpretation. That's another of the reasons I use side-to-move score perspective.

The tablebase probe count also has commas as needed. I'm afraid the the use of commas over spaces is an Americanism and may not be the best.
It's always best to do it the other guy's way and be the one to adapt or change. Then you are not depending on someone else to change. And chess is much bigger in Europe anyway than in America.

Maybe there should be a book probe count as well.
Make much of it optional and use the KISS principle so that people won't be too intimidated to use it.

Don
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Towards a standard analysis output format

Post by bob »

sje wrote:
Dann Corbit wrote:The most important thing is to standardize for which side you are reporting the score.
Yes; I suspect that there are near irreconcilable differences on this point. In part for reason of symmetry, my code always reports scores from the perspective of the side to move. I'm not sure that this is much better than some alternatives, but it certainly isn't any worse.

For those who might chose White POV score reporting, I ask why is that any different from Black POV reporting? Either seems to be arbitrary.
I think either is better than "side to move point of view" or "own point of view". That certainly makes parsing complex. Books use +/- to show that white is winning, not that the side on move is winning. There is a plus for consistency.


And so what does "-MateIn1" really mean? In could be "LoseIn1" or maybe either of "LoseIn2" or "Checkmated" depending upon the coder's interpretation. That's another of the reasons I use side-to-move score perspective.
I simply use MateNN or -MateNN. MateNN means that from this position, starting with the first move of the PV, I am going to play exactly NN moves and leave my opponent checkmated. -Mate05 means that I play the first move in the PV, and then my opponent is going to play exactly 5 moves and mate me. I sort of like the +M5 and -M5 better as it is shorter...


The tablebase probe count also has commas as needed. I'm afraid the the use of commas over spaces is an Americanism and may not be the best.
Question is, who is this intended for? People like commas to make numbers easier to read. Computers do not like 'em and they have to be parsed out to interpret the number. I'd like to see a standard that a GUI can use and depend on, and that people might be able to read it if they want (but in reality, the GUI should do the formatting to make it as readable as it wants).

Maybe there should be a book probe count as well.
Not sure how one would formalize that. Crafty will, if asked, show all the possible book moves, and information about them, such as how frequently they were played, etc. Specifying that might be problematic depending on who displays what data.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Towards a standard analysis output format

Post by Don »

bob wrote:
sje wrote:
Dann Corbit wrote:The most important thing is to standardize for which side you are reporting the score.
Yes; I suspect that there are near irreconcilable differences on this point. In part for reason of symmetry, my code always reports scores from the perspective of the side to move. I'm not sure that this is much better than some alternatives, but it certainly isn't any worse.

For those who might chose White POV score reporting, I ask why is that any different from Black POV reporting? Either seems to be arbitrary.
I think either is better than "side to move point of view" or "own point of view". That certainly makes parsing complex. Books use +/- to show that white is winning, not that the side on move is winning. There is a plus for consistency.


And so what does "-MateIn1" really mean? In could be "LoseIn1" or maybe either of "LoseIn2" or "Checkmated" depending upon the coder's interpretation. That's another of the reasons I use side-to-move score perspective.
I simply use MateNN or -MateNN. MateNN means that from this position, starting with the first move of the PV, I am going to play exactly NN moves and leave my opponent checkmated. -Mate05 means that I play the first move in the PV, and then my opponent is going to play exactly 5 moves and mate me. I sort of like the +M5 and -M5 better as it is shorter...


The tablebase probe count also has commas as needed. I'm afraid the the use of commas over spaces is an Americanism and may not be the best.
Question is, who is this intended for? People like commas to make numbers easier to read. Computers do not like 'em and they have to be parsed out to interpret the number. I'd like to see a standard that a GUI can use and depend on, and that people might be able to read it if they want (but in reality, the GUI should do the formatting to make it as readable as it wants).
Yes, this makes great sense. The comma's can be placed by anything external that parses it.

Maybe there should be a book probe count as well.
Not sure how one would formalize that. Crafty will, if asked, show all the possible book moves, and information about them, such as how frequently they were played, etc. Specifying that might be problematic depending on who displays what data.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Towards a standard analysis output format

Post by sje »

The Lisp structure that represents a search result analysis also has a member for recording the search termination status although this is not printed in the one line report.

There are sixteen of these currently in use:

Code: Select all

;; ---------- Search termination status

(defconstant sts-all-bad-but-one  0 "All moves but one are bad")
(defconstant sts-all-certain      1 "All moves have certain scores")
(defconstant sts-book-move        2 "Book move located")
(defconstant sts-certain-lose     3 "Certain lose located")
(defconstant sts-certain-mate     4 "Certain mate located")
(defconstant sts-interrupt        5 "Interrupted")
(defconstant sts-maximum-depth    6 "Maximum depth limit")
(defconstant sts-no-moves         7 "No moves available")
(defconstant sts-param-fault      8 "Parameter fault")
(defconstant sts-program-error    9 "Program error detected")
(defconstant sts-random-move     10 "Random move selected")
(defconstant sts-re-nodes        11 "Resource exhaustion: nodes")
(defconstant sts-re-plies        12 "Resource exhaustion: plies")
(defconstant sts-re-time         13 "Resource exhaustion: time")
(defconstant sts-singleton       14 "Only one move available")
(defconstant sts-unterminated    15 "Search is still active")