Interpretation of a result for time and node count:
When reporting the result of a search, many programs (maybe nearly all of them) report the total time used and total node count. But are these an accurate description of the effort needed to produce the result?
I suggest that a better report would use the time/nodes numbers at the point in the search when the result PV and score were established. Alternatively, the time/nodes numbers reported would be from the point in the search after the point at which any PV could be established; i.e., the last time a score/bound was determined for a root move selection. The idea is to not credit any effort in which all data generated was thrown away when a resource limit (e.g., allocated elapsed time) was encountered.
The change is easy to implement. A program need only capture the time/nodes numbers at the time a PV is established and pass these along with the final PV when the result report is generated.
Interpretation of a result for time and node count
Moderators: hgm, Rebel, chrisw
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: Interpretation of a result for time and node count
Depends on what you mean by "result". If result is just the PV or the best move, then no. If the result is that plus the number of examined root moves at that depth, then yes.sje wrote: When reporting the result of a search, many programs (maybe nearly all of them) report the total time used and total node count. But are these an accurate description of the effort needed to produce the result?
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Interpretation of a result for time and node count
I don't think that following your proposal would be accurate enough. When a PV is established you don't know exactly whether this PV will be kept until the end of an iteration. In many cases it will but sometimes it won't. So the time needed to *confirm* the PV by proving that all other moves are not better is part of the time needed to produce the result.
You may argue that this "confirmation" is often wasted. But there you are using a posteriori knowledge.
You are right in general about the timeout issue. Here you have these cases:
A. timeout during search of first root move (no new PV),
B. timeout during search of second root move (could produce new PV but no new best move),
C. timeout after search of second root move (could produce new PV and sometimes also new best move).
What would you report in these cases? Would that match the expectations? According to your proposal it seems you would like to report the time/nodes needed to produce the PV of the previous iteration in case A, and that of the current iteration in case B+C, which might give surprising results for case A at least.
Let's assume an engine needs 60 seconds to complete all iterations until depth X-1, then 30 seconds until the PV of iteration X pops up, then another 30 seconds to confirm that the remaining moves in X are not better, then starts searching the best move from X to depth X+1, and gets a timeout after another 45 seconds right before completing that search.
Total time elapsed: 2:45 minutes
Your reported time: 1:30 minutes
So what would you think as a user about what the engine did in the remaining 1:15 minutes?
Engines that do not even start a new iteration if completing it is unlikely might avoid this problem and can also save a lot of time, especially in cases A and B.
Another issue, more from the statistical viewpoint: it is hard to compare reported times and node counts which are not referring to fully completed iterations. This also affects calculating the EBF.
Sven
You may argue that this "confirmation" is often wasted. But there you are using a posteriori knowledge.
You are right in general about the timeout issue. Here you have these cases:
A. timeout during search of first root move (no new PV),
B. timeout during search of second root move (could produce new PV but no new best move),
C. timeout after search of second root move (could produce new PV and sometimes also new best move).
What would you report in these cases? Would that match the expectations? According to your proposal it seems you would like to report the time/nodes needed to produce the PV of the previous iteration in case A, and that of the current iteration in case B+C, which might give surprising results for case A at least.
Let's assume an engine needs 60 seconds to complete all iterations until depth X-1, then 30 seconds until the PV of iteration X pops up, then another 30 seconds to confirm that the remaining moves in X are not better, then starts searching the best move from X to depth X+1, and gets a timeout after another 45 seconds right before completing that search.
Total time elapsed: 2:45 minutes
Your reported time: 1:30 minutes
So what would you think as a user about what the engine did in the remaining 1:15 minutes?
Engines that do not even start a new iteration if completing it is unlikely might avoid this problem and can also save a lot of time, especially in cases A and B.
Another issue, more from the statistical viewpoint: it is hard to compare reported times and node counts which are not referring to fully completed iterations. This also affects calculating the EBF.
Sven
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: Interpretation of a result for time and node count
There is another possible case: the search of another root move failed high in the zero window search, but the search times out when searching it with a full window.
In this case you have a new best move but no PV.
In this case you have a new best move but no PV.
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Interpretation of a result for time and node count
This change should be considered a protocol violation. In their thinking output engines are expected to print the true time and number of nodes upto that point.sje wrote:The change is easy to implement. A program need only capture the time/nodes numbers at the time a PV is established and pass these along with the final PV when the result report is generated.
If you want to let the user know how much time and nodes were needed to find a certain PV, you should simply have the engine emit a line of thinking output at the moment it finds that PV. You can then repeat that (with new times and nod counts) at the end of the iteration, if the PV did not change.
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
A few points
A few points:
1) What exactly is the interpretation of the result time? It could be wall clock time, CPU time, the sum of all CPU time for multiple CPUs, the sum of all core time (physical or virtual), or maybe something else.
2) What exactly is an iteration? Not all programs use an iterative search. Also, an iteration in an iterated best-first search is likely something quite different from that in an iterated depth-first search.
3) What exactly is a node? Is it the same as the number of static evaluations? Is it the same as the number of execute/retract pairs? Is it the count of unique positions created?
4) What exactly is the reported analysis draft? Is it the number of completed iterations? Or the number of started iterations? Is it the length of the PV?
Few will agree on answers for all of the above.
1) What exactly is the interpretation of the result time? It could be wall clock time, CPU time, the sum of all CPU time for multiple CPUs, the sum of all core time (physical or virtual), or maybe something else.
2) What exactly is an iteration? Not all programs use an iterative search. Also, an iteration in an iterated best-first search is likely something quite different from that in an iterated depth-first search.
3) What exactly is a node? Is it the same as the number of static evaluations? Is it the same as the number of execute/retract pairs? Is it the count of unique positions created?
4) What exactly is the reported analysis draft? Is it the number of completed iterations? Or the number of started iterations? Is it the length of the PV?
Few will agree on answers for all of the above.
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Sample statistics report form the new CIL Toolkit
Code: Select all
* (simpleton (string->pos "2kr3r/pp1q1ppp/5n2/1Nb5/2Pp1B2/7Q/P4PPP/1R3RK1 w - - 0 1"))
[si] Starting iteration one
[pv] 1 Bc1 Qxh3 2 gxh3 score: -1.348
[pv] 1 Bd2 Qxh3 2 gxh3 score: -1.068
[pv] 1 Kh1 Qxh3 2 gxh3 score: -0.978
[pv] 1 Qd3 score: -0.934
[pv] 1 Rfe1 Qxh3 2 gxh3 score: -0.858
[si] Starting iteration two
[pv] 1 Rfe1 d3 score: -0.972
[pv] 1 Qxd7+ Rxd7 2 Rfe1 score: -0.874
[si] Starting iteration three
[pv] 1 Qxd7+ Rxd7 2 Rfe1 d3 score: -1.092
[pv] 1 Rfe1 d3 2 Nc3 Qxh3 3 gxh3 score: -0.936
[pv] 1 Qg3 Qf5 2 Rfe1 score: -0.904
[si] Starting iteration four
[pv] 1 Qg3 g6 2 Bg5 Ne4 score: -1.018
[pv] 1 Qxd7+ Rxd7 2 Rfe1 d3 3 Nc3 score: -0.952
[si] Starting iteration five
[pv] 1 Qxd7+ Rxd7 2 Rfe1 d3 3 Rb2 h5 score: -1.124
[pv] 1 Qg3 g6 2 Rfe1 Rde8 3 h4 score: -0.956
[pv] 1 Rfe1 Qxh3 2 gxh3 d3 3 Nc3 score: -0.936
[st] Resource exhaustion (time) Elapsed: 000.00:05:00.000
Search statistics summary report at 2011.04.12 06:23:51
Root position FEN: 2kr3r/pp1q1ppp/5n2/1Nb5/2Pp1B2/7Q/P4PPP/1R3RK1 w - - 0 1
Root moves (forty-seven):
(Bb8 Bc1 Bc7 Bd2 Bd6 Be3 Be5 Bg3 Bg5 Bh6 Kh1 Na3 Nc3 Nc7 Nd6+ Nxa7+ Nxd4
Qa3 Qb3 Qc3 Qd3 Qe3 Qe6 Qf3 Qf5 Qg3 Qg4 Qh4 Qh5 Qh6 Qxd7+ Qxh7 Ra1 Rb2 Rb3
Rb4 Rbc1 Rbd1 Rbe1 Rfc1 Rfd1 Rfe1 a3 a4 f3 g3 g4)
PV: 1 Rfe1 Qxh3 2 gxh3 d3 3 Nc3
Expectation: -0.936
Analysis scalars:
Draft: 5 Nodes: 498,502 Time: 000.00:01:16.067
Frequency: 6.5534596 KHz Period: 152.59117 us
Total node counts:
All: 2,040,525 Capture: 1,495,178 Evasion: 292,019 General: 79,660 PlyZero: 5
Total elapsed time: 000.00:05:00.000
Termination status: Resource exhaustion (time)
Transposition table statistics (general position tables by color):
(white) size: 65,536 usage: 65,526 match: 108,175 probe: 806,434 store: 753,412
(black) size: 65,536 usage: 65,536 match: 166,725 probe: 1,233,190 store: 1,113,436
Transposition table statistics (pawn structure tables by color):
(white) size: 4,096 usage: 3,574 match: 665,879 probe: 688,627 store: 22,748
(black) size: 4,096 usage: 3,727 match: 783,963 probe: 806,551 store: 22,588
Transposition pruning counts:
Attempt: 2,039,624 Match: 274,900 Sufficient draft: 213,905 Success: 172,767
[-0.936/5/76.067/498,502/0] 1 Rfe1 Qxh3 2 gxh3 d3 3 Nc3
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: A few points
In WB protocol the time issue at least is defined: engines should report wall-clock time, unless a previous nps 0 command has instructed them to use user CPU time.sje wrote:A few points:
1) What exactly is the interpretation of the result time? It could be wall clock time, CPU time, the sum of all CPU time for multiple CPUs, the sum of all core time (physical or virtual), or maybe something else.
2) What exactly is an iteration? Not all programs use an iterative search. Also, an iteration in an iterated best-first search is likely something quite different from that in an iterated depth-first search.
3) What exactly is a node? Is it the same as the number of static evaluations? Is it the same as the number of execute/retract pairs? Is it the count of unique positions created?
4) What exactly is the reported analysis draft? Is it the number of completed iterations? Or the number of started iterations? Is it the length of the PV?
Few will agree on answers for all of the above.
I guess the specs can use some tightening up for the SMP case. In a future release I will specify the nps 0 reporting should be the sum of the user CPU time of all threads.
The other details are indeed implementation dependent. You can report what you think is most useful for your implementation. As long as you do it the same way every time, it does not matter.
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: A few points
Wall clock time. Never seen anything different.sje wrote:A few points:
1) What exactly is the interpretation of the result time? It could be wall clock time, CPU time, the sum of all CPU time for multiple CPUs, the sum of all core time (physical or virtual), or maybe something else.
I would guess that 98% of all chess engines use iterative deepening, and I think I can assume everyone knows what an iteration is in that context. It is better to have a notion for something that is used by 98% of all programs, than to have nothing just because 2% are different.sje wrote:2) What exactly is an iteration? Not all programs use an iterative search. Also, an iteration in an iterated best-first search is likely something quite different from that in an iterated depth-first search.
Making a move during search enters a new node of the search tree, regardless whether that node is being evaluated or not, and whether it is unique or has already been searched before.sje wrote:3) What exactly is a node? Is it the same as the number of static evaluations? Is it the same as the number of execute/retract pairs? Is it the count of unique positions created?
I agree, though, that there are probably too many different ways of node counting around. A common definition that is used by *everyone* (i.e., 98%) might certainly help. Some do not count illegal positions, some do not count hash hits as a node, and so on. This has already been discussed a couple of times.
Usually the number of started iterations, since it is reported together with the PV which can pop up at any time during an iteration. Surely it is not the length of the PV, due to things like search extensions, qsearch, or extending PV from hash.sje wrote:4) What exactly is the reported analysis draft? Is it the number of completed iterations? Or the number of started iterations? Is it the length of the PV?
Sven
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Re: Interpretation of a result for time and node count
You'd have a PV, but it would be only one move in length.rbarreira wrote:There is another possible case: the search of another root move failed high in the zero window search, but the search times out when searching it with a full window.
In this case you have a new best move but no PV.