Additional UCI output registering as massive elo losses

AlvaroBegue · Post by **AlvaroBegue** » Wed Apr 11, 2018 10:51 pm

RuyDos prints an `info' line every time alpha is updated (i.e., when the exploration of the first move finishes at any depth, and when a new best move is found), plus another one at the end of each depth. That's about the most output you can have. And yet, I don't get any Elo gain by suppressing the output altogether.

Why is my experience so different from those described by others in this thread? Does anyone have a theory to explain that?

AndrewGrant · Post by **AndrewGrant** » Wed Apr 11, 2018 10:56 pm

What time control are you testing this on?

Are you flushing stdout with each line, or just with bestmove?

How much are you really printing? I'm not familiar with your engine. Shallow or deep searching?

Are you doing anything else when reporting? For example, I look at the first 1000 table entries to get an estimator for hashfull.

AlvaroBegue · Post by **AlvaroBegue** » Wed Apr 11, 2018 11:12 pm

AndrewGrant wrote:What time control are you testing this on?

Are you flushing stdout with each line, or just with bestmove?

How much are you really printing? I'm not familiar with your engine. Shallow or deep searching?

Are you doing anything else when reporting? For example, I look at the first 1000 table entries to get an estimator for hashfull.

I am testing at 0:10+0.1s/move. I am flushing stdout after every line. I am printing something fairly typical, like this (from the initial position):

Code: Select all

info depth 1 nodes 1 score cp -17 time 0 nps 25698 pv f2f4
info depth 1 nodes 3 score cp 3 time 0 nps 39178 pv b1a3
info depth 1 nodes 5 score cp 28 time 0 nps 58824 pv b1c3
info depth 1 nodes 21 score cp 30 time 0 nps 207758 pv e2e3
info depth 1 nodes 23 score cp 30 time 0 nps 217231 pv e2e3
info depth 2 nodes 52 score cp 11 time 0 nps 403313 pv e2e3 g8f6
info depth 2 nodes 88 score cp 11 time 0 nps 575212 pv e2e3 g8f6
info depth 3 nodes 202 score cp 27 time 0 nps 982229 pv e2e3 e7e6 b1c3
info depth 3 nodes 236 score cp 27 time 0 nps 1010499 pv e2e3 e7e6 b1c3
info depth 4 nodes 454 score cp 8 time 0 nps 1359094 pv e2e3 b8c6 b1c3
info depth 4 nodes 617 score cp 11 time 0 nps 1525499 pv b1c3 b8c6 e2e3 e7e6
info depth 4 nodes 1132 score cp 11 time 1 nps 1867999 pv b1c3 b8c6 e2e3 e7e6
info depth 5 nodes 1644 score cp 19 time 1 nps 1995351 pv b1c3 g8f6 g1f3
info depth 5 nodes 1964 score cp 19 time 1 nps 2070291 pv b1c3 g8f6 g1f3
info depth 6 nodes 2941 score cp 9 time 1 nps 2212104 pv b1c3 b8c6 g1f3
info depth 6 nodes 4080 score cp 10 time 2 nps 2294113 pv g1f3 g8f6 d2d3 d7d6 b1c3 b8c6
info depth 6 nodes 5245 score cp 10 time 2 nps 2337148 pv g1f3 g8f6 d2d3 d7d6 b1c3 b8c6
info depth 7 nodes 6603 score cp 15 time 3 nps 2374647 pv g1f3 g8f6 d2d3 d7d6 b1c3 b8c6 h2h3
info depth 7 nodes 8915 score cp 15 time 4 nps 2404687 pv g1f3 g8f6 d2d3 d7d6 b1c3 b8c6 h2h3
info depth 8 nodes 11704 score cp 11 time 5 nps 2428191 pv g1f3 g8f6 d2d3 d7d6 b1c3 b8c6 h2h3 a7a6
info depth 8 nodes 19112 score cp 11 time 8 nps 2488651 pv g1f3 g8f6 d2d3 d7d6 b1c3 b8c6 h2h3 a7a6
info depth 9 nodes 24585 score cp 14 time 10 nps 2502906 pv g1f3 g8f6 d2d3 d7d6 b1c3 b8c6 h2h3 h7h6 c1e3
info depth 9 nodes 30319 score cp 14 time 12 nps 2500131 pv g1f3 g8f6 d2d3 d7d6 b1c3 b8c6 h2h3 h7h6 c1e3
info depth 10 nodes 47808 score cp 11 time 18 nps 2613366 pv g1f3 g8f6 b1c3 e7e6 h2h3
info depth 10 nodes 64806 score cp 11 time 22 nps 2913943 pv g1f3 g8f6 b1c3 e7e6 h2h3
info depth 11 nodes 90209 score cp 11 time 28 nps 3237440 pv g1f3 g8f6 b1c3 b8c6 e2e3 h7h6 a2a3 e7e6 f1d3 a7a6 e1g1
info depth 11 nodes 116000 score cp 11 time 34 nps 3428742 pv g1f3 g8f6 b1c3 b8c6 e2e3 h7h6 a2a3 e7e6 f1d3 a7a6 e1g1
info depth 12 nodes 182383 score cp 10 time 49 nps 3692791 pv g1f3 g8f6 d2d3 d7d6
info depth 12 nodes 267211 score cp 10 time 70 nps 3832384 pv g1f3 g8f6 d2d3 d7d6
info depth 13 nodes 402637 score cp 9 time 102 nps 3958661 pv g1f3 g8f6 d2d3 b8c6 h2h3 a7a6 c2c4
info depth 13 nodes 551486 score cp 9 time 137 nps 4019767 pv g1f3 g8f6 d2d3 b8c6 h2h3 a7a6 c2c4
info depth 14 nodes 892873 score cp 8 time 214 nps 4164650 pv g1f3 g8f6 b1c3 a7a6
info depth 14 nodes 1253817 score cp 8 time 301 nps 4159695 pv g1f3 g8f6 b1c3 a7a6

I am not sure what the boundary between shallow and deep search is, but the output above should give you an idea of how deep it searches.

I don't do anything expensive when printing the lines. In particular, I am not printing statistics about TT saturation.

AndrewGrant · Post by **AndrewGrant** » Wed Apr 11, 2018 11:20 pm

That is a fair bit of output. If your engine is stable at very low time control 1+.01 or something... you could play games at that TC, which should really highlight any system call overhead

I'll not that for a depth N search, I always print at least a depth N PV... one move at a time. Peraps that is were the time difference is, and why you see no (or less, assuming you;ve not played tens of thousands of games) impact.

I don't think it is anything to worry about. Consensus here is that UCI reporting time is of little impact at "real" time controls.

If it would make you feel better or be of interest to you, I can test Ethereal with and without Pv printing, and no printing at all.

AlvaroBegue · Post by **AlvaroBegue** » Wed Apr 11, 2018 11:45 pm

AndrewGrant wrote:That is a fair bit of output. If your engine is stable at very low time control 1+.01 or something... you could play games at that TC, which should really highlight any system call overhead

I'll not that for a depth N search, I always print at least a depth N PV... one move at a time. Peraps that is were the time difference is, and why you see no (or less, assuming you;ve not played tens of thousands of games) impact.

I don't think it is anything to worry about. Consensus here is that UCI reporting time is of little impact at "real" time controls.

If it would make you feel better or be of interest to you, I can test Ethereal with and without Pv printing, and no printing at all.

I only tested with a few thousand games, but that already ruled out any "massive" Elo improvement that could be expected by the title of this thread. I will try with even faster time controls, which my engine should be able to handle.

I agree with the consensus, but I care about these crazy-fast time controls because I use them [a lot] in testing.

Ras · Post by **Ras** » Wed Apr 11, 2018 11:58 pm

At least with very short time controls, I'd only print out the PV after each depth iteration, not inbetween. For other outputs like nodes, hash usage, current move etc, I do that once per second, but not during the first second, just as the UCI spec recommends. Which means not at all with move time below 1 second.

Already in the current release, I use direct fwrite/fread to and from stdio.

Apart from that, my current dev version gets optimised for TC as fast as 1 second per game without increments. I've thrown out every sprintf call in the critical path and am doing the necessary conversions myself. Plus that the printf family may even call malloc behind the scenes, which is nasty.

DustyMonkey · Post by **DustyMonkey** » Thu Apr 12, 2018 2:35 am

AndrewGrant wrote: For some reason, every single on of these attempts is a massive elo loser in a quick SPRT test.

Honestly it also depends strongly on the OS and the GUI interface you are using.

Stockfish for instance (at least as of last year) outputted significantly more junk (*) than other engines with a wide MultiPV, and this is where I noticed the severe problem with it.

Between the engine and the GUI is a limited I/O buffer and an engine thread will stall out (blocked writes) if this buffer fills up. Now some GUI's do polling to process the output while others are event driven, and especially in the later case the OS can have a significant impact especially if mixing cpu modes (64-bit engine, 32-bit gui, through windows side-by-side translation layer, the message passing driving the gui to drain the buffer has a significant latency)

(*) for some crazy reason SF re-outputted the entire multipv list again... and again... and again... and again.... and again... and again... and again... once for each pv it would output all N pv's on each and every iteration. O(N^2) output instead of O(n) or approximately O(n).

I tried to explain the issue to some SF devs at TCEC at one point, and while they had an explanation for why SF does it this way, they did not have a justification. Neither Komodo nor Houdini outputted even 10% as much as SF per iteration with a wide MultiPV. Both K9.2 and H4 were smashing SF8 (!!!) and AsmFish when all had a wide multipv.. and I mean real proper smashing... might as well have been rybka.

Patrice Duhamel · Post by **Patrice Duhamel** » Thu Apr 12, 2018 8:53 am

I don't know how much it cost, but in your uciReport function for the pv you call printf for each move, you could generate the pv string yourself without using printf.

Joerg Oster · Post by **Joerg Oster** » Thu Apr 12, 2018 10:18 am

DustyMonkey wrote:
AndrewGrant wrote: For some reason, every single on of these attempts is a massive elo loser in a quick SPRT test.
Honestly it also depends strongly on the OS and the GUI interface you are using.

Stockfish for instance (at least as of last year) outputted significantly more junk (*) than other engines with a wide MultiPV, and this is where I noticed the severe problem with it.

Between the engine and the GUI is a limited I/O buffer and an engine thread will stall out (blocked writes) if this buffer fills up. Now some GUI's do polling to process the output while others are event driven, and especially in the later case the OS can have a significant impact especially if mixing cpu modes (64-bit engine, 32-bit gui, through windows side-by-side translation layer, the message passing driving the gui to drain the buffer has a significant latency)

(*) for some crazy reason SF re-outputted the entire multipv list again... and again... and again... and again.... and again... and again... and again... once for each pv it would output all N pv's on each and every iteration. O(N^2) output instead of O(n) or approximately O(n).

I tried to explain the issue to some SF devs at TCEC at one point, and while they had an explanation for why SF does it this way, they did not have a justification. Neither Komodo nor Houdini outputted even 10% as much as SF per iteration with a wide MultiPV. Both K9.2 and H4 were smashing SF8 (!!!) and AsmFish when all had a wide multipv.. and I mean real proper smashing... might as well have been rybka.

From the UCI specifications:

* multipv <num>
this for the multi pv mode.
for the best move/pv add "multipv 1" in the string when you send the pv.
in k-best mode always send all k variants in k strings together.

mjlef · Post by **mjlef** » Fri Apr 13, 2018 12:06 am

hgm wrote:
Sven wrote:I think flushing output is only important for those lines that are relevant for the protocol. E.g. if you do not flush the "bestmove" output then you might lose on time. But flushing PVs and other info lines that are printed during search should not be necessary, maybe with the exception of the final PV (if the engine explicitly prints one at the very end of the search).
It is quite annoying when the engine thinks (perhaps for a few minutes) an you don't get to see anything at all, an then, when it finally moves, the PVs of all iterations appear at once. For interactive analysis this would even be intolerable.

In Komodo, I output an update to the node count every 3 seconds. This at lest updates that value and tell use humans it is still thinking.

Additional UCI output registering as massive elo losses

Re: Additional UCI output registering as massive elo losses

Re: Additional UCI output registering as massive elo losses

Re: Additional UCI output registering as massive elo losses

Re: Additional UCI output registering as massive elo losses

Re: Additional UCI output registering as massive elo losses

Re: Additional UCI output registering as massive elo losses

Re: Additional UCI output registering as massive elo losses

Re: Additional UCI output registering as massive elo losses

Re: Additional UCI output registering as massive elo losses

Re: Additional UCI output registering as massive elo losses