Losing on time

Rebel · Post by **Rebel** » Sat Jan 04, 2014 11:07 pm

Steve Maughan wrote: http://www.chessprogramming.net/compute ... sing-time/

Let's discuss a specific part of your blog, maybe this little drama thread will produce something good after all.

Checking the time and keyboard (IO operations) are expensive tasks, here is what I do: I have a counter in my eval and every 500 eval's I check the time and keyboard. Here are some results playing with that counter taking a random position at 16 plies that always produces the same number of total nodes.

Code: Select all

COUNTER  TIME        NPS
10000    7.2 secs   3.148.xxx
500      7.3 secs   3.105.xxx 
100      7.4 secs   3.063.xxx
10       9.2 secs   2.518.xxx
5       11.2 secs   2.060.xxx
2       16.4 secs   1.416.xxx
1       24.6 secs     944.xxx   (thus every eval)

Of course 10000 is risky, it can exceed 15ms.

But it makes sense to tune your engine doing not too much IO operations.

Steve Maughan · Post by **Steve Maughan** » Sat Jan 04, 2014 11:46 pm

Hi Ed,

Based on your data checking IO every 500 evals seems OK (with today's hardware). However, we're really interested in checking the IO a certain number of times per second. The beautiful of the approach in Maverick (as discussed in the blog), is it adjusted the node mask so only a certain number of time checks are made per second.

Best,

Steve

bob · Post by **bob** » Sun Jan 05, 2014 12:08 am

Rebel wrote:
Steve Maughan wrote: http://www.chessprogramming.net/compute ... sing-time/
Let's discuss a specific part of your blog, maybe this little drama thread will produce something good after all.

Checking the time and keyboard (IO operations) are expensive tasks, here is what I do: I have a counter in my eval and every 500 eval's I check the time and keyboard. Here are some results playing with that counter taking a random position at 16 plies that always produces the same number of total nodes.
Code: Select all
COUNTER  TIME        NPS
10000    7.2 secs   3.148.xxx
500      7.3 secs   3.105.xxx 
100      7.4 secs   3.063.xxx
10       9.2 secs   2.518.xxx
5       11.2 secs   2.060.xxx
2       16.4 secs   1.416.xxx
1       24.6 secs     944.xxx   (thus every eval)
Of course 10000 is risky, it can exceed 15ms.

But it makes sense to tune your engine doing not too much IO operations.

There's another solution that offers max performance. Vary the counter value above based on the target time. If your target time is 60 seconds for a move, is going over 15ms a problem? At 60 seconds a move you can get away with checking once a second, with no real issue. Or you can start at 1 sec per test until you are within 5 seconds of hitting the target, then reducing the counter a bit after each check if you really care about hitting a move at 60.000000 seconds.

bob · Post by **bob** » Sun Jan 05, 2014 12:11 am

Rebel wrote:
bob wrote:The best "fix" is to play game/1sec with no inc. Over and over. Occasionally you will lose on time. But it ought to be WAY in the game. Certainly it should last 100 moves if you use 10ms timing accuracy. This is really not that hard to solve correctly so that it works right on ALL GUIs, not just one with an ugly hack that covers up bugs.
Small correction, not 100 moves but 1000/15=66.66 moves since clock() and/or GetTickCount() operate in steps of 15, not 10.

I don't use gettickcount. I've always used gettimeofday() which returns seconds and microseconds in two separate values. Using the basic PC time of day clock, with its atrocious 16.667ms accuracy, leaves more than a lot to be desired for fast games.

bob · Post by **bob** » Sun Jan 05, 2014 12:19 am

Sven Schüle wrote:
Rebel wrote:2. Secondly some engines (like mine for instance) collect and store all kind of statistic stuff which causes overhead falling outside the limits of the time control.
If you do some expensive stuff in the time between terminating the search and actually sending the move to the GUI then your engine may occasionally lose on time. Would it help to send the move immediately and do the expensive stuff afterwards, when it is not your engine's turn (i.e. it is either idle or pondering, where in the latter case pondering could also start a little later)?

If that "overhead" you mentioned is performed while the GUI has your engine's clock still running then the "time" command (WB) would help you out, but you would still need to add a reliable estimate of the time needed for running that overhead activity once as a safety buffer in order to avoid losing on time. May I assume that your engine handles the "time" command in WB mode?

Sven

That is a TERRIBLE idea for ponder=off matches, unless you are trying to win in an underhanded way. For example, why not just sit and spin waiting on your opponent? Burn 1/2 the cpu cycles, give him the other half, you get a 2:1 time handicap.

My suggestion is "do whatever you want" but complete EVERYTHING before sending the move to the GUI. If it is necessary to spend 1.5ms after your search completes, just subtract 1.5ms from your target time. In Unix, the IDEAL GUI would probably send a STOP signal to a process once it receives a move from it, and then send a CONT signal before sending it the next move, then the engine could not do any funny stuff. I used to teach the AI class every year, and the class project was an othello program. At the end of each term, we had a tournament using the same interface I use on our cluster for my testing. And I had to deal with this issue carefully to avoid someone winning by trickery rather than by playing the best moves.

Sven · Post by **Sven** » Sun Jan 05, 2014 12:45 am

bob wrote:
Sven Schüle wrote:
Rebel wrote:2. Secondly some engines (like mine for instance) collect and store all kind of statistic stuff which causes overhead falling outside the limits of the time control.
If you do some expensive stuff in the time between terminating the search and actually sending the move to the GUI then your engine may occasionally lose on time. Would it help to send the move immediately and do the expensive stuff afterwards, when it is not your engine's turn (i.e. it is either idle or pondering, where in the latter case pondering could also start a little later)?

If that "overhead" you mentioned is performed while the GUI has your engine's clock still running then the "time" command (WB) would help you out, but you would still need to add a reliable estimate of the time needed for running that overhead activity once as a safety buffer in order to avoid losing on time. May I assume that your engine handles the "time" command in WB mode?

Sven
That is a TERRIBLE idea for ponder=off matches

Of course, not suitable for ponder=off single-core matches. That was not what I had in mind.

bob wrote:My suggestion is "do whatever you want" but complete EVERYTHING before sending the move to the GUI. If it is necessary to spend 1.5ms after your search completes, just subtract 1.5ms from your target time.

That was my alternative suggestion, see above.

bob wrote:In Unix, the IDEAL GUI would probably send a STOP signal to a process once it receives a move from it, and then send a CONT signal before sending it the next move, then the engine could not do any funny stuff. I used to teach the AI class every year, and the class project was an othello program. At the end of each term, we had a tournament using the same interface I use on our cluster for my testing. And I had to deal with this issue carefully to avoid someone winning by trickery rather than by playing the best moves.

I don't think that gathering statistics (an activity that does not influence move selection) while not on move is "trickery" as long as you don't steal CPU time from the opponent. You obviously need one full core per engine at least in order to do that.

Sven

Greg Strong · Post by **Greg Strong** » Sun Jan 05, 2014 12:46 am

Rebel wrote:Let's discuss a specific part of your blog, maybe this little drama thread will produce something good after all.

Oh, this thread has already produced something good. Quadrox is fixed and that was the point of the thread

Actually, in all seriousness, there has also been some agreement about cleaning up the specs for the protocol definition which is also important to me. (Quadrox is a secondary project - my main project is a new "universal chess" GUI, which hopefully will be ready for release some day soon ...)

hgm · Post by **hgm** » Sun Jan 05, 2014 11:30 am

bob wrote:In Unix, the IDEAL GUI would probably send a STOP signal to a process once it receives a move from it, and then send a CONT signal before sending it the next move, then the engine could not do any funny stuff.

An interesting idea. Unfortunately it would not work for engines that run through adapters such as Polyglot. The GUI would just stop the adapter, but not its engine child process. Or are there commands in Unix to stop an entire process tree?

Sven · Post by **Sven** » Sun Jan 05, 2014 11:53 am

hgm wrote:
bob wrote:In Unix, the IDEAL GUI would probably send a STOP signal to a process once it receives a move from it, and then send a CONT signal before sending it the next move, then the engine could not do any funny stuff.
An interesting idea. Unfortunately it would not work for engines that run through adapters such as Polyglot. The GUI would just stop the adapter, but not its engine child process. Or are there commands in Unix to stop an entire process tree?

If parent and child process are in the same process group (i.e. have the same PG ID) then it is possible to send a signal (like STOP) to all processes of the group with one command.

hgm · Post by **hgm** » Sun Jan 05, 2014 12:21 pm

I am not sure this is useful. From glancing at the documentation it seems that processes are able to change the process group to which they belong. Of course Polyglot could be made to spawn the engine process in the same group (if it doesn't already), and the GUI to change the group of the spawned engine/adapter to a unique one that could be selectively stopped or killed, but a malicious engine could escape to a different process group (if I understand this concept correctly). Or just spawn a process in a different group to consume resources while the opponent is thinking.

Nevertheless, I should dig into this. XBoard now kills engines that do not respond to quit, but I think it would only kill its direct decendant. It would be nice if it would also kill all non-malicious indirect offspring.

STOP also seems a very useful way for implementing the GUI Pause feature on engines that do not support the 'pause'/'resume' commands (for which I now only pause after they give their move), but also only if it also reaches beyond the adapter.

Losing on time

Re: Losing on time

Re: Losing on time

Re: Losing on time

Re: Losing on time

Re: Losing on time

Re: Losing on time

Re: Losing on time

Re: Losing on time

Re: Losing on time

Re: Losing on time