I used to call ftime() for getting the time in the search. That function is deprecated under Linux, and in the Android NDK, it is not even available. So I went for the recommended clock_gettime(CLOCK_MONOTONIC, &timebuffer); call.
Now this change reduced my NPS by 40%, and the time needed to reach a certain depth in a simple test position went up by a factor of 3.5.
It is really this change alone since when I took it back, things were back to normal. Tested under Win7-x64 with MingW.
Is this normal? What time function do you recommend? Should I stay with ftime under Windows and Linux? What should I use on Android? Or is this a MingW problem?
clock_gettime extremely slow
Moderators: hgm, Rebel, chrisw
-
- Posts: 2488
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
-
- Posts: 411
- Joined: Thu Dec 30, 2010 4:48 am
Re: clock_gettime extremely slow
How often are you calling the function?
-
- Posts: 2488
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: clock_gettime extremely slow
It's called in every node in the loop that goes over the available loops. Looks like that could be optimised to be checked only before that loop?
But still, why is one time function so extremely different from the other? From what I know, both functions should be about equal in terms of burnt cycles.
But still, why is one time function so extremely different from the other? From what I know, both functions should be about equal in terms of burnt cycles.
-
- Posts: 1494
- Joined: Thu Mar 30, 2006 2:08 pm
Re: clock_gettime extremely slow
Due to this overhead, most programs only check time ever XXXX nodes. You want to pick a XXXX that is takes about 1 millisecond or a bit less, since UCI times are all in milliseconds. So if you program does say 1 millions nodes per second, then polling time every 1000 nodes should work.Ras wrote:It's called in every node in the loop that goes over the available loops. Looks like that could be optimised to be checked only before that loop?
But still, why is one time function so extremely different from the other? From what I know, both functions should be about equal in terms of burnt cycles.
-
- Posts: 2488
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: clock_gettime extremely slow
Just changed the code a bit and moved the time call outside the loop. Actually, it will be called in the next depth level anyway - but not in the leafs.kbhearn wrote:How often are you calling the function?
Oh, and I had even been calling that in QS - I forgot to take it out when porting the engine back from the bare metal microcontroller where that call is just reading an integer.
Now things are more friendly: clock_gettime slows down the NPS by only 6%, and the time to depth raises also only by 6%.
Next idea is Mark's suggestion to only make the actual call every so many nodes.
-
- Posts: 2488
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: clock_gettime extremely slow
Good idea - would have to be calibrated to the NPS (depending on position and hardware), but these data are available anyway.mjlef wrote:So if you program does say 1 millions nodes per second, then polling time every 1000 nodes should work.
-
- Posts: 2204
- Joined: Sat Jan 18, 2014 10:24 am
- Location: Andorra
Re: clock_gettime extremely slow
Andscacs calculates nodes per ms during search, and decides depending on the remaining time if it polls the time only once in 100ms or often.
Daniel José - http://www.andscacs.com
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: clock_gettime extremely slow
If you want to do it in an advanced way, you could just schedule an event at the time you want to alter the behavior (e.g. stop unconditionally, stop when the root score is above a certain level, or think unconditionally), and let the handler routine set the appropriate flags that the search tests to implement that behavior.
-
- Posts: 2488
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: clock_gettime extremely slow
Just implemented that trick. I initialise the NPS_1ms to 0 and always check whether more nodes have been calculated since the last time than that. So during the first one or two ms, the value may be too low, but since I'm calibrating the value against absolute nodes vs. absolute time, that quickly becomes accurate.mjlef wrote:So if you program does say 1 millions nodes per second, then polling time every 1000 nodes should work.
No difference anymore between the two time functions, and I get a nice 10% speedup compared to the version where I moved the time check outside the node loop.
thanks folks!
-
- Posts: 2488
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: clock_gettime extremely slow
Sounds like a lot of platform dependent stuff. However, the code that interfaces threaded stdin scanning with the search is portable - if only because it's a total hack on every platform.hgm wrote:If you want to do it in an advanced way