SF and SMP --- a blast from the past

zullil · Post by **zullil** » Thu Feb 19, 2015 12:58 pm

http://talkchess.com/forum/viewtopic.ph ... +stockfish

About the current SF SMP situation---I'm finding it hard to understand the "plan of attack". Probably just me, since I'm not really a chess programmer, just an interested onlooker.

Suppose, for the sake of discussion, we temporarily set aside measures derived from game play. We hear that SF scales badly from 8 to 16 cores. What one numerical statistic best illustrates this? NPS on some bench test? Time-to-depth on some set of positions? Something else? If there is a number that best encapsulates the problem, then perhaps that number can be used as target for improvement. The hope is that this statistic might serve as a reasonable proxy for ELO, and testing using game play can wait for the very end of the confirmation process (since it's especially difficult and time-consuming).

I have a 16-core Linux workstation with 32GB RAM, on which I can (occasionally at least) run tests. But it's hard to test when one doesn't know what to measure! Any guidance appreciated. Thanks.

BBauer · Post by **BBauer** » Thu Feb 19, 2015 4:05 pm

Hi Louis,
there is only one real measure:
time to best move.
Forget about artificial concepts like nps or depth.
In game play it is not the engine with highest nps or deepest depth that wins, but the engine which plays the best move in the shortest time.

Perhaps a set of correct test positions could help.

Kind regards
Bernhard

Laskos · Post by **Laskos** » Thu Feb 19, 2015 4:42 pm

BBauer wrote:Hi Louis,
there is only one real measure:
time to best move.
Forget about artificial concepts like nps or depth.
In game play it is not the engine with highest nps or deepest depth that wins, but the engine which plays the best move in the shortest time.

Perhaps a set of correct test positions could help.

Kind regards
Bernhard

Some would disagree that tactical suites are suitable. My guess is that they are closer than time-to-depth (nps is just useless), but the sole sure way is to test in games, and not at hyper-fast or fast time controls.

Robert Houdart wrote:One cannot reliably extrapolate tactical test suite results to Elo strength.
Adding more threads will enlarge the width of the search - great for tactical puzzles but not necessarily for real game play.

Robert

Robert Hyatt wrote:BTW, searching tactical positions is not a valid way of testing parallel search. The key there is that the best move is often ordered later in the list by the very nature of the position (the best move is usually some sort of 'surprise'. This plays right into the hands of a parallel search that by its very nature tends to do better when move ordering is sub-optimal.

kbhearn · Post by **kbhearn** » Thu Feb 19, 2015 6:01 pm

nps is a starting point (and what people pointed to in TCEC rightly or wrongly as SF 'scaling poorly').

Put simply, nps scaling well does not guarantee the algorithm is scaling well but nps scaling poorly means the algorithm is scaling poorly (they're not doing more work per node with more threads).

bob · Post by **bob** » Thu Feb 19, 2015 8:01 pm

zullil wrote:http://talkchess.com/forum/viewtopic.ph ... +stockfish

About the current SF SMP situation---I'm finding it hard to understand the "plan of attack". Probably just me, since I'm not really a chess programmer, just an interested onlooker.

Suppose, for the sake of discussion, we temporarily set aside measures derived from game play. We hear that SF scales badly from 8 to 16 cores. What one numerical statistic best illustrates this? NPS on some bench test? Time-to-depth on some set of positions? Something else? If there is a number that best encapsulates the problem, then perhaps that number can be used as target for improvement. The hope is that this statistic might serve as a reasonable proxy for ELO, and testing using game play can wait for the very end of the confirmation process (since it's especially difficult and time-consuming).

I have a 16-core Linux workstation with 32GB RAM, on which I can (occasionally at least) run tests. But it's hard to test when one doesn't know what to measure! Any guidance appreciated. Thanks.

The first step has to be NPS. If, on 16 cores, NPS is only 10X what it is on 1 core (beware of turbo-boost here, it must be disabled or this test is no good) then you already have a serious performance issue. If you can get reasonably close to 16x the NPS (reasonable might be 14-15x, whatever) then time to depth is the next measurement to deal with. These are two different issues. One more architectural in nature (NPS, caused by cache thrashing and such) while the other is more parallel search related (poor split points and such). It is not an easy problem to deal with.

SF and SMP --- a blast from the past

SF and SMP --- a blast from the past

Re: SF and SMP --- a blast from the past

Re: SF and SMP --- a blast from the past

Re: SF and SMP --- a blast from the past

Re: SF and SMP --- a blast from the past