SF and SMP --- a blast from the past

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

SF and SMP --- a blast from the past

Post by zullil »

http://talkchess.com/forum/viewtopic.ph ... +stockfish

About the current SF SMP situation---I'm finding it hard to understand the "plan of attack". Probably just me, since I'm not really a chess programmer, just an interested onlooker.

Suppose, for the sake of discussion, we temporarily set aside measures derived from game play. We hear that SF scales badly from 8 to 16 cores. What one numerical statistic best illustrates this? NPS on some bench test? Time-to-depth on some set of positions? Something else? If there is a number that best encapsulates the problem, then perhaps that number can be used as target for improvement. The hope is that this statistic might serve as a reasonable proxy for ELO, and testing using game play can wait for the very end of the confirmation process (since it's especially difficult and time-consuming).

I have a 16-core Linux workstation with 32GB RAM, on which I can (occasionally at least) run tests. But it's hard to test when one doesn't know what to measure! Any guidance appreciated. Thanks.
BBauer
Posts: 658
Joined: Wed Mar 08, 2006 8:58 pm

Re: SF and SMP --- a blast from the past

Post by BBauer »

Hi Louis,
there is only one real measure:
time to best move.
Forget about artificial concepts like nps or depth.
In game play it is not the engine with highest nps or deepest depth that wins, but the engine which plays the best move in the shortest time.

Perhaps a set of correct test positions could help.

Kind regards
Bernhard
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: SF and SMP --- a blast from the past

Post by Laskos »

BBauer wrote:Hi Louis,
there is only one real measure:
time to best move.
Forget about artificial concepts like nps or depth.
In game play it is not the engine with highest nps or deepest depth that wins, but the engine which plays the best move in the shortest time.

Perhaps a set of correct test positions could help.

Kind regards
Bernhard
Some would disagree that tactical suites are suitable. My guess is that they are closer than time-to-depth (nps is just useless), but the sole sure way is to test in games, and not at hyper-fast or fast time controls.
Robert Houdart wrote:One cannot reliably extrapolate tactical test suite results to Elo strength.
Adding more threads will enlarge the width of the search - great for tactical puzzles but not necessarily for real game play.

Robert
Robert Hyatt wrote:BTW, searching tactical positions is not a valid way of testing parallel search. The key there is that the best move is often ordered later in the list by the very nature of the position (the best move is usually some sort of 'surprise'. This plays right into the hands of a parallel search that by its very nature tends to do better when move ordering is sub-optimal.
kbhearn
Posts: 411
Joined: Thu Dec 30, 2010 4:48 am

Re: SF and SMP --- a blast from the past

Post by kbhearn »

nps is a starting point (and what people pointed to in TCEC rightly or wrongly as SF 'scaling poorly').

Put simply, nps scaling well does not guarantee the algorithm is scaling well but nps scaling poorly means the algorithm is scaling poorly (they're not doing more work per node with more threads).
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: SF and SMP --- a blast from the past

Post by bob »

zullil wrote:http://talkchess.com/forum/viewtopic.ph ... +stockfish

About the current SF SMP situation---I'm finding it hard to understand the "plan of attack". Probably just me, since I'm not really a chess programmer, just an interested onlooker.

Suppose, for the sake of discussion, we temporarily set aside measures derived from game play. We hear that SF scales badly from 8 to 16 cores. What one numerical statistic best illustrates this? NPS on some bench test? Time-to-depth on some set of positions? Something else? If there is a number that best encapsulates the problem, then perhaps that number can be used as target for improvement. The hope is that this statistic might serve as a reasonable proxy for ELO, and testing using game play can wait for the very end of the confirmation process (since it's especially difficult and time-consuming).

I have a 16-core Linux workstation with 32GB RAM, on which I can (occasionally at least) run tests. But it's hard to test when one doesn't know what to measure! Any guidance appreciated. Thanks.
The first step has to be NPS. If, on 16 cores, NPS is only 10X what it is on 1 core (beware of turbo-boost here, it must be disabled or this test is no good) then you already have a serious performance issue. If you can get reasonably close to 16x the NPS (reasonable might be 14-15x, whatever) then time to depth is the next measurement to deal with. These are two different issues. One more architectural in nature (NPS, caused by cache thrashing and such) while the other is more parallel search related (poor split points and such). It is not an easy problem to deal with.