Lazy SMP

Werewolf · Post by **Werewolf** » Sat Mar 26, 2022 1:22 pm

Forgive for asking ancient questions, but they are fairly crucial. In all instances I'm referring to Stockfish's implementation of Lazy SMP.
Please could you have a go at the following "multiple choice" ?

1. Adding cores in a CPU to an engine ALWAYS increases search speed, providing that there's no loss in clock speed. This applies even if the new cores are slower than the old ones.
a) TRUE
b) FALSE
c) UNCLEAR

The newer CPUs with "efficiency" cores are in mind here.

2. Stockfish runs on two different PCs. In each case it searches at 5 MN/s. On the first PC it's running on one core, on the second 4 cores.
a) Stockfish's search is roughly as fast on both machines
b) Stockfish's search is much faster on the first PC

3. Jack has a 16 core PC. He runs Stockfish on 8 cores and then later that day on all 16 cores. He notices an improvement in both nps and time to depth. Jack is trying to decide which one is more indicative of the speed-up.
a) Time to depth
b) NPS
c) They both need to be considered.

4. (Related to Q3) As thread count increases Stockfish's search "thickens".
a) TRUE
b) FALSE

Ovyron · Post by **Ovyron** » Wed Jun 01, 2022 12:49 pm

Werewolf wrote: ↑Sat Mar 26, 2022 1:22 pm In all instances I'm referring to Stockfish's implementation of Lazy SMP.

Time to depth is everything, so it'll trump all even if nodes searched are lower. For multicore CPUs you'll have cores searching stuff that turned out to be irrelevant, so a single core with the same relative speed is going to have lower time to depth because of the time saved not searching those irrelevant lines.

Nodes per second don't tell you anything as most could have been used on nodes that had no effect (say, 80% of search time is employed investigating irrelevant move, then the best move has a fail high and 20% of remaining time is making sure it holds, then 80% of all nodes searched went to waste, getting to the depth when it happens ASAP is crucial as it minimizes nodes wasted.)

Uri Blass · Post by **Uri Blass** » Wed Jun 01, 2022 1:29 pm

I disagree that time to depth is everything and I read that more cores at fixed depth are better because the engine does less pruning.

smatovic · Post by **smatovic** » Wed Jun 01, 2022 2:59 pm

...I am not into SF dev, but as far as I got it, they use an 'unsound' Lazy SMP approach, one which you can not figure out with pen n paper, compared to something like YBWC or ABDADA, hence time to depth is the wrong metric here, Elo is the right one. Bob used to say they just widen the search, but IMO it is not that easy. I guess the parallel search is intertwined with the selective search heuristics/eval, so funny effects might occur in SMP if things change. As long as there is an effective branching factor ~2 there must be IMO something you can parallelize, and others seem to see an upper limit with ~128 cores for current AB-NNUE engines. Would be fun if some SF dev could make a paper how SF Lazy SMP actually works, how it gains Elo without widening the search and lowering time to depth, or alike. My speculation - cos SF dev can not figure out SF Lazy SMP with pen n paper, no one can answer your questions until they run specific tests for your given use cases.

--
Srdja

Modern Times · Post by **Modern Times** » Wed Jun 01, 2022 3:02 pm

smatovic wrote: ↑Wed Jun 01, 2022 2:59 pm ..... hence time to depth is the wrong metric here, Elo is the right one. ....
Srdja

Yes, at the end of the day Elo is the real-world measure of the effectiveness of an SMP implementation in chess, if not the technical one.

Ovyron · Post by **Ovyron** » Thu Jun 02, 2022 11:11 am

smatovic wrote: ↑Wed Jun 01, 2022 2:59 pm hence time to depth is the wrong metric here, Elo is the right one

And my claim is that the shorter time to depth the more elo, as it leaves more time on the clock so on other moves this engine will outsearch the one with longer time to depth.

Would like to see an example of an engine reaching lower depth and beating the higher depth one, all things being equal, more depth = more elo.

Ronald · Post by **Ronald** » Thu Jun 02, 2022 12:56 pm

In current chess engines depth is a relative concept with all the pruning and extensions going on. With lazy SMP the depth that is reported by the engine is the depth of only 1 search thread(usually the main thread, or a helper thread with a higher depth/score) so this makes engine depth even more obscure.

I'm pretty sure that if you run a match on fixed depth with a single thread version against a multiple thread version with a lesser fixed depth, then up to a certain difference in depth/number of threads the multi threaded version will be better. When doing such an experiment, you have to make sure that the engine used always stops the helper threads at the fixed depth, usually the search in only stopped when the main thread reaches the fixed depth, so other threads may have a deeper search depth than reported.

Of course its safe to assume that "all things being equal" usually more depth means more elo (except for mate in X positions...)

Sopel · Post by **Sopel** » Thu Jun 02, 2022 12:56 pm

Ovyron wrote: ↑Thu Jun 02, 2022 11:11 am
smatovic wrote: ↑Wed Jun 01, 2022 2:59 pm hence time to depth is the wrong metric here, Elo is the right one
And my claim is that the shorter time to depth the more elo, as it leaves more time on the clock so on other moves this engine will outsearch the one with longer time to depth.

Would like to see an example of an engine reaching lower depth and beating the higher depth one, all things being equal, more depth = more elo.

Your understanding of modern chess engines is pretty inexistent if depth is the metric you use

Ovyron · Post by **Ovyron** » Fri Jun 03, 2022 3:44 pm

Ronald wrote: ↑Thu Jun 02, 2022 12:56 pm I'm pretty sure that if you run a match on fixed depth with a single thread version against a multiple thread version with a lesser fixed depth, then up to a certain difference in depth/number of threads the multi threaded version will be better.

I'd love to see this, I've never seen an engine with lower depth beat one with a higher one (which in this case would be same engine with different conditions.)

Probably could be fast to run (say, depth 11 multi thread beating depth 12 single thread.)

Ovyron · Post by **Ovyron** » Fri Jun 03, 2022 3:46 pm

Sopel wrote: ↑Thu Jun 02, 2022 12:56 pm Your understanding of modern chess engines is pretty inexistent if depth is the metric you use

This isn't about understanding, but of practice, nobody has tested for this case (where lower depth wins.)

Lazy SMP

Lazy SMP

Re: Lazy SMP

Re: Lazy SMP

Re: Lazy SMP

Re: Lazy SMP

Re: Lazy SMP

Re: Lazy SMP

Re: Lazy SMP

Re: Lazy SMP

Re: Lazy SMP