strategies for finding slowdows in lazy smp

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
flok
Posts: 145
Joined: Tue Jul 03, 2018 8:19 am
Full name: Folkert van Heusden
Contact:

Re: strategies for finding slowdows in lazy smp

Post by flok » Wed Jun 05, 2019 8:29 am

Hi Dann,
Dann Corbit wrote:
Wed Jun 05, 2019 8:12 am
That graph showed the nps for 1 thread.
This new graph shows the average nps for all threads:
Something is very wrong with the calculation.
The aggregate NPS is the sum of the NPS for all threads.
How can it be less than the NPS for one thread?
In that graph it is not the aggregate, it is the average :D

Here's a combined graph of the average and the sum:

Image
www.vanheusden.com: Micah / Embla / PuppetMaster / DeepBrutePos / Pos / Feeks

smatovic
Posts: 716
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: strategies for finding slowdows in lazy smp

Post by smatovic » Wed Jun 05, 2019 8:44 am

flok wrote:
Tue Jun 04, 2019 7:06 pm
Now my question is: what are strategies for finding what causes this slow down?
- implement an benchsmp command to reproduce results quick on the command line
- as always in engine debugging, turn every extension off, bench only with an
basic engine and turn stepwise extensions on, you can also bench smp nps
with TT off

***edit***
- if it's not TT or extensions, then IDF loop and starting/terminating threads is left

--
Srdja

mar
Posts: 1970
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: strategies for finding slowdows in lazy smp

Post by mar » Wed Jun 05, 2019 10:34 am

flok wrote:
Tue Jun 04, 2019 7:06 pm
The dramatic slow-down is probably because other things were running on it (e.g. the chrome browser).
First of all, don't mess with affinity (especially if you don't understand how it works).
Let's say your CPU has 2 logical cores per one physical, so if you set affinity mask for one worker to bit 0 and another to bit 1, you force them to run on a single physical core, this is certainly not what you want.
So unless you know exactly what you're doing, simply trust the scheduler.
Martin Sedlak

flok
Posts: 145
Joined: Tue Jul 03, 2018 8:19 am
Full name: Folkert van Heusden
Contact:

Re: strategies for finding slowdows in lazy smp

Post by flok » Wed Jun 05, 2019 10:42 am

mar wrote:
Wed Jun 05, 2019 10:34 am
flok wrote:
Tue Jun 04, 2019 7:06 pm
The dramatic slow-down is probably because other things were running on it (e.g. the chrome browser).
First of all, don't mess with affinity (especially if you don't understand how it works).
Let's say your CPU has 2 logical cores per one physical, so if you set affinity mask for one worker to bit 0 and another to bit 1, you force them to run on a single physical core, this is certainly not what you want.
But: let's say I have a system with 32 threads (16 physical cores) on which I want to run 32 threads. In that case there's always a case of 2 on the same phsyical core.
Or are you suggesting not to use threading but only 1 thread per core?
www.vanheusden.com: Micah / Embla / PuppetMaster / DeepBrutePos / Pos / Feeks

flok
Posts: 145
Joined: Tue Jul 03, 2018 8:19 am
Full name: Folkert van Heusden
Contact:

Re: strategies for finding slowdows in lazy smp

Post by flok » Wed Jun 05, 2019 10:45 am

smatovic wrote:
Wed Jun 05, 2019 8:44 am
flok wrote:
Tue Jun 04, 2019 7:06 pm
Now my question is: what are strategies for finding what causes this slow down?
- implement an benchsmp command to reproduce results quick on the command line
- as always in engine debugging, turn every extension off, bench only with an
basic engine and turn stepwise extensions on, you can also bench smp nps
with TT off

***edit***
- if it's not TT or extensions, then IDF loop and starting/terminating threads is left
what is an IDF loop? my googling did not reproduce anything on that

starting/term. threads: I start them once at the start of the whole calculation and stop them when time is up
www.vanheusden.com: Micah / Embla / PuppetMaster / DeepBrutePos / Pos / Feeks

mar
Posts: 1970
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: strategies for finding slowdows in lazy smp

Post by mar » Wed Jun 05, 2019 11:01 am

flok wrote:
Wed Jun 05, 2019 10:42 am
mar wrote:
Wed Jun 05, 2019 10:34 am
flok wrote:
Tue Jun 04, 2019 7:06 pm
The dramatic slow-down is probably because other things were running on it (e.g. the chrome browser).
First of all, don't mess with affinity (especially if you don't understand how it works).
Let's say your CPU has 2 logical cores per one physical, so if you set affinity mask for one worker to bit 0 and another to bit 1, you force them to run on a single physical core, this is certainly not what you want.
But: let's say I have a system with 32 threads (16 physical cores) on which I want to run 32 threads. In that case there's always a case of 2 on the same phsyical core.
Or are you suggesting not to use threading but only 1 thread per core?
Of course I'm not, I'm suggesting you don't mess with affinity and let the scheduler do its job!
Let's say I have 8 logical cores and 4 physical:

Code: Select all

L0L1L2L3L4L5L6L7
P0P0P1P1P2P2P3P3
And I want to run a 4-CPU tournament. The way you allocate the logical cores, you end up with thread masks
L0L1L2L3, but that restricts the threads to only two physical cores instead of 4, so a better mask would be
L0L1 for thread0, L2L3 for thread 1 and so on. (of course, you could have more than 2 logical cores per physical, so this is just an example)

So simply let the OS scheduler handle it (plus it's less code :)
Martin Sedlak

smatovic
Posts: 716
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: strategies for finding slowdows in lazy smp

Post by smatovic » Wed Jun 05, 2019 11:20 am

flok wrote:
Wed Jun 05, 2019 10:45 am
smatovic wrote:
Wed Jun 05, 2019 8:44 am
flok wrote:
Tue Jun 04, 2019 7:06 pm
Now my question is: what are strategies for finding what causes this slow down?
- implement an benchsmp command to reproduce results quick on the command line
- as always in engine debugging, turn every extension off, bench only with an
basic engine and turn stepwise extensions on, you can also bench smp nps
with TT off

***edit***
- if it's not TT or extensions, then IDF loop and starting/terminating threads is left
what is an IDF loop? my googling did not reproduce anything on that

starting/term. threads: I start them once at the start of the whole calculation and stop them when time is up
IDF - Iterative Deepening Framework

https://www.chessprogramming.org/Iterative_Deepening

Not sure how a lazy smp implementation looks like without Iterative Deepening,
but if you have ID implemented, then maybe you want to implement a termination
strategy for all threads, for the case a thread finishes the search of the
current ID iteration...but this stuff may vary between lazy smp derivatives.

--
Srdja

flok
Posts: 145
Joined: Tue Jul 03, 2018 8:19 am
Full name: Folkert van Heusden
Contact:

Re: strategies for finding slowdows in lazy smp

Post by flok » Wed Jun 05, 2019 11:27 am

smatovic wrote:
Wed Jun 05, 2019 11:20 am
IDF - Iterative Deepening Framework
https://www.chessprogramming.org/Iterative_Deepening
Not sure how a lazy smp implementation looks like without Iterative Deepening,
Oh it has IDF, I just didn't know it was called IDF. Thought ID. But never mind.
but if you have ID implemented, then maybe you want to implement a termination
strategy for all threads, for the case a thread finishes the search of the
current ID iteration...but this stuff may vary between lazy smp derivatives.
Currently my main thread is the master-thread. If that one decides the search is finished, then all others terminate as well.
www.vanheusden.com: Micah / Embla / PuppetMaster / DeepBrutePos / Pos / Feeks

smatovic
Posts: 716
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: strategies for finding slowdows in lazy smp

Post by smatovic » Wed Jun 05, 2019 11:33 am

flok wrote:
Wed Jun 05, 2019 11:27 am
Currently my main thread is the master-thread. If that one decides the search is finished, then all others terminate as well.
And what happens if a helper finishes its search?

--
Srdja

flok
Posts: 145
Joined: Tue Jul 03, 2018 8:19 am
Full name: Folkert van Heusden
Contact:

Re: strategies for finding slowdows in lazy smp

Post by flok » Wed Jun 05, 2019 11:36 am

smatovic wrote:
Wed Jun 05, 2019 11:33 am
flok wrote:
Wed Jun 05, 2019 11:27 am
Currently my main thread is the master-thread. If that one decides the search is finished, then all others terminate as well.
And what happens if a helper finishes its search?
It goes on with the next iteration if applicable. Else it'll busy-loop :oops: until the main-thread catches up.
www.vanheusden.com: Micah / Embla / PuppetMaster / DeepBrutePos / Pos / Feeks

Post Reply