Page 3 of 4

Re: Lc0 + 320x24.J13B.2-swa-136000 vs. Stockfish 250919 TC= 30m+30s

Posted: Mon Oct 07, 2019 2:47 pm
by mwyoung
Nordlandia wrote:
Mon Oct 07, 2019 2:32 pm
Because any more than default may hurt raw performance.
Lc0 has many default settings. So I am not allowed to change them because it MAY hurt raw performance.

Are you saying I must test Lc0 at default only. Even when my setting are stronger for Lc0.

Or in your mind is there certain settings I am allowed to change?

Re: Lc0 + 320x24.J13B.2-swa-136000 vs. Stockfish 250919 TC= 30m+30s

Posted: Mon Oct 07, 2019 3:06 pm
by zullil
mwyoung wrote:
Mon Oct 07, 2019 2:17 pm
Nordlandia wrote:
Mon Oct 07, 2019 10:36 am
4 threads for Lc0 = exercise in futility.

And so with your ponder match.
You always say this. But you can never answer why. Why is using 4 threads a exercise in futility?
For a fighting chance at getting a solid answer, try joining and posting the question at https://discordapp.com/invite/pKujYxD

Technical documentation regarding Lc0 seems to be rather lacking.

Re: Lc0 + 320x24.J13B.2-swa-136000 vs. Stockfish 250919 TC= 30m+30s

Posted: Mon Oct 07, 2019 5:56 pm
by Modern Times
How many threads do TCEC and the other one (chess.com ?) tournaments use with Lc0 ?

Re: Lc0 + 320x24.J13B.2-swa-136000 vs. Stockfish 250919 TC= 30m+30s

Posted: Mon Oct 07, 2019 6:06 pm
by mwyoung
I am testing lc0 at 2 threads, vs 4 threads playing Stockfish.

4 threads scored -6 =43 +1

2 threads is playing now on my channel

Re: Lc0 + 320x24.J13B.2-swa-136000 vs. Stockfish 250919 TC= 30m+30s

Posted: Mon Oct 07, 2019 6:44 pm
by zullil
Modern Times wrote:
Mon Oct 07, 2019 5:56 pm
How many threads do TCEC and the other one (chess.com ?) tournaments use with Lc0 ?
Good question. How many GPU's are being used? My understanding is that you want two CPU "worker threads" for each GPU that the system has.

[EDIT] It seems that AllieStein at TCEC is running on 2 GPUs and Threads is set at default (since it's not listed in the settings).

Re: Lc0 + 320x24.J13B.2-swa-136000 vs. Stockfish 250919 TC= 30m+30s

Posted: Mon Oct 07, 2019 10:15 pm
by mwyoung
zullil wrote:
Mon Oct 07, 2019 6:44 pm
Modern Times wrote:
Mon Oct 07, 2019 5:56 pm
How many threads do TCEC and the other one (chess.com ?) tournaments use with Lc0 ?
Good question. How many GPU's are being used? My understanding is that you want two CPU "worker threads" for each GPU that the system has.

[EDIT] It seems that AllieStein at TCEC is running on 2 GPUs and Threads is set at default (since it's not listed in the settings).
There are many types of GPUs. Are we to assume that 2 threads work just as well on 980, 1080, RTX 2060, RTX 2070, RTX 2080, and RTX 2080 ti. Even when these cards at the top of performance stack. Run many times faster then the GPU cards on the lower end of the stack. Hmmm

But if I use 2 slower cards then it is ok to use 4 threads.

We have a tendency to always want to carve some ideas into stone. Without actually testing if are assumptions are correct.

Re: Lc0 + 320x24.J13B.2-swa-136000 vs. Stockfish 250919 TC= 30m+30s

Posted: Tue Oct 08, 2019 3:53 am
by shrapnel
mwyoung wrote:
Mon Oct 07, 2019 10:15 pm
We have a tendency to always want to carve some ideas into stone. Without actually testing if are assumptions are correct.
True. What's really surprising is that some otherwise very intelligent people also tend to have closed minds.
I still remember how I was laughed at by the Big Brains here when I suggested a few years ago that Chess Engines would become even more strong if they learnt how to use the Power of the GPU. They told me very kindly, with technical details, that chess engines couldn't possibly utilize the GPU and they could only use the CPU and that I didn't know what I was talking about.
I'm not a Software Programmer by Profession, but History has proved me right.

Re: Lc0 + 320x24.J13B.2-swa-136000 vs. Stockfish 250919 TC= 30m+30s

Posted: Tue Oct 08, 2019 5:21 am
by Modern Times
mwyoung wrote:
Mon Oct 07, 2019 10:15 pm

There are many types of GPUs. Are we to assume that 2 threads work just as well on 980, 1080, RTX 2060, RTX 2070, RTX 2080, and RTX 2080 ti. Even when these cards at the top of performance stack. Run many times faster then the GPU cards on the lower end of the stack. Hmmm
Yes. I use the default 2 threads on the GTX1050, logically it should be more for a RTX2080 ti.

Re: Lc0 + 320x24.J13B.2-swa-136000 vs. Stockfish 250919 TC= 30m+30s

Posted: Tue Oct 08, 2019 6:01 am
by shrapnel
Modern Times wrote:
Tue Oct 08, 2019 5:21 am
logically it should be more for a RTX2080 ti.
Yes, that would explain mwyoung recommending 4 Threads, as he has a 2080 Ti.
At the same time, going overboard with the number of Threads also seems to be a bad idea.
One should find the correct balance.
Is using an odd number of threads a bad idea ?

Re: Lc0 + 320x24.J13B.2-swa-136000 vs. Stockfish 250919 TC= 30m+30s

Posted: Tue Oct 08, 2019 10:23 am
by zullil
mwyoung wrote:
Mon Oct 07, 2019 10:15 pm
zullil wrote:
Mon Oct 07, 2019 6:44 pm
Modern Times wrote:
Mon Oct 07, 2019 5:56 pm
How many threads do TCEC and the other one (chess.com ?) tournaments use with Lc0 ?
Good question. How many GPU's are being used? My understanding is that you want two CPU "worker threads" for each GPU that the system has.

[EDIT] It seems that AllieStein at TCEC is running on 2 GPUs and Threads is set at default (since it's not listed in the settings).
There are many types of GPUs. Are we to assume that 2 threads work just as well on 980, 1080, RTX 2060, RTX 2070, RTX 2080, and RTX 2080 ti. Even when these cards at the top of performance stack. Run many times faster then the GPU cards on the lower end of the stack. Hmmm

But if I use 2 slower cards then it is ok to use 4 threads.

We have a tendency to always want to carve some ideas into stone. Without actually testing if are assumptions are correct.
Testing is good, provided it's done in a statistically valid way. On the other hand, when an Lc0 developer includes the following along with the code at Github, I tend to pay attention:
Number of (CPU) threads to use.
Default is 2. There's currently no use of making it more than 3 as it's limited by mutex contention which is yet to be optimized.