LC0 ver 0.16.0 ID 10520 vs Deep Shredder 13 x64 40/4

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: LC0 ver 0.16.0 ID 10520 vs Deep Shredder 13 x64 40/4

Post by kranium »

I'm quite sure that i7-4790 (being an Intel processor) ships with Hyperthreading 'on'
That would be 4 physical cores (8 logical cores) meaning 8 threads
Ted has carefully allocated 4 for LC0 and 4 for Shredder

Where's the 'contention'?...other than with a couple users here who need to bully and criticize what they don't understand.

Thanks for sharing Ted!
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: LC0 ver 0.16.0 ID 10520 vs Deep Shredder 13 x64 40/4

Post by mar »

kranium wrote: Sat Aug 11, 2018 10:14 pm I'm quite sure that i7-4790 (being an Intel processor) ships with Hyperthreading 'on'
That would be 4 physical cores (8 logical cores) meaning 8 threads
Ted has carefully allocated 4 for LC0 and 4 for Shredder

Where's the 'contention'?...other than with a couple users here who need to bully and criticize what they don't understand.

Thanks for sharing Ted!
Where's the contention? A factor of almost 1.4x slowdown on a quad with HT on is "carefully allocated" by your standards?
Plus one opponent is not only slowing you down but also pondering.
EDIT: 1.4x CPU only of course, so not that bad for Leela; not that it would matter much in 40 games anyway...
Martin Sedlak
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: LC0 ver 0.16.0 ID 10520 vs Deep Shredder 13 x64 40/4

Post by kranium »

mar wrote: Sat Aug 11, 2018 10:57 pm Where's the contention? A factor of almost 1.4x slowdown on a quad with HT on is "carefully allocated" by your standards?
Plus one opponent is not only slowing you down but also pondering.
EDIT: 1.4x CPU only of course, so not that bad for Leela; not that it would matter much in 40 games anyway...
1.4x sounds like a speedup, not a slowdown...but no matter I have no idea where you got that number, or what it is.
I can only speculate that you believe 1 physical core (HT off) is better or faster than 2 logical threads (HT on).

But that's not true...talk to the power users on playchess, for ex..they're all using hyperthreading because it's a significant performance boost.
I realize there's a powerful misconception about HT being bad, largely propagated here on talkchess.
But there's also been plenty of new data presented here in the last couple of years that refutes that myth (perhaps from Kai L.?), if you care to search for it.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: LC0 ver 0.16.0 ID 10520 vs Deep Shredder 13 x64 40/4

Post by mar »

kranium wrote: Sat Aug 11, 2018 11:42 pm 1.4x sounds like a speedup, not a slowdown...but no matter I have no idea where you got that number, or what it is.
I can only speculate that you believe 1 physical core (HT off) is better or faster than 2 logical threads (HT on).

But that's not true...talk to the power users on playchess, for ex..they're all using hyperthreading because it's a significant performance boost.
I realize there's a powerful misconception about HT being bad, largely propagated here on talkchess.
But there's also been plenty of new data presented here in the last couple of years that refutes that myth (perhaps from Kai L.?), if you care to search for it.
Huh? 1.4x slower means 1.4 times slower, not faster.
Where did I get than number? I simply ran two programs utilizing 4 threads together and compared to only one utilizing 4 cores,
so when two were running at the same time, they ran 1.4 times longer, but that's obvious because 8 threads HT on a quad core won't get you 2x speedup (vs 4T) but only about 20-30%, which I typically get for my programs.
Martin Sedlak
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: LC0 ver 0.16.0 ID 10520 vs Deep Shredder 13 x64 40/4

Post by kranium »

mar wrote: Sat Aug 11, 2018 11:50 pm Where did I get than number? I simply ran two programs utilizing 4 threads together and compared to only one utilizing 4 cores,
so when two were running at the same time, they ran 1.4 times longer, but that's obvious because 8 threads HT on a quad core won't get you 2x speedup (vs 4T) but only about 20-30%, which I typically get for my programs.
that makes sense

Actually Ted could have used less than 4 threads to feed his LC0 GPU.
If you assume he has only 1 GPU, then 1 or 2 threads is very likely more than sufficient to keep it fully utilized.

The point I'm trying to make is that with HT on, there are plenty of threads to run both programs simultaneously without thread contention
or any other issue.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: LC0 ver 0.16.0 ID 10520 vs Deep Shredder 13 x64 40/4

Post by mar »

kranium wrote: Sun Aug 12, 2018 12:10 am Actually Ted could have used less than 4 threads to feed his LC0 GPU.
If you assume he has only 1 GPU, then 1 or 2 threads is very likely more than sufficient to keep it fully utilized.

The point I'm trying to make is that with HT on, there are plenty of threads to run both programs simultaneously without thread contention
or any other issue.
Yes, it's probably not a big deal for Leela because most of the real work is done on the GPU.
I'm not criticizing Ted, he's of course free to use whatever conditions he chooses for his match and the result is perfectly valid under these conditions.

What I'm saying is that you have two logical cores constantly competing for one physical core for parts that are shared, which is why you don't get 2x speedup you would with two physical cores.
Martin Sedlak
Javier Ros
Posts: 200
Joined: Fri Oct 12, 2012 12:48 pm
Location: Seville (SPAIN)
Full name: Javier Ros

Re: LC0 ver 0.16.0 ID 10520 vs Deep Shredder 13 x64 40/4

Post by Javier Ros »

kranium wrote: Sat Aug 11, 2018 10:14 pm I'm quite sure that i7-4790 (being an Intel processor) ships with Hyperthreading 'on'
That would be 4 physical cores (8 logical cores) meaning 8 threads
Ted has carefully allocated 4 for LC0 and 4 for Shredder

Where's the 'contention'?...other than with a couple users here who need to bully and criticize what they don't understand.

Thanks for sharing Ted!
If this is the case and Shredder played with 4 logical cores and lc0 with 4 then I admit the test is correct, taking into account the permanent brain if the engine has this feature.

I did not want to start any controversy.
Thanks to Ted for sharing his test.