Thank you for sharing these interesting results.  There is little data around this topic and it is becoming more and more relevant.
PK
			
			
									
						
										
						Threads factor: Komodo, Houdini, Stockfish and Zappa
Moderator: Ras
- 
				kasinp
- Posts: 261
- Joined: Sat Dec 02, 2006 10:47 pm
- Location: Toronto
- Full name: Peter Kasinski
- 
				Modern Times
- Posts: 3756
- Joined: Thu Jun 07, 2012 11:02 pm
Re: Threads factor: Komodo, Houdini, Stockfish and Zappa
Yes I agree, these tests are very interesting. Thanks to Andreas for running them.michiguel wrote: But that is not the point of the experiment. This tells us about the upper limit of scalability, which is useful to know. In addition, it tells us how that upper limits suffers from addition of cores. For instance, Houdini starts to have problems after exactly 16 cores. Before that, it is among the best.
Miguel
- 
				Laskos  
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Threads factor: Komodo, Houdini, Stockfish and Zappa
Miguel, as you knowmichiguel wrote:But that is not the point of the experiment. This tells us about the upper limit of scalability, which is useful to know. In addition, it tells us how that upper limits suffers from addition of cores. For instance, Houdini starts to have problems after exactly 16 cores. Before that, it is among the best.Uri Blass wrote:Interesting information but the target of chess programs is not to search more nodes but to earn playing strength.
Nodes are not proportional to playing strength and I guess that for the same engine,
the same number of nodes with 1 thread is better than the same number of nodes with many threads.
Miguel
1/ MP NPS are scaling with time or depth, so at 200s per position they will be different (and probably higher in most of cases) from 20s per position
2/ NPS of Rybka Cluster and Jonny on 2,000+ cores scaled almost linearly with the number of cores, and they lost to Xeon Houdini and Junior respectively. So, their effective speed-up is way below what NPS shows. And effective speed-up from MP will scale with time too, increasing with time per position.
- 
				zullil
- Posts: 6442
- Joined: Tue Jan 09, 2007 12:31 am
- Location: PA USA
- Full name: Louis Zulli
Re: Threads factor: Komodo, Houdini, Stockfish and Zappa
Here are some data from the very latest Stockfish:Isaac wrote: It would be interesting to see how the newer SF dev versions are doing compared to SF DD.
Code: Select all
Threads          NPS          NPS/(NPS for 1 thread)
--------------------------------------------------------
     1            1 279 478                 1.00
    
     2            2 732 441                 2.14
     4            5 176 548                 4.05
     8            9 432 155                 7.37
    16           17 073 731                13.34
Code: Select all
./stockfish bench 1024 threads 20 benchfile.txt time- 
				michiguel  
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Threads factor: Komodo, Houdini, Stockfish and Zappa
Absolutely. It would be interesting to repeat this at 200s (I do not believe it would be needed to run it 5 times, one will be enough, and it could done with 2,4,6 etc). I bet all curves will go up.Laskos wrote:Miguel, as you knowmichiguel wrote:But that is not the point of the experiment. This tells us about the upper limit of scalability, which is useful to know. In addition, it tells us how that upper limits suffers from addition of cores. For instance, Houdini starts to have problems after exactly 16 cores. Before that, it is among the best.Uri Blass wrote:Interesting information but the target of chess programs is not to search more nodes but to earn playing strength.
Nodes are not proportional to playing strength and I guess that for the same engine,
the same number of nodes with 1 thread is better than the same number of nodes with many threads.
Miguel
1/ MP NPS are scaling with time or depth, so at 200s per position they will be different (and probably higher in most of cases) from 20s per position
2/ NPS of Rybka Cluster and Jonny on 2,000+ cores scaled almost linearly with the number of cores, and they lost to Xeon Houdini and Junior respectively. So, their effective speed-up is way below what NPS shows. And effective speed-up from MP will scale with time too, increasing with time per position.
Of course, the ceiling for jonny or R. was much higher than reality.
Miguel
- 
				Joerg Oster
- Posts: 982
- Joined: Fri Mar 10, 2006 4:29 pm
- Location: Germany
- Full name: Jörg Oster
Re: Threads factor: Komodo, Houdini, Stockfish and Zappa
What makes you think that SF now widens a bit?Laskos wrote:These are NPS. Hard to tell strength-wise, or effective speed-up. Time to depth (TTD) won't help too much either, as even SF with Joona's patch widens a bit, without talking of Komodo.
Afaik, there is no change in the search algorithm for SMP ...
Jörg Oster
			
						- 
				Laskos  
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Threads factor: Komodo, Houdini, Stockfish and Zappa
I posted right after the release of Joona's patch:Joerg Oster wrote:What makes you think that SF now widens a bit?Laskos wrote:These are NPS. Hard to tell strength-wise, or effective speed-up. Time to depth (TTD) won't help too much either, as even SF with Joona's patch widens a bit, without talking of Komodo.
Afaik, there is no change in the search algorithm for SMP ...
http://www.talkchess.com/forum/viewtopi ... 6&start=11
- 
				Joerg Oster
- Posts: 982
- Joined: Fri Mar 10, 2006 4:29 pm
- Location: Germany
- Full name: Jörg Oster
Re: Threads factor: Komodo, Houdini, Stockfish and Zappa
Sorry, I missed that one.Laskos wrote:I posted right after the release of Joona's patch:Joerg Oster wrote:What makes you think that SF now widens a bit?Laskos wrote:These are NPS. Hard to tell strength-wise, or effective speed-up. Time to depth (TTD) won't help too much either, as even SF with Joona's patch widens a bit, without talking of Komodo.
Afaik, there is no change in the search algorithm for SMP ...
http://www.talkchess.com/forum/viewtopi ... 6&start=11
Kind of strange, because as far as I understand Joona's patch, there is no change in the search algorithm. Idle threads can now try to actively join a split point. Maybe this is due to more search overhead?
Jörg Oster
			
						- 
				Laskos  
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Threads factor: Komodo, Houdini, Stockfish and Zappa
Maybe, I don't know, I was so surprised that I then set alpha, beta to 0.01 instead of 0.05 (LLR 4.59 instead of 2.94) to SPRT, and H1=10 was again accepted, so it's only 1% a false positive. Fixed depth:Joerg Oster wrote:Sorry, I missed that one.Laskos wrote:I posted right after the release of Joona's patch:Joerg Oster wrote:What makes you think that SF now widens a bit?Laskos wrote:These are NPS. Hard to tell strength-wise, or effective speed-up. Time to depth (TTD) won't help too much either, as even SF with Joona's patch widens a bit, without talking of Komodo.
Afaik, there is no change in the search algorithm for SMP ...
http://www.talkchess.com/forum/viewtopi ... 6&start=11
Kind of strange, because as far as I understand Joona's patch, there is no change in the search algorithm. Idle threads can now try to actively join a split point. Maybe this is due to more search overhead?
Code: Select all
    Program                            Score      %      Elo    +   -   Draws
  1 SF 4 threads                  : 2831.0/5425  52.2      8    6   6   60.6 %
  2 SF 1 thread                   : 2594.0/5425  47.8     -8    6   6   60.6 %
- 
				yolin
- Posts: 30
- Joined: Thu Mar 30, 2006 6:12 pm
Re: Threads factor: Komodo, Houdini, Stockfish and Zappa
For scalability, node count is only one aspect though. Here is a suggestion to test the true scalability of the engines. Consider the following match conditions
For a fixed engine (selfplay)
1 thread 32 min/game vs 2 threads 16 min/game
1 thread 32 min/game vs 4 threads 8 min/game
1 thread 32 min/game vs 8 threads 4 min/game
1 thread 32 min/game vs 16 threads 2 min/game
1 thread 32 min/game vs 32 threads 1 min/game
The elo difference for the different threads should give a pretty good indication of how well an engine scales. Perfect scaling would give 50% score for all games.
			
			
									
						
										
						For a fixed engine (selfplay)
1 thread 32 min/game vs 2 threads 16 min/game
1 thread 32 min/game vs 4 threads 8 min/game
1 thread 32 min/game vs 8 threads 4 min/game
1 thread 32 min/game vs 16 threads 2 min/game
1 thread 32 min/game vs 32 threads 1 min/game
The elo difference for the different threads should give a pretty good indication of how well an engine scales. Perfect scaling would give 50% score for all games.