Spock wrote:Gian-Carlo Pascutto wrote:
If running Rybka on 4 cores gives you a speedup of 3, then they are claiming the new IDeA will give you a speedup of 4 instead. So that's one core more.
Of course the claim is totally bogus. If it would be true, then Rybka should switch it SMP algorithm to IDeA by default, and the claim would be false again.
I don't see why it is bogus. Running on 4 cores gives you an effective speed-up of 3 due to SMP "losses", but 4 x 1 = 4, i.e. running 4 instances of single core, means there is no SMP loss. So I see the logic in that statement. Whether it gives a better result or not is another story.
There is _NO_ way (emphasis intended) to have zero overhead. With a _very_ good SMP algorithm, using much more regular trees than we see to day, Cray Blitz got quite close to 4.0 on 4 cpus, but at 8 we were down, and when we got to 16, we were at roughly the same point some of see today, which is 11-12x.
You are right, there is no SMP loss. There is a _HUGE_ cluster loss however. I don't know of a single programmer that would choose a 4-way cluster over a 4-core box. The idea is so far beyond silly, it takes sunlight 6 months to get from silly to that idea.
I have never seen so much disinformation from a single source. But once you notice the trend (bogus NPS, bogus total nodes, bogus PVs, bogus depth, bogus scores) then "bogus parallel search results" can't be too unexpected.
The statement made is simply a twisted and contorted piece of logic. If you have lots of things to analyze, yes you are better off analyzing each different thing with just one core. You analyze 4 at a time with no overhead. What is new about that? But if you want to analyze _one_ thing, which is where parallel search is important in the first place, this idea is worthless. It isn't something new. It is old. _very_ old. And misleading, of course.
If you can create a circumstance where you have 4 absolutely unrelated positions, and they are not derived from your own calculations (which would introduce a dependency since you first must derive the positions before you can analyze the independently) then you might see something useful here. Or if you want to analyze 4 independent games, you might have something useful. Most are not interested in that that however, and the minute positions are dependent on previous positions, the search overhead comes into play. You can either do extra work, or you can wait until dependencies are resolved and accumulate idle time. Either way, it won't be 4x.