syzygy wrote:bob wrote:Laskos wrote:bob wrote:There you go! Answers the question I posed before I saw your reply. Seems like it is quite easy to jump to conclusions here. And then reality crashes in.
Bugs happen. Regularly when one is dealing with parallel search.
What "there you go"? The gain from width is real even in buggy Hannibal, it's not simply overhead, it's 77 Elo points gain at fixed depth. The problem is that Hannibal almost doesn't go deeper 1->4 cores, time-to-depth being only 1.33. The overall effective speedup also suffers at 2.27, compared to 2.7-3.1 of most other engines. Yes, it is a bug, but not "there you go". Hannibal still has 2.27 effective speedup, coming mostly from non-trivial (non-overhead) widening, which gains points.
And as Edsel points out, Komodo's implementation is probably different and not buggy.
As a programmer, you should know that sometimes a bug hurts performance, sometimes it helps. He explicitly stated that this "increase" was unintentional and was the direct result of a bug that he has now fixed.
Ergo, no conclusions can be drawn from it at all.
So what would you call the speed up of Hannibal?
As measured by Kai, when going from 1 to 4 cores time-to-depth improves by a factor of 1.33. But the 4-core search has the strength of a 2.27x faster 1-core search.
You can call the speed up 1.33, but that's just a useless number.
The number that measures the complete SMP implementation is 2.27.
The 2.27 is a "buggy Elo". Did you read the entire thread? 1.33 shows there is work left to be done, nothing more, nothing less...
BTW, a little math, just for fun.
The Elo gain presumably shows a number that suggests the thing is 2.27x faster. Part of which comes from extra depth (how much is unknown) and part of which comes from a bug that makes the search wider. Since we have no ACTUAL speedup numbers (the 1 to 4 cpu number is meaningless with the aforementioned bug included, we might guess.
Last time I ran the test, 1 ply = 70 Elo for Crafty. Without an accurate number for Hannibal, I can only assume something similar. Which means he is using about 1/2 the available computing power. If, instead of 1.33x faster, he was hitting 3.3x (Crafty's actual number, for reference) that is a ply, a full 2x improvement over the 1 cpu version.
Out of his current 2.27, what do you assume comes from additional depth and what comes from additional width? Half each? Then the "widening" is not as effective as going for the full 3x+ faster since that should give +70 Elo by itself, which is as good as the entire improvement he sees.
As I said, the idea is there, but trading with for depth is a tradeoff that requires some serious analysis. In this case, I'd be after the full 3x+ speedup, losing the extra width, which is likely exactly what is going to happen when the author fixes the bug...
It is a serious decision to give up a ply. Serious enough to make one look carefully at why "wider" is worth more than the ply. At first look, it would seem that issue is worth addressing before taking a hammer to the problem and giving up the ply. I've always found positions in chess where another ply would have been just what was needed to spot something critical.