I'm not particularly a fan of using Elo to predict widening, then widening to predict Elo. Somewhat circular. Your time to depth is similar to what I do except that I generally use more sample runs for more cores since the variance is pretty significant at times.Laskos wrote:In Elo points gain at the fixed depth. The depth is chosen such that the games are not extremely fast, but fast enough to measure Elo with reasonable precision in reasonable amount of time. Many engines do not show widening at all (as Elo gain), just checked Shredder 12, and it does not.bob wrote:How is widening measured???Laskos wrote:Applying the above formula, I find [time(1CPU)/time(4CPU)] to equal _strength_, which gives the quality of SMP implementation:Laskos wrote:One cannot rely on time-to-depth. Besides Komodo and Rybka, I found another program, Hiarcs 14, which widens and has poor time-to-depth factor, although it appears to scale very well 1->4 cores in CEGT 40/4' list Elo-wise.bob wrote: For the record, Amdahl's law applies to ALL parallel programs. Or anything else done related to efficiency, such as internal microprocessor design decisions and such.
The primary issue is "which is easier to measure, the parallel speedup / SMP efficiency using the standard time-to-depth that has been in use forever, or the Elo gain when adding CPUs, which takes a huge amount of computational resources to measure?" For all programs excepting just one, it appears that the traditional time-to-depth is by far the better measure. At least until the point where one throws up his hands and says "I can't get any deeper by adding more processors, so what else can I use 'em for?"
Since for all programs except 1, the easier-to-measure metric works just as well as the Elo measurement, who would want to go to that extreme? I'm able to do it frequently due to having access to large clusters. But even for me, it is unpleasant. On the 70 node cluster with 8 cores per node, I can either play 560 games at a time, or 70 using 8 threads. That's a factor of 8x. I then need to run a BUNCH of games to get acceptable accuracy, further slowing testing. With a traditional time-to-depth measurement, I can run a thorough test in under an hour...
It appears that there are two factors: time-to-depth and widening 1->4 cores. If the widening gives N Elo points to the same depth, and doubling in time is 100 Elo points at fast controls, then the formula would be:
(time-to-depth factor) * 2^(N/100).
I agree that playing games to find the gain is practically impossible for normal users. One needs thousands of multi-threaded games at not very short control, say 40/4', a task which is too much even for developers with large machines.Komodo is the most extreme in deviating from time-to-depth rule.Code: Select all
Engine time-to-depth widening time-to-strength Houdini 3 3.05 0 3.05 Stockfish 3 2.91 0 2.91 Rybka 4.1 2.32 24 2.74 Komodo 5.1 1.68 75 2.83 Hiarcs 14 2.10 42 2.81 Gaviota 0.86 2.74 0 2.74
Since EVERY parallel search widens to some extent due to overhead.
Time-to-depth is measured on 150 positions (30 repeating 5 times) running for an hour or two on single core (and less than that on 4 cores).
For the "widening" I would be tempted to actually compare node counts to see how much larger the tree actually gets, as opposed to just measuring the Elo gain.
Right now, numbers (nodes, time, etc) are most helpful to the discussion since they can be used to derive a mathematical model that explains what is going on. If I could run Komodo on our cluster, I'd be generating enough data to analyze this to great precision...
BTW, this "Elo" measurement still leaves a lot to be desired. It is only measuring the Elo gain produced by looking at extra stuff. But it doesn't address the Elo loss that extra stuff costs in terms of time. That skews this Elo measurement significantly. In your tests, does EVERY 4 cpu move to fixed depth take less time than the equivalent single-cpu move? If not this really has a problem.
