syzygy wrote:That Komodo's SMP implementation is effective in terms of playing strength / elo gain per core follows from other threads.
Nope. Nothing to base that on.
What in "follows from other threads" is unclear to you?
bob wrote:Elo is based on time, where both players have equal time to play the game. This test does not even approximate that standard.
"Follows from other threads." I am sorry, but I am not going to spell it out any further.
syzygy wrote:
bob wrote:Ergo <zero information>
Nope. It does show that Komodo's SMP implementation is different. Komodo's 4-core search to a particular (reported) depth is clearly of higher quality than Komodo's 1-core search to the same (reported) depth.
Is a search without LMR "higher quality"? Most of us equate "quality" with "Elo". But we have ZERO data about the Elo gain... Because time is a part of the equation and it was left out.
This is a little bit like I am talking to a very small kid.
Do you know what was tested? Komodo at fixed depth with 4 threads versus Komodo at fixed depth with 1 thread. Komodo with 4 threads scored way better. This shows that "Komodo's 4-core search to a particular (reported) depth is clearly of higher quality than Komodo's 1-core search to the same (reported) depth".
If you insist on quibbling about these words then you're just slightly too sore for my taste.
Laskos wrote:And this +130 Elos going from 1 to 4 cores compares rather well with 90-110 Elos shown by "typical" engines like Houdini, Critter, Stockfish, Crafty, etc. on CCRL and CEGT 40/4.
The gain highly depends on the TC, 15"+0.2 is an order of magnitude faster than CEGT 40/4.
From preliminary results Komodo 5.1 with 4 cores is rated around 3115, whereas the 1 core engine will be situated between Komodo CCT and Komodo 5.0 at around 3025. This means about 90 Elo gain from 1 to 4 cores.
As comparison, on the CEGT 40/4 list Houdini gains about 105 Elo from 1 to 4 cores (3188 compared to 3082).
Yes, the comparison is unfair, the point was just that Komodo scales with the number of threads comparably with the best and "typical" engines regarding SMP. If one goes with Bob's argument, Komodo's SMP performance would be abysmal. I am away from my i7 to show that purely time-to-depth Komodo would gain much less of what it is getting, but factually, it gains points from 1 to 4 cores comparably to the gains of Houdini. And again, I am not saying that Komodo's SMP implementation is the best. It's different.
You keep mischaracterizing what I said. I discussed a well-defined, often-used term. Namely "SMP efficiency". That has ZERO to do with Elo (at least not directly), or with anything else. It is solely used to measure how much faster a 4 cpu program is than a 1 cpu program, when doing the SAME computational task.
That's not what is being done here. So using "SMP efficiency" was the wrong term to use in the first post in this thread. As to the 1 vs 4 cpu Elo comparison, that is also broken. It shows only that "something strange is going on." Nothing about whether that strange stuff helps or hurts, etc...
That it is "different" is irrelevant if you want to use a well-known term like "SMP efficiency." Another commonly misused term is "branching factor" when what is really intended is "effective branching factor." The branching factor for chess can not be altered, while the EBF certainly can be.
bob wrote:
SMP efficiency is ALWAYS defined as "time (1cpu) / time (Ncpus)"
ALWAYS.
I can provide citations if you want. The book I use in my parallel programming course this summer has it in chapter 2.
You seem to be confused by "time (1cpu) / time (Ncpus)". The definition "time (1cpu) / time (Ncpus)" is correct in the sense that time ratio needed for 1cpu to get THE STRENGTH of Ncpus is defining the effectiveness of SMP implementation. Not time-to-depth, time-to-depth in general is useless.
I used a precise term, the exact one you used, "SMP efficiency". And that is ALWAYS defined as time(1cpu) / time(ncpus) when both do exactly the same computational assignment (same data, etc, or in the case of chess, the same position to the same depth).
Ok, based on that definition you are correct. Can we end this conversation now?
Let's all tell Bob he is correct so that we can move on.
Time to depth is the ONLY way one can calculate SMP efficiency. I can't repeat that often enough. It is the ONLY way. Comparing 1 cpu to 4 cpu elo measures something else entirely. Not just the SMP efficiency (speed-up) but also other qualitative issues about the SMP search that do not just affect the speed. And even includes the presence of bugs that don't show up frequently, but do show up every now and then...
Unfortunately, there was never any doubt that I was correct here. If you want to mangle terms, that's fine. It just makes communication that much more difficult when everyone uses a different definition of a term. And sarcasm really doesn't help much here either...
bob wrote:
SMP efficiency is ALWAYS defined as "time (1cpu) / time (Ncpus)"
ALWAYS.
I can provide citations if you want. The book I use in my parallel programming course this summer has it in chapter 2.
You seem to be confused by "time (1cpu) / time (Ncpus)". The definition "time (1cpu) / time (Ncpus)" is correct in the sense that time ratio needed for 1cpu to get THE STRENGTH of Ncpus is defining the effectiveness of SMP implementation. Not time-to-depth, time-to-depth in general is useless.
I used a precise term, the exact one you used, "SMP efficiency". And that is ALWAYS defined as time(1cpu) / time(ncpus) when both do exactly the same computational assignment (same data, etc, or in the case of chess, the same position to the same depth).
Time to depth is the ONLY way one can calculate SMP efficiency. I can't repeat that often enough.
It's getting pretty absurd.
Say, from 1 core to 4 cores time-to-depth gain is
Crafty: 3.2 (you yourself got this factor)
Komodo:1.8
Elo points gain
Crafty: 100 points
Komodo: 100 points
1.8 is approximately sqrt(3.2), so the gain of Komodo should have been 100/2=50 points. What's more important, your absurd insistence on time-to-depth or the real Elo increase?
I think we should agree that SMP efficiency in chess is measured in Elos, not depths.
Not at all. "SMP efficiency" has a precise definition, period. Elo measures something else entirely, only a part of which is contributed by SMP efficiency.
Elo is the right thing to compare, something you did not do in your original post. It is the best measure of a chess engine's performance, whether the search be serial or parallel...
It is the ONLY way. Comparing 1 cpu to 4 cpu elo measures something else entirely. Not just the SMP efficiency (speed-up) but also other qualitative issues about the SMP search that do not just affect the speed. And even includes the presence of bugs that don't show up frequently, but do show up every now and then...
Laskos wrote:And this +130 Elos going from 1 to 4 cores compares rather well with 90-110 Elos shown by "typical" engines like Houdini, Critter, Stockfish, Crafty, etc. on CCRL and CEGT 40/4.
The gain highly depends on the TC, 15"+0.2 is an order of magnitude faster than CEGT 40/4.
From preliminary results Komodo 5.1 with 4 cores is rated around 3115, whereas the 1 core engine will be situated between Komodo CCT and Komodo 5.0 at around 3025. This means about 90 Elo gain from 1 to 4 cores.
As comparison, on the CEGT 40/4 list Houdini gains about 105 Elo from 1 to 4 cores (3188 compared to 3082).
Yes, the comparison is unfair, the point was just that Komodo scales with the number of threads comparably with the best and "typical" engines regarding SMP. If one goes with Bob's argument, Komodo's SMP performance would be abysmal. I am away from my i7 to show that purely time-to-depth Komodo would gain much less of what it is getting, but factually, it gains points from 1 to 4 cores comparably to the gains of Houdini. And again, I am not saying that Komodo's SMP implementation is the best. It's different.
You keep mischaracterizing what I said. I discussed a well-defined, often-used term. Namely "SMP efficiency". That has ZERO to do with Elo (at least not directly), or with anything else. It is solely used to measure how much faster a 4 cpu program is than a 1 cpu program, when doing the SAME computational task.
Fine, I will go with Don on this, and agree with you. Komodo doesn't have the notion of SMP efficiency because it's not doing exactly the SAME computational task on 4 cpu. I guess only Crafty has SMP efficiency. That's all I have to explain to you.
That's not what is being done here. So using "SMP efficiency" was the wrong term to use in the first post in this thread. As to the 1 vs 4 cpu Elo comparison, that is also broken. It shows only that "something strange is going on." Nothing about whether that strange stuff helps or hurts, etc...
That it is "different" is irrelevant if you want to use a well-known term like "SMP efficiency." Another commonly misused term is "branching factor" when what is really intended is "effective branching factor." The branching factor for chess can not be altered, while the EBF certainly can be.
bob wrote:So using "SMP efficiency" was the wrong term to use in the first post in this thread.
So this whole regrettable discussion turns around your personal definition of "SMP efficiency" that you have decided to impose upon all of us.
Any reasonable person understood perfectly well how the term was used in the first post.
This is typical of Bob's style when he is backed into a corner, he finds some technicality and digs in. It would be far easier for him to just to just yield a little ground and he could do this and still maintain his dignity but he will keep going until he cannot back down.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
bob wrote:
The test did not address your speculation.
The correct test is to take a group of parallel programs and play 'em in a large match, using 1 core per engine. Then take each engine, one at a time, and run 'em at 4 cores. Measure the Elo improvement. If Komodo gains more than the others, I'll stand ready to eat my hat. the test, as run, doesn't show a single thing about this topic, however...
is your hat safe ? or were you not referring to eqivalent gain.
bob wrote:So using "SMP efficiency" was the wrong term to use in the first post in this thread.
So this whole regrettable discussion turns around your personal definition of "SMP efficiency" that you have decided to impose upon all of us.
Any reasonable person understood perfectly well how the term was used in the first post.
Maybe one should say that time-to-depth is applicable to "SMP efficiency" engines, as defined by Bob, and doesn't apply to non-"SMP efficiency" engines like Komodo.