Don wrote:bob wrote:Don wrote:Bob,
I'm not following this too closely any longer. I don't know to what extent you have taken these 2 things into consideration - maybe you already have but if not, here goes:
Crafty gets 100 ELO going from 1 to 4 processors. That is 2 doublings and that means you get 50 ELO per doubling. If you go with MORE processors you get even less ELO per doubling. So the point is that you cannot mix and match any way you want to and call it science. I'm not saying you are doing that as I am only quickly skimming these discussions. So if you talk about nodes per second, number of cores, or speedup per core you have to separate them and make sure you are being scientifically rigid, at least as much as tests like this can permit.
That is not "two doublings". This is, once again, apples and oranges. SMP overhead comes in and this changes things.
I didn't say you were doing anything wrong here, I'm only making the point that we must use great care. And since I don't have time to carefully parse all the posts flooding in right now and answer every point, I just wanted to remind everyone involved of this.
For example, it would wrong to test a 1 cpu program against a 4 cpu program and say, "even after 2 doublings we only get 100 ELO", but then later say a doubling in hardware is worth a full 60 or 70 ELO without distinguishing was KIND of doubling it was.
I agree completely. And have tried to be quite clear. If you talk about pure hardware speed, a 6 core is close to 6x faster than a single core at the same speed. A lot depends on memory and such and also cache (two 4-core chips might be better than one 8-core for example, since you suddenly have twice the paths to memory, if memory has been designed with interleaving or some such. If you talk about program speed, then I try to use (for SMP discussions) the actual SMP speedup numbers, which is always less than the number of cores, on average...
At the very beginning I stressed that we should not even be considering MP programs in all of this - it can be worked out later. Keep it simple stupid, the KISS principle. Otherwise it gets terribly confusing. What you should have done is estimate the 1 cpu hardware improvements over 15 years, then the 1 cpu software improvement over 15 years, and left it at that. But I feel that you changed the point of reference in each case to suit whatever point you happened to be trying to make at the time. Whether you did or did not, you made it really confusing.
If you have noticed, _all_ of my testing has been single-cpu here. I do lots of SMP testing, but never in this discussion, and primarily only do that when I want to test SMP changes, since it slows testing down by 8x.
My "Crafty" numbers could be written two different ways. Clearly if you run Crafty on a single-cpu I7, it will run at N nodes per second. Just as clearly, a 6-core (or dual-6-core) i7 would be 6 or 12 times faster in overall computing power. And less in terms of chess-playing skill. Which number to use? I use Crafty's speedup numbers since they have been verified dozens of times, by many different people. But one could argue that since Crafty's NPS would actually be 6x or 12x faster on the above hardware, that is a software shortcoming, and the hardware should get full credit. If we can't use it effectively, is that the engineer's fault?
But for my summary of results, I certainly used the effective speedup number, which was about 1024, as opposed to the theoretical max, which was 1500x.
So let's please keep this real simple and leave out MP completely. Do the 1 core calculation only for old and new programs. THEN we can see how much we get for 6 more cores in 2010.
Already did that. If you want to remove the 4.5x speedup for 6 cores, we go back to 250x faster. Which is 8 doublings. or something in the range of 560 Elo assuming just 70 per double. But there is definitely another two doublings without being optimistic, because we have such 6-core boxes already and have the performance data showing 4.5x (actually closer to 5.0 for a couple of hundred test positions (and not tactical positions that are well-suited)).
I cannot help but feel that your argument is really weak when you feel the need now to talk about the inferiority of various SMP machines and now are pushing to invalidate the results of the ratings agencies due to this.
Where am I pushing to invalidate anything? Someone questioned why my results are off by 50 or whatever compared to others. Simple questions:
What about the book? Do you believe it matters? I believe a good book is worth 100 Elo, from many years of doing this stuff. I don't release our tournament book, period, we keep it for tournaments.
What about learning? I can't count the number of mistakes I have made due to book and position learning. The first version looks bad, the new version looks better. Not because it is better, but because of the learning from the first version.
What about variable loads on a test machine? Run a short query while A is thinking and it gets hurt. Run it while B is running and B gets hurt. I don't have that problem at all.
I don't mix and match and change opponents all the time. I try to keep the same opponents, the same positions, the same hardware, the exact same time control, the same everything. And no books or other outside influences.
And exactly what "inferiority of various SMP machines" have I started to talk about? No idea what that means...