Deep Blue vs Rybka

bob · Post by **bob** » Wed Sep 15, 2010 2:21 am

jwes wrote:
bob wrote:
Don wrote:Bob,

I tried the link you gave me and it's not working. Is it working for anyone else or is there something wrong with my connection?
I just did this:

ftp ftp.cis.uab.edu
login: anonymous (or you can use "ftp" without quotes)
password: hyatt@cis.uab.edu (enter your email)
cd pub/hyatt/source
ls crafty*x.tar
227 Entering Passive Mode (138,26,66,6,74,169)
150 Here comes the directory listing.
-rw-r--r-- 1 500 30 7260160 Sep 14 10:52 Crafty-10.x.tar

So it works for me...
The link in your post is http://ftp.cis.uab.edu which does not work but ftp://ftp.cis.uab.edu does.

No http: typed on this end. Just this:

"ftp.cis.uab.edu/pub/hyatt/source"

(without the quotes). If the message board interpreted that as a hyper-link and added the http:// then that certainly would not work. But it didn't come from my end since I know we have no httpd running on that box, ever...

bob · Post by **bob** » Wed Sep 15, 2010 2:23 am

Don wrote:
jwes wrote:
bob wrote:
Don wrote:Bob,

I tried the link you gave me and it's not working. Is it working for anyone else or is there something wrong with my connection?
I just did this:

ftp ftp.cis.uab.edu
login: anonymous (or you can use "ftp" without quotes)
password: hyatt@cis.uab.edu (enter your email)
cd pub/hyatt/source
ls crafty*x.tar
227 Entering Passive Mode (138,26,66,6,74,169)
150 Here comes the directory listing.
-rw-r--r-- 1 500 30 7260160 Sep 14 10:52 Crafty-10.x.tar

So it works for me...
The link in your post is http://ftp.cis.uab.edu which does not work but ftp://ftp.cis.uab.edu does.
Yes, I just assumed without looking that it was a web address - I've used ftp for years but forgot about this.

After downloading none of it works anyway and I cannot make it compile without putting some time into it - and I would be fool to waste any more time on this.

If you use linux (64 bit only) just "make" ought to work, although I think I did use the intel C compiler rather than gcc...

gcc worked ok, but I can never get a PGO run to work, and although I didn't use PGO for my tests, had planned on trying to at first, and intel's compiler works perfectly for that, not to mention producing better code than gcc.

bob · Post by **bob** » Wed Sep 15, 2010 2:31 am

Milos wrote:
bob wrote:I do have 8 core boxes, and I'd be more than happy to run the test for 1, 2, 4 and 8. But I have already done that, with no surprises at all. I don't run many 8-core tests because that machine has 70 nodes, and I can only run 70 games at a time, which is slow enough to make me not do it unless I am testing something new in the parallel search and want to see how it does.
Well you only repeat enormous number of times you've done something, however, you never present any new results except what you had in your Cray Blitz paper from the last millennium.
I repeat I've never seen a grain of proof for what you are saying.

Perhaps try opening eyes and shutting mouth would help? I have posted hundreds of parallel search performance runs here. Some 5-6 year old data was on my ftp box, showing 1-8 cores. Others have run crafty tests and posted the results. So just because you didn't see them, that's not a shortcoming on _my_ part. It is ignorance on yours. The data has been presented, and parallel search discussions have been taking place here for many years. You don't get a free pass to stumble around with your eyes closed and then claim, "You never provide any data" because you didn't see it. You _couldn't_ see it unless you looked. But it's been here.

So try again with trying to shift the topic to something that is false.

If what you claim is true taking even 4 and 8 core machine you would have (with your formula):
- on 4 core machine: speedup_a=1+3*0.7=3.1
- on 8 core machine: speedup_b=1+7*0.7=5.9

Amazing. You got it right. And had you seen the analysis by Martin F (I put the raw data from an 8-way opteron box run a few years ago) you would have seen that the speedup was actually _above_ that simple linear fit I have been quoting. But then again, you wouldn't actually be interested in real data, or you would have already seen more than enough.

log(speedup_b/speedup_b)/log(2)*70=65 elo points

There is simply no way you can get 65 elo improvement. And to cite you, been there done that. You don't get even 40 elo.

I know what a factor of 2.0 in speed gives in Elo. There's nothing else to say. You keep talking about me providing no data when I have provided hundreds of posts over the past 5 years on this very topic, then you provide _nothing_ but a vacuous declaration of "you don't even get 40." Perhaps you were trying to shift the discussion to your IQ or something. Otherwise I have no clue where "40" comes from with regard to anything. The 1.7x is less than 2x, which means less that 70 Elo. Probably in the range of 50 or so. Of course, I suppose that kind of math is a little over your head???

Once you finally run this kind of test (using programs from this millennium) and realize you are wrong, then you might be able to start thinking of the reasons why...

I'm only trying to think of reasons why I keep trying to discuss things with someone that has no clue about what is going on. That is a much more serious issue, in fact... talk about wasting time...

mhull · Post by **mhull** » Wed Sep 15, 2010 2:38 am

Don wrote:
mhull wrote:
Don wrote:What are you, Bob's groupie or something?
No, I'm just old. And remember, I bought a copy of Rex Chess (back when), so you can't be too hard on me.
Will you be my groupie too?

I will, as soon as some whippersnapper -- long on talk and short on data -- starts pontificating at you.

bob · Post by **bob** » Wed Sep 15, 2010 2:40 am

Milos wrote:
bob wrote:I have actually run on machines with up to 64 CPUs (not a cluster, a real shared-memory SMP box). And thru 64, the speedup pretty well matched my linear approximation of

speedup = 1 + (NCPUS - 1) * 0.7

Or, for the 64 CPU case, about 45x faster than one CPU. Not great, but seriously faster, still.
One more fishy thing related to your formula.

Speedup from 1 to 2 cores is 1.7x. Also speedup from 2 to 4 cores should be the same (1.7x). And the same for speedup from 4 to 8 cores.

Learn some math. Hint. 1 cpu has no overhead. two cpus has one cpu with no overhead, one with. What happens when you add the other two? Now you have 3 with overhead, one without. Nothing says that the speedup follows your interpretation. Everything is relative to 1, because any extra adds overhead. If you could double that 1.7, you would have 2 with practically no overhead...

I gave a simple linear approximation to speedup that has been tested thru 64 now. It has been beaten to death on 1-16. It is not "perfect". It is "close" and _is_ a "rough approximation." Unless you have something concrete to offer, other than just hot air and incorrect assumptions, what is the point of continuing this???

So the speedup from 1 to 8 cores is 1.7^3=4.9x.
Your formula gives 1+7*0.7=5.9x. And somehow you get 5.9x (or even 6.3) in your results.

So something is terribly wrong.

Yes it is, just not with _my_ calculations. Again, you get one processor for free. Every one after that adds overhead. You get more overhead going from 2 to 4 than from 1 to 2. You get more still going from 4 to 8 than from 2 to 4. I suppose this must be rocket science or something because it is not getting thru to you for some reason.

If you look at my formula, that's the reason for the 1 at the front. It works with little overhead, where each additional one doesn't get in for free. And again, we are not trying to integrate to find a closed solution for the area under the curve. This is just an approximation that is reasonably close. Don't make it into something that it is not. You want accurate numbers for 1-8? Go run several positions to a fixed depth and compute the speedup that way. Then there will no error at all for the position set you chose.

bob · Post by **bob** » Wed Sep 15, 2010 2:44 am

Milos wrote:
bob wrote:I have actually run on machines with up to 64 CPUs (not a cluster, a real shared-memory SMP box). And thru 64, the speedup pretty well matched my linear approximation of

speedup = 1 + (NCPUS - 1) * 0.7

Or, for the 64 CPU case, about 45x faster than one CPU. Not great, but seriously faster, still.
More on Bob's formula...

speedup_2=1+0.7=1.7
speedup_32=1+31*0.7=22.7
speedup_64=1+63*0.7=45.1

speedup_64/speedup32=1.99!!!
speedup_2=1.7

One doesn't compare speedup of 32 to speedup of 64. My formula compares speedup of 1 to any of the other choices (2 ... 64). So quit trying to make it into something it isn't, so you can show your ignorance further.

Do you understand the terminology "linear _approximation_" That's not the same as "exact linear fit" or "straight line". It is an "approximation". I'll let you google that term to get a grip on things.

17% more gain when going from 32 to 64 than from 1 to 2 cores. Man can only laugh at this, nothing more...

Yes, but at your use of math, not at the formula which does work...

Seriously though I really hope you didn't try to publish a paper with this kind of result, because I don't see any reviewer with a grain of self-respect who could let this kind of result pass through.

Fortunately, the people that would be reviewing things would (a) understand how to test a parallel program for speedup (which has nothing to do with playing entire games or matches); (b) what the term "linear approximation of speedup" means; (c) how a parallel search actually works and where the overhead comes from. You, sadly, seem unaware of any of those critical bits of knowledge...

Milos · Post by **Milos** » Wed Sep 15, 2010 2:51 am

bob wrote:Yes it is, just not with _my_ calculations. Again, you get one processor for free. Every one after that adds overhead. You get more overhead going from 2 to 4 than from 1 to 2. You get more still going from 4 to 8 than from 2 to 4. I suppose this must be rocket science or something because it is not getting thru to you for some reason.

I don't know if you are playing stupid or what, but your formula and presented results suggest exactly opposite from what you are claiming here. You seem not to be able to understand your own simple formula. Miguel asked you the same question in the other thread and you seamed not to have understood it.
I've never ever seen so far somebody presenting results that are exactly contrary to his claims as they support his claims. This is really amazing.

speedup_1_to_2=1.7
speedup_2_to_4=3.1/1.7=1.82
speedup_4_to_8=5.9/3.1=1.9

More overhead higher speedup!?????
Please man put yourself together... This is so sad

.

mhull · Post by **mhull** » Wed Sep 15, 2010 3:55 am

Milos wrote:
bob wrote:Yes it is, just not with _my_ calculations. Again, you get one processor for free. Every one after that adds overhead. You get more overhead going from 2 to 4 than from 1 to 2. You get more still going from 4 to 8 than from 2 to 4. I suppose this must be rocket science or something because it is not getting thru to you for some reason.
I don't know if you are playing stupid or what, but your formula and presented results suggest exactly opposite from what you are claiming here. You seem not to be able to understand your own simple formula. Miguel asked you the same question in the other thread and you seamed not to have understood it.
I've never ever seen so far somebody presenting results that are exactly contrary to his claims as they support his claims. This is really amazing.

speedup_1_to_2=1.7
speedup_2_to_4=3.1/1.7=1.82
speedup_4_to_8=5.9/3.1=1.9

More overhead higher speedup!?????
Please man put yourself together... This is so sad .

Dividing one speedup (faster perf/cpu) into another speedup (slower perf/cpu) is dividing apples by oranges. So it's a ratio of unlike things, which is an invalid comparison.

bob · Post by **bob** » Wed Sep 15, 2010 4:31 am

Milos wrote:
bob wrote:Yes it is, just not with _my_ calculations. Again, you get one processor for free. Every one after that adds overhead. You get more overhead going from 2 to 4 than from 1 to 2. You get more still going from 4 to 8 than from 2 to 4. I suppose this must be rocket science or something because it is not getting thru to you for some reason.
I don't know if you are playing stupid or what, but your formula and presented results suggest exactly opposite from what you are claiming here. You seem not to be able to understand your own simple formula. Miguel asked you the same question in the other thread and you seamed not to have understood it.
I've never ever seen so far somebody presenting results that are exactly contrary to his claims as they support his claims. This is really amazing.

speedup_1_to_2=1.7
speedup_2_to_4=3.1/1.7=1.82
speedup_4_to_8=5.9/3.1=1.9

More overhead higher speedup!?????
Please man put yourself together... This is so sad .

I can't figure out whether you are (a) dense; (b) an ass; or (c) a troll. But let me run the math one more time, since you are incapable...

Using my formula

Code: Select all

#cpus     speedup
      1             1.0
      2             1.7
      4             3.1
      8             5.9

Nowhere in my discussion of this formula suggests anything about taking the speedup from 2 and comparing it to 4 and making any sense. This is not a formula that is intended to do anything other than suggest a speedup for a given number of processors. What does "3.1 / 1.7" mean? What does "more overhead higher speedup" mean/ If you double the number of processors, and don't double the overhead, of _course_ you can get a higher speedup. But you are trying to use numbers that are estimates, to compute something that doesn't mean anything at all. So keep trying. I'm not going to argue something so stupid over and over. You want to treat the formula like it is accurate to 3 decimal places and then use predictive values to compute something that is even more accurate. Good luck with that.

Meanwhile, if you want to stop writing nonsense, and run Crafty on an 8-way box and test it over a set of positions, you can decide whether the formula is "close" or "way off" the _correct_ way. The rest of us understand it is a linear approximation of a function that is _not_ a straight line. I suppose "linear approximation to a non-linear set of data" is a bit to complicated for you...

ernest · Post by **ernest** » Wed Sep 15, 2010 8:46 pm

bob wrote:Using my formula

Code: Select all

#cpus     speedup
      1             1.0
      2             1.7
      4             3.1
      8             5.9

Hi Bob,
I wouldn't dream to be rude (I'm always astonished by that Milos...), but for me, the following numbers have been given, for n-core efficiency (including by Vasik Rajlich):

Code: Select all

single core   x1
dual   core   x1.7
quad          x2.8
octal         x4.4

So I must say I am a little surprised by your x3.1 and x5.9

Is that something new, or are we talking of something else than "efficiency" (average time to solution)?

Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka