Deep Blue vs Rybka

bob · Post by **bob** » Mon Sep 13, 2010 7:56 pm

Don wrote:
rbarreira wrote:Hsu said that Deep Blue's parallel efficiency was about 10%, so those 200m - 1bn nodes per second should be scaled down by a factor of about 10 to get something similar to the effective NPS of a single-threaded search. This was due to the massive parallelism it employed, I guess.

I think I believed that Deep Blue was much better than it actually was. It was impressive in 1997, but It seems that Deep Blue was inferior in every chess specific way except for raw nodes per second due to the constraints of hardware.

So using Deep Blue as a reference point is not going to work.

I'm not saying it was "lousy software". But Hsu set out with a specific project in 1985, primarily to build a "Belle on a chip". He did and used that to win a couple of ACM events before he got into the "multiple copies of Belle on a chip". And IBM picked up on that and the two-level parallel search was developed. So his "stuff" was based on raw speed, primarily, trying to avoid any sort of "selectivity" and the inherent error that produced. It was _insanely_ strong, however. As all of us that played the thing remembers, yourself included. Somewhat similar to the chess 4.x revelation from Slate/Atkin that ended the highly selective search approach that had been used in all programs prior to that.

I should add that many of their "issues" could have been addressed. Null-move could have been done in the software part of the search, which was basically about 2/3 of any pathway they searched, only the last 3-4-5 plies were hardware. Hsu had addressed the lack of hashing in the chess hardware, but did not have time to design the necessary memory system to support 16 simultaneous reads/writes from the group of chess processors that would connect to it.

Our current forward-pruning would be highly problematic, because what we are pruning is exactly that part of the tree (last 4 plies) that would be given to the chess hardware. And the SP was not an optimal platform, a fully-shared-memory box would have been better. One can only dream about a T932 with 16 chess processors per CPU... That would be interesting...

Don · Post by **Don** » Mon Sep 13, 2010 8:19 pm

bob wrote:
Don wrote:
rbarreira wrote:Hsu said that Deep Blue's parallel efficiency was about 10%, so those 200m - 1bn nodes per second should be scaled down by a factor of about 10 to get something similar to the effective NPS of a single-threaded search. This was due to the massive parallelism it employed, I guess.

I think I believed that Deep Blue was much better than it actually was. It was impressive in 1997, but It seems that Deep Blue was inferior in every chess specific way except for raw nodes per second due to the constraints of hardware.

So using Deep Blue as a reference point is not going to work.
I'm not saying it was "lousy software". But Hsu set out with a specific project in 1985, primarily to build a "Belle on a chip". He did and used that to win a couple of ACM events before he got into the "multiple copies of Belle on a chip". And IBM picked up on that and the two-level parallel search was developed. So his "stuff" was based on raw speed, primarily, trying to avoid any sort of "selectivity" and the inherent error that produced. It was _insanely_ strong, however. As all of us that played the thing remembers, yourself included. Somewhat similar to the chess 4.x revelation from Slate/Atkin that ended the highly selective search approach that had been used in all programs prior to that.

Those guys were good engineers, but as you have said the chess part was already out of date.

I had a lot of discussions with Murray and Hsu and what stood out to me was that they were MUCH better hardware guys that computer chess guys. A case in point was that Hsu was afraid of null move pruning. I don't know how Murray felt, but Hsu was paranoid about protecting the weaknesses of the program instead of actually doing what was best. For example he admitted to me that null move was a big improvement but thought that it might create some oversight that a micro would take advantage of.

With extensions he used similar reasoning. He wanted to extend everything imaginable in order to cover the possibility that a selective search micro might see something his hardware might miss. He also told me that they give up several ply of search to do those extensions. Their "nominal" depth was not impressive at all, however nominal depth does not tell the entire story because they extended so much that a 6 ply search was probably really more like an 8 ply search or something.

I talked to Jonathan Schaeffer about this when he visited my home one year and he was aware of the same issues and we both speculated about how strong Deep Blue would be had it been programmed to win instead of to not lose.

However, I think they did end up using a very conservative form of null move pruning and I cannot say for sure what ended up in the 1997 program as far as extensions and null move and so on, perhaps you know more of the details. I can only report on what Hsu talked to me about. Of course I realize that null move pruning may not have been present in the chips for engineering reasons.

I cannot blast him too much because in all fairness null move was a lot newer then than now. I don't actually remember the degree of acceptance back in those days.

I have to say that what they did was incredibly impressive. I'm not ignorant to the fact that what they did was not easy.

rbarreira · Post by **rbarreira** » Mon Sep 13, 2010 8:24 pm

There is some interesting information about Deep Blue here, from the authors themselves:

http://sjeng.org/ftp/deepblue.pdf

bob · Post by **bob** » Mon Sep 13, 2010 9:00 pm

Don wrote:
bob wrote:
Don wrote:
rbarreira wrote:Hsu said that Deep Blue's parallel efficiency was about 10%, so those 200m - 1bn nodes per second should be scaled down by a factor of about 10 to get something similar to the effective NPS of a single-threaded search. This was due to the massive parallelism it employed, I guess.

I think I believed that Deep Blue was much better than it actually was. It was impressive in 1997, but It seems that Deep Blue was inferior in every chess specific way except for raw nodes per second due to the constraints of hardware.

So using Deep Blue as a reference point is not going to work.
I'm not saying it was "lousy software". But Hsu set out with a specific project in 1985, primarily to build a "Belle on a chip". He did and used that to win a couple of ACM events before he got into the "multiple copies of Belle on a chip". And IBM picked up on that and the two-level parallel search was developed. So his "stuff" was based on raw speed, primarily, trying to avoid any sort of "selectivity" and the inherent error that produced. It was _insanely_ strong, however. As all of us that played the thing remembers, yourself included. Somewhat similar to the chess 4.x revelation from Slate/Atkin that ended the highly selective search approach that had been used in all programs prior to that.
Those guys were good engineers, but as you have said the chess part was already out of date.

I had a lot of discussions with Murray and Hsu and what stood out to me was that they were MUCH better hardware guys that computer chess guys. A case in point was that Hsu was afraid of null move pruning. I don't know how Murray felt, but Hsu was paranoid about protecting the weaknesses of the program instead of actually doing what was best. For example he admitted to me that null move was a big improvement but thought that it might create some oversight that a micro would take advantage of.

With extensions he used similar reasoning. He wanted to extend everything imaginable in order to cover the possibility that a selective search micro might see something his hardware might miss. He also told me that they give up several ply of search to do those extensions. Their "nominal" depth was not impressive at all, however nominal depth does not tell the entire story because they extended so much that a 6 ply search was probably really more like an 8 ply search or something.

I talked to Jonathan Schaeffer about this when he visited my home one year and he was aware of the same issues and we both speculated about how strong Deep Blue would be had it been programmed to win instead of to not lose.

However, I think they did end up using a very conservative form of null move pruning and I cannot say for sure what ended up in the 1997 program as far as extensions and null move and so on, perhaps you know more of the details. I can only report on what Hsu talked to me about. Of course I realize that null move pruning may not have been present in the chips for engineering reasons.

To the best of my recollection, they did not use NM. I believe Hsu or Murray at some point told me they had played around with it (obviously only in the software part of the search). But not in the Kasparov match.[/quote]

I cannot blast him too much because in all fairness null move was a lot newer then than now. I don't actually remember the degree of acceptance back in those days.

[/quote]

It wasn't that new. This was a late 80's thing. And as far as Murray goes, he really was (a) an excellent chess player and (b) an excellent software type. Hsu was more about hardware of course, and could not play chess very well at all.

I have to say that what they did was incredibly impressive. I'm not ignorant to the fact that what they did was not easy.

No. But it would be nice to see that hardware doing 30-35+ ply searches using todays approaches. Would bring back the dominance they had in the late 80's and early 90's all over again.

mhull · Post by **mhull** » Mon Sep 13, 2010 9:17 pm

bob wrote:But it would be nice to see that hardware doing 30-35+ ply searches using todays approaches. Would bring back the dominance they had in the late 80's and early 90's all over again.

Will a "cluster crafty" be able to reach those depths any time soon?

Dann Corbit · Post by **Dann Corbit** » Mon Sep 13, 2010 9:28 pm

Time for a "How many angels could dance on the head of a pin?" response...

The Deep Blue and Deeper Blue teams went ultra-conservative (pruning was at a very low level just to ensure that there were no tactical oversights).

Today, we could build the very same FPGAs and get a much higher clock rate. In addition, we could do a recompile of Ivanhoe or some such source code and get a branching factor of 2.

So without much effort, I guess that the Deep Blue team today could get +1000 Elo or so.

Of course, we'll never know the answers to these questions because there is no reason for IBM to repeat the experiment, since they have nothing to gain and it would be expensive.

Gerd Isenberg · Post by **Gerd Isenberg** » Mon Sep 13, 2010 9:35 pm

rbarreira wrote:There is some interesting information about Deep Blue here, from the authors themselves:

http://sjeng.org/ftp/deepblue.pdf

Thanks, seems a pre-print of the paper published 2002 in Artificial Intelligence :
http://www.math-info.univ-paris5.fr/~bo ... epBlue.pdf

bob · Post by **bob** » Mon Sep 13, 2010 10:32 pm

Don wrote:
Gerd Isenberg wrote:
bob wrote:
uaf wrote:
bob wrote: I recall them losing one game due to a power outage, and one game due to comm problems (Fritz in Hong Kong) which is an incredible streak over 11 years.
And IIRC it was Deep Thought II that lost to Fritz and not Deep Blue as always advertised by Chessbase. Deep Blue was not yet ready.
Confusion caused by IBM. That was "deep blue prototype" which Hsu/Campbell had said was "deep blue software running on deep thought hardware". So you are correct. I was lumping them all together. Chiptest first played in 1986 with serious bugs. It won the ACM event in 1987 and every year after that, only losing the two games I mentioned to the best of my recollection, one on time due to a power failure at the Watson center, one in Hong Kong primarily caused by a comm failure...
Was the game against Mephisto from ACM 1989 that with the power failure?
Deep Blue was remarkably strong for 1997 but it was far from being unbeatable. It was rare but it suffered draws and losses. I think we can estimate that it was about 4-5 years ahead of the PC programs. By 2002 a lot of very smart people believed that Junior or Fritz would beat it in a match. No point arguing about it because we can never know for sure.

I read somewhere (and I'll try to find it) that if you consider various incarnations of Deep Blue that actually played in tournaments, and performance rate it's total results, it is not particularly impressive because it only indicates something like a 200 ELO superiority over the best - but I think of all the games it lost a lot of them were due to unfortunate issues, so this is probably far from a fair metric (also considering that so few games were played.) I think in reality is was stronger than this. A crude calculation is that if it took programs 5 years to catch it, you can guesstimate it's superiority and I think that puts it as more like 400 ELO better than anyone else.

The Deep Blue team was very humble and were a joy to talk with. At the Hong Kong tournament Murray told me that they estimated their winning chances to be right around 50%. That sounds incredible at first unless you do the math. To survive a 5 round tournament with 24 players and have a 50% chance to be the winner you must not only be the best player, but best by a good margin. If their chances were 50%, the chances of the 23 other contestants were divided up among the remaining 50% so that is pretty impressive.

But this tells you that even the Deep Blue team expected to lose games relatively frequently, just much less frequently than anyone else! When it's all soberly analyzed and all the hype removed, Deep Blue stands out as the most outstanding program of it's day, but no more. (I am not sure if some early programs stand out even more, such as Belle or even before than the Chess 4.7 program, they were also seemingly unbeatable so this deserves a fact check.)

If you go back to 1997 when they won the Fredkin prize, they had a FIDE equivalent rating of 2650+. I don't remember the exact number but it was _well_ beyond the Fredkin prize requirement... What micro was close to that in long games. A couple of micros had beaten GM players in blitz (Cray Blitz defeated GM players all over the place in the 90's, as a reference). So they were very strong, and based on deep though vs everyone else thru 1994 ACM-sponsored events, they were clearly well "above and beyond."

How far is debatable. But I would not use that 200 number myself since we have no data for Micros playing super-GM players at 40 moves in 2 hours.

Uri Blass · Post by **Uri Blass** » Mon Sep 13, 2010 11:04 pm

bob wrote:
Don wrote:
Gerd Isenberg wrote:
bob wrote:
uaf wrote:
bob wrote: I recall them losing one game due to a power outage, and one game due to comm problems (Fritz in Hong Kong) which is an incredible streak over 11 years.
And IIRC it was Deep Thought II that lost to Fritz and not Deep Blue as always advertised by Chessbase. Deep Blue was not yet ready.
Confusion caused by IBM. That was "deep blue prototype" which Hsu/Campbell had said was "deep blue software running on deep thought hardware". So you are correct. I was lumping them all together. Chiptest first played in 1986 with serious bugs. It won the ACM event in 1987 and every year after that, only losing the two games I mentioned to the best of my recollection, one on time due to a power failure at the Watson center, one in Hong Kong primarily caused by a comm failure...
Was the game against Mephisto from ACM 1989 that with the power failure?
Deep Blue was remarkably strong for 1997 but it was far from being unbeatable. It was rare but it suffered draws and losses. I think we can estimate that it was about 4-5 years ahead of the PC programs. By 2002 a lot of very smart people believed that Junior or Fritz would beat it in a match. No point arguing about it because we can never know for sure.

I read somewhere (and I'll try to find it) that if you consider various incarnations of Deep Blue that actually played in tournaments, and performance rate it's total results, it is not particularly impressive because it only indicates something like a 200 ELO superiority over the best - but I think of all the games it lost a lot of them were due to unfortunate issues, so this is probably far from a fair metric (also considering that so few games were played.) I think in reality is was stronger than this. A crude calculation is that if it took programs 5 years to catch it, you can guesstimate it's superiority and I think that puts it as more like 400 ELO better than anyone else.

The Deep Blue team was very humble and were a joy to talk with. At the Hong Kong tournament Murray told me that they estimated their winning chances to be right around 50%. That sounds incredible at first unless you do the math. To survive a 5 round tournament with 24 players and have a 50% chance to be the winner you must not only be the best player, but best by a good margin. If their chances were 50%, the chances of the 23 other contestants were divided up among the remaining 50% so that is pretty impressive.

But this tells you that even the Deep Blue team expected to lose games relatively frequently, just much less frequently than anyone else! When it's all soberly analyzed and all the hype removed, Deep Blue stands out as the most outstanding program of it's day, but no more. (I am not sure if some early programs stand out even more, such as Belle or even before than the Chess 4.7 program, they were also seemingly unbeatable so this deserves a fact check.)
If you go back to 1997 when they won the Fredkin prize, they had a FIDE equivalent rating of 2650+. I don't remember the exact number but it was _well_ beyond the Fredkin prize requirement... What micro was close to that in long games. A couple of micros had beaten GM players in blitz (Cray Blitz defeated GM players all over the place in the 90's, as a reference). So they were very strong, and based on deep though vs everyone else thru 1994 ACM-sponsored events, they were clearly well "above and beyond."

How far is debatable. But I would not use that 200 number myself since we have no data for Micros playing super-GM players at 40 moves in 2 hours.

We clearly have data about computers who played humans in 120/40 time control

I remember reading that Fritz3 on P90 could get the IM norm in tournament time control games so it is not correct to say that programs did not play long time games.

No micro was close to 2650 but I am sure that
micro's were at least at 2400 level at that time so 200 elo difference between Deep thought or Deep blue prototype and the best micro's of the same time is not illogical.

Uri

Gerd Isenberg · Post by **Gerd Isenberg** » Mon Sep 13, 2010 11:05 pm

bob wrote:If you go back to 1997 when they won the Fredkin prize, they had a FIDE equivalent rating of 2650+. I don't remember the exact number but it was _well_ beyond the Fredkin prize requirement... What micro was close to that in long games. A couple of micros had beaten GM players in blitz (Cray Blitz defeated GM players all over the place in the 90's, as a reference). So they were very strong, and based on deep though vs everyone else thru 1994 ACM-sponsored events, they were clearly well "above and beyond."

How far is debatable. But I would not use that 200 number myself since we have no data for Micros playing super-GM players at 40 moves in 2 hours.

The 89 and 90 ACM performance of Lang's Mephisto was astonishing as well. Hard fights with Berliner's HiTech from Carnegie Mellon as well at that time, with a lot emotions, I guess. In 1990 the same three 89 programs finished top three (in different order). DT lost from HiTech, but won the tournament as HiTech lost from Mephisto which on the other hand lost from DT.

DT lost from Mephisto (89), HiTech (90) and Mchess Pro (94) round 2, which looks like win by default according to Hsu's post in rgc.

Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka

Re: Deep Blue vs Rybka