GM and Rybka vs. Stockfish

Uri Blass · Post by **Uri Blass** » Tue Aug 12, 2014 6:00 am

2500 elo on CCRL may be superior to 2500 elo human but I think that it is the opposite for high rating because 100 elo difference in computer rating is translated to clearly less than 100 elo in human rating.

I believe that 3600 is clearly too optimistic for stockfish on 8 cores and 3200 is more realistic.

Laskos · Post by **Laskos** » Tue Aug 12, 2014 12:21 pm

Nitro wrote:The 8-core Mac Pro would be somewhere around 16x as powerful as the MacBook (Rybka 3 ran on one core). That's a rough estimate based on the Mac Pro having 8x as many active cores, and ~2x the performance per core (Ivy Bridge Xeon @ 3Ghz vs Core2 @ 2Ghz).

That performance difference plus the engine strength difference indicates that a score of 3.5/4 is exactly in line with what you'd expect just based on the two computers alone. So in this case, there's not much evidence that the GM helped or hurt Rybka. Of course, with only 4 games, it's very hard to draw any conclusions with confidence -- except that a human GM plus the best chess engine from 2008 on 2008 hardware is almost certainly much inferior to the best chess engine from today on modern hardware.

That conclusion should be unsurprising to those who closely follow computer chess. (Indeed, I am of the opinion that rating lists typically understate engine strength relative to their human counterparts; that is, an engine rated at 2500 Elo on CCRL is probably superior to a 2500 Elo human.) And with the latest version of Stockfish on 8 modern Xeon cores, we're likely pushing ~3600 Elo -- a level of skill which is hard to fathom in human terms. So it was not surprising to me that the match turned out the way it did, but some IMs and GMs I spoke to before the match really thought Daniel not only had a chance, but even an advantage!

Thank you very much for the match, very instructive.

As for ratings, lists like CCRL have an anchor at 2,800 or so Elo points (IIRC some Shredder on one core for CCRL). Computer ratings are somewhat dilated when compared to FIDE ratings of humans. So, even if this 2,800 anchor is more or less correct, 2,400 Elo of an engine in CCRL is probably close to 2,600 FIDE, 3,500 Elo CCRL is probably close to 3,200 FIDE.

But you are right, 0.5/4 is what Rybka 3 would get in a match with Stockfish 5 on 16x weaker hardware, so the GM help is minimal, if at all. Sure, the number of games is small.

Adam Hair · Post by **Adam Hair** » Tue Aug 12, 2014 1:34 pm

Laskos wrote:
Nitro wrote:The 8-core Mac Pro would be somewhere around 16x as powerful as the MacBook (Rybka 3 ran on one core). That's a rough estimate based on the Mac Pro having 8x as many active cores, and ~2x the performance per core (Ivy Bridge Xeon @ 3Ghz vs Core2 @ 2Ghz).

That performance difference plus the engine strength difference indicates that a score of 3.5/4 is exactly in line with what you'd expect just based on the two computers alone. So in this case, there's not much evidence that the GM helped or hurt Rybka. Of course, with only 4 games, it's very hard to draw any conclusions with confidence -- except that a human GM plus the best chess engine from 2008 on 2008 hardware is almost certainly much inferior to the best chess engine from today on modern hardware.

That conclusion should be unsurprising to those who closely follow computer chess. (Indeed, I am of the opinion that rating lists typically understate engine strength relative to their human counterparts; that is, an engine rated at 2500 Elo on CCRL is probably superior to a 2500 Elo human.) And with the latest version of Stockfish on 8 modern Xeon cores, we're likely pushing ~3600 Elo -- a level of skill which is hard to fathom in human terms. So it was not surprising to me that the match turned out the way it did, but some IMs and GMs I spoke to before the match really thought Daniel not only had a chance, but even an advantage!
Thank you very much for the match, very instructive.

As for ratings, lists like CCRL have an anchor at 2,800 or so Elo points (IIRC some Shredder on one core for CCRL). Computer ratings are somewhat dilated when compared to FIDE ratings of humans. So, even if this 2,800 anchor is more or less correct, 2,400 Elo of an engine in CCRL is probably close to 2,600 FIDE, 3,500 Elo CCRL is probably close to 3,200 FIDE.

But you are right, 0.5/4 is what Rybka 3 would get in a match with Stockfish 5 on 16x weaker hardware, so the GM help is minimal, if at all. Sure, the number of games is small.

CEGT, IPON, and I think Frank Quisinsky use Shredder 12 64-bit at 2800 Elo as their anchor. The CCRL uses the SSDF ratings as its reference. I have never asked why, but I suppose it is because the SSDF ratings have a tenuous connection to human ratings. However, we did lower the ratings by 100 Elo a couple of years ago.

Sedat Canbaz · Post by **Sedat Canbaz** » Thu Aug 14, 2014 12:08 am

Just I'd like to add,
SCCT should not be forgotten)) and can be considered very close to human ratings too
http://www.sedatcanbaz.com/chess/?page_id=82

And as reference,
I take Deep Fritz 10 6 cores with start Elo calculation: 3000

Code: Select all

SCCT (3m+2s) i7 980X 3.33GHz Elostat
Rybka 4.1 x64 6 cores	3308 Elo

SCCT (3m+2s) i7 980X 3.33GHz Ordo
Rybka 4.1 x64 6 cores	3372 Elo

CCRL (40/4) AMD 4600 2.40GHz
Rybka 4.1 64-bit 4CPU	3198 Elo

CCRL (40/40) AMD 4600 2.40GHz
Rybka 4 64-bit 4CPU	3161 Elo

SSDF (40/120)Q6600 2,40 GHz 
Deep Rybka 4 x64 4 cores 3209 Elo

Notes:
-In case of doubling the processor speed, the MP Elo difference is mainly between 50-100 Elo
-This is correct (in CCRL) that with slower time controls, the engines Elos are going down, approx. 50 Elo

And as I mentioned before (exception SCCT),
CCRL and SSDF ratings almost perfectly matches to human ratings !!

Sedat Canbaz · Post by **Sedat Canbaz** » Thu Aug 14, 2014 2:54 pm

Uri Blass wrote:2500 elo on CCRL may be superior to 2500 elo human but I think that it is the opposite for high rating because 100 elo difference in computer rating is translated to clearly less than 100 elo in human rating.

I believe that 3600 is clearly too optimistic for stockfish on 8 cores and 3200 is more realistic.

Uri,

I stated several times...I can't remember exactly, but at least min 20 times and I feel like I am a parrot ))

Actually I published a clear proof rating list (with the useful database by Ed Schröder)
http://www.talkchess.com/forum/viewtopi ... ight=sedat

Even I announced several times challenges...
For those who still don't believe that there will be more than 500 Elo

But it seems some chess friends love more comments than reality...))!!

And why we are only talking ??? : I believe, I think etc...
Believe me I am tired from comments...no more no less...

It's time to see this in reality...!
But next time please without comments...only in challenge and my contact address is very clear!
http://www.sedatcanbaz.com/chess/?page_id=209

I wonder too, why no any GM is contacting me ??
Note: I promise to not use Perfect 16 book (due to respect... and it is old-dated)
But in case of interest by any GM, he can play in a gauntlet mode against SCCT Book participants + Stockfish

Once more I'd like to mention,
I will give a prize 10.000 USD and 10% commission for that person who will find me a GM
For more details:
http://www.talkchess.com/forum/viewtopi ... t&start=10

BTW, you can't say 3600 is clearly too optimistic for stockfish on 8 cores and 3200 is more realistic.

Do you know why?
1) Hardware speed plays a BIG role
For example the Elo difference between AMD 4600 2.40 GHz and i7 980X 3.33GHz is expecting to be approx. 200-250 Elo

2)Opening book plays another BIG role too,
Depending on what kind of opening book usage,
X Engine can be performed approx. 0-200 Elo points or even more:
http://www.sedatcanbaz.com/chess/?page_id=473

3)Junior 6 or Fritz 6 or Movei 00.8.438...are not planning to be used
We plan to use the latest version of Stockfish , which is at least 700 Elo over the mentioned above engines !!!

And here are available some more examples:

1) There will be no big Elo difference if we run 8 cores vs 2 cores

Code: Select all

2x Intel Xeon E5310       1.60 GHz     8       7834
Intel Core 2 Duo E8500  @ 4.90 GHz     2       7105

2)There will be no big Elo difference if we run 8 cores vs 4 cores

Code: Select all

AMD FX-8150 Zambezi       3.60 GHz     8      11867
Intel Core i5 4670         3.4 GHz     4      11700

And last,
I expect to see Stockfish to be over 3500 Elo in Blitz (in case of using a superior strong book + i7 980 @4.33 GHz)

Hopes this time helps!

Modern Times · Post by **Modern Times** » Thu Aug 14, 2014 3:15 pm

Adam Hair wrote: The CCRL uses the SSDF ratings as its reference. I have never asked why, but I suppose it is because the SSDF ratings have a tenuous connection to human ratings. However, we did lower the ratings by 100 Elo a couple of years ago.

It was because at the time we started, CCRL 40/40 on the reference AMD X2 4600+ seemed very similar to SSDF 40/120 on an Athlon 1200.

Some years later we then subsequently drop everything by 100 Elo as the ratings for the top engines seemed to be very high.

M ANSARI · Post by **M ANSARI** » Fri Aug 15, 2014 1:09 am

My guess is that Rybka 3 with a custom book would do much better in a match against SF 5 with no book. I really believe that in this case the human GM was the weak link in this equation. I think for a long time even the strongest engines had a weaker evaluation than the strongest humans but would compensate for it with dramatically stronger tactical abilities. This is not so and has not been so for quite some time. You will find some endgames that humans still understand better and could outplay the top engine, but these few areas of weakness are disappearing day by day.

wims · Post by **wims** » Fri Aug 15, 2014 4:02 am

M ANSARI wrote:My guess is that Rybka 3 with a custom book would do much better in a match against SF 5 with no book. I really believe that in this case the human GM was the weak link in this equation. I think for a long time even the strongest engines had a weaker evaluation than the strongest humans but would compensate for it with dramatically stronger tactical abilities. This is not so and has not been so for quite some time. You will find some endgames that humans still understand better and could outplay the top engine, but these few areas of weakness are disappearing day by day.

That's pretty much what I was thinking as well, the human is the weak link holding back the engine. I'm guessing a strong player with excellent understanding of chess engines would do better than a gm with next to no understanding of engines

Uri Blass · Post by **Uri Blass** » Fri Aug 15, 2014 10:29 am

Sedat Canbaz wrote:
Uri Blass wrote:2500 elo on CCRL may be superior to 2500 elo human but I think that it is the opposite for high rating because 100 elo difference in computer rating is translated to clearly less than 100 elo in human rating.

I believe that 3600 is clearly too optimistic for stockfish on 8 cores and 3200 is more realistic.

Uri,

I stated several times...I can't remember exactly, but at least min 20 times and I feel like I am a parrot ))

Actually I published a clear proof rating list (with the useful database by Ed Schröder)
http://www.talkchess.com/forum/viewtopi ... ight=sedat

Even I announced several times challenges...
For those who still don't believe that there will be more than 500 Elo

But it seems some chess friends love more comments than reality...))!!

And why we are only talking ??? : I believe, I think etc...
Believe me I am tired from comments...no more no less...

It's time to see this in reality...!
But next time please without comments...only in challenge and my contact address is very clear!
http://www.sedatcanbaz.com/chess/?page_id=209

I wonder too, why no any GM is contacting me ??
Note: I promise to not use Perfect 16 book (due to respect... and it is old-dated)
But in case of interest by any GM, he can play in a gauntlet mode against SCCT Book participants + Stockfish

Once more I'd like to mention,
I will give a prize 10.000 USD and 10% commission for that person who will find me a GM
For more details:
http://www.talkchess.com/forum/viewtopi ... t&start=10

BTW, you can't say 3600 is clearly too optimistic for stockfish on 8 cores and 3200 is more realistic.

Do you know why?
1) Hardware speed plays a BIG role
For example the Elo difference between AMD 4600 2.40 GHz and i7 980X 3.33GHz is expecting to be approx. 200-250 Elo

2)Opening book plays another BIG role too,
Depending on what kind of opening book usage,
X Engine can be performed approx. 0-200 Elo points or even more:
http://www.sedatcanbaz.com/chess/?page_id=473

3)Junior 6 or Fritz 6 or Movei 00.8.438...are not planning to be used
We plan to use the latest version of Stockfish , which is at least 700 Elo over the mentioned above engines !!!

And here are available some more examples:

1) There will be no big Elo difference if we run 8 cores vs 2 cores
Code: Select all
2x Intel Xeon E5310       1.60 GHz     8       7834
Intel Core 2 Duo E8500  @ 4.90 GHz     2       7105
2)There will be no big Elo difference if we run 8 cores vs 4 cores
Code: Select all
AMD FX-8150 Zambezi       3.60 GHz     8      11867
Intel Core i5 4670         3.4 GHz     4      11700
And last,
I expect to see Stockfish to be over 3500 Elo in Blitz (in case of using a superior strong book + i7 980 @4.33 GHz)

Hopes this time helps!

The game between GM and Rybka vs stockfish was not blitz so when I said that I believe top engines are at level of 3200 I did not mean to blitz conditions of 3 minutes per game but to the time control that the GM played in the match when he and rybka lost 3.5:0.5 that is practically 45 minutes per game+30 seconds per move.

Uri Blass · Post by **Uri Blass** » Fri Aug 15, 2014 10:33 am

M ANSARI wrote:My guess is that Rybka 3 with a custom book would do much better in a match against SF 5 with no book. I really believe that in this case the human GM was the weak link in this equation. I think for a long time even the strongest engines had a weaker evaluation than the strongest humans but would compensate for it with dramatically stronger tactical abilities. This is not so and has not been so for quite some time. You will find some endgames that humans still understand better and could outplay the top engine, but these few areas of weakness are disappearing day by day.

My guess is that rybka is going to score not better than 0.5 out of 4
in case that you use some book that is not deaigned to take advantage of rybka's weaknesses(remember also that stockfish should get a significant hardware advantage of 8 cores against 1 core when the cores that stockfish use are twice faster).

GM and Rybka vs. Stockfish

Re: GM and Rybka vs. Stockfish

Re: GM and Rybka vs. Stockfish

Re: GM and Rybka vs. Stockfish

Re: GM and Rybka vs. Stockfish

Re: GM and Rybka vs. Stockfish

Re: GM and Rybka vs. Stockfish

Re: GM and Rybka vs. Stockfish

Re: GM and Rybka vs. Stockfish

Re: GM and Rybka vs. Stockfish

Re: GM and Rybka vs. Stockfish