Rybka odds matches and the strength of engines

Laskos · Post by **Laskos** » Sat Jun 09, 2012 7:16 pm

I put a Rybka self-play randomizer match (search window 3cp) with the "pawn and move" handicap (f7 removed). I don't know if this was done before, maybe some Rybka Forum members would know.

[d]

200 games at 1s/move

169:31
+145 =48 -7

Some ~300 computer Elo points handicap.

In 2008 Rybka played 8 games with this handicap against GM Roman Dzindzichashvili, and 2 games against GM Vadim Milov, performing at Elo 2550 FIDE level. This is AFAIK the last series of computer-GM games in stable conditions.

Now a bit of speculations: engines improved since 2008 by some ~150 Elo points, for a total difference of ~450 computer Elo points, meaning some ~350 human Elo points (could someone confirm that computer ratings are exaggerating the differences?). Therefore a recent top engine on a quad (and tournament TC) would be ~2550+350 ~ 2900 Elo points on FIDE ratings. Seems a bit low compared to Elo 3200 assumed by many for these engines.

Kai

hgm · Post by **hgm** » Sat Jun 09, 2012 8:50 pm

This has been done before, (indeed on Rybka forum), and the results were not as extreme as what you report. In a far larger number of games the white advantage was ~72%, IIRC. This is in good agreement with what I found in self-play tests of Fairy-Max or Joker, and seemed almost completely independent on the level of play.

Laskos · Post by **Laskos** » Sat Jun 09, 2012 9:14 pm

hgm wrote:This has been done before, (indeed on Rybka forum), and the results were not as extreme as what you report. In a far larger number of games the white advantage was ~72%, IIRC. This is in good agreement with what I found in self-play tests of Fairy-Max or Joker, and seemed almost completely independent on the level of play.

Ok, but seems to depend on the time control (more equal at longer TC, therefore depends on the level?). So, a 2900 Elo FIDE level seems plausible for present top engines on a quad? I am bit surprised (the commonly accepted level is >3000).

Kai

Uri Blass · Post by **Uri Blass** » Sat Jun 09, 2012 9:48 pm

Laskos wrote:I put a Rybka self-play randomizer match (search window 3cp) with the "pawn and move" handicap (f7 removed). I don't know if this was done before, maybe some Rybka Forum members would know.

[d]

200 games at 1s/move

169:31
+145 =48 -7

Some ~300 computer Elo points handicap.

In 2008 Rybka played 8 games with this handicap against GM Roman Dzindzichashvili, and 2 games against GM Vadim Milov, performing at Elo 2550 FIDE level. This is AFAIK the last series of computer-GM games in stable conditions.

Now a bit of speculations: engines improved since 2008 by some ~150 Elo points, for a total difference of ~450 computer Elo points, meaning some ~350 human Elo points (could someone confirm that computer ratings are exaggerating the differences?). Therefore a recent top engine on a quad (and tournament TC) would be ~2550+350 ~ 2900 Elo points on FIDE ratings. Seems a bit low compared to Elo 3200 assumed by many for these engines.

Kai

I believe that the difference in playing strength against humans between having pawn advantage and not having pawn advantage is clearly
more than the difference in comp-comp games.

I also do not think that using self-play randomizer at 1 second per move is a good way to estimate elo computer difference.

self play randomizer at 1 second per move
means 2 things:
1)very fast time control that you do not use against humans
2)weaker playing strength relative to not using a randomizer.

Sedat Canbaz · Post by **Sedat Canbaz** » Sat Jun 09, 2012 11:32 pm

Laskos wrote:
hgm wrote:This has been done before, (indeed on Rybka forum), and the results were not as extreme as what you report. In a far larger number of games the white advantage was ~72%, IIRC. This is in good agreement with what I found in self-play tests of Fairy-Max or Joker, and seemed almost completely independent on the level of play.
Ok, but seems to depend on the time control (more equal at longer TC, therefore depends on the level?). So, a 2900 Elo FIDE level seems plausible for present top engines on a quad? I am bit surprised (the commonly accepted level is >3000).

Kai

Actually i tested Rybka without full pawn and its performance was approx.240 elo weaker than Rybka (default-with all peaces)

Rybka 4.1 WP x64 1c:
http://www.sedatcanbaz.com/chess/ratings/scct-auto232/

For more details about Human vs Engine Elo calculations:
http://www.talkchess.com/forum/viewtopi ... 1&start=20
http://www.talkchess.com/forum/viewtopi ... 4&start=50

Best,
Sedat

Laskos · Post by **Laskos** » Sun Jun 10, 2012 1:41 am

Uri Blass wrote:
Laskos wrote:I put a Rybka self-play randomizer match (search window 3cp) with the "pawn and move" handicap (f7 removed). I don't know if this was done before, maybe some Rybka Forum members would know.

[d]

200 games at 1s/move

169:31
+145 =48 -7

Some ~300 computer Elo points handicap.

In 2008 Rybka played 8 games with this handicap against GM Roman Dzindzichashvili, and 2 games against GM Vadim Milov, performing at Elo 2550 FIDE level. This is AFAIK the last series of computer-GM games in stable conditions.

Now a bit of speculations: engines improved since 2008 by some ~150 Elo points, for a total difference of ~450 computer Elo points, meaning some ~350 human Elo points (could someone confirm that computer ratings are exaggerating the differences?). Therefore a recent top engine on a quad (and tournament TC) would be ~2550+350 ~ 2900 Elo points on FIDE ratings. Seems a bit low compared to Elo 3200 assumed by many for these engines.

Kai
I believe that the difference in playing strength against humans between having pawn advantage and not having pawn advantage is clearly
more than the difference in comp-comp games.

I don't know, are you sure?

I also do not think that using self-play randomizer at 1 second per move is a good way to estimate elo computer difference.

self play randomizer at 1 second per move
means 2 things:
1)very fast time control that you do not use against humans
2)weaker playing strength relative to not using a randomizer.

Actually 1s/move is not so fast, the difference at slower TC would be even smaller (Sedat is showing 240 points instead of 300), and I am already wondering that the difference is not very large. The weakening at 3cp window is not that important (I guess).

Kai

Laskos · Post by **Laskos** » Sun Jun 10, 2012 1:43 am

Sedat Canbaz wrote:
Laskos wrote:
hgm wrote:This has been done before, (indeed on Rybka forum), and the results were not as extreme as what you report. In a far larger number of games the white advantage was ~72%, IIRC. This is in good agreement with what I found in self-play tests of Fairy-Max or Joker, and seemed almost completely independent on the level of play.
Ok, but seems to depend on the time control (more equal at longer TC, therefore depends on the level?). So, a 2900 Elo FIDE level seems plausible for present top engines on a quad? I am bit surprised (the commonly accepted level is >3000).

Kai
Actually i tested Rybka without full pawn and its performance was approx.240 elo weaker than Rybka (default-with all peaces)

Rybka 4.1 WP x64 1c:
http://www.sedatcanbaz.com/chess/ratings/scct-auto232/

For more details about Human vs Engine Elo calculations:
http://www.talkchess.com/forum/viewtopi ... 1&start=20
http://www.talkchess.com/forum/viewtopi ... 4&start=50

Best,
Sedat

Is this the same (f7) "move and pawn" handicap? Thanks for the links, I was not trying to define generally the human rating with respect to computers, but it occurred to me that an obscure to me rated 2500+ something GM actually drew an 8-game match against Rybka 3 on a quad (IIRC) at pawn odds, and a stronger GM beat Rybka. Also, I am trying to imagine what taking back N times odds could mean.

Kai

Sedat Canbaz · Post by **Sedat Canbaz** » Sun Jun 10, 2012 1:50 am

Laskos wrote:
Sedat Canbaz wrote:
Laskos wrote:
hgm wrote:This has been done before, (indeed on Rybka forum), and the results were not as extreme as what you report. In a far larger number of games the white advantage was ~72%, IIRC. This is in good agreement with what I found in self-play tests of Fairy-Max or Joker, and seemed almost completely independent on the level of play.
Ok, but seems to depend on the time control (more equal at longer TC, therefore depends on the level?). So, a 2900 Elo FIDE level seems plausible for present top engines on a quad? I am bit surprised (the commonly accepted level is >3000).

Kai
Actually i tested Rybka without full pawn and its performance was approx.240 elo weaker than Rybka (default-with all peaces)

Rybka 4.1 WP x64 1c:
http://www.sedatcanbaz.com/chess/ratings/scct-auto232/

For more details about Human vs Engine Elo calculations:
http://www.talkchess.com/forum/viewtopi ... 1&start=20
http://www.talkchess.com/forum/viewtopi ... 4&start=50

Best,
Sedat
Is this the same (f7) "move and pawn" handicap?

Kai

Dear Kai,

Rybka 4.1 x64 1c is played at handicap-without full pawn (e2 and e7),for more details:
http://rybkaforum.net/cgi-bin/rybkaforu ... 3;hl=sedat

Best,
Sedat

Uri Blass · Post by **Uri Blass** » Sun Jun 10, 2012 1:52 am

Laskos wrote:
Uri Blass wrote:
Laskos wrote:I put a Rybka self-play randomizer match (search window 3cp) with the "pawn and move" handicap (f7 removed). I don't know if this was done before, maybe some Rybka Forum members would know.

[d]

200 games at 1s/move

169:31
+145 =48 -7

Some ~300 computer Elo points handicap.

In 2008 Rybka played 8 games with this handicap against GM Roman Dzindzichashvili, and 2 games against GM Vadim Milov, performing at Elo 2550 FIDE level. This is AFAIK the last series of computer-GM games in stable conditions.

Now a bit of speculations: engines improved since 2008 by some ~150 Elo points, for a total difference of ~450 computer Elo points, meaning some ~350 human Elo points (could someone confirm that computer ratings are exaggerating the differences?). Therefore a recent top engine on a quad (and tournament TC) would be ~2550+350 ~ 2900 Elo points on FIDE ratings. Seems a bit low compared to Elo 3200 assumed by many for these engines.

Kai
I believe that the difference in playing strength against humans between having pawn advantage and not having pawn advantage is clearly
more than the difference in comp-comp games.
I don't know, are you sure?

I also do not think that using self-play randomizer at 1 second per move is a good way to estimate elo computer difference.

self play randomizer at 1 second per move
means 2 things:
1)very fast time control that you do not use against humans
2)weaker playing strength relative to not using a randomizer.
Actually 1s/move is not so fast, the difference at slower TC would be even smaller (Sedat is showing 240 points instead of 300), and I am already wondering that the difference is not very large. The weakening at 3cp window is not that important (I guess).

Kai

I suspect that it is not the same f7 pawn.
It does not make sense to have a smaller difference at slower time control unless the position is a draw so slower time control helps the weaker side to find the right moves to draw.

Laskos · Post by **Laskos** » Sun Jun 10, 2012 1:56 am

Uri Blass wrote:
I suspect that it is not the same f7 pawn.
It does not make sense to have a smaller difference at slower time control unless the position is a draw so slower time control helps the weaker side to find the right moves to draw.

I actually put on faster controls than 1s/move to see what happens, and the difference was larger. At Rybka depth 5 (very fast games), 3cp window randomizer, I waited for some 40 games to have a single draw, I was thinking that I messed up something (still a possibility).

Kai

Rybka odds matches and the strength of engines

Rybka odds matches and the strength of engines

Re: Rybka odds matches and the strength of engines

Re: Rybka odds matches and the strength of engines

Re: Rybka odds matches and the strength of engines

Re: Rybka odds matches and the strength of engines

Re: Rybka odds matches and the strength of engines

Re: Rybka odds matches and the strength of engines

Re: Rybka odds matches and the strength of engines

Re: Rybka odds matches and the strength of engines

Re: Rybka odds matches and the strength of engines