Another Firebird-Rybka match (120 games @ 20mn+5s)

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Andre
Posts: 98
Joined: Thu Jul 23, 2009 5:40 am

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)

Post by Andre »

Same conditions and adding 240 more games

Firebird 1.0 - Rybka 3.0: +70 =125 -45 (132.5/107.5)
55.21%/44.79%

And for those who like statistics, here are some details:
- from 1 to 60
Firebird 1.0 - Rybka 3.0: +16 =25 -19
- from 61 to 120:
Firebird 1.0 - Rybka 3.0: +20 =31 -09
- from 121 to 180:
Firebird 1.0 - Rybka 3.0: +21 =29 -10
- from 181 to 240:
Firebird 1.0 - Rybka 3.0: +13 =40 -07
govert
Posts: 270
Joined: Thu Jan 15, 2009 12:52 pm

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)

Post by govert »

bayeselo on that:

Code: Select all

Rank Name       Elo    +    - games score oppo. draws
   1 Firebird    15   17   17   240   55%   -15   52%
   2 Rybka      -15   17   17   240   45%    15   52%
ResultSet-EloRating>los
          Fi Ry
Firebird     95
Rybka      4
I would say that it is quite likely that Firebird is stronger given the parameters for these 240 games.
CRoberson
Posts: 2094
Joined: Mon Mar 13, 2006 2:31 am
Location: North Carolina, USA

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)

Post by CRoberson »

govert wrote:bayeselo on that:

Code: Select all

Rank Name       Elo    +    - games score oppo. draws
   1 Firebird    15   17   17   240   55%   -15   52%
   2 Rybka      -15   17   17   240   45%    15   52%
ResultSet-EloRating>los
          Fi Ry
Firebird     95
Rybka      4
I would say that it is quite likely that Firebird is stronger given the parameters for these 240 games.
I certainly wouldn't agree with that.

You show a rating difference of 30 Elo with margins of +/-17 for only 240 games. I don't buy into that confidence level.

Take a look at the confidence level by the CCRL team. http://www.computerchess.org.uk/ccrl/4040/

After 1445 games, they use margins of +/-16. After 239 games, they use margins of +/- 37.

Using margins of +/- 37, a performance of +30 Elo does not pass the test for being better.

Even more interesting is that all the Ippo fans came out early claiming it is 100 Elo stronger than R3 and all the testing is showing that it may not be better at all. That is all the testing with a large number of games.
User avatar
Spacious_Mind
Posts: 317
Joined: Mon Nov 02, 2009 12:05 am
Location: Alabama

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)

Post by Spacious_Mind »

Andre wrote:Same conditions and adding 240 more games

Firebird 1.0 - Rybka 3.0: +70 =125 -45 (132.5/107.5)
55.21%/44.79%

And for those who like statistics, here are some details:
- from 1 to 60
Firebird 1.0 - Rybka 3.0: +16 =25 -19
- from 61 to 120:
Firebird 1.0 - Rybka 3.0: +20 =31 -09
- from 121 to 180:
Firebird 1.0 - Rybka 3.0: +21 =29 -10
- from 181 to 240:
Firebird 1.0 - Rybka 3.0: +13 =40 -07
Hi Andre,

Is that a total of 360 games that you have now played?

regards

Nick
kingliveson

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)

Post by kingliveson »

Hi André,

Thanks for these tests, but where can one get these games?


Franklin
govert
Posts: 270
Joined: Thu Jan 15, 2009 12:52 pm

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)

Post by govert »

CRoberson wrote:
govert wrote:bayeselo on that:

Code: Select all

Rank Name       Elo    +    - games score oppo. draws
   1 Firebird    15   17   17   240   55%   -15   52%
   2 Rybka      -15   17   17   240   45%    15   52%
ResultSet-EloRating>los
          Fi Ry
Firebird     95
Rybka      4
I would say that it is quite likely that Firebird is stronger given the parameters for these 240 games.
I certainly wouldn't agree with that.

You show a rating difference of 30 Elo with margins of +/-17 for only 240 games. I don't buy into that confidence level.
What about the likelyhood of superiority, as reported by bayeselo: 95% in favor of Firebird.

Either you missed that metric, or I have missed something about that metric.