Another Firebird-Rybka match (120 games @ 20mn+5s)

Andre · Post by **Andre** » Wed Feb 03, 2010 5:45 am

Same conditions and adding 240 more games

Firebird 1.0 - Rybka 3.0: +70 =125 -45 (132.5/107.5)
55.21%/44.79%

And for those who like statistics, here are some details:
- from 1 to 60
Firebird 1.0 - Rybka 3.0: +16 =25 -19
- from 61 to 120:
Firebird 1.0 - Rybka 3.0: +20 =31 -09
- from 121 to 180:
Firebird 1.0 - Rybka 3.0: +21 =29 -10
- from 181 to 240:
Firebird 1.0 - Rybka 3.0: +13 =40 -07

govert · Post by **govert** » Wed Feb 03, 2010 2:20 pm

bayeselo on that:

Code: Select all

Rank Name       Elo    +    - games score oppo. draws
   1 Firebird    15   17   17   240   55%   -15   52%
   2 Rybka      -15   17   17   240   45%    15   52%
ResultSet-EloRating>los
          Fi Ry
Firebird     95
Rybka      4

I would say that it is quite likely that Firebird is stronger given the parameters for these 240 games.

CRoberson · Post by **CRoberson** » Fri Feb 05, 2010 3:01 am

govert wrote:bayeselo on that:
Code: Select all
Rank Name       Elo    +    - games score oppo. draws
   1 Firebird    15   17   17   240   55%   -15   52%
   2 Rybka      -15   17   17   240   45%    15   52%
ResultSet-EloRating>los
          Fi Ry
Firebird     95
Rybka      4
I would say that it is quite likely that Firebird is stronger given the parameters for these 240 games.

I certainly wouldn't agree with that.

You show a rating difference of 30 Elo with margins of +/-17 for only 240 games. I don't buy into that confidence level.

Take a look at the confidence level by the CCRL team. http://www.computerchess.org.uk/ccrl/4040/

After 1445 games, they use margins of +/-16. After 239 games, they use margins of +/- 37.

Using margins of +/- 37, a performance of +30 Elo does not pass the test for being better.

Even more interesting is that all the Ippo fans came out early claiming it is 100 Elo stronger than R3 and all the testing is showing that it may not be better at all. That is all the testing with a large number of games.

Spacious_Mind · Post by **Spacious_Mind** » Fri Feb 05, 2010 5:19 am

Andre wrote:Same conditions and adding 240 more games

Firebird 1.0 - Rybka 3.0: +70 =125 -45 (132.5/107.5)
55.21%/44.79%

And for those who like statistics, here are some details:
- from 1 to 60
Firebird 1.0 - Rybka 3.0: +16 =25 -19
- from 61 to 120:
Firebird 1.0 - Rybka 3.0: +20 =31 -09
- from 121 to 180:
Firebird 1.0 - Rybka 3.0: +21 =29 -10
- from 181 to 240:
Firebird 1.0 - Rybka 3.0: +13 =40 -07

Hi Andre,

Is that a total of 360 games that you have now played?

regards

Nick

kingliveson · Post by **kingliveson** » Fri Feb 05, 2010 5:48 am

Hi André,

Thanks for these tests, but where can one get these games?

Franklin

govert · Post by **govert** » Fri Feb 05, 2010 8:56 am

CRoberson wrote:
govert wrote:bayeselo on that:
Code: Select all
Rank Name       Elo    +    - games score oppo. draws
   1 Firebird    15   17   17   240   55%   -15   52%
   2 Rybka      -15   17   17   240   45%    15   52%
ResultSet-EloRating>los
          Fi Ry
Firebird     95
Rybka      4
I would say that it is quite likely that Firebird is stronger given the parameters for these 240 games.
I certainly wouldn't agree with that.

You show a rating difference of 30 Elo with margins of +/-17 for only 240 games. I don't buy into that confidence level.

What about the likelyhood of superiority, as reported by bayeselo: 95% in favor of Firebird.

Either you missed that metric, or I have missed something about that metric.

Another Firebird-Rybka match (120 games @ 20mn+5s)

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)

Re: Another Firebird-Rybka match (120 games @ 20mn+5s)