Same conditions and adding 240 more games
Firebird 1.0 - Rybka 3.0: +70 =125 -45 (132.5/107.5)
55.21%/44.79%
And for those who like statistics, here are some details:
- from 1 to 60
Firebird 1.0 - Rybka 3.0: +16 =25 -19
- from 61 to 120:
Firebird 1.0 - Rybka 3.0: +20 =31 -09
- from 121 to 180:
Firebird 1.0 - Rybka 3.0: +21 =29 -10
- from 181 to 240:
Firebird 1.0 - Rybka 3.0: +13 =40 -07
Another Firebird-Rybka match (120 games @ 20mn+5s)
Moderator: Ras
-
- Posts: 98
- Joined: Thu Jul 23, 2009 5:40 am
-
- Posts: 270
- Joined: Thu Jan 15, 2009 12:52 pm
Re: Another Firebird-Rybka match (120 games @ 20mn+5s)
bayeselo on that:
I would say that it is quite likely that Firebird is stronger given the parameters for these 240 games.
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Firebird 15 17 17 240 55% -15 52%
2 Rybka -15 17 17 240 45% 15 52%
ResultSet-EloRating>los
Fi Ry
Firebird 95
Rybka 4
-
- Posts: 2094
- Joined: Mon Mar 13, 2006 2:31 am
- Location: North Carolina, USA
Re: Another Firebird-Rybka match (120 games @ 20mn+5s)
I certainly wouldn't agree with that.govert wrote:bayeselo on that:I would say that it is quite likely that Firebird is stronger given the parameters for these 240 games.Code: Select all
Rank Name Elo + - games score oppo. draws 1 Firebird 15 17 17 240 55% -15 52% 2 Rybka -15 17 17 240 45% 15 52% ResultSet-EloRating>los Fi Ry Firebird 95 Rybka 4
You show a rating difference of 30 Elo with margins of +/-17 for only 240 games. I don't buy into that confidence level.
Take a look at the confidence level by the CCRL team. http://www.computerchess.org.uk/ccrl/4040/
After 1445 games, they use margins of +/-16. After 239 games, they use margins of +/- 37.
Using margins of +/- 37, a performance of +30 Elo does not pass the test for being better.
Even more interesting is that all the Ippo fans came out early claiming it is 100 Elo stronger than R3 and all the testing is showing that it may not be better at all. That is all the testing with a large number of games.
-
- Posts: 317
- Joined: Mon Nov 02, 2009 12:05 am
- Location: Alabama
Re: Another Firebird-Rybka match (120 games @ 20mn+5s)
Hi Andre,Andre wrote:Same conditions and adding 240 more games
Firebird 1.0 - Rybka 3.0: +70 =125 -45 (132.5/107.5)
55.21%/44.79%
And for those who like statistics, here are some details:
- from 1 to 60
Firebird 1.0 - Rybka 3.0: +16 =25 -19
- from 61 to 120:
Firebird 1.0 - Rybka 3.0: +20 =31 -09
- from 121 to 180:
Firebird 1.0 - Rybka 3.0: +21 =29 -10
- from 181 to 240:
Firebird 1.0 - Rybka 3.0: +13 =40 -07
Is that a total of 360 games that you have now played?
regards
Nick
Re: Another Firebird-Rybka match (120 games @ 20mn+5s)
Hi André,
Thanks for these tests, but where can one get these games?
Franklin
Thanks for these tests, but where can one get these games?
Franklin
-
- Posts: 270
- Joined: Thu Jan 15, 2009 12:52 pm
Re: Another Firebird-Rybka match (120 games @ 20mn+5s)
What about the likelyhood of superiority, as reported by bayeselo: 95% in favor of Firebird.CRoberson wrote:I certainly wouldn't agree with that.govert wrote:bayeselo on that:I would say that it is quite likely that Firebird is stronger given the parameters for these 240 games.Code: Select all
Rank Name Elo + - games score oppo. draws 1 Firebird 15 17 17 240 55% -15 52% 2 Rybka -15 17 17 240 45% 15 52% ResultSet-EloRating>los Fi Ry Firebird 95 Rybka 4
You show a rating difference of 30 Elo with margins of +/-17 for only 240 games. I don't buy into that confidence level.
Either you missed that metric, or I have missed something about that metric.