Rybka 3 64 bit 4 CPU vs Firebird 64 bit 4 CPU 2+2 match

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

JManion
Posts: 205
Joined: Wed Dec 23, 2009 8:53 am

Rybka 3 64 bit 4 CPU vs Firebird 64 bit 4 CPU 2+2 match

Post by JManion »

I am playing a 2+2 match. I will post the games as soon as I am done.

So far Rybka is winning 15 to 11, with 22 draws but it is still early.
JManion
Posts: 205
Joined: Wed Dec 23, 2009 8:53 am

Re: Rybka 3 64 bit 4 CPU vs Firebird 64 bit 4 CPU 2+2 match

Post by JManion »

Well Only 178 games done so far. Then sadly FB had an error where it ran out of time game after game.

Firebird +53/-35/=90 55.06% 98.0/178
Rybka 3 +35/-53/=90 44.94% 80.0/178
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: Rybka 3 64 bit 4 CPU vs Firebird 64 bit 4 CPU 2+2 match

Post by kranium »

JManion wrote:Well Only 178 games done so far. Then sadly FB had an error where it ran out of time game after game.

Firebird +53/-35/=90 55.06% 98.0/178
Rybka 3 +35/-53/=90 44.94% 80.0/178
Many thanks for beta testing and the feedback Josh! we've learned a lot already.

I hope you'll also test 1.0 when available (possibly ready in a couple weeks, after more testing)
...Sentinel and I have already fixed several bugs:

-no longer occasionaly displays 0.1 in eval
-no longer drops a thread in analysis mode when left running more than 11-12 hours
-movelists have been enlarged to prevent overflow in really long games
-SMP changes/improvements
-new time management
-several new UCI options:
move on ponderhit
fritz mode
and much more
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: Rybka 3 64 bit 4 CPU vs Firebird 64 bit 4 CPU 2+2 match

Post by beram »

Endresult: Nunn2 testmatch (50 games) at long timecontrol: 40m/40/20m/40/20
(Centrino Dual Core T8300, 2100Mhz, 128 mb Hash (Fritzmark 6,14))

1 FireBird 1.0 beta w32 +9/=34/-7 52.00% 26.0/50
2 Rybka 3 32-bit +7/=34/-9 48.00% 24.0/50

So Firebird 1.0 beta beats Rybka 3 in all the 3 Nunn2 matches:
At 4m2sec with 31 - 19
At 25 min with 30 - 20
At 40m/40 with 26 – 24

So you can conclude that the ELO gap between Firebird and Rybka diminishes at longer time controls. But hat of Firebird 1.0, not bad for beta
grts Bram
:)
ernest
Posts: 2041
Joined: Wed Mar 08, 2006 8:30 pm

Re: Rybka 3 64 bit 4 CPU vs Firebird 64 bit 4 CPU 2+2 match

Post by ernest »

beram wrote:
1 FireBird 1.0 beta w32 +9/=34/-7 52.00% 26.0/50
2 Rybka 3 32-bit +7/=34/-9 48.00% 24.0/50
Please be aware of the statistical noise...
Standard Deviation is 0.5*Sqrt(9+7) = 2
So within 68% probability (1 SD) Firebird has 24 to 28/50
and within 95% probability (2 SD) Firebird has 22 to 30/50
govert
Posts: 270
Joined: Thu Jan 15, 2009 12:52 pm

Re: Rybka 3 64 bit 4 CPU vs Firebird 64 bit 4 CPU 2+2 match

Post by govert »

Now what would be really interesting would be if someone with some solid statistical knowledge could calculate the following:

Given the data:
At 4m2sec with 31 - 19
At 25 min with 30 - 20
At 40m/40 with 26 – 24
What is P("ELO diff is lower at long TC") ?

I hope that I have formulated the question correctly and that someone can answer it.

EDIT:typo
ernest
Posts: 2041
Joined: Wed Mar 08, 2006 8:30 pm

Re: Rybka 3 64 bit 4 CPU vs Firebird 64 bit 4 CPU 2+2 match

Post by ernest »

govert wrote:What is P("ELO diff is lower at long TC") ?
Don't know if I have "solid statistical knowledge", but let's try! :wink:

Let's only take in account the 2 results
At 4m2sec with 31 - 19
At 40m/40 with 26 – 24

For the 26-24 match, I had the detail W/L/D and calculated the SD (Standard Deviation) to be 2.
Let's assume that for the 31-19 match, the D (draw) rate was about the same. Then the SD is also 2.
Then the difference between the (4m2sec) Gauss curve centered at 31 with SD=2 and the (40m/40) Gauss curve centered at 26 with SD=2
is a Gauss curve centered at +5 with SD=2*Sqrt(2)=2.8

The probability to be on the negative side of this last Gauss curve is
P (minus infinity to minus 1.77 SD) = 4%

So the probability that ELO diff is lower at (40m/40) TC than at (4m2sec) TC is 96%

This is assuming that the so-called Nunn2 beginnings (which are something named by Chessbase, but not created by Nunn - Nunn "real" beginnings are 10 and 20, not 25 !!!) do not introduce a bias in the matches.
alpha123
Posts: 660
Joined: Sat Dec 05, 2009 5:13 am
Location: Colorado, USA

Re: Rybka 3 64 bit 4 CPU vs Firebird 64 bit 4 CPU 2+2 match

Post by alpha123 »

kranium wrote:
JManion wrote:Well Only 178 games done so far. Then sadly FB had an error where it ran out of time game after game.

Firebird +53/-35/=90 55.06% 98.0/178
Rybka 3 +35/-53/=90 44.94% 80.0/178
Many thanks for beta testing and the feedback Josh! we've learned a lot already.

I hope you'll also test 1.0 when available (possibly ready in a couple weeks, after more testing)
...Sentinel and I have already fixed several bugs:

-no longer occasionaly displays 0.1 in eval
-no longer drops a thread in analysis mode when left running more than 11-12 hours
-movelists have been enlarged to prevent overflow in really long games
-SMP changes/improvements
-new time management
-several new UCI options:
move on ponderhit
fritz mode
and much more
Welcome back Norman :D!

I noticed that Firebird will occasionally hang while playing tournaments. Also, it crashed after I accidentally left it analyzing for 7 hours :P....

Good luck getting those bugs out of the way,
Peter
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: Rybka 3 64 bit 4 CPU vs Firebird 64 bit 4 CPU 2+2 match

Post by kranium »

alpha123 wrote:
kranium wrote:
JManion wrote:Well Only 178 games done so far. Then sadly FB had an error where it ran out of time game after game.

Firebird +53/-35/=90 55.06% 98.0/178
Rybka 3 +35/-53/=90 44.94% 80.0/178
Many thanks for beta testing and the feedback Josh! we've learned a lot already.

I hope you'll also test 1.0 when available (possibly ready in a couple weeks, after more testing)
...Sentinel and I have already fixed several bugs:

-no longer occasionaly displays 0.1 in eval
-no longer drops a thread in analysis mode when left running more than 11-12 hours
-movelists have been enlarged to prevent overflow in really long games
-SMP changes/improvements
-new time management
-several new UCI options:
move on ponderhit
fritz mode
and much more
Welcome back Norman :D!

I noticed that Firebird will occasionally hang while playing tournaments. Also, it crashed after I accidentally left it analyzing for 7 hours :P....

Good luck getting those bugs out of the way,
Peter
thanks Peter!
i believe both 'issues' have been resolved with the SMP changes we've implemented...
i recently left it in analysis mode for 48 hours on 2 cores...ran perfectly.
govert
Posts: 270
Joined: Thu Jan 15, 2009 12:52 pm

Re: Rybka 3 64 bit 4 CPU vs Firebird 64 bit 4 CPU 2+2 match

Post by govert »

So the probability that ELO diff is lower at (40m/40) TC than at (4m2sec) TC is 96%
Thankyou!