Firebird 1.0 and 1.01: 180 games.

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Uri Blass
Posts: 10268
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Firebird 1.0 and 1.01: 180 games.

Post by Uri Blass »

yanquis1972 wrote:i'm somewhat surprised that robbo is so powerful on PPC -- i understand rybka as currently written does not translate well, while hiarcs gets a large edge (compared to other programs) for some reason. have you tested robbo vs PF4? could be interesting. at any rate, if robbo scales linearly on a 500mhz+ processor it would be a monster. apparently rybka doesn't, & hiarcs does.

i am basing this on a HIARCS forum posting some time back of PF3 vs rybka on one of the dedicated machines, & hiarcs beating it pretty easily.
I think that Vas did not translate Rybka3 to run on PPC so your conclusion is probably wrong.

I fully expect Rybka3(translated to PPC) to beat PF3 but there is no rybka3 translated to PPC so it is impossible to test it.

Uri
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: Firebird 1.0 and 1.01: 180 games.

Post by beram »

lkaufman wrote:"Robbo (and I presume also Firebird) is much faster than Rybka 3 but with lower quality, How can it play better than ?
so it will often see a ply deeper but with less positional knowledge and missing some tactical things as well ??? Examples please"

An extra ply is very important in computer chess. If one engine outsearches a similar one by a ply consistently, it will win decisively in a match. If you play Rybka vs. Robbo at any fixed depth, remembering to set Rybka for three plies less than Robbo as Rybka reports depth minus three, Rybka will always win any long match, typcially by somewhere around 55% or so. This is the basis of my statement that the search is of higher quality in Rybka. I believe this is mostly due to less chess knowledge in Robbo, but also to other shortcuts that speed up Robbo but occasionally cause a tactic to remain hidden for an extra ply or so. The extra ply Robbo often gets is clearly enough to allow it to win timed matches at short time controls. I believe there exists some time control at which Rybka would start to win, because the value of an extra ply diminishes with depth while the value of chess knowledge does not, but what this time control is I do not know.

I do know that at about one minute per position the two programs become equal at matching moves from my huge database, so for my own analysis I would use Robbo (or a MP equivalent) if I want a quick answer but I would use Rybka (actually Rybka human as I consider it better for analyzing human games) if I intend to let the machine think for at least a minute.
I gave you just one, but very special Anand - Kasparov example game, with compareanalysis of Rybka and Firebird at two minutes a move in Fritz 12 (which is in practice more approx 3 minutes) and Firebird gave 7 different than Rybka - very good Kasparov moves - on a total of 18. Earlier you said, well this is an incident but before this game I tried (and published) another between Botwinnik and Donner and also here a huge difference also at this long analysis time level of two minutes a move.

Perhaps now you have something to chew on, because your argument of equalness doesn't prove nowhere, in these two gm games.
[/b]
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Firebird 1.0 and 1.01: 180 games.

Post by lkaufman »

You don't set any time limit, only fixed depth limit. Try setting Robbo for eight plies and Rybka 3 for five plies. You will need to use a book that has hundreds of openings, and of course play each opening twice, once with each color. I suppose the exact result may differ depending on the book used and the version of Robbo used; I used Robbo 9 for this test. Since these games go very fast, a thousand game sample like I used shouldn't take terribly long, although I was able to run six such matches at once using my computer and software.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Firebird 1.0 and 1.01: 180 games.

Post by lkaufman »

I don't claim that the two programs are the same or even nearly the same, because the derivatives have a simpler but faster evaluation which in timed play will often lead to a different move. These differences favored Firebird in the two games you mention, but based on literally thousands of such comparisons at one minute the two performed equally (I used Robbolito 9 but there appears to be hardly any difference in move choices of Robbo and Firebird (single core) at the same depth). I would say that a conclusion based on several thousand games is a lot more meaningful than a conclusion based on two games. It's just statistics, you need big samples.
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: Firebird 1.0 and 1.01: 180 games.

Post by beram »

lkaufman wrote:I don't claim that the two programs are the same or even nearly the same, because the derivatives have a simpler but faster evaluation which in timed play will often lead to a different move. These differences favored Firebird in the two games you mention, but based on literally thousands of such comparisons at one minute the two performed equally (I used Robbolito 9 but there appears to be hardly any difference in move choices of Robbo and Firebird (single core) at the same depth). I would say that a conclusion based on several thousand games is a lot more meaningful than a conclusion based on two games. It's just statistics, you need big samples.
My point was that you said this I do know that at about one minute per position the two programs become equal at matching moves and this just is not true as I showed you for Rybka and Firebird at 2-3 minutes analysis a move.

Perhaps you mean that when I let Rybka analyse longer for say 30 - 60 minutes a move it is going to match Kasparovs (and Firebirds) moves, so they become equal, but that testing I leave up to you
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Firebird 1.0 and 1.01: 180 games.

Post by lkaufman »

no, I mean that they are equal at one minute per move based on matching thousands of games, not based on two games.
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: Firebird 1.0 and 1.01: 180 games.

Post by beram »

lkaufman wrote:no, I mean that they are equal at one minute per move based on matching thousands of games, not based on two games.
How much less than 100% you consider equal ?
More than 95, or 90 or 80 or 70% ?
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Firebird 1.0 and 1.01: 180 games.

Post by lkaufman »

I'm not sure what the percentages you mention refer to. My result was that Rybka 3 matched 58.7% of nearly 10,000 moves while Robbo 9 matched 58.15% at one minute. So Rybka 3 did better, but I called it equal as the difference of about half a percentage point is not highly significant for the sample size.
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: Firebird 1.0 and 1.01: 180 games.

Post by beram »

lkaufman wrote:I'm not sure what the percentages you mention refer to. My result was that Rybka 3 matched 58.7% of nearly 10,000 moves while Robbo 9 matched 58.15% at one minute. So Rybka 3 did better, but I called it equal as the difference of about half a percentage point is not highly significant for the sample size.
Than we are indeed talking about different things here. I mean that in the two games I observed a similarity in the analysis-moves of Rybka and Firebird I found it equal only about 50 - 55 %
frcha
Posts: 221
Joined: Thu Jan 28, 2010 5:47 pm

Re: Firebird 1.0 and 1.01: 180 games.

Post by frcha »

lkaufman wrote:You don't set any time limit, only fixed depth limit. Try setting Robbo for eight plies and Rybka 3 for five plies. You will need to use a book that has hundreds of openings, and of course play each opening twice, once with each color. I suppose the exact result may differ depending on the book used and the version of Robbo used; I used Robbo 9 for this test. Since these games go very fast, a thousand game sample like I used shouldn't take terribly long, although I was able to run six such matches at once using my computer and software.
Ok, thanks.. I have an interesting question for you. I did try what you said but not enough games and so far you could be right - but 40 games is too few Rybka leading by one. I dont have the time right now to finish it.

However, i thought of trying the same experiments with OTHER ENGINES. So I tried Crafty, Fritz 12, Toga and Hiarcs.
Crafty and Fritz at depth 8 kept getting beaten badly by rybka that i discontinued. Then I tried Hiarcs. Hiarcs won almost every game!

So my question is since my sample size is way too small, have you tried Hiarcs 12 vs Rybka in this test and what was your result?

If Hiarcs 12 has far better evaluation than Rybka then how does Rybka beat Hiarcs in the ratings?

And if above is true (Hiarcs is better at eval) then Robbo might not lose to Rybka even at long tc!