Firebird 1.0 and 1.01: 180 games.

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Firebird 1.0 and 1.01: 180 games.

Post by lkaufman »

Yes, Rybka was "self-tuned", but on the other hand Vas included lots of eval that wasn't stricktly helpful in timed self-play on the theory that it should help against other engines and at long time controls. Robbo removed much or all of this borderline eval.
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: Firebird 1.0 and 1.01: 180 games.

Post by yanquis1972 »

that's good to hear, as it's always been my impression that vas is every bit the elo fiend the ippolit guys are. however i really don't see, in most if any positions, this 'missing knowledge' having much if any detriment in the eval, but i haven't examined this thoroughly.

on a related subject, i noticed you keep mentioning time management, when it's always been my impression rybka was superior to all others in this regard (though robbo may have caught/surpassed it, at least short TC) -- again in relation to what i've perhaps mistakenly observed as vas's desire to dominate comp-comp rating lists above all else.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Firebird 1.0 and 1.01: 180 games.

Post by lkaufman »

Time management was terrible up thru Rybka 2.3.2a. We did spend a few days on it at the end for Rybka 3, but it was pretty much an afterthought, Vas was busy getting R3 ready for release. Maybe it's much better for 40/x games like CCRL and CEGT run than on game/x + increment levels, I don't know. Has there been much testing of Robbo et al vs. Rybka at 40/x levels?
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: Firebird 1.0 and 1.01: 180 games.

Post by yanquis1972 »

a considerable amount, unfortunately a lot of it is tucked away in a private rybka forum -- i'm sure you could get access but there's a 500 post requirement for viewing, i don't know if you have that or not? i'm also not sure that i'm supposed to talk about it but oh well.

anyway, the results of 40/20 - 40/40 - 30+15 matches, between robbo + ivanhoe vs R3 0 contempt, have demonstrated rough equality. one guy ran 300 30+15 games with an early ivanhoe against R3 and a ~+25 elo result (4CPU each) for the deriv. paul has run matches of robbo vs r3 @ 40/40 (i think, possibly 40/20) with r3 coming out ahead by less than 10 elo. firebird in all the tests i've seen so far has been about = on 4 cores against R3 at LTC.

essentially, equality, although i still cannot figure out demonstrably why (outside of the common sense reasons given by some). i don't have the patience to run long sets of LTC games on my one machine, but robbo clearly has an edge at fast TC, and ivanhoe seems to be faster on the whole at solving test positions that require several minutes of think-time than rybka 3 is. maybe that last is incorrect, it's only my impression & it may be selective -- it also may not correspond to game performance.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Firebird 1.0 and 1.01: 180 games.

Post by lkaufman »

I can check on that private Rybka forum but it sounds like the type of time control doesn't matter much, it comes down to the derivatives being stronger at blitz and near-blitz and rough equality at intermediate speeds. This is consistent with what I've been saying. Presumably Rybka 3 would come out ahead at 40/2 if anyone ran hundreds of such games. On matching moves from high-level games at one minute I got a slightly better result for R3 over Robbo over nearly 10,000 positions. I expect Robbo would score better on positions selected for tactical reasons.
Karlo Bala
Posts: 373
Joined: Wed Mar 22, 2006 10:17 am
Location: Novi Sad, Serbia
Full name: Karlo Balla

Re: Firebird 1.0 and 1.01: 180 games.

Post by Karlo Bala »

lkaufman wrote:I can check on that private Rybka forum but it sounds like the type of time control doesn't matter much, it comes down to the derivatives being stronger at blitz and near-blitz and rough equality at intermediate speeds. This is consistent with what I've been saying. Presumably Rybka 3 would come out ahead at 40/2 if anyone ran hundreds of such games. On matching moves from high-level games at one minute I got a slightly better result for R3 over Robbo over nearly 10,000 positions. I expect Robbo would score better on positions selected for tactical reasons.
Time control for itself mean nothing.
Which is longer time control: 100+10 on A2000+ or 10+1 on i920?
I propose different way of measuring time, as mega nodes per game.

For example,
Time 5+3
Robbolito, 2 Mn/s

Time control ~ (5*60 + 3*60[average moves per game])*2 = 960Mn/G

This way you could compare engines strength on different hardware
Best Regards,
Karlo Balla Jr.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Firebird 1.0 and 1.01: 180 games.

Post by lkaufman »

I believe that some of the testing organizations already do something like this; they specify a time control for specific hardware, then run a test on an individual's hardware and have him adjust his time limit accordingly. But for casual conversation, it's not such a critical issue because most of the speed difference among processors comes from how many are used, so if we're talking about a single-core matchup (or a four-core matchup or whatever) there is only about a two to one ratio in speed between the fastest computer and any recent-vintage laptop.
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: Firebird 1.0 and 1.01: 180 games.

Post by beram »

lkaufman wrote:So you base the claim that Firebird is stronger than Rybka on a lead of 2 points after 72 games (at a fast tc)? The very fact that these are so close whereas head to head testing shows clear wins for the Robbo family supports my claim that Robbo's superiority largely vanishes against unrelated engines, and any that remains is easily attributed just to better TC. Let me know if you can point to any test where Firebird or any of the Robbo family has a statistically significant lead against a field of unrelated engines, preferably not at a blitz TC.
You should read my above mentioned post http://talkchess.com/forum/viewtopic.php?t=32514

this is an abstract: ...Currently we have a head-to head between Firebird and Rybka. ... The present tendency is a tie but not a superiority of Firebird. But there are differences in scoring against the opponents.
While Stockfish and Zappa are clearly kept in check by Firebird, Rybka doesn't score so well against them. Against Naum, the other way is round...