CCRL update (1st September 2007)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Norm Pollock
Posts: 1031
Joined: Thu Mar 09, 2006 3:15 pm
Location: Long Island, NY, USA
Contact:

Re: CCRL update (1st September 2007)

Post by Norm Pollock » Tue Sep 04, 2007 12:12 pm

Hi Kiril,

If both "ponder hits" and "draw analysis" point in the same direction, then there are likely to be 2 similar thinking engines.

Are there any "control" experiments with "ponder hits" to establish reference points for the stats?. Like calculating "ponder hits" with two identical engines, or with two intra-family engines with a 100 elo difference in strength, or with two unrelated engines where the source codes are published and there is no doubt that the engines are unrelated.

I think the "in between" issue with the "draw analysis" of Strelka printed above, is due to the mixture of Rybka+Toga+Fruit. If I just focus of Strelka v Rybka 1.0 Beta, there is a clearer picture.

Strelka 1.0b 32-bit v Rybka 1.0 Beta 32-bit: 9 wins, 32 draws, 23 losses
draw rate: 50.0%

Strelka 1.8 32-bit v Rybka 1.0 Beta 32-bit: 10 wins, 33 draws, 21 losses
draw rate: 51.6%

(note: Rybka 1.0 Beta 32-bit was sometimes previously mentioned as Rybka Beta 32-bit)

(inter-family draw rate 29.8%, intra-family draw rate 50.1%)

(note: These Strelka v Rybka 1.0 games were included in the inter-family group, not the intra-family group).

Based on these 128 games, there is an indication of similar thinking. But I do not think the data is sufficient for a "statistical" conclusion. I imagine that it would need 20 times as many games.

-Norm

Uri Blass
Posts: 8940
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: CCRL update (1st September 2007)

Post by Uri Blass » Tue Sep 04, 2007 12:23 pm

I think that draw analysis is very bad to decide if engines are in the same family.


I see no reason to assume that there will be more draws between program A and itself relative to draws between 2 other engines that have the same playing strength.

For example
I can imagine a small number of draws between engine and itself if the engine has a big contempt factor that cause it to reject draws.

Uri

Spock

Re: CCRL update (1st September 2007)

Post by Spock » Tue Sep 04, 2007 12:52 pm

Uri Blass wrote:I think that draw analysis is very bad to decide if engines are in the same family.


I see no reason to assume that there will be more draws between program A and itself relative to draws between 2 other engines that have the same playing strength.

For example
I can imagine a small number of draws between engine and itself if the engine has a big contempt factor that cause it to reject draws.

Uri
On it's own, draw analysis does not tell you much, agreed

Norm Pollock
Posts: 1031
Joined: Thu Mar 09, 2006 3:15 pm
Location: Long Island, NY, USA
Contact:

Re: CCRL update (1st September 2007)

Post by Norm Pollock » Tue Sep 04, 2007 1:02 pm

Uri Blass wrote:I think that draw analysis is very bad to decide if engines are in the same family.


I see no reason to assume that there will be more draws between program A and itself relative to draws between 2 other engines that have the same playing strength.

For example
I can imagine a small number of draws between engine and itself if the engine has a big contempt factor that cause it to reject draws.

Uri
Hi Uri,

If a game is deadlocked and impossible to win, there will be a draw even if there is a contempt factor. One thing about computer games, there are no "grandmaster" draws. A draw is really a draw. That alone gives some credence to "draw analysis".

The theory behind "draw analysis" is that engines that think alike, will not find as many weaknesses in the other's position as non-related engines would. Finding a weakness in the opponent's position will lead to an attack and usually a win/loss result.

Unrelated engines that are closely rated will usually only draw 30-35% of the time, while engines that are related usually draw 50-60% of the time. These are experimentally obtained stats, using CCRL 40/40 results. The increase in draws does not materially affect the "score" between the engines, since the better engine will win more often than lose.

-Norm

Post Reply