FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)

Spock · Post by **Spock** » Wed Aug 22, 2007 6:11 pm

Yes, unfortunately in common with many other ratings list, including our own 40/4 standard chess list, Hiarcs 11.2 has turned in a performance here worse than 11.1 and also worse than 11.

So Hiarcs 11.1 retains it's spot at the top of the pure list and best versions list.

A definite pattern can be seen with 11.2 - it underperformed against the stronger engines whilst over-performing against weaker opponents.

Most unfortunate, and the first time in the history of this list that a new engine version has performed worse than it's predecessor. I do not however expect a repeat of this from Hiarcs, and I'm sure that the Hiarcs team will be back fighting with Hiarcs 12

The main list is here
http://www.computerchess.org.uk/ccrl/404FRC/index.html

But you'll need to look at the "Complete List" to find Hiarcs 11.2

Scores against common opponents can be seen here

.

pichy · Post by **pichy** » Thu Aug 23, 2007 3:44 am

Spock wrote:Yes, unfortunately in common with many other ratings list, including our own 40/4 standard chess list, Hiarcs 11.2 has turned in a performance here worse than 11.1 and also worse than 11.

So Hiarcs 11.1 retains it's spot at the top of the pure list and best versions list.

A definite pattern can be seen with 11.2 - it underperformed against the stronger engines whilst over-performing against weaker opponents.

Most unfortunate, and the first time in the history of this list that a new engine version has performed worse than it's predecessor. I do not however expect a repeat of this from Hiarcs, and I'm sure that the Hiarcs team will be back fighting with Hiarcs 12

The main list is here
http://www.computerchess.org.uk/ccrl/404FRC/index.html

But you'll need to look at the "Complete List" to find Hiarcs 11.2

Scores against common opponents can be seen here

.

What would you estimate the Newer Rybka rating will be in FRC, close to 2985

Spock · Post by **Spock** » Thu Aug 23, 2007 8:57 am

pichy wrote:
What would you estimate the Newer Rybka rating will be in FRC, close to 2985

Not sure - the 32-bit version may struggle to get to that level, but the 64-bit version certainly at least that

Eelco de Groot · Post by **Eelco de Groot** » Thu Aug 23, 2007 2:37 pm

Thanks for testing Hiarcs 11.2 Ray! Even if the result is a bit unexpected it is important to know how strong the new version is.

I think that Mark should go back trying to improve Hiarcs in Normal Playing Style, so that we would just have to switch on Hypermodern to get some extra elopoints. That seemed to work better

By the way Ray, I think it may also be a good thing that in your FRC testings all engines play each other even if the Elo rating shows a big gap. If the Elo rating system works as it should, also for computer chess, it should of course not matter at all how large the difference between opponents is. And if it does make a difference for the final results it is possible that playing all against all, or something approaching this, is still the best way, although the picture comparing versions that are closer in the list may suffer a little the integrity of the whole list is probably better, especially with the big amount of games that you are playing to reduce randomness.

I think it is possible Harm Geert Muller might say something similar but at the moment I believe he is more busy with discussing multithreading issues with Robert Hyatt

A propos, maybe you or Jorge knows a bit more this, I could not find anything about whether the Chess960 version of Rybka will be separately available or maybe just an option for Rybka 3.0? Maybe I missed it if Vasik Rajlich already said something about this? I have not yet bought Rybka 2.3 but the 3.0 is still such a long time away to wait for

Thanks for doing your tests Ray!

Fischer Random Regards,
Eelco

Spock · Post by **Spock** » Thu Aug 23, 2007 2:55 pm

Thanks for your interest in the FRC list

I currently play pairs where the ELO gap is <= 300 ELO. It is probably something I can't win on - if I play pairs with a bigger ELO gap, then some people will criticise that, in fact have criticised CCRL in the past for it. However you would like to see them played... Is it really meaningful for example for Hiarcs to play Ayito, a 500+ difference ? I'm certainly up for the discussion and playing the games if on balance that is what the audience for this list would prefer. The FRC list is a bit unique and doesn't have to necessarily follow the same rules as our standard chess list.

The next development on the list is to include 64-bit engines. Currently it is 32-bit only, only because that is the only machine I had spare when I started this list. Now all my machines are 64-bit, so I want to "upgrade" it. So I'll soon be playing Naum x64 and Glaurung x64

Rybka 2.3.2 960 is a private engine. Rybka 3.0 will be the first public version to support FRC.

I have Rybka 2.3.2 960, and it is testing now. The 32-bit results will be on the list within the next 24 hrs or so hopefully. Then the 64-bit version together with 64-bit Naum and Glaurung as above.

Uri Blass · Post by **Uri Blass** » Thu Aug 23, 2007 3:39 pm

I think that no result is meaningless.
You are free to play matches when the difference is high(it may be interesting what is the minimal difference in rating when we are going to see 100-0 result) and you are also free to play matches at longer time control so we can see if we get different ranking at different time control.

It may be possible that deep sjeng is better than glaurung1.2.1 at 40/40

7 Glaurung 1.2.1 2767 +14 −14 48.0% +9.9 21.6% 2000
98.6%
8 Deep Sjeng 2.5 1CPU 2745 +14 −14 47.0% +15.6 22.0% 1800

Note that in normal 40/40 it seems to be the case that Deep Sjeng earned 70 elo from transition from 40/4 to 40/40 when glaurung only earned 5 elo.

It also may be possible that naum has a better place at 40/40 because this engine seems to perform better at longer time control based on CEGT
and CCRL.

Uri

Andrew · Post by **Andrew** » Sun Aug 26, 2007 6:35 am

But isn't it true that 11.2 was just meant to fix up a few problems, and
wasn't meant to be an improvement on 11.1 ??

Also the errors on your list for both versions are +- 16. and +-20 If this is 1 standard deviation, then the observed difference has no statistical significance. "Most unfortunate" isn't a fair appraisal.

Andrew

Spock wrote:Yes, unfortunately in common with many other ratings list, including our own 40/4 standard chess list, Hiarcs 11.2 has turned in a performance here worse than 11.1 and also worse than 11.

So Hiarcs 11.1 retains it's spot at the top of the pure list and best versions list.

A definite pattern can be seen with 11.2 - it underperformed against the stronger engines whilst over-performing against weaker opponents.

Most unfortunate, and the first time in the history of this list that a new engine version has performed worse than it's predecessor. I do not however expect a repeat of this from Hiarcs, and I'm sure that the Hiarcs team will be back fighting with Hiarcs 12

The main list is here
http://www.computerchess.org.uk/ccrl/404FRC/index.html

But you'll need to look at the "Complete List" to find Hiarcs 11.2

Scores against common opponents can be seen here

.

Dirt · Post by **Dirt** » Sun Aug 26, 2007 6:55 am

Andrew wrote:But isn't it true that 11.2 was just meant to fix up a few problems, and
wasn't meant to be an improvement on 11.1 ??

Also the errors on your list for both versions are +- 16. and +-20 If this is 1 standard deviation, then the observed difference has no statistical significance. "Most unfortunate" isn't a fair appraisal.

Andrew

The standard in chess ratings seems to be two standard deviations (or 95% confidence). There would still a fair chance of the difference being statistical error, but 11.2 is rated lower in some standard chess rating lists. A lower rating in FRC too isn't surprising.

Uri Blass · Post by **Uri Blass** » Sun Aug 26, 2007 7:20 am

Dirt wrote:
Andrew wrote:But isn't it true that 11.2 was just meant to fix up a few problems, and
wasn't meant to be an improvement on 11.1 ??

Also the errors on your list for both versions are +- 16. and +-20 If this is 1 standard deviation, then the observed difference has no statistical significance. "Most unfortunate" isn't a fair appraisal.

Andrew
The standard in chess ratings seems to be two standard deviations (or 95% confidence). There would still a fair chance of the difference being statistical error, but 11.2 is rated lower in some standard chess rating lists. A lower rating in FRC too isn't surprising.

A lower rating in FRC is certainly surprising because 11.2 was clearly supposed to be an improvement:

http://64.68.157.89/forum/viewtopic.php ... 24&t=15384

Harvey Williamson

No it is not the version that played in last weekends event.

It is an improved version of 11.1 with a few enhancements.

Dirt · Post by **Dirt** » Sun Aug 26, 2007 8:45 am

Uri Blass wrote:
Dirt wrote:
Andrew wrote:But isn't it true that 11.2 was just meant to fix up a few problems, and
wasn't meant to be an improvement on 11.1 ??

Also the errors on your list for both versions are +- 16. and +-20 If this is 1 standard deviation, then the observed difference has no statistical significance. "Most unfortunate" isn't a fair appraisal.

Andrew
The standard in chess ratings seems to be two standard deviations (or 95% confidence). There would still a fair chance of the difference being statistical error, but 11.2 is rated lower in some standard chess rating lists. A lower rating in FRC too isn't surprising.
A lower rating in FRC is certainly surprising because 11.2 was clearly supposed to be an improvement:

http://64.68.157.89/forum/viewtopic.php ... 24&t=15384

Harvey Williamson

No it is not the version that played in last weekends event.

It is an improved version of 11.1 with a few enhancements.

I think that was before the testing by the major testing groups. Since it is now seen to be performing worse in standard chess, why do you think it is surprising it is also performing worse in FRC?

FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)

FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)

Re: FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)

Re: FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)

Re: FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)

Re: FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)

Re: FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)

Re: FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)

Re: FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)

Re: FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)

Re: FRC - Hiarcs 11.2 completed (24 ELO weaker than 11.1)