Cluster Testing Pitfalls?

Ajedrecista · Post by **Ajedrecista** » Sat Aug 27, 2011 12:17 am

Hi Don:

Maybe RobboLito 0.084 is getting weaker because it was the first version of RobboLito. Stronger versions are 0.085d3, 0.085e4, 0.085g3 and 0.09 (just four of them, maybe there are other strong versions, but these four should be enough strong).

Another IPPOLIT family engines that should be strong are some of the latest compiles of IvanHoe (there are little differences between them) and also Fire 2.2 (Fire 1.5, 2.0, 2.1 and 2.2 are very similar between them). I hope this info is useful.

Regards from Spain.

Ajedrecista.

Don · Post by **Don** » Sat Aug 27, 2011 12:23 am

Ajedrecista wrote:Hi Don:

Maybe RobboLito 0.084 is getting weaker because it was the first version of RobboLito. Stronger versions are 0.085d3, 0.085e4, 0.085g3 and 0.09 (just four of them, maybe there are other strong versions, but these four should be enough strong). I hope this info is useful.

Regards from Spain.

Ajedrecista.

My plan is to find a stronger clone and that could simply be a better version of RobboLito.

In talking to Bob I realized that I can probably improve Robbo also by compiling with the hardware popcount and using PGO and better compiler options. That might extend it's usefulness a bit more.

hgm · Post by **hgm** » Sat Aug 27, 2011 8:18 am

Laskos wrote:The number of games necessary to achieve the same error margins at 95% result as at 50% result would be (40/7)^2 ~ 33 times more, which is unpractical, but up to 65%-75% results it all seems fine, so one doesn't have to constantly tune the engines up to 100-200 Elo points difference between them.

You forget to take into account that the variance of the result gets smaller, when it approaches 0% or 100%. E.g. ignoring draws, it would be 0.5*0.5 = 0.25 at 50%, but only 0.95*0.05 = 0.0475 at 95%. So you gain back a factor 0.0475/0.25 = 0.19, and you would not need 33 times as many games, but only 6.3 times as many. Still enough not to do it, of course.

Laskos · Post by **Laskos** » Sun Aug 28, 2011 1:33 pm

Right, but I didn't forget, not ignoring the draws would give a correct factor somewhere between yours and mine. I just pointed to the order of magnitude without entering these subtleties, with them Don's caution in choosing opponents seems even more exaggerated.

Kai

Cluster Testing Pitfalls?

Re: Cluster Testing Pitfalls?

Re: Cluster Testing Pitfalls?

Re: Cluster Testing Pitfalls?

Re: Cluster Testing Pitfalls?