## Critter 1.6 - Critter 1.4a ponder ON/OFF

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Ajedrecista
Posts: 1988
Joined: Wed Jul 13, 2011 9:04 pm

### Re: Critter 1.6 - Critter 1.4a, ponder ON/OFF.

Hello:
ernest wrote:
Ajedrecista wrote:Hi again!
Hi Jesus,

Thanks for this detailed post, I will study it carefully!

Of course, your program gives more accurate numbers, I only get (not too bad) approximations.

Do you have a comment on my section starting with
Now if we want to see if Ponder OFF or ON makes a significant difference in a match between Critter 1.6 and Critter 1.4a SSE4, we have to consider the global (sum) distribution
which shows that so far (i.e. with only those 500+500 games) the difference is NOT significant?
My former post was regarding my model, but implying some assumptions (µ = 0.5 and lots of games), our n&#963;² were identical to each other, so it also should be valid with your model... but please take my words with extreme care, because I am not very gifted in Statistics.

I read that section of your post and I tried to elude it it because I have no idea! I think I managed to avoid it... but this is not the case anymore. Well, regarding error bars (which I think you are not asking for), in one of the post I linked to, in the last quote I gave some details of what I think that would be correct, although I am not sure:
i subindex is represented by _i; i = 1, 2.

<e> = sqrt{[n_1·(<e_1>)² + n_2·(<e_2>)²]/(n_1 + n_2)}
I calculated the error bar of the tied 500-game match for 2-sigma confidence with my programme, and more less is ± 20.64 Elo; the other error bar was ± 20.48 Elo. So, knowing that n_1 = n_2 = 500:
<e> ~ sqrt{[(20.64)² + (20.48)²]/2} = sqrt(422.72) ~ ± 20.56 Elo.
So, with those two matches of 500 games each (that sum 1000 games), I am not sure if the error bar is more less ± 20.56 Elo or ± 14.53 Elo of a single 1000-game match that I reported in my first post in this topic. If I replace <e> by k&#963; (I took k = 2, that is, ~ 95.45% confidence), which is what appears in my post of January:
In the tied 500-game, I get 2&#963;_1 ~ 0.029665; in the other 500-game match: 2&#963;_2 ~ 0.029314.

2&#963; ~ sqrt{[(0.029665)² + (0.029314)²]/2} ~ 0.02949

µ = 0.5165 (in the add of the two 500-game match).

|<e>| ~ 200·log[(µ + 2&#963;)(1 - µ + 2&#963;)/(µ - 2&#963;)(1 - µ - 2&#963;)] ~ 20.54 Elo.
I think that this last quote is more accurate than the other; but it was faster to me working directly with error bars than with standard deviations. Furthermore, 20.54 and 20.56 are enough close to considered them almost the same value. I can not conclude anything and I am unable of saying something about the significant difference (or not) in a match between Critter 1.6 and Critter 1.4a SSE4; maybe considering the global (sum) distribution is necessary, but this is only a random guess, so please do not take my words seriously. I have used too many words for not saying anything really useful.

SUMMARIZING: I accept my limitations and I do not have an answer for your question... I am sorry. Some help would be required and will be very much appreciated!

Regards from Spain.

Ajedrecista.
Mike S.
Posts: 1480
Joined: Thu Mar 09, 2006 5:33 am

### Re: Critter 1.6 - Critter 1.4a ponder ON/OFF

Meanwhile, Critter 1.6a has been released. Hopefully, this strange effect is gone... but it's not mentioned in the release notes at

http://www.vlasak.biz/critter/
Regards, Mike