Critter teaser

rvida · Post by **rvida** » Sun Nov 27, 2011 1:06 pm

Hi,

I have just answered some PMs about the status of Critter development. I thought I might post in public as well.

New version is due to be released next month. Not sure about exact ELO improvement, though its performance should be pretty close to H1.5.
To give an idea, here is a quick copy&paste from the results of current test run:

Code: Select all

1.3.53c

Critter dev 64-bit - Critter 1.2 64-bit SSE4    116.0 - 106.0  52.25%		
Critter dev 64-bit - Stockfish 2.1.1 JA 64bit   132.0 -  90.0  59.46%		
Critter dev 64-bit - Komodo64 3 SSE             128.0 -  94.0  57.66%		
Critter dev 64-bit - Rybka 4 x64                121.5 -  99.5  54.98%		
Critter dev 64-bit - IvanHoe 9.50b x64          131.0 -  91.0  59.01%		
Critter dev 64-bit - Houdini 1.5 x64            107.5 - 113.5  48.64%		
                                                736.0 - 594.0  55.34%		

1330 out of 2100 games played
Level: 40 Moves in 1 min

40/1, 1 core, ponder off, without tablebases, Shredder classic default book
Non-PGO build without SSE4.

Richard

IWB · Post by **IWB** » Sun Nov 27, 2011 1:26 pm

Hi Richard,

rvida wrote: ...
New version is due to be released next month.
...

Why wait - just go for it!

Bye
Ingo

gerold · Post by **gerold** » Sun Nov 27, 2011 3:17 pm

Thanks for the update Richard. Would that be in 3 days. Dec. 1st.

Best,
Gerold.

Ajedrecista · Post by **Ajedrecista** » Sun Nov 27, 2011 10:18 pm

Hello Richard!

rvida wrote:Hi,

I have just answered some PMs about the status of Critter development. I thought I might post in public as well.

New version is due to be released next month. Not sure about exact ELO improvement, though its performance should be pretty close to H1.5.
To give an idea, here is a quick copy&paste from the results of current test run:
Code: Select all
1.3.53c

Critter dev 64-bit - Critter 1.2 64-bit SSE4    116.0 - 106.0  52.25%		
Critter dev 64-bit - Stockfish 2.1.1 JA 64bit   132.0 -  90.0  59.46%		
Critter dev 64-bit - Komodo64 3 SSE             128.0 -  94.0  57.66%		
Critter dev 64-bit - Rybka 4 x64                121.5 -  99.5  54.98%		
Critter dev 64-bit - IvanHoe 9.50b x64          131.0 -  91.0  59.01%		
Critter dev 64-bit - Houdini 1.5 x64            107.5 - 113.5  48.64%		
                                                736.0 - 594.0  55.34%		

1330 out of 2100 games played
Level: 40 Moves in 1 min 
40/1, 1 core, ponder off, without tablebases, Shredder classic default book
Non-PGO build without SSE4.

Richard

First of all, congratulations for your achievements. 1330 games are a nice amount and I tried to find (with the only help of pencil, paper and Derive 6) the uncertainty of this test with ~ 95.45% confidence (2-sigma confidence). Of course, these uncertainties depend on the number of draws (the draw ratio). Looking your results is easy to see that the minimum number of draws is 0 (0%, I guess this is not the case) and the maximum is 1181 (~ 88.8%; 594·2 - 7, if Critter 1.1.53c ended unbeaten in all its winning matches and is not able to win a single game against Houdini... I guess again that it is not the case). Here is what I found, given in steps of 5% (I know that 5% of 1330 = 66.5, an impossible number of draws, but I think an step of 5% is not hard to the eyes) except the last datum (hoping no typos and/or errors in my clumsy calculations):

Code: Select all

Draw ratio = D; Elo difference is rounded (not exact numbers).
Rating difference = rd = 400·log(736/594) ~ +37.2
Uncertainty (error given in Elo points) = |e|
Confidence ~ 95.45%  ===>  rd ± |e| (referred to the average rating of all these six engines).

D = 0%  --->  +37.2 ± 19.3 ~ ]+17.9, +56.5[
D = 5%  --->  +37.2 ± 18.8 ~ ]+18.4, +56[
D = 10%  --->  +37.2 ± 18.3 ~ ]+18.9, +55.5[
D = 15%  --->  +37.2 ± 17.8 ~ ]+19.4, +55[
D = 20%  --->  +37.2 ± 17.2 ~ ]+20, +54.4[
D = 25%  --->  +37.2 ± 16.7 ~ ]+20.5, +53.9[
D = 30%  --->  +37.2 ± 16.1 ~ ]+21.1, +53.3[
D = 35%  --->  +37.2 ± 15.5 ~ ]+21.7, +52.7[
D = 40%  --->  +37.2 ± 14.9 ~ ]+22.3, +52.1[
D = 45%  --->  +37.2 ± 14.2 ~ ]+23, +51.4[
D = 50%  --->  +37.2 ± 13.5 ~ ]+23.7, +50.7[
D = 55%  --->  +37.2 ± 12.8 ~ ]+24.4, +50[
D = 60%  --->  +37.2 ± 12.1 ~ ]+25.1, +49.3[
D = 65%  --->  +37.2 ± 11.3 ~ ]+25.9, +48.5[
D = 70%  --->  +37.2 ± 10.4 ~ ]+26.8, +47.6[
D = 75%  --->  +37.2 ± 9.4 ~ ]+27.8, +46.6[
D = 80%  --->  +37.2 ± 8.4 ~ ]+28.8, +45.6[
D = 85%  --->  +37.2 ± 7.2 ~ ]+30, +44.4[
D ~ 88.8% ---> +37.2 ± 6.1 ~ ]+31.1, +43.3[

So, those uncertainties should be between ± 6 and ± 19 (more less); ignoring the cases with too many draws or too few draws, this bar is reduced between ± 11 (2/3 of draws) and ± 15.7 (1/3 of draws), which seems pretty reasonable from my inexpert point of view. So, rounding again around [+25, +50] ahead of the average rating of this bunch of pretty good engines... hats off to you.

Just for curiosity: the previous development version was Critter 1.1.36 (if I remember well) and finally Critter 1.2 was released; now, with Critter 1.3.53c development version... the next public release will be Critter 1.4? Thanks in advance and congratulations once again.

Regards from Spain.

Ajedrecista.

Marek Soszynski · Post by **Marek Soszynski** » Mon Nov 28, 2011 8:50 am

Any chance of Critter Linux using Gaviota tablebases?

Dr.Wael Deeb · Post by **Dr.Wael Deeb** » Mon Nov 28, 2011 10:13 am

Great News Richard

Waiting eagerly for the new release regards,
Dr.D

Ajedrecista · Post by **Ajedrecista** » Mon Nov 28, 2011 10:46 am

Hi again:

Ajedrecista wrote:Hello Richard!
rvida wrote:Hi,

I have just answered some PMs about the status of Critter development. I thought I might post in public as well.

New version is due to be released next month. Not sure about exact ELO improvement, though its performance should be pretty close to H1.5.
To give an idea, here is a quick copy&paste from the results of current test run:
Code: Select all
1.3.53c

Critter dev 64-bit - Critter 1.2 64-bit SSE4    116.0 - 106.0  52.25%		
Critter dev 64-bit - Stockfish 2.1.1 JA 64bit   132.0 -  90.0  59.46%		
Critter dev 64-bit - Komodo64 3 SSE             128.0 -  94.0  57.66%		
Critter dev 64-bit - Rybka 4 x64                121.5 -  99.5  54.98%		
Critter dev 64-bit - IvanHoe 9.50b x64          131.0 -  91.0  59.01%		
Critter dev 64-bit - Houdini 1.5 x64            107.5 - 113.5  48.64%		
                                                736.0 - 594.0  55.34%		

1330 out of 2100 games played
Level: 40 Moves in 1 min 
40/1, 1 core, ponder off, without tablebases, Shredder classic default book
Non-PGO build without SSE4.

Richard
First of all, congratulations for your achievements. 1330 games are a nice amount and I tried to find (with the only help of pencil, paper and Derive 6) the uncertainty of this test with ~ 95.45% confidence (2-sigma confidence). Of course, these uncertainties depend on the number of draws (the draw ratio). Looking your results is easy to see that the minimum number of draws is 0 (0%, I guess this is not the case) and the maximum is 1181 (~ 88.8%; 594·2 - 7, if Critter 1.1.53c ended unbeaten in all its winning matches and is not able to win a single game against Houdini... I guess again that it is not the case). Here is what I found, given in steps of 5% (I know that 5% of 1330 = 66.5, an impossible number of draws, but I think an step of 5% is not hard to the eyes) except the last datum (hoping no typos and/or errors in my clumsy calculations):
Code: Select all
Draw ratio = D; Elo difference is rounded (not exact numbers).
Rating difference = rd = 400·log(736/594) ~ +37.2
Uncertainty (error given in Elo points) = |e|
Confidence ~ 95.45%  ===>  rd ± |e| (referred to the average rating of all these six engines).

D = 0%  --->  +37.2 ± 19.3 ~ ]+17.9, +56.5[
D = 5%  --->  +37.2 ± 18.8 ~ ]+18.4, +56[
D = 10%  --->  +37.2 ± 18.3 ~ ]+18.9, +55.5[
D = 15%  --->  +37.2 ± 17.8 ~ ]+19.4, +55[
D = 20%  --->  +37.2 ± 17.2 ~ ]+20, +54.4[
D = 25%  --->  +37.2 ± 16.7 ~ ]+20.5, +53.9[
D = 30%  --->  +37.2 ± 16.1 ~ ]+21.1, +53.3[
D = 35%  --->  +37.2 ± 15.5 ~ ]+21.7, +52.7[
D = 40%  --->  +37.2 ± 14.9 ~ ]+22.3, +52.1[
D = 45%  --->  +37.2 ± 14.2 ~ ]+23, +51.4[
D = 50%  --->  +37.2 ± 13.5 ~ ]+23.7, +50.7[
D = 55%  --->  +37.2 ± 12.8 ~ ]+24.4, +50[
D = 60%  --->  +37.2 ± 12.1 ~ ]+25.1, +49.3[
D = 65%  --->  +37.2 ± 11.3 ~ ]+25.9, +48.5[
D = 70%  --->  +37.2 ± 10.4 ~ ]+26.8, +47.6[
D = 75%  --->  +37.2 ± 9.4 ~ ]+27.8, +46.6[
D = 80%  --->  +37.2 ± 8.4 ~ ]+28.8, +45.6[
D = 85%  --->  +37.2 ± 7.2 ~ ]+30, +44.4[
D ~ 88.8% ---> +37.2 ± 6.1 ~ ]+31.1, +43.3[
So, those uncertainties should be between ± 6 and ± 19 (more less); ignoring the cases with too many draws or too few draws, this bar is reduced between ± 11 (2/3 of draws) and ± 15.7 (1/3 of draws), which seems pretty reasonable from my inexpert point of view. So, rounding again around [+25, +50] ahead of the average rating of this bunch of pretty good engines... hats off to you.

Just for curiosity: the previous development version was Critter 1.1.36 (if I remember well) and finally Critter 1.2 was released; now, with Critter 1.3.53c development version... the next public release will be Critter 1.4? Thanks in advance and congratulations once again.

Regards from Spain.

Ajedrecista.

Finally, I was wrong in one thing: in the maximum number of draws... with a result of 736 - 594, it is impossible an odd number of draws, it must be even! The correct number is 2·(106 + 90 + 94 + 99.5 + 91 + 107.5) = 1176 (and not 1181), so around 88.42%. Now, the minimum uncertainty should be:

Code: Select all

D ~ 88.42% ---> +37.2 ± 6.2 ~ ]+31, +43.4[

I hope the rest of the numbers of my former post will not be wrong. Waiting for next Critter (1.4?).

Regards from Spain.

Ajedrecista.

MM · Post by MM » Mon Nov 28, 2011 8:19 pm

rvida wrote:Hi,

I have just answered some PMs about the status of Critter development. I thought I might post in public as well.

New version is due to be released next month. Not sure about exact ELO improvement, though its performance should be pretty close to H1.5.
To give an idea, here is a quick copy&paste from the results of current test run:
Code: Select all
1.3.53c

Critter dev 64-bit - Critter 1.2 64-bit SSE4    116.0 - 106.0  52.25%		
Critter dev 64-bit - Stockfish 2.1.1 JA 64bit   132.0 -  90.0  59.46%		
Critter dev 64-bit - Komodo64 3 SSE             128.0 -  94.0  57.66%		
Critter dev 64-bit - Rybka 4 x64                121.5 -  99.5  54.98%		
Critter dev 64-bit - IvanHoe 9.50b x64          131.0 -  91.0  59.01%		
Critter dev 64-bit - Houdini 1.5 x64            107.5 - 113.5  48.64%		
                                                736.0 - 594.0  55.34%		

1330 out of 2100 games played
Level: 40 Moves in 1 min 
40/1, 1 core, ponder off, without tablebases, Shredder classic default book
Non-PGO build without SSE4.

Richard

Hi,
Great news and excellet work as usual. I would love to see Critter-Houdini 2.0 50.01 %...

rvida · Post by **rvida** » Tue Nov 29, 2011 8:28 am

Marek Soszynski wrote:Any chance of Critter Linux using Gaviota tablebases?

Yes, Gaviota tablebases will be available on all supported platforms.

rvida · Post by **rvida** » Tue Nov 29, 2011 8:53 am

Ajedrecista wrote:Hello Richard!
...
1330 games are a nice amount and I tried to find (with the only help of pencil, paper and Derive 6) the uncertainty of this test with ~ 95.45% confidence (2-sigma confidence)
...
... a lot of math ...
...

Your math skills seem to be well beyond my level, I am not able to comment.

Ajedrecista wrote: Just for curiosity: the previous development version was Critter 1.1.36 (if I remember well) and finally Critter 1.2 was released; now, with Critter 1.3.53c development version... the next public release will be Critter 1.4?

Do not read too much into version numbers. They are quite arbitrary. I made up a scheme where odd numers (1.1, 1.3, ...) are meant to be development versions, even numbers (1.0, 1.2, 1.4, ...) to be public releases.

IIRC the last beta version before 1.2 release was 1.1.49.

Critter teaser

Critter teaser

Re: Critter teaser

Re: Critter teaser

Re: Critter teaser

Re: Critter teaser

Re: Critter teaser

Re: Critter teaser

Re: Critter teaser

Re: Critter teaser

Re: Critter teaser