Critter teaser

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
rvida
Posts: 481
Joined: Thu Apr 16, 2009 12:00 pm
Location: Slovakia, EU

Critter teaser

Post by rvida »

Hi,

I have just answered some PMs about the status of Critter development. I thought I might post in public as well.

New version is due to be released next month. Not sure about exact ELO improvement, though its performance should be pretty close to H1.5.
To give an idea, here is a quick copy&paste from the results of current test run:

Code: Select all

1.3.53c

Critter dev 64-bit - Critter 1.2 64-bit SSE4    116.0 - 106.0  52.25%		
Critter dev 64-bit - Stockfish 2.1.1 JA 64bit   132.0 -  90.0  59.46%		
Critter dev 64-bit - Komodo64 3 SSE             128.0 -  94.0  57.66%		
Critter dev 64-bit - Rybka 4 x64                121.5 -  99.5  54.98%		
Critter dev 64-bit - IvanHoe 9.50b x64          131.0 -  91.0  59.01%		
Critter dev 64-bit - Houdini 1.5 x64            107.5 - 113.5  48.64%		
                                                736.0 - 594.0  55.34%		

1330 out of 2100 games played
Level: 40 Moves in 1 min 
40/1, 1 core, ponder off, without tablebases, Shredder classic default book
Non-PGO build without SSE4.

Richard
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Critter teaser

Post by IWB »

Hi Richard,
rvida wrote: ...
New version is due to be released next month.
...
Why wait - just go for it! :-)

Bye
Ingo
gerold
Posts: 10121
Joined: Thu Mar 09, 2006 12:57 am
Location: van buren,missouri

Re: Critter teaser

Post by gerold »

Thanks for the update Richard. Would that be in 3 days. Dec. 1st. :)

Best,
Gerold.
User avatar
Ajedrecista
Posts: 2122
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Critter teaser

Post by Ajedrecista »

Hello Richard!
rvida wrote:Hi,

I have just answered some PMs about the status of Critter development. I thought I might post in public as well.

New version is due to be released next month. Not sure about exact ELO improvement, though its performance should be pretty close to H1.5.
To give an idea, here is a quick copy&paste from the results of current test run:

Code: Select all

1.3.53c

Critter dev 64-bit - Critter 1.2 64-bit SSE4    116.0 - 106.0  52.25%		
Critter dev 64-bit - Stockfish 2.1.1 JA 64bit   132.0 -  90.0  59.46%		
Critter dev 64-bit - Komodo64 3 SSE             128.0 -  94.0  57.66%		
Critter dev 64-bit - Rybka 4 x64                121.5 -  99.5  54.98%		
Critter dev 64-bit - IvanHoe 9.50b x64          131.0 -  91.0  59.01%		
Critter dev 64-bit - Houdini 1.5 x64            107.5 - 113.5  48.64%		
                                                736.0 - 594.0  55.34%		

1330 out of 2100 games played
Level: 40 Moves in 1 min 
40/1, 1 core, ponder off, without tablebases, Shredder classic default book
Non-PGO build without SSE4.

Richard
First of all, congratulations for your achievements. 1330 games are a nice amount and I tried to find (with the only help of pencil, paper and Derive 6) the uncertainty of this test with ~ 95.45% confidence (2-sigma confidence). Of course, these uncertainties depend on the number of draws (the draw ratio). Looking your results is easy to see that the minimum number of draws is 0 (0%, I guess this is not the case) and the maximum is 1181 (~ 88.8%; 594·2 - 7, if Critter 1.1.53c ended unbeaten in all its winning matches and is not able to win a single game against Houdini... I guess again that it is not the case). Here is what I found, given in steps of 5% (I know that 5% of 1330 = 66.5, an impossible number of draws, but I think an step of 5% is not hard to the eyes) except the last datum (hoping no typos and/or errors in my clumsy calculations):

Code: Select all

Draw ratio = D; Elo difference is rounded (not exact numbers).
Rating difference = rd = 400·log(736/594) ~ +37.2
Uncertainty (error given in Elo points) = |e|
Confidence ~ 95.45%  ===>  rd ± |e| (referred to the average rating of all these six engines).

D = 0%  --->  +37.2 ± 19.3 ~ ]+17.9, +56.5[
D = 5%  --->  +37.2 ± 18.8 ~ ]+18.4, +56[
D = 10%  --->  +37.2 ± 18.3 ~ ]+18.9, +55.5[
D = 15%  --->  +37.2 ± 17.8 ~ ]+19.4, +55[
D = 20%  --->  +37.2 ± 17.2 ~ ]+20, +54.4[
D = 25%  --->  +37.2 ± 16.7 ~ ]+20.5, +53.9[
D = 30%  --->  +37.2 ± 16.1 ~ ]+21.1, +53.3[
D = 35%  --->  +37.2 ± 15.5 ~ ]+21.7, +52.7[
D = 40%  --->  +37.2 ± 14.9 ~ ]+22.3, +52.1[
D = 45%  --->  +37.2 ± 14.2 ~ ]+23, +51.4[
D = 50%  --->  +37.2 ± 13.5 ~ ]+23.7, +50.7[
D = 55%  --->  +37.2 ± 12.8 ~ ]+24.4, +50[
D = 60%  --->  +37.2 ± 12.1 ~ ]+25.1, +49.3[
D = 65%  --->  +37.2 ± 11.3 ~ ]+25.9, +48.5[
D = 70%  --->  +37.2 ± 10.4 ~ ]+26.8, +47.6[
D = 75%  --->  +37.2 ± 9.4 ~ ]+27.8, +46.6[
D = 80%  --->  +37.2 ± 8.4 ~ ]+28.8, +45.6[
D = 85%  --->  +37.2 ± 7.2 ~ ]+30, +44.4[
D ~ 88.8% ---> +37.2 ± 6.1 ~ ]+31.1, +43.3[
So, those uncertainties should be between ± 6 and ± 19 (more less); ignoring the cases with too many draws or too few draws, this bar is reduced between ± 11 (2/3 of draws) and ± 15.7 (1/3 of draws), which seems pretty reasonable from my inexpert point of view. So, rounding again around [+25, +50] ahead of the average rating of this bunch of pretty good engines... hats off to you.

Just for curiosity: the previous development version was Critter 1.1.36 (if I remember well) and finally Critter 1.2 was released; now, with Critter 1.3.53c development version... the next public release will be Critter 1.4? Thanks in advance and congratulations once again.

Regards from Spain.

Ajedrecista.
User avatar
Marek Soszynski
Posts: 586
Joined: Wed May 10, 2006 7:28 pm
Location: Birmingham, England

Re: Critter teaser

Post by Marek Soszynski »

Any chance of Critter Linux using Gaviota tablebases?
Marek Soszynski
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: Critter teaser

Post by Dr.Wael Deeb »

Great News Richard :D

Waiting eagerly for the new release regards,
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
User avatar
Ajedrecista
Posts: 2122
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Critter teaser

Post by Ajedrecista »

Hi again:
Ajedrecista wrote:Hello Richard!
rvida wrote:Hi,

I have just answered some PMs about the status of Critter development. I thought I might post in public as well.

New version is due to be released next month. Not sure about exact ELO improvement, though its performance should be pretty close to H1.5.
To give an idea, here is a quick copy&paste from the results of current test run:

Code: Select all

1.3.53c

Critter dev 64-bit - Critter 1.2 64-bit SSE4    116.0 - 106.0  52.25%		
Critter dev 64-bit - Stockfish 2.1.1 JA 64bit   132.0 -  90.0  59.46%		
Critter dev 64-bit - Komodo64 3 SSE             128.0 -  94.0  57.66%		
Critter dev 64-bit - Rybka 4 x64                121.5 -  99.5  54.98%		
Critter dev 64-bit - IvanHoe 9.50b x64          131.0 -  91.0  59.01%		
Critter dev 64-bit - Houdini 1.5 x64            107.5 - 113.5  48.64%		
                                                736.0 - 594.0  55.34%		

1330 out of 2100 games played
Level: 40 Moves in 1 min 
40/1, 1 core, ponder off, without tablebases, Shredder classic default book
Non-PGO build without SSE4.

Richard
First of all, congratulations for your achievements. 1330 games are a nice amount and I tried to find (with the only help of pencil, paper and Derive 6) the uncertainty of this test with ~ 95.45% confidence (2-sigma confidence). Of course, these uncertainties depend on the number of draws (the draw ratio). Looking your results is easy to see that the minimum number of draws is 0 (0%, I guess this is not the case) and the maximum is 1181 (~ 88.8%; 594·2 - 7, if Critter 1.1.53c ended unbeaten in all its winning matches and is not able to win a single game against Houdini... I guess again that it is not the case). Here is what I found, given in steps of 5% (I know that 5% of 1330 = 66.5, an impossible number of draws, but I think an step of 5% is not hard to the eyes) except the last datum (hoping no typos and/or errors in my clumsy calculations):

Code: Select all

Draw ratio = D; Elo difference is rounded (not exact numbers).
Rating difference = rd = 400·log(736/594) ~ +37.2
Uncertainty (error given in Elo points) = |e|
Confidence ~ 95.45%  ===>  rd ± |e| (referred to the average rating of all these six engines).

D = 0%  --->  +37.2 ± 19.3 ~ ]+17.9, +56.5[
D = 5%  --->  +37.2 ± 18.8 ~ ]+18.4, +56[
D = 10%  --->  +37.2 ± 18.3 ~ ]+18.9, +55.5[
D = 15%  --->  +37.2 ± 17.8 ~ ]+19.4, +55[
D = 20%  --->  +37.2 ± 17.2 ~ ]+20, +54.4[
D = 25%  --->  +37.2 ± 16.7 ~ ]+20.5, +53.9[
D = 30%  --->  +37.2 ± 16.1 ~ ]+21.1, +53.3[
D = 35%  --->  +37.2 ± 15.5 ~ ]+21.7, +52.7[
D = 40%  --->  +37.2 ± 14.9 ~ ]+22.3, +52.1[
D = 45%  --->  +37.2 ± 14.2 ~ ]+23, +51.4[
D = 50%  --->  +37.2 ± 13.5 ~ ]+23.7, +50.7[
D = 55%  --->  +37.2 ± 12.8 ~ ]+24.4, +50[
D = 60%  --->  +37.2 ± 12.1 ~ ]+25.1, +49.3[
D = 65%  --->  +37.2 ± 11.3 ~ ]+25.9, +48.5[
D = 70%  --->  +37.2 ± 10.4 ~ ]+26.8, +47.6[
D = 75%  --->  +37.2 ± 9.4 ~ ]+27.8, +46.6[
D = 80%  --->  +37.2 ± 8.4 ~ ]+28.8, +45.6[
D = 85%  --->  +37.2 ± 7.2 ~ ]+30, +44.4[
D ~ 88.8% ---> +37.2 ± 6.1 ~ ]+31.1, +43.3[
So, those uncertainties should be between ± 6 and ± 19 (more less); ignoring the cases with too many draws or too few draws, this bar is reduced between ± 11 (2/3 of draws) and ± 15.7 (1/3 of draws), which seems pretty reasonable from my inexpert point of view. So, rounding again around [+25, +50] ahead of the average rating of this bunch of pretty good engines... hats off to you.

Just for curiosity: the previous development version was Critter 1.1.36 (if I remember well) and finally Critter 1.2 was released; now, with Critter 1.3.53c development version... the next public release will be Critter 1.4? Thanks in advance and congratulations once again.

Regards from Spain.

Ajedrecista.
Finally, I was wrong in one thing: in the maximum number of draws... with a result of 736 - 594, it is impossible an odd number of draws, it must be even! The correct number is 2·(106 + 90 + 94 + 99.5 + 91 + 107.5) = 1176 (and not 1181), so around 88.42%. Now, the minimum uncertainty should be:

Code: Select all

D ~ 88.42% ---> +37.2 ± 6.2 ~ ]+31, +43.4[
I hope the rest of the numbers of my former post will not be wrong. Waiting for next Critter (1.4?).

Regards from Spain.

Ajedrecista.
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: Critter teaser

Post by MM »

rvida wrote:Hi,

I have just answered some PMs about the status of Critter development. I thought I might post in public as well.

New version is due to be released next month. Not sure about exact ELO improvement, though its performance should be pretty close to H1.5.
To give an idea, here is a quick copy&paste from the results of current test run:

Code: Select all

1.3.53c

Critter dev 64-bit - Critter 1.2 64-bit SSE4    116.0 - 106.0  52.25%		
Critter dev 64-bit - Stockfish 2.1.1 JA 64bit   132.0 -  90.0  59.46%		
Critter dev 64-bit - Komodo64 3 SSE             128.0 -  94.0  57.66%		
Critter dev 64-bit - Rybka 4 x64                121.5 -  99.5  54.98%		
Critter dev 64-bit - IvanHoe 9.50b x64          131.0 -  91.0  59.01%		
Critter dev 64-bit - Houdini 1.5 x64            107.5 - 113.5  48.64%		
                                                736.0 - 594.0  55.34%		

1330 out of 2100 games played
Level: 40 Moves in 1 min 
40/1, 1 core, ponder off, without tablebases, Shredder classic default book
Non-PGO build without SSE4.

Richard
Hi,
Great news and excellet work as usual. I would love to see Critter-Houdini 2.0 50.01 %...
MM
User avatar
rvida
Posts: 481
Joined: Thu Apr 16, 2009 12:00 pm
Location: Slovakia, EU

Re: Critter teaser

Post by rvida »

Marek Soszynski wrote:Any chance of Critter Linux using Gaviota tablebases?
Yes, Gaviota tablebases will be available on all supported platforms.
User avatar
rvida
Posts: 481
Joined: Thu Apr 16, 2009 12:00 pm
Location: Slovakia, EU

Re: Critter teaser

Post by rvida »

Ajedrecista wrote:Hello Richard!
...
1330 games are a nice amount and I tried to find (with the only help of pencil, paper and Derive 6) the uncertainty of this test with ~ 95.45% confidence (2-sigma confidence)
...
... a lot of math ...
...
Your math skills seem to be well beyond my level, I am not able to comment.
Ajedrecista wrote: Just for curiosity: the previous development version was Critter 1.1.36 (if I remember well) and finally Critter 1.2 was released; now, with Critter 1.3.53c development version... the next public release will be Critter 1.4?
Do not read too much into version numbers. They are quite arbitrary. I made up a scheme where odd numers (1.1, 1.3, ...) are meant to be development versions, even numbers (1.0, 1.2, 1.4, ...) to be public releases.

IIRC the last beta version before 1.2 release was 1.1.49.