Movei's CCRL rating list

Uri Blass · Post by **Uri Blass** » Fri Aug 31, 2007 8:48 am

http://www.computerchess.org.uk/ccrl/40 ... t_all.html

Movei 0.08.438 2815 +584 −331 100.0% −252.0 0.0% 2
50.8%

Movei beat 2 opponents(one with 2606 and one with 2510)

Even rybka could not do better so I wonder how the 2815 was calculated.
My opinion is that programs that scored 100% or 0% should be out of the list and they should be included in the list as soon as their result is different than 0% or 100%

Ryan Benitez · Post by **Ryan Benitez** » Fri Aug 31, 2007 8:52 am

You are of course right, 2 games is not enough for a stable rating. I hope it keeps playing well though.

Graham Banks · Post by **Graham Banks** » Fri Aug 31, 2007 8:55 am

Uri Blass wrote:http://www.computerchess.org.uk/ccrl/40 ... t_all.html

Movei 0.08.438 2815 +584 −331 100.0% −252.0 0.0% 2
50.8%

Movei beat 2 opponents(one with 2606 and one with 2510)

Even rybka could not do better so I wonder how the 2815 was calculated.
My opinion is that programs that scored 100% or 0% should be out of the list and they should be included in the list as soon as their result is different than 0% or 100%

Hi Uri,

just ignore the ratings of engines with a very small number of games.

Regards, Graham.

Uri Blass · Post by **Uri Blass** » Fri Aug 31, 2007 9:04 am

Hi Ryan,
Thanks for the encouragement.

Number of games is not the reason that I am against including it in the full list and in case of having a program with 1.5/2 I have no problem with including that program in the full list based on performance but in case of 100% the performance is infinite.

My estimate is that the rating of it after enough games should be slightly more than 2700(based on the cegt it estimated to have 87.9 elo improvement relative to 403)

Uri

GenoM · Post by **GenoM** » Fri Aug 31, 2007 9:47 am

I made a bullet nunn match at 2GHz computer between Glaurung 2-epsilon/5 and last Movei. Result was 11.5-2.5 for Glaurung. I interrupted similar match between Strelka 1.8 and last Movei because of 9-0 result. So for me its interresting to see such high results from Movei. May be because of good opening book? Any ideas? [or because of bullet TC?]

ps: now i'm at work; when i come home i can publish here (or send to you Uri) pgn's of these matches

Uri Blass · Post by **Uri Blass** » Fri Aug 31, 2007 10:25 am

GenoM wrote:I made a bullet nunn match at 2GHz computer between Glaurung 2-epsilon/5 and last Movei. Result was 11.5-2.5 for Glaurung. I interrupted similar match between Strelka 1.8 and last Movei because of 9-0 result. So for me its interresting to see such high results from Movei. May be because of good opening book? Any ideas? [or because of bullet TC?]

ps: now i'm at work; when i come home i can publish here (or send to you Uri) pgn's of these matches

I have another explanation.

It is possible that movei simply print too much output relative to the opponents and printing a lot of output cause it to perform worse at bullet.

Movei print every fail high even if it is wrong fail high and it also prints pv at the end of every iteration.

I tested private movei that does not print pv under arena and it performed clearly better in bullet relative to previous version(I did not test against other opponents but I remember result of 40-12 against the version that prints pv)

I think that the only reliable interface to use at bullet time control is winboard(winboard relatively does not steal times from engines that print the pv often).

Note that both Glaurung and Strelka are better than movei but I think that they are not better for the type of results that you got.

Uri

Marc MP · Post by **Marc MP** » Fri Aug 31, 2007 10:54 am

Hi Uri,

Thank you for the new Movei!,

I also ran (really fast) bullets: 30 moves in 10 sec. LGPGNVER didn't find any incorrect behavior for the new Movei. Congratulation! Movei 403 lost two games on time and made an illegal move against Fruit. (I put the game below but I'm not sure if this is still interesting for you ).

30m/10s, 12-moves polyglot book, windoardx, 4M Hash, no tablebases.

Code: Select all

Rank Name            Elo    +    - games score oppo. draws 
   1 Fruit 2.1      2760   39   37   200   71%  2610   23% 
   2 Naum 2.0       2713   37   36   200   64%  2610   23% 
   3 Movei00_8_438  2662   23   23   500   50%  2666   22% 
   4 List 512       2661   37   37   200   57%  2610   17% 
   5 Ruffian 1.0.5  2629   37   36   200   52%  2610   22% 
   6 Booot 4.13.1   2567   36   36   200   44%  2610   23% 
   7 Movei00_8_403  2559   24   24   500   36%  2666   21%

Code: Select all

   1 Fruit 2.1      2760 200.0 (141.5 :  58.5)
                         100.0 ( 68.0 :  32.0) Movei00_8_438  2662
                         100.0 ( 73.5 :  26.5) Movei00_8_403  2559
   2 Naum 2.0       2713 200.0 (128.0 :  72.0)
                         100.0 ( 52.5 :  47.5) Movei00_8_438  2662
                         100.0 ( 75.5 :  24.5) Movei00_8_403  2559
   3 Movei00_8_438  2662 500.0 (247.5 : 252.5)
                         100.0 ( 32.0 :  68.0) Fruit 2.1      2760
                         100.0 ( 47.5 :  52.5) Naum 2.0       2713
                         100.0 ( 60.5 :  39.5) List 512       2661
                         100.0 ( 50.0 :  50.0) Ruffian 1.0.5  2629
                         100.0 ( 57.5 :  42.5) Booot 4.13.1   2567
   4 List 512       2661 200.0 (113.0 :  87.0)
                         100.0 ( 39.5 :  60.5) Movei00_8_438  2662
                         100.0 ( 73.5 :  26.5) Movei00_8_403  2559
   5 Ruffian 1.0.5  2629 200.0 (104.5 :  95.5)
                         100.0 ( 50.0 :  50.0) Movei00_8_438  2662
                         100.0 ( 54.5 :  45.5) Movei00_8_403  2559
   6 Booot 4.13.1   2567 200.0 ( 88.0 : 112.0)
                         100.0 ( 42.5 :  57.5) Movei00_8_438  2662
                         100.0 ( 45.5 :  54.5) Movei00_8_403  2559
   7 Movei00_8_403  2559 500.0 (177.5 : 322.5)
                         100.0 ( 26.5 :  73.5) Fruit 2.1      2760
                         100.0 ( 24.5 :  75.5) Naum 2.0       2713
                         100.0 ( 26.5 :  73.5) List 512       2661
                         100.0 ( 45.5 :  54.5) Ruffian 1.0.5  2629
                         100.0 ( 54.5 :  45.5) Booot 4.13.1   2567

Others Errors founds:
List 512 lost 8 times on time against Movei 408 and 10 times against Movei 403.

=============================================================================
Game 510: Movei00_8_403-Fruit 2.1

1. e4 c6 2. d4 d5 3. e5 Bf5 4. Nc3 a6 5. Be3 Qc7 6. Nf3 Nd7 7. Bd3 Bxd3
8. Qxd3 e6 9. O-O Bb4 10. Ne2 Ne7 11. a3 Ba5 12. b4 Bb6 13. Bd2 h6 14. Ng3
O-O 15. Kh1 a5 16. Nh5 Nf5 17. bxa5 Bxa5 18. Bxa5 Rxa5 19. g4 Ne7 20. g5
g6 21. Ng3 c5 22. gxh6 cxd4 23. Qxd4 Nc6 24. Qb2 Kh7 25. Rfe1 Kxh6 26. Re3
Kg7 27. Rae1 Nc5 28. Rd1 Rh8 29. Ree1 Ra4 30. Kf2

Result :*
Warning level :4
Final position: 7r/1pq2pk1/2n1p1p1/2npP3/r7/P4NN1/1QP2P1P/3RR2K w - - 8 30
Analyze result:0-1 {illegal move 30. Kf2 }
Inconsistent header result :0-1

hgm · Post by **hgm** » Fri Aug 31, 2007 11:59 am

Uri Blass wrote:Even rybka could not do better so I wonder how the 2815 was calculated.

By BayesElo. That adds a number of virtual draws to the result to mimic an a-priori assumption of the rating likelyhood. So the rating calculation is never using a 0% or 100% result, and as a result the rating remains finite. Another effect is that BayesElo takes a 100% score of 20-0 much more seriously (awards a higher rating for it) than a 2-0 score.

Basically BayesElo is reluctant to admit that a program could have a rating above average (which is ~2500 for the CCRL list), and doesn't assign a rating higher than necessary to credibly explain away the rest of the score as luck, rather than skill.

Movei's CCRL rating list

Movei's CCRL rating list

Re: Movei's CCRL rating list

Re: Movei's CCRL rating list

Re: Movei's CCRL rating list

Re: Movei's CCRL rating list

Re: Movei's CCRL rating list

Re: Movei's CCRL rating list

Re: Movei's CCRL rating list