Movei 438 FRC testing completed - +106 ELO after 1,200 games

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Spock

Movei 438 FRC testing completed - +106 ELO after 1,200 games

Post by Spock »

As the title says - Movei 438 testing has been completed. Thanks to Uri for this new version. It is +106 ELO over Movei 383 which was the last version that I tested.

The pure list is my preferred one, here:
http://www.computerchess.org.uk/ccrl/40 ... _pure.html

Movei 438 moved up 2 places, into the top 10 at 9th position.

Next for the rating list will likely be Rybka FRC when it is released, which will almost certainly take the number one spot away from Hiarcs which has held it for so long.
Uri Blass
Posts: 10311
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Movei 438 FRC testing completed - +106 ELO after 1,200 g

Post by Uri Blass »

Thanks for your tests.

It seems that you used many computers to test movei at the same time because with one computer you certainly had not enough time to play 1200 games at 40/4.

Interesting to find the biggest difference in rating when the weaker by rating won a match and also the biggest surprises in performance relative to the real rating.

We have the following results:
Loop 10.32f 2888-Hiarcs 11.1 2916 50.5-49.5
Fruit 051103 2893-Shredder10 2902 53-47
Naum 2.2 2886-Loop10.32f 2888 51.5-48.5
Naum 2.2 2886-Fruit2.3 2887 51-49
Spike 1.2 Turin 2849-Fruit2.2.1 2851 51-49
Naum 2.1 2813 -Fruit2.2.1 2851 51-49
Deep Sjeng 2.5 1CPU 2746-Glaurung 1.2.1 2768 54.5 − 45.5
Movei 383 2603-Ufim 8.02 2625 52.5-47.5
Hamsters 0.4 2514 -Hermann 2.0 2538 56.5-43.5
Aice 0.99.2 2395- Hermann 1.7 2420 51-49
Ayito 0.2.994 2378-Hermann 1.7 2420 54-46

Interesting that we see no big surprises
The maximal difference in elo is in the last match(42 elo)
The maximal result is 56.5-43.5

The maximal difference between performance and result is in the following matchs(I included only cases of difference that is bigger than 70)

Hiarcs X54 2879-Ufim 8.02 2625 73 − 27 (-82 elo for hiarcs)
Fruit 2.2.1 2851- Movei 383 2603 86-14(+73 elo for fruit)
Ayito 0.2.994 2378-Hermann 1.7 2420 54-46(+73 elo for ayito)
Hamsters 0.4 2514 -Hermann 2.0 2538 56.5-43.5(+71 elo for hamster)


It seems that practically one match of 100 FRC games can give the exact elo with error that is less than 75 elo with almost 100% confidence.

It seems surprising because I expected rating to be more dependent on the opponent.

Uri
Spock

Re: Movei 438 FRC testing completed - +106 ELO after 1,200 g

Post by Spock »

Hi Uri,

Yes given the very wide ELO range that I play (up to maximum +/- 300 ELO) there is certainly the possibility of some funny results. But I figured that the 100 game pairings probably eliminate that, which indeed seems to be the case. I normally wouldn't play up to +/- 300 on our normal chess lists, but there are so few engines here that it's necessary. Of course the opening positions are random as well adds to the reliability.