Adam Hair wrote:There is a definite difference.
The ratings for the Also-Rans were actually computed from the entire CCRL 40/4 database, from which I am using the engines rated 2200 Elo or less (as computed with Ordo). Here are the Ordo ratings for the top and bottom of the entire database:
Code: Select all
# ENGINE : RATING POINTS PLAYED (%) 1 Houdini 3 64-bit 4CPU : 3417.5 1320.5 1644 80.3% 2 Houdini 2.0c 64-bit 4CPU : 3360.1 1833.5 2465 74.4% 3 Houdini 1.5a 64-bit 4CPU : 3345.7 1196.5 1582 75.6% 4 Critter 1.6a 64-bit 4CPU : 3308.2 980.0 1450 67.6% 5 Houdini 3 64-bit : 3306.7 317.5 472 67.3% 6 Stockfish 2.3.1 64-bit 4CPU : 3298.7 566.0 951 59.5% 7 Critter 1.2 64-bit 4CPU : 3287.8 913.5 1331 68.6% 8 Rybka 4.1 64-bit 4CPU : 3283.2 1375.0 2072 66.4% .................................................................................................................. 1198 MicroChess 1976 : 451.4 110.5 196 56.4% 1199 NEG 0.3d : 402.8 164.0 435 37.7% 1200 Ram 2.0 : 375.2 144.5 435 33.2% 1201 LaMoSca 0.10 : 305.4 89.0 286 31.1% 1202 CPP1 : 282.2 73.0 255 28.6% 1203 ACE 0.1 : 144.9 67.0 473 14.2% 1204 POS 1.20 : 110.9 55.5 298 18.6% 1205 Brutus RND : 0.0 32.0 306 10.5%
Now, here is the Bayeselo ratings using the computed drawelo and default scale parameter:
Code: Select all
Rank Name Elo + - games score oppo. draws 1 Houdini 3 64-bit 4CPU 3177 17 17 1644 80% 2944 29% 2 Houdini 2.0c 64-bit 4CPU 3127 14 14 2465 74% 2943 32% 3 Houdini 1.5a 64-bit 4CPU 3117 17 17 1582 76% 2920 31% 4 Houdini 3 64-bit 3081 27 27 472 67% 2962 40% 5 Critter 1.6a 64-bit 4CPU 3080 16 16 1450 68% 2956 42% 6 Stockfish 2.3.1 64-bit 4CPU 3071 19 19 951 60% 3009 47% 7 Critter 1.2 64-bit 4CPU 3060 17 17 1331 69% 2925 40% 8 Rybka 4.1 64-bit 4CPU 3058 14 14 2072 66% 2940 39% ............................................................................................................... 1198 MicroChess 1976 373 55 55 196 56% 322 31% 1199 NEG 0.3d 330 45 45 435 38% 473 29% 1200 Ram 2.0 311 46 46 435 33% 489 30% 1201 LaMoSca 0.10 257 49 49 286 31% 449 61% 1202 CPP1 227 54 54 255 29% 447 23% 1203 ACE 0.1 111 52 52 473 14% 609 22% 1204 POS 1.20 80 56 56 298 19% 462 22% 1205 Brutus RND 0 60 60 306 10% 462 21%
It could be interesting not only to make a comparison of [(maximum rating of Ordo) - (minimum rating of Ordo)]/[(maximum rating of BayesElo) - (minimum rating of BayesElo)] but also the distribution of ratings in an adimensional way. I propose the following math:Michel wrote:Thanks. This is interesting. The difference seems to be about 10% over the entire ELO scale.
Code: Select all
o_i = [(Ordo rating_i) - (minimum rating of Ordo)]/[(maximum rating of Ordo) - (minimum rating of Ordo)]
b_i = [(BayesElo rating_i) - (minimum rating of BayesElo)]/[(maximum rating of BayesElo) - (minimum rating of BayesElo)]Code: Select all
Engine: o_i b_i
Houdini 3 64-bit 4CPU 1.0000 1.0000
Houdini 2.0c 64-bit 4CPU 0.9832 0.9843
Houdini 1.5a 64-bit 4CPU 0.9790 0.9811
Critter 1.6a 64-bit 4CPU 0.9680 0.9695
Houdini 3 64-bit 0.9676 0.9698
Stockfish 2.3.1 64-bit 4CPU 0.9652 0.9666
Critter 1.2 64-bit 4CPU 0.9620 0.9632
Rybka 4.1 64-bit 4CPU 0.9607 0.9625
[...]
MicroChess 1976 0.1321 0.1174
NEG 0.3d 0.1179 0.1039
Ram 2.0 0.1098 0.0979
LaMoSca 0.10 0.0894 0.0809
CPP1 0.0826 0.0715
ACE 0.1 0.0424 0.0349
POS 1.20 0.0325 0.0252
Brutus RND 0.0000 0.0000You can see that differences between columns are not negligible at all. Other thing is if that table is useful or useless.
Regards from Spain.
Ajedrecista.


