Well, I tried the following:
Based on the results from Chess War X Promo, I extracted ratings with BayesElo using prior = 0. The ratings are then spread out over about twice as large a range as with the default setting, namely from -1400 to +1000. I threw away all engines (versions) with 100% scores, as with prior=0 their ratings are indeterminate.
Then I used those ratings in the current Promo pairing. I took only the games between engine that had received a rating from the previos Promo. This left 42 games. The (new) rating difference in these games ranged from 506 to 1407, on the average 916 points. Now with the score formula used by BayesElo (score = 100%/(1+10^(RatingDifference/400))), calculating the number of points expected to be salvaged by the weak group, (on a per-game basis, and adding all games), we get 0.476 point (out of 42 games). So If the priorless ratings are any good, we should expect one draw.
In fact, we see one win (of the 3 'surprises' in the first Promo round, only Mooboo-BaChess had both participants rated: Trynyty-Vicki and Ananke-Akiba were dropped because Ananke and Vicki are new). This is not really a significant difference, and if you look at the game, even the win is suspicious: BaChess won on time in a totally lost position because neither engine claims or recognizes rep-draws, and they were just infinitely repeating the same position...
In fact the other 'surprises' were just as suspicious: Akiba managed to win on time because Ananke crashed after 9 moves, while Ananke was already a Knight ahead. Such wins really don't tell you anything about the rating difference: Ananke would have lost this game no matter how poor the opponent was, because even a random mover would not let itself be checkmated (by Ananke) within 9 moves. (Note that Eden2 managed to get itself checkmated against Blikskottel in 4 moves, though!)
So in conclusion, it seems that the rating differences are really much larger than even the results of this promo round (2.5 out of 73 = 3.4%) suggest, as this score of the weaker group seems to be due more to problems with the stronger engines ('defeating themselves', by crashing or opponent-independent idiocy), rather than due to any skill of the opponent.
If we use the score of TSCP 1.18c given in Olivier's list (1638) as a calibration point, it means the bottom of the list ends somewhere around an Elo of 0:
Code: Select all
Rating Engine
1804 Storm 0.6
1756 Mooboo 0.2b
1740 Milady 2.15
1739 MiniMax
1738 Damas 7c
1712 ChessRikus 1.4.66
1707 Clarabit 0.18
1688 Atlanchess 3.3
1675 Simple 0048
1657 Pooky 2.7
1638 TSCP 1.81c
1626 PolarChess 1.3
1614 Jester 0.83
1609 Golem 0.4
1591 SharpChess2 2.52
1587 Dimitri 1.34e
1587 Milady 2.1
1576 Simon 1.2
1552 Beaches 2.2
1546 JChess 1.0
1531 Hokus Pokus 0.6.3
1522 Hoplite 2.1.1
1515 LarsenVB 0.05.01
1514 Bace 0.45
1504 Roque 1.1
1470 Lovelace 1.0r1
1462 Jupiter 001
1458 Gedeone 1620
1431 MSCP 1.6g
1422 Piranha 0.5
1413 Rainman 0.7.5
1408 Pentagon 1.2
1364 Alice 0.3.5
1360 Cefap 0.72
1359 Nero 6.1
1318 The Lightning 2.04
1306 Skaki 1.22
1294 Braincrack
1272 Yawce 0.16
1263 APILchess 1.05r1b
1261 StAndersen 3.1
1247 Blikskottel 0.7
1245 Murderhole 1.0.10
1241 MiniMardi 1.3
1219 Eden 0.0.11_server
1218 Exacto 0.d
1209 Ozwald 0.43
1206 Sinapse 1.1
1175 MiniChessAI 1.19
1171 Zephyr 0.61
1162 Excelsior 2.32b
1161 Trueno 1.0
1132 SCP 1.0b
1104 Raffaela 0.14
1092 T.rex 1.9b
1073 Stan's Chess 1.42
1030 KillerQueen 2b3
1024 Carnivor
987 Pierre 1.7
987 BaChess 1.3
955 Tikov 0.6.3
949 BremboCE 0.4
944 Gringo 1.4.7
940 DarkFusch 0.9
904 SharpChess 0.0.6
901 O'Chess
897 Brama 051204
888 Matilde 2.6.1
887 Turing
881 Cassandre 0.24
876 Blitzter 2.0
876 Crux 5.0m
875 RoboKewlper 0.047a
855 Marquis 0.1.5
838 ZChess2 2004
836 Joana
806 Kace 0.8.1
787 Youk 1.05
777 Mystery 2.1
769 LTK 2.0
768 Dimitri 1.35e
749 Geko 0.4.3
730 MFChess 1.1
723 JaksaH 0.17
685 BabyChess 11.1
679 Pyotr Amateur 0.6
660 Tiffanys 0.2
655 Talvmenni 0.1
652 BigBook 3.1
650 Xadreco 5.0
642 StrategicDeep 1.31
619 Chad's Chess 0.15
616 NSVChess 0.14
582 Trynyty 1.0
569 Neophyte 0.1
547 Testina 2.2
512 Dreamer 0.1.0
490 Koenig Schwarz
487 Fianchetto
419 Cheops 1.1
381 Belofte 0.2.8
363 Protej 0.5.3
353 CS4210
346 Usurper 0.5
322 Akiba 0.0.20031118
317 Sachy 0.2
293 GiuChess 1.01b1
290 PreChess 0.7.8
194 LaMoSca 0.10
37 POS 1.10
-66 Etabeta 7.21
-78 RattateChess 0.666a
-94 Omar 3.1
-102 ECE 0.1
-156 CPP1 0.1038
-213 Eden2
-236 Gray Matter
The second Promo round will provide a nice test to see if these huge rating difference are real, as now engines will be paired that are much closer in rating. So the scores between the weaker and stronger groups should stay far enough from 100% so that meaningful conclusions about the rating difference can be drawn. (Rather than all scoring being due to flukes.)