ChessWar XI Promotion : lust of participants

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
Olivier Deville
Posts: 937
Joined: Wed Mar 08, 2006 9:13 pm
Location: Aurec, France

ChessWar XI Promotion : lust of participants

Post by Olivier Deville »

ChessWar XI Promotion 40m/20'
Swiss system, 11 rounds
List of participants
Engines ranked 1 to 10 promote to ChessWar XII F


Games start today! First round broadcasted at 10:00 french time (chesswar.dyndns.org port 16044).

More information about the tournament and on how to connect to the broadcast :
http://loirechecs.chez-alice.fr/chesswar/

Olivier

Code: Select all

Nr. Name                 Elo     Fed              
1   FAUCE 0.41c          1716    ITA     
2   ROBIN 0.983          1704    POL     
3   ENIGMA 1.1.4         1700    POL     
4   BIKJUMP 1.4          1699*   NED     
5   MIZAR 3.0            1696    ITA     
6   SILKE CHESS 1.2.1209 1696    GER     
7   RDCHESS 3.23         1687    AUT     
8   STORM 0.6            1687    USA     
9   DCHESS 1.02          1676    USA     
10  IQ23 003             1672    GER     
11  CHESSRIKUS 1.4.66    1668    USA     
12  TAMERLANE 0.2        1667    ITA     
13  BRUTUS 4.3           1666    NED     
14  CANALLA 0.175        1666    GER     
15  MOOBOO 0.2b          1656    GER     
16  MILADY 2.15          1641    FRA     
17  TSCP 1.81c           1638    USA     
18  ZOTRON 4.4.6         1635    USA     
19  MINIMAX              1634    GER     
20  SDBC 0.4.14.0        1628    GER     
21  GOLEM 0.4            1627    ITA     
22  HOPLITE 2.1.1        1622    FIN     
23  ALDEBARAN 0.7.0      1619    ITA     
24  TJCHESS 0.78R2       1618    USA     
25  DEEPTROUBLE 1.00     1617    USA     
26  NEEDLE 0.53.1        1614    FIN     
27  JESTER 0.83          1605    USA     
28  AZRAEL               1599*   SCO     
29  JARS 1.69            1599*   FRA     
30  VICKI 0.031a         1599*   RSA     
31  BEACHES 2.2          1597    USA     
32  SIMON 1.2            1597    USA     
33  SIMPLE 0048          1592    FRA     
34  POLARCHESS 1.3       1591    NOR     
35  RAINMAN 0.7.5        1590    SWE     
36  AWESOME 1.71         1584    AUS     
37  GEDEONE 1620         1581    ITA     
38  POOKY 2.7            1580    USA     
39  HOKUS POKUS 0.6.3    1577    POL     
40  ATLANCHESS 3.3       1573    ENG     
41  PHILEMON C           1565    SUI     
42  JUPITER 001          1556    DEN     
43  MSCP 1.6g            1553    NED     
44  PIRANHA 0.5          1551    GER     
45  ECHELON 1.03         1548    USA     
46  JCHESS 1.0           1548    POL     
47  BACE 0.45            1546    USA     
48  LARSENVB 0.05.01     1530    ITA     
49  ROQUE 1.1            1525    ESP     
50  SKAKI 1.22           1521    GRE     
51  CEFAP 0.72           1520    SWE     
52  YAWCE 0.16           1517    DEN     
53  LOVELACE 1.0r1       1508    FRA     
54  EIRE 0.1             1499*   ESP     
55  JCHECS 0.0.8         1499*   FRA     
56  LITTLECLARA Final    1499*   ESP     
57  OZWALD 0.43          1499    FIN     
58  TONY'S CHESS 0.02    1499*   CAN     
59  MURDERHOLE 1.0.10    1498    USA     
60  PENTAGON 1.2         1492    ITA     
61  BRAINCRACK           1487    GER     
62  ALICE 0.3.5          1485    USA     
63  ANANKE 0.002         1477    ENG     
64  APILCHESS 1.05r1b    1460    GER     
65  NERO 6.1             1460    FIN     
66  BLIKSKOTTEL 0.7      1457    RSA     
67  MINIMARDI 1.3        1456    SWE     
68  CARNIVOR             1454    USA     
69  TRUENO 1.0           1453    USA     
70  STANDERSEN 1.31      1446    SWE     
71  EXACTO 0.d           1443    USA     
72  THE LIGHTNING 2.04   1442    GER     
73  MINICHESSAI 1.19     1432    POL     
74  EXCELSIOR 2.32b      1430    POL     
75  FIMBULWINTER 5.00    1429    USA     
76  DIMITRI 1.35-E       1417    ITA     
77  ZEPHYR 0.61          1408    GER     
78  T.REX 1.9b           1407    FRA     
79  CHESSCRAFT 96        1406    DEN     
80  STAN'S CHESS 1.42    1406    NED     
81  BREMBOCE 0.4         1403    ITA     
82  NANOOK 0.16          1399*   FRA     
83  GRINGO 1.4.7m        1390    AUT     
84  SINAPSE 1.1          1361    BRA     
85  CRUX 5.0m            1360    HUN     
86  DARKFUSCH 0.9        1358    GER     
87  PIERRE 1.7           1358    CAN     
88  BACHESS 1.3          1352    GER     
89  EDEN 0.0.11_server   1332    GER     
90  RAFFAELA 0.14        1328    ITA     
91  O'CHESS 1.0          1317    USA     
92  SHARPCHESS 0.06      1317    SWE     
93  BRAMA 051204         1313    ITA     
94  TIKOV 0.6.3          1301    ENG     
95  NUMPTY 0.21pr        1299*   ENG     
96  KILLERQUEEN 2b3      1294    ITA     
97  MYSTERY 2.1          1281    GER     
98  ROBOKEWLPER 0.047a   1267    USA     
99  BIGBOOK 3.1          1259    USA     
100 JOANA                1257    MEX     
101 MARQUIS 0.1.5        1256    SUI     
102 TURING               1256    GER     
103 TRYNYTY 1.0          1252    HUN     
104 LTK 2.0              1251    USA     
105 YOUK 1.05            1244    FRA     
106 BLITZTER 2.0         1243    USA     
107 ZCHESS2 2004         1240    ENG     
108 KACE 0.8.1           1238    USA     
109 MFCHESS 1.1          1238    SWE     
110 CASSANDRE 0.24       1231    FRA     
111 NSVCHESS 0.14        1226    FRA     
112 XADRECO 5.7          1223    BRA     
113 STRATEGICDEEP 1.31   1194    POL     
114 JAKSAH 0.17          1188    SER     
115 CHAD'S CHESS 0.15    1179    USA     
116 TIFFANYS 0.3         1160    SUI     
117 GEKO 0.4.3           1159    ITA     
118 TESTINA 2.2          1153    ITA     
119 PYOTR Amateur 0.6    1140    GRE     
120 FIANCHETTO           1134    AUS     
121 BABYCHESS 11.1       1129    GER     
122 DREAMER 0.1.0        1117    GER     
123 KOENIG SCHWARZ       1114    GER     
124 CHEOPS 1.1           1112    CAN     
125 TALVMENNI 0.1        1111    FAI     
126 ACE 0.1              1099*   USA     
127 MICROCHESS           1099*   USA     
128 NEG 0.3d             1099*   NED     
129 PULCHESS 0.2b        1099*   ITA     
130 NEOPHYTE 0.1         1093    USA     
131 BELOFTE 0.2.8        1044    BEL     
132 USURPER 0.5          1044    USA     
133 CS4210               1036    USA     
134 GIUCHESS 1.0b2       1008    ITA     
135 PRECHESS 0.7.8       1008    BRA     
136 AKIBA 0.0.20031118   1000    POL     
137 CPP1 0.1038          1000    NED     
138 ECE 0.1              1000    ITA     
139 EDEN2                1000    ITA     
140 ETABETA 7.21         1000    ITA     
141 GRAY MATTER R6       1000    USA     
142 LAMOSCA 0.10         1000    ITA     
143 OMAR 3.2             1000    ESP     
144 POS 1.14             1000    NED     
145 RATTATECHESS 0.666a  1000    ITA     
146 SACHY 0.2            1000    CZE     

* = provisional rating     
Tony Thomas

Re: ChessWar XI Promotion : lust of participants

Post by Tony Thomas »

I guess Fauce is the lustiest participant since it is currently the strongest rated. :lol: :lol:
User avatar
hgm
Posts: 28356
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: ChessWar XI Promotion : lust of participants

Post by hgm »

Tony Thomas wrote:I guess Fauce is the lustiest participant since it is currently the strongest rated. :lol: :lol:
:lol:

I am afraid NEG is a bit overrated, though. It is true that in the test tourneys it could score draws even against some of the engines above it, but that was at blitz. Most opponents will improve at the time control used in the promo, while NEG will continue to move instantly.

Well, we will see...
User avatar
hgm
Posts: 28356
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Rating-scale problem

Post by hgm »

Code: Select all

1	[0]	FAUCE 0.41c	1716	1-0 	EXCELSIOR 2.32b	1430	[0]
2	[0]	FIMBULWINT 5.00	1429	0-1 	ROBIN 0.983	1704	[0]
3	[0]	ENIGMA 1.1.4	1700	1-0 	DIMITRI 1.35-E 	1417	[0]
4	[0]	ZEPHYR 0.61	1408	0-1 	BIKJUMP 1.4	1699	[0]
5	[0]	MIZAR 3.0	1696	1-0 	T.REX 1.9b	1407	[0]
6	[0]	CHESSCRAFT 96	1406	0-1 	SILKEC 1.2.1209	1696	[0]
7	[0]	RDCHESS 3.23	1687	1-0 	STAN'S CHE 1.42	1406	[0]
8	[0]	BREMBOCE 0.4	1403	0-1 	STORM 0.6	1687	[0]
9	[0]	DCHESS 1.02	1676	1-0 	NANOOK 0.16	1399	[0]
10	[0]	GRINGO 1.4.7m	1390	0-1 	IQ23 003	1672	[0]
11	[0]	CHESSRIK 1.4.66	1668	1-0 	SINAPSE 1.1	1361	[0]
12	[0]	CRUX 5.0m	1360	0-1 	TAMERLANE 0.2	1667	[0]
13	[0]	BRUTUS 4.3	1666	1-0 	DARKFUSCH 0.9	1358	[0]
14	[0]	PIERRE 1.7	1358	0-1 	CANALLA 0.175	1666	[0]
15	[0]	MOOBOO 0.2b	1656	0-1 	BACHESS 1.3	1352	[0]
16	[0]	EDEN 0.0.11_ser	1332	0-1 	MILADY 2.15	1641	[0]
17	[0]	TSCP 1.81c	1638	1-0 	RAFFAELA 0.14	1328	[0]
18	[0]	O'CHESS 1.0	1317	0-1 	ZOTRON 4.4.6	1635	[0]
19	[0]	MINIMAX 	1634	1-0 	SHARPCHESS 0.06	1317	[0]
20	[0]	BRAMA 051204	1313	0-1 	SDBC 0.4.14.0	1628	[0]
21	[0]	GOLEM 0.4	1627	1-0 	TIKOV 0.6.3	1301	[0]
22	[0]	NUMPTY 0.21pr	1299	0-1 	HOPLITE 2.1.1	1622	[0]
23	[0]	ALDEBARAN 0.7.0	1619	1-0 	KILLERQUEEN 2b3	1294	[0]
24	[0]	MYSTERY 2.1	1281	0-1 	TJCHESS 0.78R2 	1618	[0]
25	[0]	DEEPTROUBL 1.00	1617	1-0 	ROBOKEWL 0.047a	1267	[0]
26	[0]	BIGBOOK 3.1	1259	0-1 	NEEDLE 0.53.1	1614	[0]
27	[0]	JESTER 0.83	1605	1-0 	JOANA 		1257	[0]
28	[0]	MARQUIS 0.1.5	1256	0-1 	AZRAEL 		1599	[0]
29	[0]	JARS 1.69	1599	1-0 	TURING 		1256	[0]
30	[0]	TRYNYTY 1.0	1252	1/2 	VICKI 0.031a	1599	[0]
31	[0]	BEACHES 2.2	1597	1-0 	LTK 2.0		1251	[0]
32	[0]	YOUK 1.05	1244	0-1 	SIMON 1.2	1597	[0]
33	[0]	SIMPLE 0048	1592	1-0 	BLITZTER 2.0	1243	[0]
34	[0]	ZCHESS2 2004	1240	0-1 	POLARCHESS 1.3	1591	[0]
35	[0]	RAINMAN 0.7.5	1590	1-0 	KACE 0.8.1	1238	[0]
36	[0]	MFCHESS 1.1	1238	0-1 	AWESOME 1.71	1584	[0]
37	[0]	GEDEONE 1620	1581	1-0 	CASSANDRE 0.24	1231	[0]
38	[0]	NSVCHESS 0.14	1226	0-1 	POOKY 2.7	1580	[0]
39	[0]	HOKUS POK 0.6.3	1577	1-0 	XADRECO 5.7	1223	[0]
40	[0]	STRATEGICD 1.31	1194	0-1 	ATLANCHESS 3.3	1573	[0]
41	[0]	PHILEMON C	1565	1-0 	JAKSAH 0.17	1188	[0]
42	[0]	CHAD'S CHE 0.15	1179	0-1 	JUPITER 001	1556	[0]
43	[0]	MSCP 1.6g	1553	1-0 	TIFFANYS 0.3	1160	[0]
44	[0]	GEKO 0.4.3	1159	0-1 	PIRANHA 0.5	1551	[0]
45	[0]	ECHELON 1.03	1548	1-0 	TESTINA 2.2	1153	[0]
46	[0]	PYOTR Amate 0.6	1140	0-1 	JCHESS 1.0	1548	[0]
47	[0]	BACE 0.45	1546	1-0 	FIANCHETTO 	1134	[0]
48	[0]	BABYCHESS 11.1	1129	0-1 	LARSENV 0.05.01	1530	[0]
49	[0]	ROQUE 1.1	1525	1-0 	DREAMER 0.1.0	1117	[0]
50	[0]	KOENIG SCHWARZ 	1114	0-1 	SKAKI 1.22	1521	[0]
51	[0]	CEFAP 0.72	1520	1-0 	CHEOPS 1.1	1112	[0]
52	[0]	TALVMENNI 0.1	1111	0-1 	YAWCE 0.16	1517	[0]
53	[0]	LOVELACE 1.0r1	1508	1-0 	ACE 0.1		1099	[0]
54	[0]	MICROCHESS 	1099	0-1 	EIRE 0.1	1499	[0]
55	[0]	JCHECS 0.0.8	1499	1-0 	NEG 0.3d	1099	[0]
56	[0]	LITTLECLARA Fin	1499	1-0 	PULCHESS 0.2b	1099	[0]
57	[0]	OZWALD 0.43	1499	1-0 	NEOPHYTE 0.1	1093	[0]
58	[0]	BELOFTE 0.2.8	1044	0-1 	TONY'S CHE 0.02	1499	[0]
Something seems wrong with your rating scale. In the list above, 58 engines of the upper half play 58 engines of the lower half of the initial ranking (as is usual in the Swiss pairing system). There are only 2 cases where the strongest did not win: Mooboo-Bachess 0-1 and Trynyty-Vicky 1/2-1/2. The weak group thus scored only 1.5 pt out of 58 games = 2.59%.

This corresponds to an average rating distance of ~551 pts in the Elo system. The average listed rating difference of these groups is only 347 pts, however. With a 347 pt difference, one would have expected the weaker group to score 11.3% (6.5 pt), rather than 2.59% (1.5 pt). A very significant difference, as the standard error over 58 games at that score percentage is only ~3%.

So it seems your rating scale is way too compressed, and that actual rating differences are about a factor 1.58 larger as those listed. Thus, if the ratings at the top of the list (near 1700) are correct, what you list as 1000 would in truth be more like 587.

(sorry about the layout, apparently tabs are not correctly transmitted even in a 'code' section. For a better readable lis, look here.)
Uri Blass
Posts: 10815
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Rating-scale problem

Post by Uri Blass »

hgm wrote:

Code: Select all

1	[0]	FAUCE 0.41c	1716	1-0 	EXCELSIOR 2.32b	1430	[0]
2	[0]	FIMBULWINT 5.00	1429	0-1 	ROBIN 0.983	1704	[0]
3	[0]	ENIGMA 1.1.4	1700	1-0 	DIMITRI 1.35-E 	1417	[0]
4	[0]	ZEPHYR 0.61	1408	0-1 	BIKJUMP 1.4	1699	[0]
5	[0]	MIZAR 3.0	1696	1-0 	T.REX 1.9b	1407	[0]
6	[0]	CHESSCRAFT 96	1406	0-1 	SILKEC 1.2.1209	1696	[0]
7	[0]	RDCHESS 3.23	1687	1-0 	STAN'S CHE 1.42	1406	[0]
8	[0]	BREMBOCE 0.4	1403	0-1 	STORM 0.6	1687	[0]
9	[0]	DCHESS 1.02	1676	1-0 	NANOOK 0.16	1399	[0]
10	[0]	GRINGO 1.4.7m	1390	0-1 	IQ23 003	1672	[0]
11	[0]	CHESSRIK 1.4.66	1668	1-0 	SINAPSE 1.1	1361	[0]
12	[0]	CRUX 5.0m	1360	0-1 	TAMERLANE 0.2	1667	[0]
13	[0]	BRUTUS 4.3	1666	1-0 	DARKFUSCH 0.9	1358	[0]
14	[0]	PIERRE 1.7	1358	0-1 	CANALLA 0.175	1666	[0]
15	[0]	MOOBOO 0.2b	1656	0-1 	BACHESS 1.3	1352	[0]
16	[0]	EDEN 0.0.11_ser	1332	0-1 	MILADY 2.15	1641	[0]
17	[0]	TSCP 1.81c	1638	1-0 	RAFFAELA 0.14	1328	[0]
18	[0]	O'CHESS 1.0	1317	0-1 	ZOTRON 4.4.6	1635	[0]
19	[0]	MINIMAX 	1634	1-0 	SHARPCHESS 0.06	1317	[0]
20	[0]	BRAMA 051204	1313	0-1 	SDBC 0.4.14.0	1628	[0]
21	[0]	GOLEM 0.4	1627	1-0 	TIKOV 0.6.3	1301	[0]
22	[0]	NUMPTY 0.21pr	1299	0-1 	HOPLITE 2.1.1	1622	[0]
23	[0]	ALDEBARAN 0.7.0	1619	1-0 	KILLERQUEEN 2b3	1294	[0]
24	[0]	MYSTERY 2.1	1281	0-1 	TJCHESS 0.78R2 	1618	[0]
25	[0]	DEEPTROUBL 1.00	1617	1-0 	ROBOKEWL 0.047a	1267	[0]
26	[0]	BIGBOOK 3.1	1259	0-1 	NEEDLE 0.53.1	1614	[0]
27	[0]	JESTER 0.83	1605	1-0 	JOANA 		1257	[0]
28	[0]	MARQUIS 0.1.5	1256	0-1 	AZRAEL 		1599	[0]
29	[0]	JARS 1.69	1599	1-0 	TURING 		1256	[0]
30	[0]	TRYNYTY 1.0	1252	1/2 	VICKI 0.031a	1599	[0]
31	[0]	BEACHES 2.2	1597	1-0 	LTK 2.0		1251	[0]
32	[0]	YOUK 1.05	1244	0-1 	SIMON 1.2	1597	[0]
33	[0]	SIMPLE 0048	1592	1-0 	BLITZTER 2.0	1243	[0]
34	[0]	ZCHESS2 2004	1240	0-1 	POLARCHESS 1.3	1591	[0]
35	[0]	RAINMAN 0.7.5	1590	1-0 	KACE 0.8.1	1238	[0]
36	[0]	MFCHESS 1.1	1238	0-1 	AWESOME 1.71	1584	[0]
37	[0]	GEDEONE 1620	1581	1-0 	CASSANDRE 0.24	1231	[0]
38	[0]	NSVCHESS 0.14	1226	0-1 	POOKY 2.7	1580	[0]
39	[0]	HOKUS POK 0.6.3	1577	1-0 	XADRECO 5.7	1223	[0]
40	[0]	STRATEGICD 1.31	1194	0-1 	ATLANCHESS 3.3	1573	[0]
41	[0]	PHILEMON C	1565	1-0 	JAKSAH 0.17	1188	[0]
42	[0]	CHAD'S CHE 0.15	1179	0-1 	JUPITER 001	1556	[0]
43	[0]	MSCP 1.6g	1553	1-0 	TIFFANYS 0.3	1160	[0]
44	[0]	GEKO 0.4.3	1159	0-1 	PIRANHA 0.5	1551	[0]
45	[0]	ECHELON 1.03	1548	1-0 	TESTINA 2.2	1153	[0]
46	[0]	PYOTR Amate 0.6	1140	0-1 	JCHESS 1.0	1548	[0]
47	[0]	BACE 0.45	1546	1-0 	FIANCHETTO 	1134	[0]
48	[0]	BABYCHESS 11.1	1129	0-1 	LARSENV 0.05.01	1530	[0]
49	[0]	ROQUE 1.1	1525	1-0 	DREAMER 0.1.0	1117	[0]
50	[0]	KOENIG SCHWARZ 	1114	0-1 	SKAKI 1.22	1521	[0]
51	[0]	CEFAP 0.72	1520	1-0 	CHEOPS 1.1	1112	[0]
52	[0]	TALVMENNI 0.1	1111	0-1 	YAWCE 0.16	1517	[0]
53	[0]	LOVELACE 1.0r1	1508	1-0 	ACE 0.1		1099	[0]
54	[0]	MICROCHESS 	1099	0-1 	EIRE 0.1	1499	[0]
55	[0]	JCHECS 0.0.8	1499	1-0 	NEG 0.3d	1099	[0]
56	[0]	LITTLECLARA Fin	1499	1-0 	PULCHESS 0.2b	1099	[0]
57	[0]	OZWALD 0.43	1499	1-0 	NEOPHYTE 0.1	1093	[0]
58	[0]	BELOFTE 0.2.8	1044	0-1 	TONY'S CHE 0.02	1499	[0]
Something seems wrong with your rating scale. In the list above, 58 engines of the upper half play 58 engines of the lower half of the initial ranking (as is usual in the Swiss pairing system). There are only 2 cases where the strongest did not win: Mooboo-Bachess 0-1 and Trynyty-Vicky 1/2-1/2. The weak group thus scored only 1.5 pt out of 58 games = 2.59%.

This corresponds to an average rating distance of ~551 pts in the Elo system. The average listed rating difference of these groups is only 347 pts, however.

So it seems your rating scale is way too compressed, and that actual rating differences are about a factor 1.58 larger as those listed. Thus, if the ratings at the top of the list (near 1700) are correct, what you list as 1000 would in truth be more like 587.

(sorry about the layout, apparently tabs are not correctly transmitted even in a 'code' section. For a better readable lis, look here.)
I am not sure if your model is correct in predicting results for cases of big difference in rating.

The only way to build correct model is based on games and it may be different for different levels of playing strength.

The main question is the following:
suppose that the expected result of A against B is n1% and the expected result of B against C is n2%

What is the expected result of A against C?

The only way to find out is by games and I suspect that result at high level may be different than result at low level.

The reason is that at very high level that we still do not have today white never lose and you may find cases when
A(i) scores 60% against A(i+1) for 1<=i<=99 but
A(1) scores only 75% against A(100) by win with white and draw with black.

Uri
User avatar
hgm
Posts: 28356
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Rating-scale problem

Post by hgm »

I did the same test for the first round of Chess War XI-F. There the listed average rating difference between the strong and the weak group is 137. That should have resulted in a scoring of more than 31.2%. In fact the score of the weak group was only 10 out of 41 = 24.4%. This despite the fact that some of the engines in the weak group were obviously underrated, as they were strongly improved version using the rating of predecessors (HfC, Mediocre). And it were mainly those engines that scored the points.

So it seems that there somehow is a systematic underestimate of rating differences.
User avatar
hgm
Posts: 28356
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Rating-scale problem

Post by hgm »

Uri Blass wrote:I am not sure if your model is correct in predicting results for cases of big difference in rating.

The only way to build correct model is based on games and it may be different for different levels of playing strength.

The main question is the following:
suppose that the expected result of A against B is n1% and the expected result of B against C is n2%

What is the expected result of A against C?

The only way to find out is by games and I suspect that result at high level may be different than result at low level.

The reason is that at very high level that we still do not have today white never lose and you may find cases when
A(i) scores 60% against A(i+1) for 1<=i<=99 but
A(1) scores only 75% against A(100) by win with white and draw with black.
Well, it is not my model but that of prof. Elo. I am using his original Gaussian distribution for the results as a function of rating difference. There are alternative distributions in use (e.g. BayesElo uses one), and they have tails that decay less aggressively as for a Gaussian. But this means that you would need an even larger rating difference before the score percentage would drop to such an exceedingly low level as observed in the promo.

The validity of the model has to be checked on a vast number of games. Sonas has done this, (for Humans), and found that the original Elo formula leaves a lot to be desired. Nevertheless, even in his emperical formula, a win percentage of 2.5% (averaged over both colors) corresponds to a much larger rating difference than 350. (At 350 his curve gives 20% score with white, and 10% with black, so 15% average. This is even farther from the observed 2.6% as the prediction of 11.3% from the Elo formula.)

Note that the listed rating difference of 350 cannot be described yet as extreme. It is in the range where the the bulk of games are usually played, and where the models should be good and all approximately the same. This holds even more for the F-division result, where the rating difference is only 137.

Although in principle you are right about the model being strength dependent, you remark yourself that we are not yet in the regime where white is unbeatable. And the engines in the promo are indeed very very far from that regime, as the weaker ones still almost always lose even with white.

There just is something very wrong with the ratings. And that of course tends to be a self perpetuating condition, as new ratings will be calculated from these faulty old ratings...
pijl

Re: Rating-scale problem

Post by pijl »

hgm wrote:I did the same test for the first round of Chess War XI-F. There the listed average rating difference between the strong and the weak group is 137. That should have resulted in a scoring of more than 31.2%. In fact the score of the weak group was only 10 out of 41 = 24.4%. This despite the fact that some of the engines in the weak group were obviously underrated, as they were strongly improved version using the rating of predecessors (HfC, Mediocre). And it were mainly those engines that scored the points.

So it seems that there somehow is a systematic underestimate of rating differences.
This is a know problem with ELO calculations in computer chess, which is very well-know to those that collect bloated ratings on chess servers. If you look in their history they will play many engines rated about 200-300 points lower than them as they will gain the most that way.
Richard.
User avatar
hgm
Posts: 28356
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Rating-scale problem

Post by hgm »

Hmm... The remarkable thing is that Sonas, in the paper I referred too, found that the stronger player is disadvantaged in the classical Elo system.

Is this because in general people cannot even properly calculate the rating as it should be in the Elo system? What happens if we feed the total database of all computer games on which these ratings are supposed to be based into BayesElo?
pijl

Re: Rating-scale problem

Post by pijl »

hgm wrote:Hmm... The remarkable thing is that Sonas, in the paper I referred too, found that the stronger player is disadvantaged in the classical Elo system.

Is this because in general people cannot even properly calculate the rating as it should be in the Elo system? What happens if we feed the total database of all computer games on which these ratings are supposed to be based into BayesElo?
I don't think that elo calculations are not correct. I think the bigger problem is that the game collections on which the rating lists are based are not ideal. Usually there are only few games with bigger elo differences, so I guess you cannot expect that based on those few games the scaling of the list is correct.
Richard.