Intra OliThink Tournaments

Discussion of computer chess matches and engine tournaments.

Moderators: Harvey Williamson, bob, hgm

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
OliverBr
Posts: 560
Joined: Tue Dec 18, 2007 8:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch
Contact:

Re: Intra OliThink Tournaments

Post by OliverBr » Tue Sep 22, 2020 8:59 pm

So, now there was tournament of OliThink 5.7.0 to 5.8.0 2000 games each version:

Code: Select all

   # PLAYER            :  RATING  ERROR  POINTS  PLAYED   (%)    W    D    L  D(%)  CFS(%)
   1 OliThink 5.7.9    :      99     15  1157.0    2000  57.9  725  864  411  43.2      70
   2 OliThink 5.8.0    :      95     15  1144.0    2000  57.2  691  906  403  45.3      92
   3 OliThink 5.7.8    :      84     15  1110.5    2000  55.5  660  901  439  45.0      91
   4 OliThink 5.7.7    :      73     16  1076.5    2000  53.8  643  867  490  43.4     100
   5 OliThink 5.7.6    :      48     16  1000.0    2000  50.0  569  862  569  43.1      69
   6 OliThink 5.7.5    :      44     15   987.5    2000  49.4  552  871  577  43.5      50
   7 OliThink 5.7.4    :      44     16   987.5    2000  49.4  555  865  580  43.2      97
   8 OliThink 5.7.3    :      28     16   938.0    2000  46.9  516  844  640  42.2      93
   9 OliThink 5.7.2    :      16     15   901.5    2000  45.1  492  819  689  41.0      98
  10 OliThink 5.7.0    :       0   ----   852.5    2000  42.6  438  829  733  41.5      62
  11 OliThink 5.7.1    :      -2     16   845.0    2000  42.2  426  838  736  41.9     ---

White advantage = 32.92 +/- 2.56
Draw rate (equal opponents) = 44.17 % +/- 0.48
So now we see two times "deteriorating" versions: -2 from 5.7.0 to 5.7.1 and -4 from 5.7.9 to 5.8.0.
Now people are screaming: "How could you do that, regression and all?".
Because it's not that simple.

I put all those questionable versions to the the test:

Code: Select all

   # PLAYER            :  RATING  ERROR  POINTS  PLAYED   (%)     W     D     L  D(%)  CFS(%)
   1 Glaurung 2.2      :     148     12  2836.5    4158  68.2  2448   777   933  18.7     100
   2 GreKo 2020.03     :      27     12  1936.0    4158  46.6  1527   818  1813  19.7     100
   3 OliThink 5.8.0    :      10     11  1807.5    4158  43.5  1210  1195  1753  28.7      95
   4 OliThink 5.7.9    :       0   ----  1736.0    4158  41.8  1127  1218  1813  29.3     ---

White advantage = 40.47 +/- 3.47
Draw rate (equal opponents) = 25.52 % +/- 0.48
Further:

Code: Select all

   # PLAYER            :  RATING  ERROR  POINTS  PLAYED   (%)     W    D     L  D(%)  CFS(%)
   1 Glaurung 2.2      :     213     15  2242.5    3053  73.5  2001  483   569  15.8     100
   2 GreKo 2020.03     :      79     14  1547.0    3050  50.7  1280  534  1236  17.5     100
   3 OliThink 5.7.1    :      14     13  1195.0    3052  39.2   815  760  1477  24.9      98
   4 OliThink 5.7.0    :       0   ----  1118.5    3051  36.7   741  755  1555  24.7     ---

White advantage = 33.77 +/- 4.54
Draw rate (equal opponents) = 22.86 % +/- 0.57
So we see: Beating itself is not enough for an engine. It should beat other engines as well :)
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink

OliverBr
Posts: 560
Joined: Tue Dec 18, 2007 8:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch
Contact:

Re: Intra OliThink Tournaments

Post by OliverBr » Sat Oct 17, 2020 8:32 am

Here is a new tourney:

Code: Select all

   # PLAYER            :  RATING  ERROR  POINTS  PLAYED   (%)     W     D     L  D(%)  CFS(%)
   1 OliThink 5.8.7    :     194     12  2253.5    3500  64.4  1493  1521   486  43.5     100
   2 OliThink 5.8.6    :     153     12  2035.5    3500  58.2  1260  1551   689  44.3      99
   3 OliThink 5.8.5    :     140     12  1963.0    3500  56.1  1185  1556   759  44.5     100
   4 OliThink 5.8.4    :     118     12  1846.5    3500  52.8  1051  1591   858  45.5      99
   5 OliThink 5.8.3    :     105     12  1772.5    3500  50.6   999  1547   954  44.2     100
   6 OliThink 5.8.2    :      70     11  1584.0    3500  45.3   859  1450  1191  41.4     100
   7 OliThink 5.8.1    :      24     12  1333.5    3500  38.1   639  1389  1472  39.7     100
   8 OliThink 5.8.0    :       0   ----  1211.5    3500  34.6   564  1295  1641  37.0     ---

White advantage = 34.76 +/- 2.21
Draw rate (equal opponents) = 45.53 % +/- 0.43
It looks as though those ELO differences between own versions is not he ELO gain against other engines, because OliThink 5.8.7 is is certainly less than 194 ELO stronger than 5.8.0 against other engines.
Chess Engine OliThink: http://brausch.org/home/chess
OliThink GitHub:https://github.com/olithink

Post Reply