CCRL 40/40, 40/4 and FRC lists updated (21st February 2015)

Modern Times · Post by **Modern Times** » Sun Feb 22, 2015 1:17 pm

Did you read the comments above ?

The pure list has been updated. Everything is updated together, it is an automated process. If an engine doesn't have enough games against pure list opponents, then it doesn't go on there until it has. In the case of Texel:

"Texel 1.05 64-bit 4CPU" out (only 132 games), "Texel 1.04 64-bit 4CPU" in

Texel 1.05 64-bit 4CPU needs more games before it can replace 1.04 on the pure list.

The Pure list is not necessarily better. You reduce the distortion yes, but you replace it with far greater margins of error due to the smaller database with much less games.

petero2 · Post by **petero2** » Sun Feb 22, 2015 3:03 pm

Modern Times wrote:The Pure list is not necessarily better. You reduce the distortion yes, but you replace it with far greater margins of error due to the smaller database with much less games.

Currently the 40/40 pure list for free engines has a problem. All 4CPU engines seem to be around 600-700 elo below their expected elo level. The problem is that even though all participants in this list have played a sufficient amount of games against other participants in the list, there are no games between the 4CPU and the 1CPU engines.

Modern Times · Post by **Modern Times** » Sun Feb 22, 2015 3:24 pm

petero2 wrote: Currently the 40/40 pure list for free engines has a problem. All 4CPU engines seem to be around 600-700 elo below their expected elo level. The problem is that even though all participants in this list have played a sufficient amount of games against other participants in the list, there are no games between the 4CPU and the 1CPU engines.

Testers are very aware of the need to play 1CPU vs 4CPU to connect the engines. If they don't then there is a problem for sure !

michiguel · Post by **michiguel** » Sun Feb 22, 2015 3:37 pm

petero2 wrote:
Modern Times wrote:The Pure list is not necessarily better. You reduce the distortion yes, but you replace it with far greater margins of error due to the smaller database with much less games.
Currently the 40/40 pure list for free engines has a problem. All 4CPU engines seem to be around 600-700 elo below their expected elo level. The problem is that even though all participants in this list have played a sufficient amount of games against other participants in the list, there are no games between the 4CPU and the 1CPU engines.

Right. Ordo can be useful to detect this automatically with the switch -g

Running
ordo -p free.\[37786\].pgn -g g.txt

gives
(see groups = 2)

Code: Select all

Total games            37786
 - White wins          14141
 - Draws               12204
 - Black wins          11441
 - Truncated               0
Unique head to head    17.01%
Reference rating      2300.0 (average of the pool)

Loose Anchors = none
Relative Anchors = none
groups=2
Encounters, Total=6427, Main=6427, @ Interface between groups=0

And the output is in the file g.txt, listing the two groups, to know exactly who needs connecting games (see at the bottom)

Code: Select all

Group 1
 | AnMon 5.60
 | Aristarch 4.50
 | Pseudo 0.7c
 | SOS 5.1
 | Jonny 2.83 32-bit
 | Slow Chess Blitz WV2.1
 | Little Goliath Evolution 3.12
 | Snitch 1.6.2 32-bit
 | Quark 2.35
 | Anechka 0.08
 | Bruja 1.9.1 32-bit
 | Muse 0.899b
 | Dragon 4.6
 | Gaia 3.5 32-bit
 | Ant 2006-F
 | Amateur 2.82
 | Ufim 8.02
 | Ruffian 1.0.5
 | Aice 0.99.2
 | Arion 1.7
 | Monarch 1.7
 | Pharaon 3.5.1
 | Kiwi 0.6d 32-bit
 | Gosu 0.16
 | Diablo 0.5.1
 | Patzer 3.80
 | Zappa 1.1 64-bit
 | Trace 1.37a
 | WildCat 7
 | Comet B68
 | Ayito 0.2.994
 | Zeus 1.29
 | Natwarlal 0.14
 | PostModernist 1016
 | Prophet 2.0 32-bit
 | Marvin 1.3.0
 | Smash 1.0.3
 | Clueless 1.4
 | Alaric 707
 | Asterisk 0.6
 | Homer 2.01
 | Parrot 07.07.22
 | Flux 2.2.1
 | Petir 4.999999
 | BigLion 2.23x
 | Alf 1.09
 | ProDeo 1.6
 | Movei 00.8.438 (10 10 10)
 | Queen 4.03
 | Mustang 4.97
 | Joker 1.1.14
 | Matheus 2.3
 | E.T. Chess 13.01.08
 | Horizon 4.4
 | Clarabit 1.00 32-bit
 | OBender 3.2.4.2
 | Anatoli 0.35k
 | ChessAlex 2.0r4
 | Uralochka 1.1b
 | Popochin 3.2
 | Sage 3.53
 | NagaSkaki 5.12
 | Sorgenkind 0.4
 | Twisted Logic 20080620
 | Cerebro 3.03d
 | Arasan 10.4
 | Surprise 4.3b13
 | Rival 1.18
 | Warrior 1.0.3
 | Lime 66
 | BibiChess 0.5
 | Typhoon 1.00-358
 | Delfi 5.4
 | Yace 0.99.87
 | Tao 5.6
 | Pepito 1.59 32-bit
 | Cyrano 0.6b17 32-bit
 | Colossus 2008b
 | Matacz 1.4
 | Amy 0.87b
 | Phalanx XXII
 | Hamsters 0.7.1
 | Micro-Max 4.8 (DM-PII)
 | BikJump 2.01 32-bit
 | Feuerstein 0.4.6.1
 | Heracles 0.6.16
 | Counter 1.1
 | Neurosis 2.5
 | Brutus 7.02b
 | Amundsen 0.80
 | BlackBishop 1.0
 | The Baron 2.23 64-bit
 | Sloppy 0.2.2 64-bit
 | Schola 1.1.0 32-bit
 | Reger 0.09
 | Hermann 2.5 32-bit
 | Xpdnt 091007
 | Amyan 1.72
 | OliThink 5.3.0 32-bit
 | KMTChess 1.2.1 32-bit
 | PLP 1661571
 | Jabba 1.0 32-bit
 | Protej 0.5.7
 | RattateChess 1.0 Nosferatu
 | Slibo 0.5.1
 | MatMoi 7.15.0-cct 32-bit
 | N2 0.4 64-bit
 | Ares 1.004
 | LittleThought 1.052 64-bit
 | Daydreamer 1.75 64-bit
 | KnockOut 0.7.1
 | Sungorus 1.4 32-bit
 | ChessMind 0.82
 | Ghost 2.0.1
 | Beowulf 2.4a 64-bit
 | Atak 6.8
 | Chronos 1.9.9 64-bit
 | Kurt 0.9.2 32-bit
 | Waxman 2010
 | Eeyore 1.52 64-bit
 | Dolphin 1.0
 | Plisk 0.2.7d 32-bit
 | Predateur 2.0
 | Spike 1.4 Leiden
 | Spark 1.0 64-bit
 | Adam 3.3
 | GChess IV 0.2.0
 | Bubble 1.5
 | AliChess 4.25
 | BugChess2 1.9 64-bit
 | DoubleCheck 1.0 64-bit
 | Alex 2.14a 64-bit
 | Naraku 1.4
 | GarboChess 3.0 64-bit
 | Francesca MAD 0.19
 | Simplex 0.9.8 64-bit
 | Nemo 1.0.1 64-bit
 | ECE 12.01
 | Philou 3.7.1 64-bit
 | Quazar 0.4 64-bit
 | Tornado 4.88 64-bit
 | TJchess 1.1 64-bit
 | Frenzee 3.5.19 64-bit
 | Bison 9.11 64-bit
 | MinkoChess 1.3 64-bit
 | Bobcat 3.25 64-bit
 | Loop 13.6 (Loop 2007) 64-bit
 | RomiChess P3L 64-bit
 | Butcher 1.64 64-bit
 | Ifrit m1.8 64-bit
 | nanoSzachy 4.0 64-bit
 | Gibbon 2.57a 64-bit
 | Bagatur 1.3a 64-bit
 | ChessKISS 1.7 64-bit
 | Mediocre 0.5
 | Pupsi2 0.09
 | Alfil 13.1 64-bit
 | ProChess 1.02AD
 | Myrddin 0.86 64-bit
 | Betsabe II 1.30 64-bit
 | Glass 2.0 64-bit
 | Rotor 0.8
 | Nebula 2.0 64-bit
 | Jazz 721 64-bit
 | EveAnn 1.71a
 | Toga II 3.0
 | Dirty 20Apr2013 64-bit
 | GNU Chess 5.50 64-bit
 | Booot 5.2.0 64-bit
 | Scorpio 2.7.6 64-bit
 | Pawny 1.0 64-bit
 | Carballo 0.8
 | Murka 3 64-bit
 | Fischerle 0.9.30b 64-bit
 | Shallow r688 64-bit
 | DanaSah 5.07
 | Tigran 2.4n 64-bit
 | Octochess r5190 64-bit
 | Mango Paola Ajedrez 4.1
 | Maverick 0.51 64-bit
 | Delphil 3.1 64-bit
 | Absolute Zero 2.4.0.0 64-bit
 | RedQueen 1.1.97 64-bit
 | Cheng4 0.36c 64-bit
 | FireFly 2.7.0 64-bit
 | Godel 3.4.9 64-bit
 | Gaviota 1.0 64-bit
 | Arminius 2014-01-18 64-bit
 | Exacto 0.e 64-bit
 | Djinn 1.021 64-bit
 | GreKo 12.0 64-bit
 | MadChess 1.4 64-bit
 | Capivara LK 0.09b02c 64-bit
 | Cheese 1.6.1 64-bit
 | Rodin 7.0
 | Orion 0.2 64-bit
 | Hakkapeliitta 1.0 64-bit
 | Jellyfish 1.0
 | Atlas 3.70em 64-bit
 | Vajolet2 1.45 64-bit
 | Tucano 5.00 64-bit
 | iCE 2.0 64-bit
 | Pedone 0.5 64-bit
 | Rhetoric 1.4.1 64-bit
 | Sjakk 2.2 64-bit
 | Deuterium 14.3.34.130 64-bit
 \---> this group is isolated from the rest

Group 2
 | Hannibal 1.4b 64-bit 4CPU
 | Critter 1.6a 64-bit 4CPU
 | Fire 3.0 64-bit 4CPU
 | BlackMamba 2.0 64-bit 4CPU
 | Gull 3 64-bit 4CPU
 | Stockfish 5 64-bit 4CPU
 | Texel 1.04 64-bit 4CPU
 | Naum 4.6 64-bit 4CPU
 | Protector 1.7.0 64-bit 4CPU
 \---> this group is isolated from the rest

petero2 · Post by **petero2** » Sun Feb 22, 2015 3:39 pm

Modern Times wrote:
petero2 wrote:Currently the 40/40 pure list for free engines has a problem. All 4CPU engines seem to be around 600-700 elo below their expected elo level. The problem is that even though all participants in this list have played a sufficient amount of games against other participants in the list, there are no games between the 4CPU and the 1CPU engines.
Testers are very aware of the need to play 1CPU vs 4CPU to connect the engines. If they don't then there is a problem for sure !

Yes, and in the full list I have never seen this to be a problem. It may not be a problem in the pure lists most of the time either. This is the first time I have seen the problem. In this particular case the lowest ranked 4CPU engine is Protector 1.7.0 (elo 3091 in the full list) and the highest ranked 1CPU engine is Spike 1.4 Leiden (elo 2924 in the full list). It is understandable if they have not been played against each other.

Overall the CCRL list is great. I just wanted to illustrate one more reason for why the pure lists are not necessarily better than the full lists.

Modern Times · Post by **Modern Times** » Sun Feb 22, 2015 4:05 pm

I haven't seen it before either.

The scripts will for example automatically exclude two new engines where they have only played each other, knowing that bayeselo cannot give a rating in that situation. So they do check that the database is connected. But they check in the pre-processing stage on the overall database. There are clearly no such checks when the pure lists are being created. Thanks for pointing this out.

Modern Times · Post by **Modern Times** » Sun Feb 22, 2015 6:14 pm

michiguel wrote: Right. Ordo can be useful to detect this automatically with the switch -g

Very useful to know, but we don't use Ordo.

michiguel · Post by **michiguel** » Sun Feb 22, 2015 6:58 pm

Modern Times wrote:
michiguel wrote: Right. Ordo can be useful to detect this automatically with the switch -g
Very useful to know, but we don't use Ordo.

It does not matter. You can use Ordo just to figure it out if you need connections. The -g switch does not produce a ranking. For the ranking, you can use anything else, of course.

Miguel

Graham Banks · Post by **Graham Banks** » Sun Feb 22, 2015 7:19 pm

lucasart wrote:
Graham Banks wrote:The latest CCRL Rating Lists and Statistics are available for viewing from the following links:
http://computerchess.org.uk/ccrl/4040/ (40/40)
http://www.computerchess.org.uk/ccrl/404/ (40/4)
http://www.computerchess.org.uk/ccrl/404FRC/ (FRC 40/4)
Thanks. But It seems the "pure list" was not updated. For example, I can't see Texel 1.05 in the pure list:
http://computerchess.org.uk/ccrl/4040/r ... _pure.html

On a side note, I think the default view should be the pure list, not the distorted list. With so many versions of the same engines (or clones) being tested, rating distortions become quite significant. What do you think ?

I never really take much notice of the pure list when organising my testing to be honest. Perhaps I should.

Clones/derivatives are always a tricky subject.

Which engines qualify as clones/derivatives? Who decides? On what basis and what criteria is the decision made? Is there unanimous agreement over such criteria?

lech · Post by **lech** » Mon Feb 23, 2015 11:51 pm

Graham, I asked you to continue the test of Sting SF 4.8.3 and I am very happy that you did (doing) it. The version 4.8.3 is not stable and can hang up. But the result of this version is much more important for me than the next 4.8.4. Believe me I respect you much more than the prominent authors of the prominent engines.

I know programmers, they feel like Gods!

CCRL 40/40, 40/4 and FRC lists updated (21st February 2015)

Re: CCRL 40/40, 40/4 and FRC lists updated (21st February 20

Re: CCRL 40/40, 40/4 and FRC lists updated (21st February 20

Re: CCRL 40/40, 40/4 and FRC lists updated (21st February 20

Re: CCRL 40/40, 40/4 and FRC lists updated (21st February 20

Re: CCRL 40/40, 40/4 and FRC lists updated (21st February 20

Re: CCRL 40/40, 40/4 and FRC lists updated (21st February 20

Re: CCRL 40/40, 40/4 and FRC lists updated (21st February 20

Re: CCRL 40/40, 40/4 and FRC lists updated (21st February 20

Re: CCRL 40/40, 40/4 and FRC lists updated (21st February 20

Re: CCRL 40/40, 40/4 and FRC lists updated (21st February 20