An idea for ranking rating list for chess programs

Uri Blass · Post by **Uri Blass** » Wed Sep 19, 2007 11:55 am

Give micromax ranking of 0.

Now give every program that can score more than 50/80 in the silver suite against micromax rank1.

Give rank N for every program that can score more than 50/80 against one of the programs that achieved rank N-1

I wonder what is going to be the ranking of rybka and if she can achieve higher ranking at longer time control or not when you use the same programs.

I suggest for the test to use only the 78 programs that are ranked by the CCRLwith more than 200 games based on the following link

http://www.computerchess.org.uk/ccrl/4040.live/

Uri

hgm · Post by **hgm** » Wed Sep 19, 2007 2:13 pm

It seems to me that what you suggest is not very different from ordinary determination of ratings. (Many ratinglist use their own books rather than that of the engines, and starting from the Silver positions is logically equivalent to having a small book.)

With you 'chain of engines' you could trie to make use of relatively weak engines that play particularly good against one particular opponents *for their Elo difference) to make the chain longer. If there is a loop of 3 engines that beat each other 50-30, the chain could even be infinitely long. But what is the point of it?

BTW, it is much more interesting to do this downward, as there are many engines much weaker than uMax, and their ratings are very poorly known. It would be interesting to know what the rating of a random mover is, but there is still an increadible gap between Chad's Chess and engines like Pos or NEG. Also, the common failure of engines in this range to recognize and avoid rep-draws makes quantative measurement very difficult, as such engines tend to behave different then the common rating models assume. In particular, you could have a very long chain of engines decisively beating each next-lower member of the chain, but the engine at the bottom would still score a sizable fraction of rep-draws against the top engine. Many engines also suffer from self-inflicted losses, through illegal moves or spontaneous resigns in a totally won position, that occur with a frequency nearly independent of the capability of the opponent.

Ovyron · Post by **Ovyron** » Fri Sep 21, 2007 11:28 am

What about not counting draws? You could run the test and then get rid of the games that ended in draw.

An idea for ranking rating list for chess programs

An idea for ranking rating list for chess programs

Re: An idea for ranking rating list for chess programs

Re: An idea for ranking rating list for chess programs