Give micromax ranking of 0.
Now give every program that can score more than 50/80 in the silver suite against micromax rank1.
Give rank N for every program that can score more than 50/80 against one of the programs that achieved rank N-1
I wonder what is going to be the ranking of rybka and if she can achieve higher ranking at longer time control or not when you use the same programs.
I suggest for the test to use only the 78 programs that are ranked by the CCRLwith more than 200 games based on the following link
http://www.computerchess.org.uk/ccrl/4040.live/
Uri
An idea for ranking rating list for chess programs
Moderator: Ras
-
Uri Blass
- Posts: 10980
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
-
hgm
- Posts: 28413
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: An idea for ranking rating list for chess programs
It seems to me that what you suggest is not very different from ordinary determination of ratings. (Many ratinglist use their own books rather than that of the engines, and starting from the Silver positions is logically equivalent to having a small book.)
With you 'chain of engines' you could trie to make use of relatively weak engines that play particularly good against one particular opponents *for their Elo difference) to make the chain longer. If there is a loop of 3 engines that beat each other 50-30, the chain could even be infinitely long. But what is the point of it?
BTW, it is much more interesting to do this downward, as there are many engines much weaker than uMax, and their ratings are very poorly known. It would be interesting to know what the rating of a random mover is, but there is still an increadible gap between Chad's Chess and engines like Pos or NEG. Also, the common failure of engines in this range to recognize and avoid rep-draws makes quantative measurement very difficult, as such engines tend to behave different then the common rating models assume. In particular, you could have a very long chain of engines decisively beating each next-lower member of the chain, but the engine at the bottom would still score a sizable fraction of rep-draws against the top engine. Many engines also suffer from self-inflicted losses, through illegal moves or spontaneous resigns in a totally won position, that occur with a frequency nearly independent of the capability of the opponent.
With you 'chain of engines' you could trie to make use of relatively weak engines that play particularly good against one particular opponents *for their Elo difference) to make the chain longer. If there is a loop of 3 engines that beat each other 50-30, the chain could even be infinitely long. But what is the point of it?
BTW, it is much more interesting to do this downward, as there are many engines much weaker than uMax, and their ratings are very poorly known. It would be interesting to know what the rating of a random mover is, but there is still an increadible gap between Chad's Chess and engines like Pos or NEG. Also, the common failure of engines in this range to recognize and avoid rep-draws makes quantative measurement very difficult, as such engines tend to behave different then the common rating models assume. In particular, you could have a very long chain of engines decisively beating each next-lower member of the chain, but the engine at the bottom would still score a sizable fraction of rep-draws against the top engine. Many engines also suffer from self-inflicted losses, through illegal moves or spontaneous resigns in a totally won position, that occur with a frequency nearly independent of the capability of the opponent.
-
Ovyron
- Posts: 4562
- Joined: Tue Jul 03, 2007 4:30 am
Re: An idea for ranking rating list for chess programs
What about not counting draws? You could run the test and then get rid of the games that ended in draw.
Your beliefs create your reality, so be careful what you wish for.