The grand unified rating list

pedrox · Post by **pedrox** » Fri Feb 03, 2012 10:23 pm

2412 DanaSah 2.26/16                        2406   41   41   188   48%  2412   38%
2454 DanaSah 2.26                           2394   61   61   100   64%  2286   22%

4032 DanaSah 1.3.4                          1963  271  271     5   40%  2043    0%
4121 DanaSah 1.34                           1918  130  130    31   68%  1724    6%

You can combine data and name it as DanaSah 2.26 and DanaSah 1.3.4

It's fun, in the list there are versions that I did not remember.

On the other hand I find that maybe my best version with enough games was private, DanaSah 4.07, I will have to review code source.

Pedro

Evert · Post by **Evert** » Fri Feb 03, 2012 11:19 pm

Very useful indeed!
One correction: there is no "Jazz 4.44", it is the same version as "Jazz 444". I never bothered to give Jazz (or Sjaak) proper version numbers, the number associated with each version is just the SVN revision number corresponding to that release. This is confusion for many, something that hadn't occurred to me.

lkaufman · Post by **lkaufman** » Tue Feb 07, 2012 3:51 am

First of all, thanks for this nice work! I think it is a better overall rating list than any single list at this time. It tells me that we need 35 Elo over Komodo 4 to be number one on your single-core list. That won't be easy, but at least it is a realistic target. Maybe MP will show a different picture.
My only complaint is that by mixing the IPON blitz ratings with the other slower rating lists, you can get distortion, and also the mere fact that IPON can play more games due to the faster time limit gives the IPON ratings too much weight. I wonder if you would consider publishing an alternate list that had a minimum time limit of 40/20' or the equivalent? Then people could decide for themselves if they wanted the larger samples that including blitz games gives, or wanted the purity of only longer time control games. Or they could consider both lists and take the average.

Mark Mason · Post by **Mark Mason** » Tue Feb 07, 2012 9:13 am

Great listing and invaluable as a single source look up. Thank you !

JVMerlino · Post by **JVMerlino** » Fri Feb 10, 2012 12:40 am

Excellent work, Vincent!

You can combine:

Myrddin 0.84g 64-bit
Myrddin 0.85 64-bit

and call them "Myrddin 0.85 64-bit", as they are the same version.

Thanks very much!

jm

jdart · Post by **jdart** » Fri Feb 10, 2012 3:36 am

I think you should just remove engines that have >100 ELO error bars. There is no sense ranking these.

bob · Post by **bob** » Fri Feb 10, 2012 3:58 am

I would second Jon's idea. For example, there is a "loop" in the list with 2 games. One simple way to cull would be requiring 100 games or something similar to make sure it is not just noise...

lkaufman · Post by **lkaufman** » Fri Feb 10, 2012 4:18 am

I have another proposal, to address what I consider the most significant potential bias of the list. I'm referring to the fact that the list automatically underweights longer time limit games simply because fewer can be played. There are "universal" rating lists for human games (in Germany at least) based on the notion that some fairly slow time limit is considered standard, and that all faster games can still be rated, but only with a weighting in proportion to the time limit of "standard" games. So for example if game in one hour is standard, then game in five minutes can be rated, but each such game gets only 1/12 the weight of a standard game. In this manner, a tournament gets the same weight for a given amount of total playing time, regardless of whether they play 5 slow games or 60 blitz games in the same time. This seems fair to me, and counteracts the overweighting of blitz.

I think you could do the same here. Of course some judgments would need to be made, such as equating 40/20' with 20' + 10" increment, for example. Also unequal hardware must be allowed for, and testing with ponder on should count something like 1.3 times as much as with ponder off, based on a 30% ponder hit rate. So the person in charge of the list must make some tough calls, but they only need be made at the beginning, unless a new list is added or a list changes its conditions. Once the weighting is decided, it is no more work than the present system.

As a matter of disclosure, I should say that since I believe that the overweighting of blitz hurts Komodo relative to Houdini and Critter, I would benefit from anything that corrects for it. If as some claim there is no difference in scaling among these programs, then there should be no objection from them. But regardless, I think it is the right way to do it.

IWB · Post by **IWB** » Fri Feb 10, 2012 7:45 am

Hello Vincent,

Just my 2 cents about this attempt.

A lot of work and no doubt done for good reasons but I started my list years ago and made it public later exactly because I was unhappy with the established lists. Now you throw everything in a pot and stirr it. I have some doubt that the outcome is any good.
Even if I know that I can't stop you from doing it I would rather like to have the IPON games not in that list - it makes simply no sence to mix ALL that different conditions.

Thx
Ingo

Graham Banks · Post by **Graham Banks** » Fri Feb 10, 2012 8:00 am

IWB wrote:Hello Vincent,

Just my 2 cents about this attempt.

A lot of work and no doubt done for good reasons but I started my list years ago and made it public later exactly because I was unhappy with the established lists. Now you throw everything in a pot and stirr it. I have some doubt that the outcome is any good.
Even if I know that I can't stop you from doing it I would rather like to have the IPON games not in that list - it makes simply no sence to mix ALL that different conditions.

Thx
Ingo

No harm in what he's doing Ingo and a lot of people seem to appreciate it.
Ipon is still Ipon regardless, so nothing to worry about here really.
We wouldn't like it if certain engine authors objected to particular rating lists including their engine. Just an anology.

The grand unified rating list

Re: The grand unified rating list

Re: The grand unified rating list

Re: The grand unified rating list

Re: The grand unified rating list

Re: The grand unified rating list

Re: The grand unified rating list

Re: The grand unified rating list

Re: The grand unified rating list

Re: The grand unified rating list

Re: The grand unified rating list