Very useful indeed!
One correction: there is no "Jazz 4.44", it is the same version as "Jazz 444". I never bothered to give Jazz (or Sjaak) proper version numbers, the number associated with each version is just the SVN revision number corresponding to that release. This is confusion for many, something that hadn't occurred to me.
First of all, thanks for this nice work! I think it is a better overall rating list than any single list at this time. It tells me that we need 35 Elo over Komodo 4 to be number one on your single-core list. That won't be easy, but at least it is a realistic target. Maybe MP will show a different picture.
My only complaint is that by mixing the IPON blitz ratings with the other slower rating lists, you can get distortion, and also the mere fact that IPON can play more games due to the faster time limit gives the IPON ratings too much weight. I wonder if you would consider publishing an alternate list that had a minimum time limit of 40/20' or the equivalent? Then people could decide for themselves if they wanted the larger samples that including blitz games gives, or wanted the purity of only longer time control games. Or they could consider both lists and take the average.
I would second Jon's idea. For example, there is a "loop" in the list with 2 games. One simple way to cull would be requiring 100 games or something similar to make sure it is not just noise...
I have another proposal, to address what I consider the most significant potential bias of the list. I'm referring to the fact that the list automatically underweights longer time limit games simply because fewer can be played. There are "universal" rating lists for human games (in Germany at least) based on the notion that some fairly slow time limit is considered standard, and that all faster games can still be rated, but only with a weighting in proportion to the time limit of "standard" games. So for example if game in one hour is standard, then game in five minutes can be rated, but each such game gets only 1/12 the weight of a standard game. In this manner, a tournament gets the same weight for a given amount of total playing time, regardless of whether they play 5 slow games or 60 blitz games in the same time. This seems fair to me, and counteracts the overweighting of blitz.
I think you could do the same here. Of course some judgments would need to be made, such as equating 40/20' with 20' + 10" increment, for example. Also unequal hardware must be allowed for, and testing with ponder on should count something like 1.3 times as much as with ponder off, based on a 30% ponder hit rate. So the person in charge of the list must make some tough calls, but they only need be made at the beginning, unless a new list is added or a list changes its conditions. Once the weighting is decided, it is no more work than the present system.
As a matter of disclosure, I should say that since I believe that the overweighting of blitz hurts Komodo relative to Houdini and Critter, I would benefit from anything that corrects for it. If as some claim there is no difference in scaling among these programs, then there should be no objection from them. But regardless, I think it is the right way to do it.
A lot of work and no doubt done for good reasons but I started my list years ago and made it public later exactly because I was unhappy with the established lists. Now you throw everything in a pot and stirr it. I have some doubt that the outcome is any good.
Even if I know that I can't stop you from doing it I would rather like to have the IPON games not in that list - it makes simply no sence to mix ALL that different conditions.
A lot of work and no doubt done for good reasons but I started my list years ago and made it public later exactly because I was unhappy with the established lists. Now you throw everything in a pot and stirr it. I have some doubt that the outcome is any good.
Even if I know that I can't stop you from doing it I would rather like to have the IPON games not in that list - it makes simply no sence to mix ALL that different conditions.
Thx
Ingo
No harm in what he's doing Ingo and a lot of people seem to appreciate it.
Ipon is still Ipon regardless, so nothing to worry about here really.
We wouldn't like it if certain engine authors objected to particular rating lists including their engine. Just an anology.