FRC/960 elo 4000?

lkaufman · Post by **lkaufman** » Sat Oct 02, 2021 6:47 am

Currently Komodo Dragon 2.5 tops the CCRL Blitz FRC (960) rating list at 3872. But this is on just one thread. To see where it would be on a good modern laptop computer using all the cores, I ran a long (2200 game) match of Dragon 2.5 on eight threads vs. same on one thread at the CCRL blitz time control of 2' + 1". The eight thread Dragon won by 117 elo, not surprising since 100 elo gains are common from 1 to 8 threads in standard chess, and FRC has fewer draws and hence larger rating gaps. 3872 plus 117 equals 3989. Going from eight to ten threads should be worth about the eleven elo needed to reach 4000. So 4000 elo is a reasonable estimate for the CCRL Blitz FRC rating of Dragon 2.5 on ten threads on a ten core i7 or comparable computer. To be fair, latest Stockfish would also probably reach or exceed 4000 with the same conditions.
To anticipate likely objections, first let's consider whether the list is inflated relative to humans. Well, Shredder 10 is rated 2771 on that list. Shredder 10 is newer and clearly stronger than Fritz 7, 8, and 9, which drew matches with Kasparov and Kramnik and had other 2800 level performances. This was on four threads, but four threads of nearly twenty year old hardware is no better than one thread of modern i7 used by CCRL. Also those were standard chess; FRC/960 is clearly harder for the human to play well than for an engine, so I have no doubt that Shredder 10 on a modern i7, one thread, would win a classical time limit FRC match from any human, even Magnus Carlsen. But this is a blitz list, which greatly favors engines. Can anyone doubt that a FRC blitz match between Shredder 10 on one thread and even the best human FRC player (Carlsen or Wesley So perhaps) would result in an overwhelming score for Shredder? So the CCRL FRC blitz list is clearly way too LOW in human terms!
A more serious objection is that engine vs. engine results overstate rating differences. This is true, but should be less true of FRC than for normal chess, because in normal chess memorized openings tend to lead to draws and reduce rating differences, which the engine lists avoid by forcing suboptimal openings to be played (reversing colors). With FRC humans and engines are treated the same. While there is still probably some exaggeration of rating differences, the use of BayesElo by CCRL offsets this significantly, and as already pointed out the list "starts" at too low a rating in human terms. So I think it is fair to say that a 4000 human equivalent rating has now been achieved in FRC blitz on reasonably affordable hardware.

Modern Times · Post by **Modern Times** » Sat Oct 02, 2021 11:00 am

The FRC list is still repeating time control, we never switched that over. What I'm thinking I might do next year is start an additional new 8CPU or 10CPU FRC list with a slightly longer time control, but restricted to Top 10 or similar due to the extra resources required. That has to wait until early January next year which is when I'll have access to my multi-core machines again.

Modern Times · Post by **Modern Times** » Sat Oct 02, 2021 11:36 am

lkaufman wrote: ↑Sat Oct 02, 2021 6:47 am ..... A more serious objection is that engine vs. engine results overstate rating differences. ..........
While there is still probably some exaggeration of rating differences, the use of BayesElo by CCRL offsets this significantly,

So using bayeselo is a good thing then

Using Ordo on that list would probably already show Dragon and SF over 4000 Elo with say Shredder 10 as the anchor.

Modern Times · Post by **Modern Times** » Sat Oct 02, 2021 11:58 am

Using Shredder 10 as the anchor, running the database through Ordo:

Code: Select all

 # PLAYER                   :  RATING  POINTS  PLAYED   (%)
   Dragon 2.5 by Komodo     :  3976.7  1470.5    1918    77
   Stockfish 14             :  3975.1  1974.0    2588    76
   Shredder 10              :  2771.0  1133.5    1900    60

lkaufman · Post by **lkaufman** » Sat Oct 02, 2021 5:59 pm

Modern Times wrote: ↑Sat Oct 02, 2021 11:36 am
lkaufman wrote: ↑Sat Oct 02, 2021 6:47 am ..... A more serious objection is that engine vs. engine results overstate rating differences. ..........
While there is still probably some exaggeration of rating differences, the use of BayesElo by CCRL offsets this significantly,
So using bayeselo is a good thing then

Using Ordo on that list would probably already show Dragon and SF over 4000 Elo with say Shredder 10 as the anchor.

If the goal is to have engine vs engine rating differences correspond to the actual Elo system, so that for example a 76% score will show up as a 200 elo difference as the elo tables say, then Ordo (as used by CEGT and FastGM) is correct, BayesElo gives smaller differences. But it happens that giving smaller differences makes for better correlation with results of engines vs. humans. So the CCRL/BayesElo approach gives numbers that are closer aligned to human ratings. With CEGT or FastGM roughly the same thing can be done by moving the ratings 10% of the way towards the anchor engine. Another point is that (if I understand everything correctly) BayesElo weights draws more heavily than normal (Ordo) Elo, which I consider an undesirable property. Anyway there is a case for both approaches, pretty much comes down to whether you want to better simulate what ratings would be against humans (BayesElo) or whether you want ratings to match the results of engine matches rated by the elo system (Ordo). As a Komodo partner, I don't like the fact that BayesElo makes our Elo gains look smaller than actual results suggest by the Elo system, but in human terms there is a justification for it.

Uri Blass · Post by **Uri Blass** » Sat Oct 02, 2021 6:31 pm

Modern Times wrote: ↑Sat Oct 02, 2021 11:00 am The FRC list is still repeating time control, we never switched that over. What I'm thinking I might do next year is start an additional new 8CPU or 10CPU FRC list with a slightly longer time control, but restricted to Top 10 or similar due to the extra resources required. That has to wait until early January next year which is when I'll have access to my multi-core machines again.

I prefer to have all entries in the same list and have few matches between engines with different time control to combine between the lists(for example something like Ethereal 10 CPU 40 moves/10 minutes vs Stockfish 1 CPU 40 moves/2 minutes assuming you want to add 10 CPU 40/10 for the top engines when you already have 1 CPU for 40/2)

FRC/960 elo 4000?

FRC/960 elo 4000?

Re: FRC/960 elo 4000?

Re: FRC/960 elo 4000?

Re: FRC/960 elo 4000?

Re: FRC/960 elo 4000?

Re: FRC/960 elo 4000?