ELO progression by year, period 2006-2021

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, chrisw, Rebel

Jouni
Posts: 3462
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: ELO progression by year, period 2006-2021

Post by Jouni »

NCM testing shows +135 progression for 2020!
Jouni
lkaufman
Posts: 6078
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: ELO progression by year, period 2006-2021

Post by lkaufman »

CMCanavessi wrote: Fri Oct 15, 2021 1:43 pm It is very weird that we don't see the NNUE jump in elo. It's almost linear in that progression.
CMCanavessi wrote: Fri Oct 15, 2021 1:43 pm It is very weird that we don't see the NNUE jump in elo. It's almost linear in that progression.
Probably due to using BayesElo. With Ordo gains should match those observed in matches. As I understand it, BayesElo effectively double counts draws, which is bad for NNUE engines that tend to have higher draw percentages against moderately lower rated engines. Overweighting of draws seems like the opposite of what we should do in chess.
Komodo rules!
User avatar
Rebel
Posts: 7231
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: ELO progression by year, period 2006-2021

Post by Rebel »

CMCanavessi wrote: Fri Oct 15, 2021 1:43 pm It is very weird that we don't see the NNUE jump in elo. It's almost linear in that progression.
CCRL 40/15
SF12 - 3476
SF11 - 3433
Only +43

CEGT 40/20
SF12 - 3530
SF11 - 3435
+95

GRL 40/2
SF12 - 3622
SF11 - 3506
+116

Seems to confirm Larry 's point BayesElo vs Ordo, CCRL is BayesElo, CEGT and the GRL are Ordo.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7231
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: ELO progression by year, period 2006-2021

Post by Rebel »

CEGT numbers 2018-2021

Code: Select all

Year  ELO-SP Gain Engine           Year  ELO-MP4 Gain Engine             Year  ELO-MP8 Gain Engine
2018   3359  ---- Stockfish 9      2018   3468   ---- Stockfish 10       2018   3464   ----  Stockfish 9
2019   3405   +46 Stockfish 10     2019   3468   ---- Stockfish 10       2019   3495   +31   Stockfish 10
2020   3530  +125 Stockfish 12     2020   3573   +105 Stockfish 12       2020   3495   ----  Stockfish 10
2021   3585   +55 Stockfish 14     2021   3584    +11 KomodoDragon 2.5   2021   3495   ----  Stockfish 10
NNUE progress clearly seen.
90% of coding is debugging, the other 10% is writing bugs.
Colin-G
Posts: 191
Joined: Mon Oct 31, 2016 6:30 pm
Location: England

Re: ELO progression by year, period 2006-2021

Post by Colin-G »

Some of you may be interested to see how the ELO list for my engine-engines matches has evolved over the last 15 years.
I now have well over 40,000 games played on my various computers since 2000. All played at the equivalent of about 40 moves in 4 minutes using a single core with nearly 350 different engines/versions.
The figures were computed using ELOstat.exe which was all there was when I started 20 years ago, and have stuck with it for consistancy. I find it a lot easier to use than Bayeselo. The ratings were computed at the end of the year in the list.
Although engines obviously get stronger over time, the maximum ELO rating does not seem to change much.
All results use the same seed value of 2400 which was the default value when I started using it.
I do play the newer engines against a lot of the old ones as well as the recent ones. I believe a wide spread of opponents gives a more accurate result.
Although the top ELO rating does not seem to change, the ratings of the older engines get driven down the list over time.
e.g looking at the 2006 results compared to my current ratings
Spike 1.2 now has a rating of 2435, down 312
Chessmaster9000 now has a rating of 2347, down 348
Fritz 7 now has a rating of 2453, down 207

Image
lkaufman
Posts: 6078
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: ELO progression by year, period 2006-2021

Post by lkaufman »

Colin-G wrote: Sat Oct 16, 2021 4:37 pm Some of you may be interested to see how the ELO list for my engine-engines matches has evolved over the last 15 years.
I now have well over 40,000 games played on my various computers since 2000. All played at the equivalent of about 40 moves in 4 minutes using a single core with nearly 350 different engines/versions.
The figures were computed using ELOstat.exe which was all there was when I started 20 years ago, and have stuck with it for consistancy. I find it a lot easier to use than Bayeselo. The ratings were computed at the end of the year in the list.
Although engines obviously get stronger over time, the maximum ELO rating does not seem to change much.
All results use the same seed value of 2400 which was the default value when I started using it.
I do play the newer engines against a lot of the old ones as well as the recent ones. I believe a wide spread of opponents gives a more accurate result.
Although the top ELO rating does not seem to change, the ratings of the older engines get driven down the list over time.
e.g looking at the 2006 results compared to my current ratings
Spike 1.2 now has a rating of 2435, down 312
Chessmaster9000 now has a rating of 2347, down 348
Fritz 7 now has a rating of 2453, down 207

Image
As I recall, ELOstat averages ratings of opposition, which is unsound when there is a large spread in their ratings. If you are playing today's top engines against ancient ones, using ElOstat will give completely misleading ratings. Ordo uses the same Elo formula as ELOstat but does it correctly, without this averaging of ratings. BayesElo is just a different rating system. I strongly recommend Ordo.
Komodo rules!
Colin-G
Posts: 191
Joined: Mon Oct 31, 2016 6:30 pm
Location: England

Re: ELO progression by year, period 2006-2021

Post by Colin-G »

Thanks for the info. I have just downloaded Ordo and will try it later.
I had not heard of it previously.
Colin-G
Posts: 191
Joined: Mon Oct 31, 2016 6:30 pm
Location: England

Re: ELO progression by year, period 2006-2021

Post by Colin-G »

I tried the Ordo executable on my main pgn file with nearly 43000 games and 350 different engines/versions.
The rating list produced had the engines placed in a very similar order to that of my list made using ELOstat.
The same seed value of 2400 was used by both programs.
The ELO ratings using Ordo were between 500 to 300 points higher than the ELOstat ratings for the top dozen engines and between 100 to 0 points lower for the bottom dozen engines.
The engines with an ELO rating close to my seed value of 2400 in my ELOstat list were all about 150 points higher in the Ordo list.
The engines in my Ordo list had ratings between 150 to 200 higher than those in the current CCRL Blitz list. I assume a higher seed value was used. I checked a mix of engines, higher, middle and lower rated.

I will continue to use ELOstat as well as Ordo since it creates some extra useful files when it runs.
e.g. a large text file is produced which shows every engine that each engine has ever played against with results.
I have written a script in tcl/tk that interrogates this file and allows me with a couple of button clicks to instantly display all of the engines that I currently use that have not yet played a particular engine.
This helps me decide which engines matches to play next. It can also display ratings for different groups of engines and ratings for just different versions of the same engine. I have added buttons to it over the years, the last feature being able to display a list of engines that have all played less than a certain number of games.
The ELOstat Cluster file is also useful, it helps me spot typos when an engine gets given 2 different names between my many computers in different GUIs.