Fide equivalent "slow blitz" rating list

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Fide equivalent "slow blitz" rating list

Post by lkaufman »

Here is my rating list for a sampling of engines (single thread) based on 500 game matches at 5' + 3" against various rating levels of Dragon 2.6, with the list anchored by the weakest engine tested, Safrad 2.2.40, set to 1723, its FIDE-equivalent Lichess blitz rating based solely on all 104 blitz games played there against human opponents over 1400 in the last six months (actual Lichess performance was 1848, translated to FIDE1723 by the formula given in another thread). Note that the spread of ratings is considerably smaller than in normal rating lists; I think this is due to the great difference between conventional engines and an NNUE engine searching to a much shorter depth for similar strength. Similarity of opponents seems to spread the ratings apart (unless they are so strong that most games are drawn). In theory these ratings should predict what FIDE rating a human needs to score 50% against the engine at 5' + 3". So my question is: Are these ratings generally realistic for that goal? Note that only engines near the human range were tested. Or are they too high, or too low?

1. Rybka 2.3.2a 3016
2. Fruit 2.2.1 2832
3. Benjamin 1.0 2784
4. Rebel Century 2739
5. Nebula 2 2726
6. Pawny 0.2 2550
7. Baislicka 1.0 2474
8. Bikjump 2.01 64b 2338
9. Snowy 0.2 2235
10. Pigeon 1.5.1 2130
11. Irina 0.15 1872
12. Sargon 1.01 1819
13. Safrad 2.2.40 1723
Komodo rules!
Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: Fide equivalent "slow blitz" rating list

Post by Raphexon »

No hardware normalization?
Chessqueen
Posts: 5685
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: Fide equivalent "slow blitz" rating list

Post by Chessqueen »

lkaufman wrote: Mon Jan 03, 2022 7:02 am Here is my rating list for a sampling of engines (single thread) based on 500 game matches at 5' + 3" against various rating levels of Dragon 2.6, with the list anchored by the weakest engine tested, Safrad 2.2.40, set to 1723, its FIDE-equivalent Lichess blitz rating based solely on all 104 blitz games played there against human opponents over 1400 in the last six months (actual Lichess performance was 1848, translated to FIDE1723 by the formula given in another thread). Note that the spread of ratings is considerably smaller than in normal rating lists; I think this is due to the great difference between conventional engines and an NNUE engine searching to a much shorter depth for similar strength. Similarity of opponents seems to spread the ratings apart (unless they are so strong that most games are drawn). In theory these ratings should predict what FIDE rating a human needs to score 50% against the engine at 5' + 3". So my question is: Are these ratings generally realistic for that goal? Note that only engines near the human range were tested. Or are they too high, or too low?

1. Rybka 2.3.2a 3016
2. Fruit 2.2.1 2832
3. Benjamin 1.0 2784
4. Rebel Century 2739
5. Nebula 2 2726
6. Pawny 0.2 2550
7. Baislicka 1.0 2474
8. Bikjump 2.01 64b 2338
9. Snowy 0.2 2235
10. Pigeon 1.5.1 2130
11. Irina 0.15 1872
12. Sargon 1.01 1819
13. Safrad 2.2.40 1723
Are you Also calibrating for rating range below 1700 to 800 ? According to this chart the majority of the players fall between 800 thru 1700 and the same goes for FIDE ratings, and between that range is where players need UCI_ELO the most to improve their Elo progressively http://www.uschess.org/archive/ratings/ratedist.php
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Fide equivalent "slow blitz" rating list

Post by lkaufman »

Raphexon wrote: Mon Jan 03, 2022 11:05 am No hardware normalization?
Normalization to match what, the hardware described as the one used for the Lichess games for Safrad perhaps? No, I didn't try to match that, I can check later based on the claim that it runs about a million nps. I don't think we're far apart, I suppose this could cause a 20 elo or so error in the level of the list, as could quite a few other things.
Komodo rules!
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Fide equivalent "slow blitz" rating list

Post by lkaufman »

Chessqueen wrote: Mon Jan 03, 2022 3:46 pm
lkaufman wrote: Mon Jan 03, 2022 7:02 am Here is my rating list for a sampling of engines (single thread) based on 500 game matches at 5' + 3" against various rating levels of Dragon 2.6, with the list anchored by the weakest engine tested, Safrad 2.2.40, set to 1723, its FIDE-equivalent Lichess blitz rating based solely on all 104 blitz games played there against human opponents over 1400 in the last six months (actual Lichess performance was 1848, translated to FIDE1723 by the formula given in another thread). Note that the spread of ratings is considerably smaller than in normal rating lists; I think this is due to the great difference between conventional engines and an NNUE engine searching to a much shorter depth for similar strength. Similarity of opponents seems to spread the ratings apart (unless they are so strong that most games are drawn). In theory these ratings should predict what FIDE rating a human needs to score 50% against the engine at 5' + 3". So my question is: Are these ratings generally realistic for that goal? Note that only engines near the human range were tested. Or are they too high, or too low?

1. Rybka 2.3.2a 3016
2. Fruit 2.2.1 2832
3. Benjamin 1.0 2784
4. Rebel Century 2739
5. Nebula 2 2726
6. Pawny 0.2 2550
7. Baislicka 1.0 2474
8. Bikjump 2.01 64b 2338
9. Snowy 0.2 2235
10. Pigeon 1.5.1 2130
11. Irina 0.15 1872
12. Sargon 1.01 1819
13. Safrad 2.2.40 1723
Are you Also calibrating for rating range below 1700 to 800 ? According to this chart the majority of the players fall between 800 thru 1700 and the same goes for FIDE ratings, and between that range is where players need UCI_ELO the most to improve their Elo progressively http://www.uschess.org/archive/ratings/ratedist.php
I don't have any reliable CCRL rated engines below 1000 (so 1700 FIDE); do you know of any that are easy to download, UCI, reasonably bug free, and play somewhat sensible looking chess (to an amateur)? I do have a way to extrapolate the ratings of the Elo levels in Dragon down to 800 and below, but it would be more accurate if I had a suitably weak CCRL engine to use in the tests.
Komodo rules!
Chessqueen
Posts: 5685
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: Fide equivalent "slow blitz" rating list

Post by Chessqueen »

lkaufman wrote: Mon Jan 03, 2022 5:49 pm
Chessqueen wrote: Mon Jan 03, 2022 3:46 pm
lkaufman wrote: Mon Jan 03, 2022 7:02 am Here is my rating list for a sampling of engines (single thread) based on 500 game matches at 5' + 3" against various rating levels of Dragon 2.6, with the list anchored by the weakest engine tested, Safrad 2.2.40, set to 1723, its FIDE-equivalent Lichess blitz rating based solely on all 104 blitz games played there against human opponents over 1400 in the last six months (actual Lichess performance was 1848, translated to FIDE1723 by the formula given in another thread). Note that the spread of ratings is considerably smaller than in normal rating lists; I think this is due to the great difference between conventional engines and an NNUE engine searching to a much shorter depth for similar strength. Similarity of opponents seems to spread the ratings apart (unless they are so strong that most games are drawn). In theory these ratings should predict what FIDE rating a human needs to score 50% against the engine at 5' + 3". So my question is: Are these ratings generally realistic for that goal? Note that only engines near the human range were tested. Or are they too high, or too low?

1. Rybka 2.3.2a 3016
2. Fruit 2.2.1 2832
3. Benjamin 1.0 2784
4. Rebel Century 2739
5. Nebula 2 2726
6. Pawny 0.2 2550
7. Baislicka 1.0 2474
8. Bikjump 2.01 64b 2338
9. Snowy 0.2 2235
10. Pigeon 1.5.1 2130
11. Irina 0.15 1872
12. Sargon 1.01 1819
13. Safrad 2.2.40 1723
Are you Also calibrating for rating range below 1700 to 800 ? According to this chart the majority of the players fall between 800 thru 1700 and the same goes for FIDE ratings, and between that range is where players need UCI_ELO the most to improve their Elo progressively http://www.uschess.org/archive/ratings/ratedist.php
I don't have any reliable CCRL rated engines below 1000 (so 1700 FIDE); do you know of any that are easy to download, UCI, reasonably bug free, and play somewhat sensible looking chess (to an amateur)? I do have a way to extrapolate the ratings of the Elo levels in Dragon down to 800 and below, but it would be more accurate if I had a suitably weak CCRL engine to use in the tests.
What about by comparing the rating List from CCRL and CEGT for instance between Cassandre 0.24 can you use lower rated engines from CEGT ?

This is from CEGT
407 Pigeon 1.36 x64 1222 21 21 900 49.5% 1226 30.6%
408 Minimardi 1.3 1172 18 18 1844 37.9% 1306 18.4%
409 Satana 2.0.8 x64 1097 19 19 1400 36.1% 1223 29.9%
410 Sargon 1978 1.00 UCI 1093 22 22 900 34.6% 1218 51.4%
411 T.rex 1.9 beta 1081 21 21 1200 33.0% 1253 22.1%
412 Cassandre 0.24 1016 21 21 1682 30.9% 1197 31.2%
413 Chads Chess 0.15 854 32 32 400 37.4% 961 25.3%
414 Fimbulwinter 5.04 w32 838 25 25 1600 13.4% 1344 8.8%
415 Testina 2.2 834 26 26 770 24.5% 1073 21.3%


And this is from CCRL
688 Cassandre 0.24 1133 +28 −17 57.7% −59.8 43.6% 1458

Or by using the engines from Chess.com special HOLIDAY settings from 800 to 1600 ==>https://www.chess.com/play/computer
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Fide equivalent "slow blitz" rating list

Post by lkaufman »

Chessqueen wrote: Mon Jan 03, 2022 10:46 pm
lkaufman wrote: Mon Jan 03, 2022 5:49 pm
Chessqueen wrote: Mon Jan 03, 2022 3:46 pm
lkaufman wrote: Mon Jan 03, 2022 7:02 am Here is my rating list for a sampling of engines (single thread) based on 500 game matches at 5' + 3" against various rating levels of Dragon 2.6, with the list anchored by the weakest engine tested, Safrad 2.2.40, set to 1723, its FIDE-equivalent Lichess blitz rating based solely on all 104 blitz games played there against human opponents over 1400 in the last six months (actual Lichess performance was 1848, translated to FIDE1723 by the formula given in another thread). Note that the spread of ratings is considerably smaller than in normal rating lists; I think this is due to the great difference between conventional engines and an NNUE engine searching to a much shorter depth for similar strength. Similarity of opponents seems to spread the ratings apart (unless they are so strong that most games are drawn). In theory these ratings should predict what FIDE rating a human needs to score 50% against the engine at 5' + 3". So my question is: Are these ratings generally realistic for that goal? Note that only engines near the human range were tested. Or are they too high, or too low?

1. Rybka 2.3.2a 3016
2. Fruit 2.2.1 2832
3. Benjamin 1.0 2784
4. Rebel Century 2739
5. Nebula 2 2726
6. Pawny 0.2 2550
7. Baislicka 1.0 2474
8. Bikjump 2.01 64b 2338
9. Snowy 0.2 2235
10. Pigeon 1.5.1 2130
11. Irina 0.15 1872
12. Sargon 1.01 1819
13. Safrad 2.2.40 1723
Are you Also calibrating for rating range below 1700 to 800 ? According to this chart the majority of the players fall between 800 thru 1700 and the same goes for FIDE ratings, and between that range is where players need UCI_ELO the most to improve their Elo progressively http://www.uschess.org/archive/ratings/ratedist.php
I don't have any reliable CCRL rated engines below 1000 (so 1700 FIDE); do you know of any that are easy to download, UCI, reasonably bug free, and play somewhat sensible looking chess (to an amateur)? I do have a way to extrapolate the ratings of the Elo levels in Dragon down to 800 and below, but it would be more accurate if I had a suitably weak CCRL engine to use in the tests.
What about by comparing the rating List from CCRL and CEGT for instance between Cassandre 0.24 can you use lower rated engines from CEGT ?

This is from CEGT
407 Pigeon 1.36 x64 1222 21 21 900 49.5% 1226 30.6%
408 Minimardi 1.3 1172 18 18 1844 37.9% 1306 18.4%
409 Satana 2.0.8 x64 1097 19 19 1400 36.1% 1223 29.9%
410 Sargon 1978 1.00 UCI 1093 22 22 900 34.6% 1218 51.4%
411 T.rex 1.9 beta 1081 21 21 1200 33.0% 1253 22.1%
412 Cassandre 0.24 1016 21 21 1682 30.9% 1197 31.2%
413 Chads Chess 0.15 854 32 32 400 37.4% 961 25.3%
414 Fimbulwinter 5.04 w32 838 25 25 1600 13.4% 1344 8.8%
415 Testina 2.2 834 26 26 770 24.5% 1073 21.3%


And this is from CCRL
688 Cassandre 0.24 1133 +28 −17 57.7% −59.8 43.6% 1458

Or by using the engines from Chess.com special HOLIDAY settings from 800 to 1600 ==>https://www.chess.com/play/computer
The bottom engine on that cegt list is also on CCRL, but its rating is nearly as high as the Safrad 2.2 engine I'm already using (both very near 1000), so that won't help for rating engines much weaker than FIDE 1700 (apparently about ccrl 1000). I need some reliable, UCI, easily downloaded engine below 800 ccrl (which would be like 600 cegt). Your suggestion of chess.com bots is pretty funny, since they are Komodo skill levels with some modifications! Anyway I need engines I can put on my computer to test using CuteChess, not online bots.
Komodo rules!
Odd Gunnar Malin
Posts: 310
Joined: Wed Mar 08, 2006 9:59 pm
Location: Norway, Vads?
Full name: Odd Gunnar Malin

Re: Fide equivalent "slow blitz" rating list

Post by Odd Gunnar Malin »

There is also the SSDF rating list with pretty stable computers down to 1471 ( https://ssdf.bosjo.net/long.txt ). The list is for 2h timecontrol I think so maybe not so useful. Can't this mess setup run these as uci engines. I couldn't find the lowest rated computere in CB-emu Messui32, but I found it dusted down in my basement. Without the silver pieces and el.adapter, but I put in some battery and it worked like a charm. I only need my finger to play blindfold against it.
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Fide equivalent "slow blitz" rating list

Post by lkaufman »

Odd Gunnar Malin wrote: Tue Jan 04, 2022 12:45 am There is also the SSDF rating list with pretty stable computers down to 1471 ( https://ssdf.bosjo.net/long.txt ). The list is for 2h timecontrol I think so maybe not so useful. Can't this mess setup run these as uci engines. I couldn't find the lowest rated computere in CB-emu Messui32, but I found it dusted down in my basement. Without the silver pieces and el.adapter, but I put in some battery and it worked like a charm. I only need my finger to play blindfold against it.
I don't actually use the CCRL or CEGT or SSDF rating for this human rating list, I just need a reliable, UCI engine that would be roughly 600 to 800 on CCRL blitz list. There are several such engines on the list, but either I can't find where to download them, or I get "malicious software" warnings, or they crash when I test them in Cutechess. Maybe one or two are reliable and I just haven't found them yet.
Komodo rules!
mbabigian
Posts: 220
Joined: Tue Oct 15, 2013 2:34 am
Location: US
Full name: Mike Babigian

Re: Fide equivalent "slow blitz" rating list

Post by mbabigian »

Larry, I would also recommend the CB-emu collection. My wife wanted to learn chess so I've been coaching her. She now plays in the 1100s and I use the old Mephisto, Novag, etc engines for her to play against. You can use them stock or make them run 5x faster than they did originally etc.

They have estimated ratings from publications you once edited. I have found several that play slightly better than my wife so she gets some wins. All work in UCI (Arena).
Mike