Fide equivalent "slow blitz" rating list

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Chessqueen
Posts: 5685
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: Fide equivalent "slow blitz" rating list

Post by Chessqueen »

lkaufman wrote: Mon Jan 03, 2022 5:49 pm
Chessqueen wrote: Mon Jan 03, 2022 3:46 pm
lkaufman wrote: Mon Jan 03, 2022 7:02 am Here is my rating list for a sampling of engines (single thread) based on 500 game matches at 5' + 3" against various rating levels of Dragon 2.6, with the list anchored by the weakest engine tested, Safrad 2.2.40, set to 1723, its FIDE-equivalent Lichess blitz rating based solely on all 104 blitz games played there against human opponents over 1400 in the last six months (actual Lichess performance was 1848, translated to FIDE1723 by the formula given in another thread). Note that the spread of ratings is considerably smaller than in normal rating lists; I think this is due to the great difference between conventional engines and an NNUE engine searching to a much shorter depth for similar strength. Similarity of opponents seems to spread the ratings apart (unless they are so strong that most games are drawn). In theory these ratings should predict what FIDE rating a human needs to score 50% against the engine at 5' + 3". So my question is: Are these ratings generally realistic for that goal? Note that only engines near the human range were tested. Or are they too high, or too low?

1. Rybka 2.3.2a 3016
2. Fruit 2.2.1 2832
3. Benjamin 1.0 2784
4. Rebel Century 2739
5. Nebula 2 2726
6. Pawny 0.2 2550
7. Baislicka 1.0 2474
8. Bikjump 2.01 64b 2338
9. Snowy 0.2 2235
10. Pigeon 1.5.1 2130
11. Irina 0.15 1872
12. Sargon 1.01 1819
13. Safrad 2.2.40 1723
Are you Also calibrating for rating range below 1700 to 800 ? According to this chart the majority of the players fall between 800 thru 1700 and the same goes for FIDE ratings, and between that range is where players need UCI_ELO the most to improve their Elo progressively http://www.uschess.org/archive/ratings/ratedist.php
I don't have any reliable CCRL rated engines below 1000 (so 1700 FIDE); do you know of any that are easy to download, UCI, reasonably bug free, and play somewhat sensible looking chess (to an amateur)? I do have a way to extrapolate the ratings of the Elo levels in Dragon down to 800 and below, but it would be more accurate if I had a suitably weak CCRL engine to use in the tests.
My friend rated around 1645 played versus DASH on chess.com rated 1600 and emailed me this games

[pgn][Event "Chess.com DASH"]
[Date "2022.01.02"]
[Round "?"]
[White "Jose"]
[Black "DASH"]
[Result "1-0"]
[BlackElo "1600"]
[ECO "D00"]
[Opening "Queen's Pawn"]
[Time "22:38:01"]
[Variation "2.e3 Nf6"]
[WhiteElo "1645"]
[TimeControl "300+3"]
[Termination "normal"]
[WhiteType "human"]
[BlackType "program"]

1. d4 d5 2. e3 Nf6 3. c4 e6 4. c5 Nbd7 5. b4 Be7 6. a3 O-O 7. Nc3 e5
8. Be2 a5 9. Qa4 axb4 10. Qxa8 bxc3 11. c6 bxc6 12. Qxc6 Nb8 13. Qxc3 Ne4
14. Qb2 Bh4 15. g3 exd4 16. gxh4 Qxh4 17. Bd3 Nc5 18. Bb5 Qe4 19. f3 Qf5
20. Ra2 c6 21. Qc2 cxb5 22. Qxc5 Na6 23. Qxd4 Qb1 24. Qb2 Qf5 25. Qc2 Qh5
26. Ne2 Nb4 27. axb4 Qxf3 28. Rg1 Bg4 29. Nd4 Qh3 30. Qg2 h5 31. Qxh3 Bxh3
32. Rg3 Bg4 33. Ra7 Rc8 34. Bd2 Re8 35. Bc3 Rc8 36. Nxb5 d4 37. Bxd4 Rc2
38. h3 Rc1+ 39. Kd2 Rd1+ 40. Kc2 Re1 41. hxg4 hxg4 42. Rxg4 Rh1 43. Ra8+
Kh7 44. Rxg7+ Kh6 45. Rh8#) 1-0[/pgn]
Ferdy
Posts: 4851
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Fide equivalent "slow blitz" rating list

Post by Ferdy »

lkaufman wrote: Mon Jan 03, 2022 7:02 am Here is my rating list for a sampling of engines (single thread) based on 500 game matches at 5' + 3" against various rating levels of Dragon 2.6, with the list anchored by the weakest engine tested, Safrad 2.2.40, set to 1723, its FIDE-equivalent Lichess blitz rating based solely on all 104 blitz games played there against human opponents over 1400 in the last six months (actual Lichess performance was 1848, translated to FIDE1723 by the formula given in another thread). Note that the spread of ratings is considerably smaller than in normal rating lists; I think this is due to the great difference between conventional engines and an NNUE engine searching to a much shorter depth for similar strength. Similarity of opponents seems to spread the ratings apart (unless they are so strong that most games are drawn). In theory these ratings should predict what FIDE rating a human needs to score 50% against the engine at 5' + 3". So my question is: Are these ratings generally realistic for that goal? Note that only engines near the human range were tested. Or are they too high, or too low?

1. Rybka 2.3.2a 3016
2. Fruit 2.2.1 2832
3. Benjamin 1.0 2784
4. Rebel Century 2739
5. Nebula 2 2726
6. Pawny 0.2 2550
7. Baislicka 1.0 2474
8. Bikjump 2.01 64b 2338
9. Snowy 0.2 2235
10. Pigeon 1.5.1 2130
11. Irina 0.15 1872
12. Sargon 1.01 1819
13. Safrad 2.2.40 1723
So this involves Dragon on different levels and at 5+3. Is the hardware used close to that of CCRL Blitz? Or the 5+3 TC with that hardware is equivalent to CCRL blitz?

Just for comparison I downloaded all blitz games of CCRL, generate rating list using Ordo with Safrad anchored at 1723. This is what I get.

Image

I had launched a BOT in lichess called cdroid designed to make the human happy by making it play suboptimal moves. There's not much human games yet. This engine is only comparable for CCRL blitz 1000 and below.
Guenther
Posts: 4718
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: Fide equivalent "slow blitz" rating list

Post by Guenther »

lkaufman wrote: Tue Jan 04, 2022 12:51 am
Odd Gunnar Malin wrote: Tue Jan 04, 2022 12:45 am There is also the SSDF rating list with pretty stable computers down to 1471 ( https://ssdf.bosjo.net/long.txt ). The list is for 2h timecontrol I think so maybe not so useful. Can't this mess setup run these as uci engines. I couldn't find the lowest rated computere in CB-emu Messui32, but I found it dusted down in my basement. Without the silver pieces and el.adapter, but I put in some battery and it worked like a charm. I only need my finger to play blindfold against it.
I don't actually use the CCRL or CEGT or SSDF rating for this human rating list, I just need a reliable, UCI engine that would be roughly 600 to 800 on CCRL blitz list. There are several such engines on the list, but either I can't find where to download them, or I get "malicious software" warnings, or they crash when I test them in Cutechess. Maybe one or two are reliable and I just haven't found them yet.
https://lichess.org/@/monchester/perf/blitz

Code: Select all

712	Monchester 1.0 64-bit	848	+307	−302	42.0%	+61.3	27.5%	743
https://github.com/unserializable/monch ... naries/1.0

One caveat seems to be assymmetric PG books - I don't use PG books since over a decade, but PGN start postions which works with Monchester
https://github.com/unserializable/monchester/issues/12

forum3/viewtopic.php?f=2&t=75636&p=8699 ... er#p869958
https://rwbc-chess.de

[Trolls n'existent pas...]
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Fide equivalent "slow blitz" rating list

Post by lkaufman »

Ferdy wrote: Tue Jan 04, 2022 10:46 am
lkaufman wrote: Mon Jan 03, 2022 7:02 am Here is my rating list for a sampling of engines (single thread) based on 500 game matches at 5' + 3" against various rating levels of Dragon 2.6, with the list anchored by the weakest engine tested, Safrad 2.2.40, set to 1723, its FIDE-equivalent Lichess blitz rating based solely on all 104 blitz games played there against human opponents over 1400 in the last six months (actual Lichess performance was 1848, translated to FIDE1723 by the formula given in another thread). Note that the spread of ratings is considerably smaller than in normal rating lists; I think this is due to the great difference between conventional engines and an NNUE engine searching to a much shorter depth for similar strength. Similarity of opponents seems to spread the ratings apart (unless they are so strong that most games are drawn). In theory these ratings should predict what FIDE rating a human needs to score 50% against the engine at 5' + 3". So my question is: Are these ratings generally realistic for that goal? Note that only engines near the human range were tested. Or are they too high, or too low?

1. Rybka 2.3.2a 3016
2. Fruit 2.2.1 2832
3. Benjamin 1.0 2784
4. Rebel Century 2739
5. Nebula 2 2726
6. Pawny 0.2 2550
7. Baislicka 1.0 2474
8. Bikjump 2.01 64b 2338
9. Snowy 0.2 2235
10. Pigeon 1.5.1 2130
11. Irina 0.15 1872
12. Sargon 1.01 1819
13. Safrad 2.2.40 1723
So this involves Dragon on different levels and at 5+3. Is the hardware used close to that of CCRL Blitz? Or the 5+3 TC with that hardware is equivalent to CCRL blitz?

Just for comparison I downloaded all blitz games of CCRL, generate rating list using Ordo with Safrad anchored at 1723. This is what I get.

Image

I had launched a BOT in lichess called cdroid designed to make the human happy by making it play suboptimal moves. There's not much human games yet. This engine is only comparable for CCRL blitz 1000 and below.
My hardware for this was running at close to the same speed as the CCRL reference engines, so it is as if CCRL had a rating list about midway between their blitz and Rapid rating lists. It is interesting to see that on your list the elo spread between the top and bottom engines from my list is over 2000 elo, vs about 1300 elo for me. Maybe the 2000 spread would shrink to 1900 or 1800 at 5' + 3", not more than that. So the point is that when normal engines play each other, the one that looks a ply or two deeper tends to win, but when they are playing against different NNUE engines doing shorter but "smarter" searches the benefit of an extra ply is much less, so the elo differences are much less on my list. That is why I believe that my method produces relative ratings much more in line with how those engines would perform vs humans. Based on the numbers on my list and on my judgment of how these engines would actually do vs. humans at 5' + 3", I think the numbers seem credible. Whereas the elo numbers you get from the raw CCRL data with Safrad = 1723 are obviously absurdly high in human terms near the top. Of course it's been known for about four decades that engine vs engine ratings produce larger elo gaps than the same engines vs. humans, this just makes it obvious.
Komodo rules!
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Fide equivalent "slow blitz" rating list

Post by lkaufman »

Chessqueen wrote: Tue Jan 04, 2022 6:10 am
lkaufman wrote: Mon Jan 03, 2022 5:49 pm
Chessqueen wrote: Mon Jan 03, 2022 3:46 pm
lkaufman wrote: Mon Jan 03, 2022 7:02 am Here is my rating list for a sampling of engines (single thread) based on 500 game matches at 5' + 3" against various rating levels of Dragon 2.6, with the list anchored by the weakest engine tested, Safrad 2.2.40, set to 1723, its FIDE-equivalent Lichess blitz rating based solely on all 104 blitz games played there against human opponents over 1400 in the last six months (actual Lichess performance was 1848, translated to FIDE1723 by the formula given in another thread). Note that the spread of ratings is considerably smaller than in normal rating lists; I think this is due to the great difference between conventional engines and an NNUE engine searching to a much shorter depth for similar strength. Similarity of opponents seems to spread the ratings apart (unless they are so strong that most games are drawn). In theory these ratings should predict what FIDE rating a human needs to score 50% against the engine at 5' + 3". So my question is: Are these ratings generally realistic for that goal? Note that only engines near the human range were tested. Or are they too high, or too low?

1. Rybka 2.3.2a 3016
2. Fruit 2.2.1 2832
3. Benjamin 1.0 2784
4. Rebel Century 2739
5. Nebula 2 2726
6. Pawny 0.2 2550
7. Baislicka 1.0 2474
8. Bikjump 2.01 64b 2338
9. Snowy 0.2 2235
10. Pigeon 1.5.1 2130
11. Irina 0.15 1872
12. Sargon 1.01 1819
13. Safrad 2.2.40 1723
Are you Also calibrating for rating range below 1700 to 800 ? According to this chart the majority of the players fall between 800 thru 1700 and the same goes for FIDE ratings, and between that range is where players need UCI_ELO the most to improve their Elo progressively http://www.uschess.org/archive/ratings/ratedist.php
I don't have any reliable CCRL rated engines below 1000 (so 1700 FIDE); do you know of any that are easy to download, UCI, reasonably bug free, and play somewhat sensible looking chess (to an amateur)? I do have a way to extrapolate the ratings of the Elo levels in Dragon down to 800 and below, but it would be more accurate if I had a suitably weak CCRL engine to use in the tests.
My friend rated around 1645 played versus DASH on chess.com rated 1600 and emailed me this games

[pgn][Event "Chess.com DASH"]
[Date "2022.01.02"]
[Round "?"]
[White "Jose"]
[Black "DASH"]
[Result "1-0"]
[BlackElo "1600"]
[ECO "D00"]
[Opening "Queen's Pawn"]
[Time "22:38:01"]
[Variation "2.e3 Nf6"]
[WhiteElo "1645"]
[TimeControl "300+3"]
[Termination "normal"]
[WhiteType "human"]
[BlackType "program"]

1. d4 d5 2. e3 Nf6 3. c4 e6 4. c5 Nbd7 5. b4 Be7 6. a3 O-O 7. Nc3 e5
8. Be2 a5 9. Qa4 axb4 10. Qxa8 bxc3 11. c6 bxc6 12. Qxc6 Nb8 13. Qxc3 Ne4
14. Qb2 Bh4 15. g3 exd4 16. gxh4 Qxh4 17. Bd3 Nc5 18. Bb5 Qe4 19. f3 Qf5
20. Ra2 c6 21. Qc2 cxb5 22. Qxc5 Na6 23. Qxd4 Qb1 24. Qb2 Qf5 25. Qc2 Qh5
26. Ne2 Nb4 27. axb4 Qxf3 28. Rg1 Bg4 29. Nd4 Qh3 30. Qg2 h5 31. Qxh3 Bxh3
32. Rg3 Bg4 33. Ra7 Rc8 34. Bd2 Re8 35. Bc3 Rc8 36. Nxb5 d4 37. Bxd4 Rc2
38. h3 Rc1+ 39. Kd2 Rd1+ 40. Kc2 Re1 41. hxg4 hxg4 42. Rxg4 Rh1 43. Ra8+
Kh7 44. Rxg7+ Kh6 45. Rh8#) 1-0[/pgn]
Is there a working download link for a UCI version of this engine? Otherwise it's of no use for this purpose.
Komodo rules!
Guenther
Posts: 4718
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: Fide equivalent "slow blitz" rating list

Post by Guenther »

lkaufman wrote: Tue Jan 04, 2022 6:08 pm

Is there a working download link for a UCI version of this engine? Otherwise it's of no use for this purpose.
Why? I thought you use CuteChess moreover it will of course work with WB2UCI...
https://rwbc-chess.de

[Trolls n'existent pas...]
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Fide equivalent "slow blitz" rating list

Post by lkaufman »

Guenther wrote: Tue Jan 04, 2022 6:21 pm
lkaufman wrote: Tue Jan 04, 2022 6:08 pm

Is there a working download link for a UCI version of this engine? Otherwise it's of no use for this purpose.
Why? I thought you use CuteChess moreover it will of course work with WB2UCI...
OK, I was unfamiliar with WB2UCI, but I don't see "Dash" as a download on that site. Do you have a link to it, or can you suggest another engine from that site that would be rated below 800 CCRL blitz or 1500 FIDE blitz if it has such a rating?
Komodo rules!
Guenther
Posts: 4718
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: Fide equivalent "slow blitz" rating list

Post by Guenther »

lkaufman wrote: Tue Jan 04, 2022 6:56 pm
Guenther wrote: Tue Jan 04, 2022 6:21 pm
lkaufman wrote: Tue Jan 04, 2022 6:08 pm

Is there a working download link for a UCI version of this engine? Otherwise it's of no use for this purpose.
Why? I thought you use CuteChess moreover it will of course work with WB2UCI...
OK, I was unfamiliar with WB2UCI, but I don't see "Dash" as a download on that site. Do you have a link to it, or can you suggest another engine from that site that would be rated below 800 CCRL blitz or 1500 FIDE blitz if it has such a rating?

Oh I thought you were talking about my previous post, or the forum had a glitch. I just overlooked your 'uci' condition, which is void anyway.
https://talkchess.com/forum3/viewtopic. ... 10#p917264
Last edited by Guenther on Tue Jan 04, 2022 7:36 pm, edited 1 time in total.
https://rwbc-chess.de

[Trolls n'existent pas...]
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Fide equivalent "slow blitz" rating list

Post by lkaufman »

Guenther wrote: Tue Jan 04, 2022 11:06 am
lkaufman wrote: Tue Jan 04, 2022 12:51 am
Odd Gunnar Malin wrote: Tue Jan 04, 2022 12:45 am There is also the SSDF rating list with pretty stable computers down to 1471 ( https://ssdf.bosjo.net/long.txt ). The list is for 2h timecontrol I think so maybe not so useful. Can't this mess setup run these as uci engines. I couldn't find the lowest rated computere in CB-emu Messui32, but I found it dusted down in my basement. Without the silver pieces and el.adapter, but I put in some battery and it worked like a charm. I only need my finger to play blindfold against it.
I don't actually use the CCRL or CEGT or SSDF rating for this human rating list, I just need a reliable, UCI engine that would be roughly 600 to 800 on CCRL blitz list. There are several such engines on the list, but either I can't find where to download them, or I get "malicious software" warnings, or they crash when I test them in Cutechess. Maybe one or two are reliable and I just haven't found them yet.
https://lichess.org/@/monchester/perf/blitz

Code: Select all

712	Monchester 1.0 64-bit	848	+307	−302	42.0%	+61.3	27.5%	743
https://github.com/unserializable/monch ... naries/1.0

One caveat seems to be assymmetric PG books - I don't use PG books since over a decade, but PGN start postions which works with Monchester
https://github.com/unserializable/monchester/issues/12

forum3/viewtopic.php?f=2&t=75636&p=8699 ... er#p869958
I downloaded monchester windows haswell compile, but it won't install as an engine in Cutechess. Don't know why.
Komodo rules!
Guenther
Posts: 4718
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: Fide equivalent "slow blitz" rating list

Post by Guenther »

lkaufman wrote: Tue Jan 04, 2022 7:35 pm
Guenther wrote: Tue Jan 04, 2022 11:06 am
lkaufman wrote: Tue Jan 04, 2022 12:51 am
Odd Gunnar Malin wrote: Tue Jan 04, 2022 12:45 am There is also the SSDF rating list with pretty stable computers down to 1471 ( https://ssdf.bosjo.net/long.txt ). The list is for 2h timecontrol I think so maybe not so useful. Can't this mess setup run these as uci engines. I couldn't find the lowest rated computere in CB-emu Messui32, but I found it dusted down in my basement. Without the silver pieces and el.adapter, but I put in some battery and it worked like a charm. I only need my finger to play blindfold against it.
I don't actually use the CCRL or CEGT or SSDF rating for this human rating list, I just need a reliable, UCI engine that would be roughly 600 to 800 on CCRL blitz list. There are several such engines on the list, but either I can't find where to download them, or I get "malicious software" warnings, or they crash when I test them in Cutechess. Maybe one or two are reliable and I just haven't found them yet.
https://lichess.org/@/monchester/perf/blitz

Code: Select all

712	Monchester 1.0 64-bit	848	+307	−302	42.0%	+61.3	27.5%	743
https://github.com/unserializable/monch ... naries/1.0

One caveat seems to be assymmetric PG books - I don't use PG books since over a decade, but PGN start postions which works with Monchester
https://github.com/unserializable/monchester/issues/12

forum3/viewtopic.php?f=2&t=75636&p=8699 ... er#p869958
I downloaded monchester windows haswell compile, but it won't install as an engine in Cutechess. Don't know why.
It is xboard...
https://rwbc-chess.de

[Trolls n'existent pas...]