Humanized Engine Rating List

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
lkaufman
Posts: 4884
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Humanized Engine Rating List

Post by lkaufman » Sat Jun 12, 2021 7:24 pm

These are the blitz ratings that various engines and Skill levels would be expected to obtain at 3' +2" blitz and at 15' + 10" Rapid on one thread of a modern i7 computer against humans with FIDE Blitz/Rapid ratings in the 2200 to 2900 range, as appropriate. The computers are assumed to have only an 8 move deep variety opening book. The relative ratings are based only on 200 game matches against Lc0, cpu version 27, network 69200 using a three move book of popular openings. This large network gets only around ten nodes per second on a typical i7, making it a good proxy for a top human player in that both will make occasional tactical blunders in fast games. This leads to a much smaller range of ratings than playing the standard (alpha-beta) engines against each other, presumably because the Monte-Carlo search and Neural net of Lc0 are so different from standard engines that just adding one more ply doesn't give the same elo gain as in a direct match. The contraction of the rating spread is roughly what estimates have been for the contraction of standard engine vs. engine lists to simulate human ratings, notably by Kai Laskos. Lc0 on CPU performs much better in Rapid than in Blitz against these standard engines, by roughly 160 elo, about the amount that humans benefit by against standard engines from the longer time limit, so it is likely that the Rapid rating of Lc0 cpu is very close to the Blitz rating it would earn against human opposition. Therefore I'm assigning the same initial rating to Lc0 cpu for both Rapid and Blitz, which results in nearly a class lower (160 elo) Rapid ratings for standard engines than their Blitz ratings, which is consistent with human results against engines. To set the level of the two lists, I picked a rating of 3100 for Lc0 cpu because it results in ratings that are pretty consistent with human results against those engines that have played Rapid or Blitz with humans. This can be adjusted as we get more data of engine vs human results. To reduce the large error bars, I may add games against a different network. To rate weak engines that lose nearly every game to Lc0cpu, I can rate them against the same net set to look at just a single node (setting max nps to 0.001 accomplishes this). Many engines still need Rapid tests; all have blitz ratings listed.
Comments welcome, particularly on whether the level of the lists is about right, too high, or too low, supported by data.

Engine: Blitz (3' + 2") : Rapid (15' + 10")

Stockfish 13: 3594 : 3412
KomodoDragon 2: 3594 : 3412
KomodoDragon 2 MCTS: 3502
Stockfish 11: 3450 : 3322
Komodo 14.1: 3391 : 3227
Stockfish 9: 3372 : 3253
Stockfish 13 elo2850: 3239 : 3056
Critter 1.6a: 3233
Wasp 4.5: 3231 : 3086
Wasp 4: 3225
Gull 3: 3221 : 3084
Fritz 15: 3195
DeepRybka 4: 3154 : 3020
Wasp 3: 3123
Wasp 2: 3105 : 2938
KomodoDragon 2 Sk 24: 3102 : 2862
Lc0cpu v27 net 69200: 3100 : 3100
Rybka 2.3.2a: 3095 : 2953
Stockfish 13 elo2500: 3046
Rybka 1: 3031
Komodo 14.1 Skill 24: 3019 : 2776
Fruit 2.2.1: 2987 : 2777
KomodoDragon 2 Sk 23: 2971 : 2737
Benjamin 1.0: 2955
Komodo 14.1 Skill 23: 2907 : 2569
KomodoDragon 2 Sk 22: 2885 : 2559
Komodo 14.1 Skill 22: 2799
Zahak 3.0: 2750 : 2536
Pawny 0.2: 2728
Baislicka 1.0: 2703
KomodoDragon 2 Sk 21: 2670
Komodo 14.1 Skill 21: 2670
BikJump 2.01: 2630
KomodoDragon 2 Sk 20: 2588
Komodo 14.1 Skill 20: 2559
Zahak 1.0.0: 2548
Snowy 0.2: 2536 : 2180
Stockfish 13 elo2000: 2536
Komodo 14.1 Skill 19: 2510
Pigeon-1.5.1: 2481
CDrill1800-32b: 2464
KomodoDragon 2 Sk 19: 2464
KomodoDragon 2 Sk 18: 2251
Komodo rules!

Uri Blass
Posts: 9111
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: Humanized Engine Rating List

Post by Uri Blass » Sat Jun 12, 2021 7:47 pm

lkaufman wrote:
Sat Jun 12, 2021 7:24 pm
These are the blitz ratings that various engines and Skill levels would be expected to obtain at 3' +2" blitz and at 15' + 10" Rapid on one thread of a modern i7 computer against humans with FIDE Blitz/Rapid ratings in the 2200 to 2900 range, as appropriate. The computers are assumed to have only an 8 move deep variety opening book. The relative ratings are based only on 200 game matches against Lc0, cpu version 27, network 69200 using a three move book of popular openings. This large network gets only around ten nodes per second on a typical i7, making it a good proxy for a top human player in that both will make occasional tactical blunders in fast games. This leads to a much smaller range of ratings than playing the standard (alpha-beta) engines against each other, presumably because the Monte-Carlo search and Neural net of Lc0 are so different from standard engines that just adding one more ply doesn't give the same elo gain as in a direct match. The contraction of the rating spread is roughly what estimates have been for the contraction of standard engine vs. engine lists to simulate human ratings, notably by Kai Laskos. Lc0 on CPU performs much better in Rapid than in Blitz against these standard engines, by roughly 160 elo, about the amount that humans benefit by against standard engines from the longer time limit, so it is likely that the Rapid rating of Lc0 cpu is very close to the Blitz rating it would earn against human opposition. Therefore I'm assigning the same initial rating to Lc0 cpu for both Rapid and Blitz, which results in nearly a class lower (160 elo) Rapid ratings for standard engines than their Blitz ratings, which is consistent with human results against engines. To set the level of the two lists, I picked a rating of 3100 for Lc0 cpu because it results in ratings that are pretty consistent with human results against those engines that have played Rapid or Blitz with humans. This can be adjusted as we get more data of engine vs human results. To reduce the large error bars, I may add games against a different network. To rate weak engines that lose nearly every game to Lc0cpu, I can rate them against the same net set to look at just a single node (setting max nps to 0.001 accomplishes this). Many engines still need Rapid tests; all have blitz ratings listed.
Comments welcome, particularly on whether the level of the lists is about right, too high, or too low, supported by data.

Engine: Blitz (3' + 2") : Rapid (15' + 10")

Stockfish 13: 3594 : 3412
KomodoDragon 2: 3594 : 3412
KomodoDragon 2 MCTS: 3502
Stockfish 11: 3450 : 3322
Komodo 14.1: 3391 : 3227
Stockfish 9: 3372 : 3253
Stockfish 13 elo2850: 3239 : 3056
Critter 1.6a: 3233
Wasp 4.5: 3231 : 3086
Wasp 4: 3225
Gull 3: 3221 : 3084
Fritz 15: 3195
DeepRybka 4: 3154 : 3020
Wasp 3: 3123
Wasp 2: 3105 : 2938
KomodoDragon 2 Sk 24: 3102 : 2862
Lc0cpu v27 net 69200: 3100 : 3100
Rybka 2.3.2a: 3095 : 2953
Stockfish 13 elo2500: 3046
Rybka 1: 3031
Komodo 14.1 Skill 24: 3019 : 2776
Fruit 2.2.1: 2987 : 2777
KomodoDragon 2 Sk 23: 2971 : 2737
Benjamin 1.0: 2955
Komodo 14.1 Skill 23: 2907 : 2569
KomodoDragon 2 Sk 22: 2885 : 2559
Komodo 14.1 Skill 22: 2799
Zahak 3.0: 2750 : 2536
Pawny 0.2: 2728
Baislicka 1.0: 2703
KomodoDragon 2 Sk 21: 2670
Komodo 14.1 Skill 21: 2670
BikJump 2.01: 2630
KomodoDragon 2 Sk 20: 2588
Komodo 14.1 Skill 20: 2559
Zahak 1.0.0: 2548
Snowy 0.2: 2536 : 2180
Stockfish 13 elo2000: 2536
Komodo 14.1 Skill 19: 2510
Pigeon-1.5.1: 2481
CDrill1800-32b: 2464
KomodoDragon 2 Sk 19: 2464
KomodoDragon 2 Sk 18: 2251
I wonder how you get this 160 elo number that you claim humans do better in 15'+10'' relative to 3'+2''

I do not believe that Lc0 emulates humans well.
I suspect that Lc0 is basically better in the opening when humans are better in the endgame(did not use lc0 recently but it was my impression in the past).

lkaufman
Posts: 4884
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: Humanized Engine Rating List

Post by lkaufman » Sat Jun 12, 2021 8:03 pm

Uri Blass wrote:
Sat Jun 12, 2021 7:47 pm
lkaufman wrote:
Sat Jun 12, 2021 7:24 pm
These are the blitz ratings that various engines and Skill levels would be expected to obtain at 3' +2" blitz and at 15' + 10" Rapid on one thread of a modern i7 computer against humans with FIDE Blitz/Rapid ratings in the 2200 to 2900 range, as appropriate. The computers are assumed to have only an 8 move deep variety opening book. The relative ratings are based only on 200 game matches against Lc0, cpu version 27, network 69200 using a three move book of popular openings. This large network gets only around ten nodes per second on a typical i7, making it a good proxy for a top human player in that both will make occasional tactical blunders in fast games. This leads to a much smaller range of ratings than playing the standard (alpha-beta) engines against each other, presumably because the Monte-Carlo search and Neural net of Lc0 are so different from standard engines that just adding one more ply doesn't give the same elo gain as in a direct match. The contraction of the rating spread is roughly what estimates have been for the contraction of standard engine vs. engine lists to simulate human ratings, notably by Kai Laskos. Lc0 on CPU performs much better in Rapid than in Blitz against these standard engines, by roughly 160 elo, about the amount that humans benefit by against standard engines from the longer time limit, so it is likely that the Rapid rating of Lc0 cpu is very close to the Blitz rating it would earn against human opposition. Therefore I'm assigning the same initial rating to Lc0 cpu for both Rapid and Blitz, which results in nearly a class lower (160 elo) Rapid ratings for standard engines than their Blitz ratings, which is consistent with human results against engines. To set the level of the two lists, I picked a rating of 3100 for Lc0 cpu because it results in ratings that are pretty consistent with human results against those engines that have played Rapid or Blitz with humans. This can be adjusted as we get more data of engine vs human results. To reduce the large error bars, I may add games against a different network. To rate weak engines that lose nearly every game to Lc0cpu, I can rate them against the same net set to look at just a single node (setting max nps to 0.001 accomplishes this). Many engines still need Rapid tests; all have blitz ratings listed.
Comments welcome, particularly on whether the level of the lists is about right, too high, or too low, supported by data.

Engine: Blitz (3' + 2") : Rapid (15' + 10")

Stockfish 13: 3594 : 3412
KomodoDragon 2: 3594 : 3412
KomodoDragon 2 MCTS: 3502
Stockfish 11: 3450 : 3322
Komodo 14.1: 3391 : 3227
Stockfish 9: 3372 : 3253
Stockfish 13 elo2850: 3239 : 3056
Critter 1.6a: 3233
Wasp 4.5: 3231 : 3086
Wasp 4: 3225
Gull 3: 3221 : 3084
Fritz 15: 3195
DeepRybka 4: 3154 : 3020
Wasp 3: 3123
Wasp 2: 3105 : 2938
KomodoDragon 2 Sk 24: 3102 : 2862
Lc0cpu v27 net 69200: 3100 : 3100
Rybka 2.3.2a: 3095 : 2953
Stockfish 13 elo2500: 3046
Rybka 1: 3031
Komodo 14.1 Skill 24: 3019 : 2776
Fruit 2.2.1: 2987 : 2777
KomodoDragon 2 Sk 23: 2971 : 2737
Benjamin 1.0: 2955
Komodo 14.1 Skill 23: 2907 : 2569
KomodoDragon 2 Sk 22: 2885 : 2559
Komodo 14.1 Skill 22: 2799
Zahak 3.0: 2750 : 2536
Pawny 0.2: 2728
Baislicka 1.0: 2703
KomodoDragon 2 Sk 21: 2670
Komodo 14.1 Skill 21: 2670
BikJump 2.01: 2630
KomodoDragon 2 Sk 20: 2588
Komodo 14.1 Skill 20: 2559
Zahak 1.0.0: 2548
Snowy 0.2: 2536 : 2180
Stockfish 13 elo2000: 2536
Komodo 14.1 Skill 19: 2510
Pigeon-1.5.1: 2481
CDrill1800-32b: 2464
KomodoDragon 2 Sk 19: 2464
KomodoDragon 2 Sk 18: 2251
I wonder how you get this 160 elo number that you claim humans do better in 15'+10'' relative to 3'+2''

I do not believe that Lc0 emulates humans well.
I suspect that Lc0 is basically better in the opening when humans are better in the endgame(did not use lc0 recently but it was my impression in the past).
The 160 figure is calculated from the Lc0 data; for humans there is no precise comparison of blitz to Rapid, but it is well known that humans do substantially better against engines with more time. "Substantially" isn't a number, but based on my experience with engines playing strong humans over more than thirty years it is surely more than 100 elo (for Rapid vs Blitz) and probably below 200 elo, so at least 160 isn't way off the mark. Engines reached GM level in blitz around 1990, and took another three or four years of hardware plus software advance to do the same in Rapid. This seems roughly consistent with the 160 elo gap.
I agree that Lc0 isn't a great human emulator for the reason you mention, but using it for this is much better than using standard engines where seeing one ply deeper is critical and where the engines never make shallow blunders, it should at least make the scale of the list correct, even if it might favor or disfavor particular engines a bit. The low NPS of the cpu version makes even the opening play somewhat dubious due to missing some simple tactics.
Komodo rules!

Chessqueen
Posts: 2217
Joined: Wed Sep 05, 2018 12:16 am
Full name: George Pichard

Re: Humanized Engine Rating List

Post by Chessqueen » Sun Jun 13, 2021 1:24 pm

lkaufman wrote:
Sat Jun 12, 2021 8:03 pm
Uri Blass wrote:
Sat Jun 12, 2021 7:47 pm
lkaufman wrote:
Sat Jun 12, 2021 7:24 pm
These are the blitz ratings that various engines and Skill levels would be expected to obtain at 3' +2" blitz and at 15' + 10" Rapid on one thread of a modern i7 computer against humans with FIDE Blitz/Rapid ratings in the 2200 to 2900 range, as appropriate. The computers are assumed to have only an 8 move deep variety opening book. The relative ratings are based only on 200 game matches against Lc0, cpu version 27, network 69200 using a three move book of popular openings. This large network gets only around ten nodes per second on a typical i7, making it a good proxy for a top human player in that both will make occasional tactical blunders in fast games. This leads to a much smaller range of ratings than playing the standard (alpha-beta) engines against each other, presumably because the Monte-Carlo search and Neural net of Lc0 are so different from standard engines that just adding one more ply doesn't give the same elo gain as in a direct match. The contraction of the rating spread is roughly what estimates have been for the contraction of standard engine vs. engine lists to simulate human ratings, notably by Kai Laskos. Lc0 on CPU performs much better in Rapid than in Blitz against these standard engines, by roughly 160 elo, about the amount that humans benefit by against standard engines from the longer time limit, so it is likely that the Rapid rating of Lc0 cpu is very close to the Blitz rating it would earn against human opposition. Therefore I'm assigning the same initial rating to Lc0 cpu for both Rapid and Blitz, which results in nearly a class lower (160 elo) Rapid ratings for standard engines than their Blitz ratings, which is consistent with human results against engines. To set the level of the two lists, I picked a rating of 3100 for Lc0 cpu because it results in ratings that are pretty consistent with human results against those engines that have played Rapid or Blitz with humans. This can be adjusted as we get more data of engine vs human results. To reduce the large error bars, I may add games against a different network. To rate weak engines that lose nearly every game to Lc0cpu, I can rate them against the same net set to look at just a single node (setting max nps to 0.001 accomplishes this). Many engines still need Rapid tests; all have blitz ratings listed.
Comments welcome, particularly on whether the level of the lists is about right, too high, or too low, supported by data.

Engine: Blitz (3' + 2") : Rapid (15' + 10")

Stockfish 13: 3594 : 3412
KomodoDragon 2: 3594 : 3412
KomodoDragon 2 MCTS: 3502
Stockfish 11: 3450 : 3322
Komodo 14.1: 3391 : 3227
Stockfish 9: 3372 : 3253
Stockfish 13 elo2850: 3239 : 3056
Critter 1.6a: 3233
Wasp 4.5: 3231 : 3086
Wasp 4: 3225
Gull 3: 3221 : 3084
Fritz 15: 3195
DeepRybka 4: 3154 : 3020
Wasp 3: 3123
Wasp 2: 3105 : 2938
KomodoDragon 2 Sk 24: 3102 : 2862
Lc0cpu v27 net 69200: 3100 : 3100
Rybka 2.3.2a: 3095 : 2953
Stockfish 13 elo2500: 3046
Rybka 1: 3031
Komodo 14.1 Skill 24: 3019 : 2776
Fruit 2.2.1: 2987 : 2777
KomodoDragon 2 Sk 23: 2971 : 2737
Benjamin 1.0: 2955
Komodo 14.1 Skill 23: 2907 : 2569
KomodoDragon 2 Sk 22: 2885 : 2559
Komodo 14.1 Skill 22: 2799
Zahak 3.0: 2750 : 2536
Pawny 0.2: 2728
Baislicka 1.0: 2703
KomodoDragon 2 Sk 21: 2670
Komodo 14.1 Skill 21: 2670
BikJump 2.01: 2630
KomodoDragon 2 Sk 20: 2588
Komodo 14.1 Skill 20: 2559
Zahak 1.0.0: 2548
Snowy 0.2: 2536 : 2180
Stockfish 13 elo2000: 2536
Komodo 14.1 Skill 19: 2510
Pigeon-1.5.1: 2481
CDrill1800-32b: 2464
KomodoDragon 2 Sk 19: 2464
KomodoDragon 2 Sk 18: 2251
I wonder how you get this 160 elo number that you claim humans do better in 15'+10'' relative to 3'+2''

I do not believe that Lc0 emulates humans well.
I suspect that Lc0 is basically better in the opening when humans are better in the endgame(did not use lc0 recently but it was my impression in the past).
The 160 figure is calculated from the Lc0 data; for humans there is no precise comparison of blitz to Rapid, but it is well known that humans do substantially better against engines with more time. "Substantially" isn't a number, but based on my experience with engines playing strong humans over more than thirty years it is surely more than 100 elo (for Rapid vs Blitz) and probably below 200 elo, so at least 160 isn't way off the mark. Engines reached GM level in blitz around 1990, and took another three or four years of hardware plus software advance to do the same in Rapid. This seems roughly consistent with the 160 elo gap.
I agree that Lc0 isn't a great human emulator for the reason you mention, but using it for this is much better than using standard engines where seeing one ply deeper is critical and where the engines never make shallow blunders, it should at least make the scale of the list correct, even if it might favor or disfavor particular engines a bit. The low NPS of the cpu version makes even the opening play somewhat dubious due to missing some simple tactics.
Now I have a question, human playing Vs engines like Komodo Dragon2 MCTS at 3'+2 versus 15'+10" what Elo gain do human get with the increased in time. For instance IM Andras Toth playing versus Komodo Dragon2 MCTZ at 3'+2" versus playing at 15'+10" what score do you expect out of 6 games?
Arizona will have to learn from the Chinese to regain Greenland from desert ==>
https://www.youtube.com/watch?v=KTpaJn22w4I

lkaufman
Posts: 4884
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: Humanized Engine Rating List

Post by lkaufman » Sun Jun 13, 2021 7:26 pm

Chessqueen wrote:
Sun Jun 13, 2021 1:24 pm
lkaufman wrote:
Sat Jun 12, 2021 8:03 pm
Uri Blass wrote:
Sat Jun 12, 2021 7:47 pm
lkaufman wrote:
Sat Jun 12, 2021 7:24 pm
These are the blitz ratings that various engines and Skill levels would be expected to obtain at 3' +2" blitz and at 15' + 10" Rapid on one thread of a modern i7 computer against humans with FIDE Blitz/Rapid ratings in the 2200 to 2900 range, as appropriate. The computers are assumed to have only an 8 move deep variety opening book. The relative ratings are based only on 200 game matches against Lc0, cpu version 27, network 69200 using a three move book of popular openings. This large network gets only around ten nodes per second on a typical i7, making it a good proxy for a top human player in that both will make occasional tactical blunders in fast games. This leads to a much smaller range of ratings than playing the standard (alpha-beta) engines against each other, presumably because the Monte-Carlo search and Neural net of Lc0 are so different from standard engines that just adding one more ply doesn't give the same elo gain as in a direct match. The contraction of the rating spread is roughly what estimates have been for the contraction of standard engine vs. engine lists to simulate human ratings, notably by Kai Laskos. Lc0 on CPU performs much better in Rapid than in Blitz against these standard engines, by roughly 160 elo, about the amount that humans benefit by against standard engines from the longer time limit, so it is likely that the Rapid rating of Lc0 cpu is very close to the Blitz rating it would earn against human opposition. Therefore I'm assigning the same initial rating to Lc0 cpu for both Rapid and Blitz, which results in nearly a class lower (160 elo) Rapid ratings for standard engines than their Blitz ratings, which is consistent with human results against engines. To set the level of the two lists, I picked a rating of 3100 for Lc0 cpu because it results in ratings that are pretty consistent with human results against those engines that have played Rapid or Blitz with humans. This can be adjusted as we get more data of engine vs human results. To reduce the large error bars, I may add games against a different network. To rate weak engines that lose nearly every game to Lc0cpu, I can rate them against the same net set to look at just a single node (setting max nps to 0.001 accomplishes this). Many engines still need Rapid tests; all have blitz ratings listed.
Comments welcome, particularly on whether the level of the lists is about right, too high, or too low, supported by data.

Engine: Blitz (3' + 2") : Rapid (15' + 10")

Stockfish 13: 3594 : 3412
KomodoDragon 2: 3594 : 3412
KomodoDragon 2 MCTS: 3502
Stockfish 11: 3450 : 3322
Komodo 14.1: 3391 : 3227
Stockfish 9: 3372 : 3253
Stockfish 13 elo2850: 3239 : 3056
Critter 1.6a: 3233
Wasp 4.5: 3231 : 3086
Wasp 4: 3225
Gull 3: 3221 : 3084
Fritz 15: 3195
DeepRybka 4: 3154 : 3020
Wasp 3: 3123
Wasp 2: 3105 : 2938
KomodoDragon 2 Sk 24: 3102 : 2862
Lc0cpu v27 net 69200: 3100 : 3100
Rybka 2.3.2a: 3095 : 2953
Stockfish 13 elo2500: 3046
Rybka 1: 3031
Komodo 14.1 Skill 24: 3019 : 2776
Fruit 2.2.1: 2987 : 2777
KomodoDragon 2 Sk 23: 2971 : 2737
Benjamin 1.0: 2955
Komodo 14.1 Skill 23: 2907 : 2569
KomodoDragon 2 Sk 22: 2885 : 2559
Komodo 14.1 Skill 22: 2799
Zahak 3.0: 2750 : 2536
Pawny 0.2: 2728
Baislicka 1.0: 2703
KomodoDragon 2 Sk 21: 2670
Komodo 14.1 Skill 21: 2670
BikJump 2.01: 2630
KomodoDragon 2 Sk 20: 2588
Komodo 14.1 Skill 20: 2559
Zahak 1.0.0: 2548
Snowy 0.2: 2536 : 2180
Stockfish 13 elo2000: 2536
Komodo 14.1 Skill 19: 2510
Pigeon-1.5.1: 2481
CDrill1800-32b: 2464
KomodoDragon 2 Sk 19: 2464
KomodoDragon 2 Sk 18: 2251
I wonder how you get this 160 elo number that you claim humans do better in 15'+10'' relative to 3'+2''

I do not believe that Lc0 emulates humans well.
I suspect that Lc0 is basically better in the opening when humans are better in the endgame(did not use lc0 recently but it was my impression in the past).
The 160 figure is calculated from the Lc0 data; for humans there is no precise comparison of blitz to Rapid, but it is well known that humans do substantially better against engines with more time. "Substantially" isn't a number, but based on my experience with engines playing strong humans over more than thirty years it is surely more than 100 elo (for Rapid vs Blitz) and probably below 200 elo, so at least 160 isn't way off the mark. Engines reached GM level in blitz around 1990, and took another three or four years of hardware plus software advance to do the same in Rapid. This seems roughly consistent with the 160 elo gap.
I agree that Lc0 isn't a great human emulator for the reason you mention, but using it for this is much better than using standard engines where seeing one ply deeper is critical and where the engines never make shallow blunders, it should at least make the scale of the list correct, even if it might favor or disfavor particular engines a bit. The low NPS of the cpu version makes even the opening play somewhat dubious due to missing some simple tactics.
Now I have a question, human playing Vs engines like Komodo Dragon2 MCTS at 3'+2 versus 15'+10" what Elo gain do human get with the increased in time. For instance IM Andras Toth playing versus Komodo Dragon2 MCTZ at 3'+2" versus playing at 15'+10" what score do you expect out of 6 games?
Well, I estimated above that the human will perform about 160 elo better in Rapid than in Blitz against the same engine. Presumably the human improves by something like 260 while the engine gains maybe 100. But with knight odds the extra time won't help the engine much, so presumably the increased performance by the human will be something close to 260 rather than 160 elo, I would expect.
Komodo rules!

Post Reply