Carlsen vs. CCRL 2850 engines in Rapid?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

lkaufman
Posts: 5966
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Carlsen vs. CCRL 2850 engines in Rapid?

Post by lkaufman »

How would Magnus Carlsen (FIDE Classical and Rapid ratings both close to 2850) perform in a serious Rapid (15' + 10") match with engines rated around 2850 on the CCRL Rapid (40/15) rating list (assuming they used a good but varied modern opening book), and ran on the reference i7 computer used by that list? Some example engines running on just one thread include Deep Shredder 11, Stockfish 1.4 (not 14!), Minic 1.39, Fritz 11, Toga II 3.0, Naum 3, Gaviota 1.0, and Arasan 18.0. I realize that there is probably no data on this exactly, opinions must be based on results of engines against other top players with a bit of extrapolation. We know that the rating lists spread out the ratings relative to human scale, but I'm trying to determine at least whether this list is accurate relative to FIDE ratings of the best human player. What do you think?
Komodo rules!
User avatar
j.t.
Posts: 240
Joined: Wed Jun 16, 2021 2:08 am
Location: Berlin
Full name: Jost Triller

Re: Carlsen vs. CCRL 2850 engines in Rapid?

Post by j.t. »

Maybe one could check if an engine is better or worse than Magnus Carlsen, by using the current best engine to calculate centipawn loss of Magnus Carlsen moves and centipawn loss of the engine in the same positions. If an engine has consistently better/worse centipawn loss, it should be a good indicator if the engine is better than Magnus Carlsen or not. I am not sure though what the best way would be to evaluate similar centipawn losses between Carlse and the engine, as centipawn losses in some positions could be more important than in other positions.

Judging by two game analysis I made of Magnus games I made with Nalwald, I believe that CCRL 40/40 2700 is already better than Carlsen. Maybe even 2600.
MonteCarlo
Posts: 188
Joined: Sun Dec 25, 2016 4:59 pm

Re: Carlsen vs. CCRL 2850 engines in Rapid?

Post by MonteCarlo »

Deep Fritz 10 is also around that rating, and was the version that played Kramnik in 2006 (or at least was released around the time of that match, if not the exact product used).

That was slow classical (40 moves in 2 hours, I think?), and faster TCs should benefit the engine.

The previous versions of Fritz/Junior that played Kramnik/Kasparov were weaker, and were still competitive at classical.

Very small samples, but I'd guess the engines would be heavily favored in rapid based on that.

Having said that, it might depend on the engine. In my experience some of the lower-end engines (2400-2700) seem to have weaknesses more easily exploited by humans than others, so if that is also true around 2850 it might vary a lot from engine to engine.

What my "heavily favored" means exactly is not clear even to me; if I had to make a SWAG, I'd say that the score of the last Kramnik classical TC match would be a sensible lower bound for rapid, so something 70%+.

Cheers!
amanjpro
Posts: 883
Joined: Sat Mar 13, 2021 1:47 am
Full name: Amanj Sherwany

Re: Carlsen vs. CCRL 2850 engines in Rapid?

Post by amanjpro »

j.t. wrote: Mon Aug 16, 2021 9:10 pm Maybe one could check if an engine is better or worse than Magnus Carlsen, by using the current best engine to calculate centipawn loss of Magnus Carlsen moves and centipawn loss of the engine in the same positions. If an engine has consistently better/worse centipawn loss, it should be a good indicator if the engine is better than Magnus Carlsen or not. I am not sure though what the best way would be to evaluate similar centipawn losses between Carlse and the engine, as centipawn losses in some positions could be more important than in other positions.

Judging by two game analysis I made of Magnus games I made with Nalwald, I believe that CCRL 40/40 2700 is already better than Carlsen. Maybe even 2600.
This actually doesn't take into account the psychological aspect... when a human player plays against another one, he doesn't try to play the best move, but the move that has the best chance to make the opponent make mistakes...

You cannot say Tal was a bad player, because his moves were not accurate, what he did was amazing... same applies for Magnus, he often times goes for slightly worse endgames, because he is sure that he can outplay the opponent easily
Chessqueen
Posts: 5618
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: Carlsen vs. CCRL 2850 engines in Rapid?

Post by Chessqueen »

lkaufman wrote: Mon Aug 16, 2021 8:21 pm How would Magnus Carlsen (FIDE Classical and Rapid ratings both close to 2850) perform in a serious Rapid (15' + 10") match with engines rated around 2850 on the CCRL Rapid (40/15) rating list (assuming they used a good but varied modern opening book), and ran on the reference i7 computer used by that list? Some example engines running on just one thread include Deep Shredder 11, Stockfish 1.4 (not 14!), Minic 1.39, Fritz 11, Toga II 3.0, Naum 3, Gaviota 1.0, and Arasan 18.0. I realize that there is probably no data on this exactly, opinions must be based on results of engines against other top players with a bit of extrapolation. We know that the rating lists spread out the ratings relative to human scale, but I'm trying to determine at least whether this list is accurate relative to FIDE ratings of the best human player. What do you think?
Maybe one could check if any of those engines perform better against Komodo Dragon2 at Skill level x 114 = FIDE rating equivalent to Magnus Carlsen 2850, but even if chess programming technique has advanced tremendously in the last 11 years for some reason I still believe that Deep Shredder 11 would come out ahead of all other engines that you listed.
Who is 17 years old GM Gukesh 2nd at the Candidate in Toronto?
https://indianexpress.com/article/sport ... t-9281394/
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Carlsen vs. CCRL 2850 engines in Rapid?

Post by xr_a_y »

Chessqueen wrote: Mon Aug 16, 2021 10:16 pm
lkaufman wrote: Mon Aug 16, 2021 8:21 pm How would Magnus Carlsen (FIDE Classical and Rapid ratings both close to 2850) perform in a serious Rapid (15' + 10") match with engines rated around 2850 on the CCRL Rapid (40/15) rating list (assuming they used a good but varied modern opening book), and ran on the reference i7 computer used by that list? Some example engines running on just one thread include Deep Shredder 11, Stockfish 1.4 (not 14!), Minic 1.39, Fritz 11, Toga II 3.0, Naum 3, Gaviota 1.0, and Arasan 18.0. I realize that there is probably no data on this exactly, opinions must be based on results of engines against other top players with a bit of extrapolation. We know that the rating lists spread out the ratings relative to human scale, but I'm trying to determine at least whether this list is accurate relative to FIDE ratings of the best human player. What do you think?
Maybe one could check if any of those engines perform better against Komodo Dragon2 at Skill level x 114 = FIDE rating equivalent to Magnus Carlsen 2850, but even if chess programming technique has advanced tremendously in the last 11 years for some reason I still believe that Deep Shredder 11 would come out ahead of all other engines that you listed.
In CCRL data, Deep Shredder 11 "only" scored (if I read it well)
- 45.3% against SF1.4
- 42.6% against Toga II 1.4.1SE which is less strong than Toga II 3.0 it seems
- 45.8% versus Naum 3

But this says nothing about letting play this whole field of engine and of course there are not many games played. For sure this test can be done by some.
mehmet123
Posts: 676
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: Carlsen vs. CCRL 2850 engines in Rapid?

Post by mehmet123 »

In 1999 Fritz 5.32 beats Judit Polgar with a 5.5-2.5 score at 30 minutes per game match. The performance of Fritz 5.32 at this match is 2814 elo.
Fritz 5.32 had played on a Pentium II/350 Mhz hardware. Fritz 5.32 is ~100 elo weaker than Fritz 8. The rating of Fritz 8 Bilbao is 2700 elo at CCRL 40/15 rating list.
In 1998 Rebel beat Anand at 2 semi-blitz games (15 minutes) match with a 1.5- 0.5 score (2986 elo performance). At 4 blitz games (5 min + 5 sec) Rebel beat Anand with a 3-1 score (2986 elo performance). Rebel had played on a K6-2 450 Mhz hardware. Anand had 2795 elo at 1998 July.
Fritz 5.32 at Pentium II/350 Mhz and Rebel K6-2 450 Mhz probably have a rating of around 2400 elo for CCRL (40/15).

Considering that Magnus Carlsen is the best player in history in rapid games a 2400-2450 elo chess engine has a low change to beat him.
But a chess engine around 2600 CCRL elo can beat Magnus Carlsen without much difficulty at rapid games.
lkaufman
Posts: 5966
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Carlsen vs. CCRL 2850 engines in Rapid?

Post by lkaufman »

j.t. wrote: Mon Aug 16, 2021 9:10 pm Maybe one could check if an engine is better or worse than Magnus Carlsen, by using the current best engine to calculate centipawn loss of Magnus Carlsen moves and centipawn loss of the engine in the same positions. If an engine has consistently better/worse centipawn loss, it should be a good indicator if the engine is better than Magnus Carlsen or not. I am not sure though what the best way would be to evaluate similar centipawn losses between Carlse and the engine, as centipawn losses in some positions could be more important than in other positions.

Judging by two game analysis I made of Magnus games I made with Nalwald, I believe that CCRL 40/40 2700 is already better than Carlsen. Maybe even 2600.
That is a useful approach, though probably biased in favor of the engines. To some extent, all engines think alike, at least more so than engines vs. humans, so if the engine and the human were equally strong, I would expect the engine to match some other engine more than the human would. Were you using Carlsen Rapid games and engine Rapid games for a fair comparison?
Komodo rules!
lkaufman
Posts: 5966
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Carlsen vs. CCRL 2850 engines in Rapid?

Post by lkaufman »

Chessqueen wrote: Mon Aug 16, 2021 10:16 pm
lkaufman wrote: Mon Aug 16, 2021 8:21 pm How would Magnus Carlsen (FIDE Classical and Rapid ratings both close to 2850) perform in a serious Rapid (15' + 10") match with engines rated around 2850 on the CCRL Rapid (40/15) rating list (assuming they used a good but varied modern opening book), and ran on the reference i7 computer used by that list? Some example engines running on just one thread include Deep Shredder 11, Stockfish 1.4 (not 14!), Minic 1.39, Fritz 11, Toga II 3.0, Naum 3, Gaviota 1.0, and Arasan 18.0. I realize that there is probably no data on this exactly, opinions must be based on results of engines against other top players with a bit of extrapolation. We know that the rating lists spread out the ratings relative to human scale, but I'm trying to determine at least whether this list is accurate relative to FIDE ratings of the best human player. What do you think?
Maybe one could check if any of those engines perform better against Komodo Dragon2 at Skill level x 114 = FIDE rating equivalent to Magnus Carlsen 2850, but even if chess programming technique has advanced tremendously in the last 11 years for some reason I still believe that Deep Shredder 11 would come out ahead of all other engines that you listed.
This would be circular reasoning; I estimated the rating of KDragon 2 Skill levels partly by results vs. engines with known ratings, so basing engine ratings on those ratings is circular logic to some extent (yes, the skill levels 21 and 22 were tested vs. your friend Jorge, but that's a tiny sample and just two levels). It may well be that Shredder and some of the other engines that played vs. top humans around two decades ago were relatively better vs. humans than vs. engines since that was more of a goal for programmers at that time. This might mean that we are overrating current engines in the 2800 ballpark if we base the ratings on results vs. engines whose ratings are determined by human games played back then.
Komodo rules!
lkaufman
Posts: 5966
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Carlsen vs. CCRL 2850 engines in Rapid?

Post by lkaufman »

mehmet123 wrote: Mon Aug 16, 2021 10:53 pm In 1999 Fritz 5.32 beats Judit Polgar with a 5.5-2.5 score at 30 minutes per game match. The performance of Fritz 5.32 at this match is 2814 elo.
Fritz 5.32 had played on a Pentium II/350 Mhz hardware. Fritz 5.32 is ~100 elo weaker than Fritz 8. The rating of Fritz 8 Bilbao is 2700 elo at CCRL 40/15 rating list.
In 1998 Rebel beat Anand at 2 semi-blitz games (15 minutes) match with a 1.5- 0.5 score (2986 elo performance). At 4 blitz games (5 min + 5 sec) Rebel beat Anand with a 3-1 score (2986 elo performance). Rebel had played on a K6-2 450 Mhz hardware. Anand had 2795 elo at 1998 July.
Fritz 5.32 at Pentium II/350 Mhz and Rebel K6-2 450 Mhz probably have a rating of around 2400 elo for CCRL (40/15).

Considering that Magnus Carlsen is the best player in history in rapid games a 2400-2450 elo chess engine has a low change to beat him.
But a chess engine around 2600 CCRL elo can beat Magnus Carlsen without much difficulty at rapid games.
The first result you mention is the most relevant since 15' + 10" Rapid is about like game in 25'. I think that a 2800 player today is a stronger Rapid player than a 2800 player of 20 years ago, due to getting so much more practice at fast play on the internet, but even so I agree with you that somewhere between 2450 and 2600 CCRL Rapid (let's say 2525, the middle) would be an equal opponent for Magnus in Rapid based on this historical evidence. But that's roughly Skill level 22 on Komodo Dragon 2, which was a close opponent for Jorge Sammour, whose FIDE is only 2458, four hundred elo below Carlsen! So I'm having a really tough time reconciling these facts. Obviously the crippled dragon plays quite differently than a full strength twenty year old engine, but it's not obvious why it would perform much worse vs. humans than an engine of equal strength based on direct play. Some mystery here....
Komodo rules!