which is more realistic in human terms, ccrl or cegt?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

which is more realistic in human terms, ccrl or cegt?

Post by lkaufman »

The CCRL 40/15 list and the CEGT "40/20" (maybe 40/8 or so on modern hardware) lists pretty much agree around the 3500 level, but for engines in the range of strong human amateur players, say 2100-2400 FIDE or so, the CCRL ratings for most engines are far higher than the CEGT ratings, maybe 200 to 250 or so. This is due to using BayesElo vs Ordo, but whatever the reason, I'm simply asking for strong human players, or people who have observed games by strong human players against such engines, to express their opinion as to which list is closer to human FIDE ratings in that rating range, for games played at fairly slow "rapid" time controls on a typical modern 3 Ghz machine. I suspect that the truth is somewhere in between the two lists, but I have very limited data to go by. It would be very nice to know just what level engine is actually a good match for say a 2300 FIDE player.
Komodo rules!
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: which is more realistic in human terms, ccrl or cegt?

Post by carldaman »

I think on average even the CCRL ratings are a little deflated in the 2100-2400 range, which makes CEGT too deflated.

I know some strong players that have opined that Romichess, for example, is FIDE master stength, at least - especially with learning enabled. You could try it on for size yourself, Larry. :)
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: which is more realistic in human terms, ccrl or cegt?

Post by lkaufman »

carldaman wrote: Thu Jul 23, 2020 3:57 am I think on average even the CCRL ratings are a little deflated in the 2100-2400 range, which makes CEGT too deflated.

I know some strong players that have opined that Romichess, for example, is FIDE master stength, at least - especially with learning enabled. You could try it on for size yourself, Larry. :)
But the CCRL ratings for Romichess (different versions) are mostly around 2400, which is IM standard; FM standard is 2300. So this would contradict what you are saying. We certainly don't want to consider programs with learning for this, since their ratings on the lists presumably turn off learning. I played a couple fast rapid games with Gaviota 0.80 (a old version, but that should be irrelevant as to the accuracy of its rating), and I just can't believe it would earn the GM title vs. humans, rating on CCRL is 2535. I don't doubt that later versions were GM strength, but not this one. I'll play other such engines as time permits, but due to my age I'm not a typical player for a given rating. I have the feeling that a typical 2200 or 2300 FIDE player now is a much stronger player than such a player was thirty years ago, but I can't really prove it. I don't think engines from 1990 would perform nearly as well today on the same hardware; everyone just knows so much more about chess now and has so much more practice, as well as knowing how to play vs engines.
Komodo rules!
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: which is more realistic in human terms, ccrl or cegt?

Post by carldaman »

lkaufman wrote: Thu Jul 23, 2020 5:14 am
carldaman wrote: Thu Jul 23, 2020 3:57 am I think on average even the CCRL ratings are a little deflated in the 2100-2400 range, which makes CEGT too deflated.

I know some strong players that have opined that Romichess, for example, is FIDE master stength, at least - especially with learning enabled. You could try it on for size yourself, Larry. :)
But the CCRL ratings for Romichess (different versions) are mostly around 2400, which is IM standard; FM standard is 2300. So this would contradict what you are saying. We certainly don't want to consider programs with learning for this, since their ratings on the lists presumably turn off learning. I played a couple fast rapid games with Gaviota 0.80 (a old version, but that should be irrelevant as to the accuracy of its rating), and I just can't believe it would earn the GM title vs. humans, rating on CCRL is 2535. I don't doubt that later versions were GM strength, but not this one. I'll play other such engines as time permits, but due to my age I'm not a typical player for a given rating. I have the feeling that a typical 2200 or 2300 FIDE player now is a much stronger player than such a player was thirty years ago, but I can't really prove it. I don't think engines from 1990 would perform nearly as well today on the same hardware; everyone just knows so much more about chess now and has so much more practice, as well as knowing how to play vs engines.
I was mentioning others' impressions about Romichess, with an assessment of FM level at the very least, as a lower boundary. Quite possibly it's stronger, maybe IM, or weak GM, but it would be nice if that could be verified.

Anyway, one can certainly play Romi with learning turned off, but then you might run into some determinism issues.
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: which is more realistic in human terms, ccrl or cegt?

Post by Dr.Wael Deeb »

carldaman wrote: Thu Jul 23, 2020 5:57 am
lkaufman wrote: Thu Jul 23, 2020 5:14 am
carldaman wrote: Thu Jul 23, 2020 3:57 am I think on average even the CCRL ratings are a little deflated in the 2100-2400 range, which makes CEGT too deflated.

I know some strong players that have opined that Romichess, for example, is FIDE master stength, at least - especially with learning enabled. You could try it on for size yourself, Larry. :)
But the CCRL ratings for Romichess (different versions) are mostly around 2400, which is IM standard; FM standard is 2300. So this would contradict what you are saying. We certainly don't want to consider programs with learning for this, since their ratings on the lists presumably turn off learning. I played a couple fast rapid games with Gaviota 0.80 (a old version, but that should be irrelevant as to the accuracy of its rating), and I just can't believe it would earn the GM title vs. humans, rating on CCRL is 2535. I don't doubt that later versions were GM strength, but not this one. I'll play other such engines as time permits, but due to my age I'm not a typical player for a given rating. I have the feeling that a typical 2200 or 2300 FIDE player now is a much stronger player than such a player was thirty years ago, but I can't really prove it. I don't think engines from 1990 would perform nearly as well today on the same hardware; everyone just knows so much more about chess now and has so much more practice, as well as knowing how to play vs engines.
I was mentioning others' impressions about Romichess, with an assessment of FM level at the very least, as a lower boundary. Quite possibly it's stronger, maybe IM, or weak GM, but it would be nice if that could be verified.

Anyway, one can certainly play Romi with learning turned off, but then you might run into some determinism issues.
I'll take on Romi when I reach it on the CCRL rating list with the learning turned on although I doubt it will benefit from this feature as it will be only 2 games ....

Cheers,
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
Modern Times
Posts: 3550
Joined: Thu Jun 07, 2012 11:02 pm

Re: which is more realistic in human terms, ccrl or cegt?

Post by Modern Times »

lkaufman wrote: Thu Jul 23, 2020 3:49 am The CCRL 40/15 list and the CEGT "40/20" (maybe 40/8 or so on modern hardware)
My recollection is this: The CCRL and CEGT 40/40 lists were originally benched on similar Athlon X2 hardware. They re-named theirs a while back to 40/20 to take account of hardware improvements, and we recently renamed ours to 40/15 based on an Intel i7-4770k. So I think they are broadly still similar.
jdart
Posts: 4367
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: which is more realistic in human terms, ccrl or cegt?

Post by jdart »

I occasionally get asked what the rating of my engine is in human terms. My standard answer has always been that there really is no accurate answer I can give. If you put one of these engines in a series of FIDE rated tournaments against a variety of human players, you'd get it to converge to a proper FIDE rating. But there is really no reason to expect a good correlation between the CEGT/CCRL list and FIDE ratings, among other reasons because they are completely different rating pools.

--Jon
User avatar
Rebel
Posts: 6995
Joined: Thu Aug 18, 2011 12:04 pm

Re: which is more realistic in human terms, ccrl or cegt?

Post by Rebel »

jdart wrote: Thu Jul 23, 2020 10:33 am I occasionally get asked what the rating of my engine is in human terms. My standard answer has always been that there really is no accurate answer I can give. If you put one of these engines in a series of FIDE rated tournaments against a variety of human players, you'd get it to converge to a proper FIDE rating. But there is really no reason to expect a good correlation between the CEGT/CCRL list and FIDE ratings, among other reasons because they are completely different rating pools.

--Jon
Totally agree, for example, Rebel Century in 2001 (on a poor Athlon) played a 4 game match at tournament time control (40 moves in 2 hours) against the number 10 of that time on the FIDE rating list, Loek van Wely, rated 2714. 2 wins, 2 losses, 2-2. And yet:

Rebel Century CEGT elo 2379
Rebel Century CCRL elo 2543

The links are:
https://en.wikipedia.org/wiki/Loek_van_Wely
http://rebel13.nl/dos/rebel%20century%204.html

And that's just one example, Junior-Kasparov and Fritz-Kramnik around the same time are other examples.
90% of coding is debugging, the other 10% is writing bugs.
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: which is more realistic in human terms, ccrl or cegt?

Post by Vinvin »

At the top, the ratings are about synchronized at the top (3500 for SF 11 4 CPUs).
But the more you go down in the list, the bigger the difference is.

And to compare with human, I'm taking the example of Fritz who played a lot of games at highest level.

Kramnik vs Deep Fritz, Bahrain, October 2002 (4 - 4)
Kasparov versus X3D Fritz 2003 (2-2)
Fritz 8 in Bilbao (2004) vs Ponomariov, Karjakin and Topalov (3.5/4)
Fritz 9 in Bilbao (2005) vs Kasimdzhanov, Ponomariov and Khalifman (2/4)
Kramnik versus Deep Fritz, Bonn December 2006 (2 - 4)

CCRL

Code: Select all

Deep Fritz 10 4CPU	2830	
Fritz 10		2778	
Fritz 9			2742	
Fritz 8 Bilbao		2700	

CEGT

Code: Select all

Deep Fritz 10 4CPU 	2659
Fritz 10 		2622
Fritz 9 		2576
Deep Fritz 8 2CPU 	2562
Fritz in Bahrain 	2524
Fritz 8 Bilbao 		2506
Deep Fritz 8 1CPU 	2489
Fritz 6 		2358
The CCRL ratings are way more realistic than CEGT ones.
More, if you consider the time control (15 min/40 moves), the rating for Deep Fritz 10 4CPU is probably close to 3000 compares to FIDE.
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: which is more realistic in human terms, ccrl or cegt?

Post by Dr.Wael Deeb »

Vinvin wrote: Thu Jul 23, 2020 1:16 pm At the top, the ratings are about synchronized at the top (3500 for SF 11 4 CPUs).
But the more you go down in the list, the bigger the difference is.

And to compare with human, I'm taking the example of Fritz who played a lot of games at highest level.

Kramnik vs Deep Fritz, Bahrain, October 2002 (4 - 4)
Kasparov versus X3D Fritz 2003 (2-2)
Fritz 8 in Bilbao (2004) vs Ponomariov, Karjakin and Topalov (3.5/4)
Fritz 9 in Bilbao (2005) vs Kasimdzhanov, Ponomariov and Khalifman (2/4)
Kramnik versus Deep Fritz, Bonn December 2006 (2 - 4)

CCRL

Code: Select all

Deep Fritz 10 4CPU	2830	
Fritz 10		2778	
Fritz 9			2742	
Fritz 8 Bilbao		2700	

CEGT

Code: Select all

Deep Fritz 10 4CPU 	2659
Fritz 10 		2622
Fritz 9 		2576
Deep Fritz 8 2CPU 	2562
Fritz in Bahrain 	2524
Fritz 8 Bilbao 		2506
Deep Fritz 8 1CPU 	2489
Fritz 6 		2358
The CCRL ratings are way more realistic than CEGT ones.
More, if you consider the time control (15 min/40 moves), the rating for Deep Fritz 10 4CPU is probably close to 3000 compares to FIDE.
You are probably right ....

I am a non-rated self-educated chess player but never the less,I discovered and I am still discovering an obvious distortion in the rating of the chess engines in the lower sectors of the CCRL rating list ....

But in general,CCRL is reasonably realistic as realism can be achieved could be achieved when there are no humans involved ....\

Cheers,
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….