Estimated Elo Perfect Play

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

megamau
Posts: 37
Joined: Wed Feb 10, 2016 6:20 am
Location: Singapore

Estimated Elo Perfect Play

Post by megamau »

I have done some analysis of the data of CCRL rating list, correlating the draw rate to the strength of the programs.

As the ELO increases, so does the draw rate, suggesting the hypothesis that the game is a draw with perfect play. Under the same hypothesis, we can extrapolate the curve of the draw rate, and infer what should be the ELO rating of perfect play.

In this case, we have filtered out the program which played with much stronger or weaker opponents on average (as this would skew the draw rate down). Only programs for which the average opposition is within 20 ELO have been considered.

With all the caveat of extrapolation, it seems that perfect play is nearer than I thought, within 3800 and 5000 ELO.

Image
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Estimated Elo Perfect Play

Post by Laskos »

Yes, your result (3800 to 5000 CCRL) is very close to what I got doing a different extrapolation (ELO gain per doubling time). IIRC my results were a bit closer to 5000 than to 3800.
User avatar
yurikvelo
Posts: 710
Joined: Sat Dec 06, 2014 1:53 pm

Re: Estimated Elo Perfect Play

Post by yurikvelo »

I have different interpretation of draw reasons and have methodics how to verify it or neglect.

Draw rate of 2500 engine A vs 2500 engine B might be low, because they have different strength and different weaknesses (it's like in thriathlon 2 athletes got the same time on finish, but one is very bad swimmer and second is very bad runner).
Weak engines usually are inmature, lack implementation of different kind of knowledge, lack testing in different conditions.
Usually they behave quite different in different conditions. So there is much probability strong side of engine A meet weakness of engine B in particular game.

Strong engines aren't only strong, historically they are developed using very same ideas, techniques, testing methods. Thay might play draw not because of perfect play, but because they exploit the very same weakness in all type of positions.

To verify my hypo - compare drawrate of the same pair of engines (e.g. latest K vs SF) at 0.1 MN per move and 100 MN per move (HW+TC combination). This is about +500...600 ELO and should give 20-30% increase in draw rate.

Also I suggest if someone develope engine of the same strength as SF/K, but employing extremely different techniques, e.g. AlphaGo approach, draw rate might be much lower despite integral ELO strength equal.
User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Re: Estimated Elo Perfect Play

Post by cdani »

yurikvelo wrote:I have different interpretation of draw reasons and have methodics how to verify it or neglect.

Draw rate of 2500 engine A vs 2500 engine B might be low, because they have different strength and different weaknesses (it's like in thriathlon 2 athletes got the same time on finish, but one is very bad swimmer and second is very bad runner).
Weak engines usually are inmature, lack implementation of different kind of knowledge, lack testing in different conditions.
Usually they behave quite different in different conditions. So there is much probability strong side of engine A meet weakness of engine B in particular game.

Strong engines aren't only strong, historically they are developed using very same ideas, techniques, testing methods. Thay might play draw not because of perfect play, but because they exploit the very same weakness in all type of positions.

To verify my hypo - compare drawrate of the same pair of engines (e.g. latest K vs SF) at 0.1 MN per move and 100 MN per move (HW+TC combination). This is about +500...600 ELO and should give 20-30% increase in draw rate.

Also I suggest if someone develope engine of the same strength as SF/K, but employing extremely different techniques, e.g. AlphaGo approach, draw rate might be much lower despite integral ELO strength equal.

I think also like you. Probably best engines in some years will exploit at least some different fields of play and consequently extend how far the horizon seems to be.
gerold
Posts: 10121
Joined: Thu Mar 09, 2006 12:57 am
Location: van buren,missouri

Re: Estimated Elo Perfect Play

Post by gerold »

yurikvelo wrote:I have different interpretation of draw reasons and have methodics how to verify it or neglect.

Draw rate of 2500 engine A vs 2500 engine B might be low, because they have different strength and different weaknesses (it's like in thriathlon 2 athletes got the same time on finish, but one is very bad swimmer and second is very bad runner).
Weak engines usually are inmature, lack implementation of different kind of knowledge, lack testing in different conditions.
Usually they behave quite different in different conditions. So there is much probability strong side of engine A meet weakness of engine B in particular game.

Strong engines aren't only strong, historically they are developed using very same ideas, techniques, testing methods. Thay might play draw not because of perfect play, but because they exploit the very same weakness in all type of positions.

To verify my hypo - compare drawrate of the same pair of engines (e.g. latest K vs SF) at 0.1 MN per move and 100 MN per move (HW+TC combination). This is about +500...600 ELO and should give 20-30% increase in draw rate.

Also I suggest if someone develope engine of the same strength as SF/K, but employing extremely different techniques, e.g. AlphaGo approach, draw rate might be much lower despite integral ELO strength equal.
Plus one.
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: Estimated Elo Perfect Play

Post by Ozymandias »

yurikvelo wrote:I have different interpretation of draw reasons and have methodics how to verify it or neglect.

Draw rate of 2500 engine A vs 2500 engine B might be low, because they have different strength and different weaknesses (it's like in thriathlon 2 athletes got the same time on finish, but one is very bad swimmer and second is very bad runner).
Weak engines usually are inmature, lack implementation of different kind of knowledge, lack testing in different conditions.
Usually they behave quite different in different conditions. So there is much probability strong side of engine A meet weakness of engine B in particular game.

Strong engines aren't only strong, historically they are developed using very same ideas, techniques, testing methods. Thay might play draw not because of perfect play, but because they exploit the very same weakness in all type of positions.

To verify my hypo - compare drawrate of the same pair of engines (e.g. latest K vs SF) at 0.1 MN per move and 100 MN per move (HW+TC combination). This is about +500...600 ELO and should give 20-30% increase in draw rate.

Also I suggest if someone develope engine of the same strength as SF/K, but employing extremely different techniques, e.g. AlphaGo approach, draw rate might be much lower despite integral ELO strength equal.
I've heard this sort of argument before, in some form or another, but still haven't seen any indication of a paradigm sift. Most probably because it can't be done.
User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Re: Estimated Elo Perfect Play

Post by cdani »

Ozymandias wrote: I've heard this sort of argument before, in some form or another, but still haven't seen any indication of a paradigm sift. Most probably because it can't be done.
I can propose for example that in many games there should be some lines that can win a position, but as they are not related in any way to tactics or typical weaknesses known by engines, but only to deep ideas, the engines simply cannot see them. So something big or a lot of little improvements should be done to reach this.
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: Estimated Elo Perfect Play

Post by Ozymandias »

cdani wrote:[…] in many games there should be some lines that can win a position, but as they are not related in any way to tactics or typical weaknesses known by engines, but only to deep ideas, the engines simply cannot see them.
I guess you mean, even more deep ideas, because engines already handle pretty deep "ideas" nowadays. That could happen, but you'd need very complex positions, for those ideas to thrive, and books already take care of those. I hope there's some uncharted ones left, but I'm afraid they're nowhere near "many".
mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: Estimated Elo Perfect Play

Post by mjlef »

Laskos wrote:Yes, your result (3800 to 5000 CCRL) is very close to what I got doing a different extrapolation (ELO gain per doubling time). IIRC my results were a bit closer to 5000 than to 3800.
Andreas at fastgm.de gathered some data here using Komodo 9.3:

http://fastgm.de/time-control4.html

There is a clear decline in elo gain from successive doublings. Of course it all depnds on what kind of curve you fit to the data, but is seems that at least for Komodo 9.3, you would eventually reach a point where more CPU power/time would not help improve elo.

Then again, this is just one program. But interesting.

Mark
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Estimated Elo Perfect Play

Post by Laskos »

mjlef wrote:
Laskos wrote:Yes, your result (3800 to 5000 CCRL) is very close to what I got doing a different extrapolation (ELO gain per doubling time). IIRC my results were a bit closer to 5000 than to 3800.
Andreas at fastgm.de gathered some data here using Komodo 9.3:

http://fastgm.de/time-control4.html

There is a clear decline in elo gain from successive doublings. Of course it all depnds on what kind of curve you fit to the data, but is seems that at least for Komodo 9.3, you would eventually reach a point where more CPU power/time would not help improve elo.

Then again, this is just one program. But interesting.

Mark
Thanks, very interesting, cleaner results than mine (more games to longer time control).

Several programs seem to converge to a similar limiting value, I did tests with Houdini, Komodo and SF.

From Andreas results, roughly extrapolating would give a limiting value of about 4400-4800 ELO points on CCRL 40/40. Very similar to my results.
Although some may doubt the extrapolation, if one assumes that Chess from standard opening position is a draw, then the extrapolation is natural and straightforward.