Strong engines make more draws

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
Thomas Mayer
Posts: 385
Joined: Thu Mar 09, 2006 6:45 pm
Location: Nellmersbach, Germany

Re: Strong engines make more draws

Post by Thomas Mayer »

Hi Dann,

additionally here the same calculation done with the CEGT-40/4 games:

Code: Select all

Elo+/-50        #ofGames        #ofDraw         %

1750            25              5               20,00%
1800            27              3               11,11%
1850                    
1900                    
1950                    
2000                    
2050                    
2100                    
2150            50              2               4,00%
2200                    
2250            3                               0,00%
2300            212             56              26,42%
2350            294             77              26,19%
2400            1308            326             24,92%
2450            761             196             25,76%
2500            1457            388             26,63%
2550            4635            1294            27,92%
2600            3846            1168            30,37%
2650            5070            1443            28,46%
2700            4766            1449            30,40%
2750            1793            598             33,35%
2800            4925            1711            34,74%
2850            3817            1336            35,00%
2900            1332            477             35,81%
2950            146             68              46,58%
3000            50              20              40,00%
3050                    
3100                    
3150                    
3200                    		
same tendency, but as I said in the earlier post, there are definitely different reasons possible.

Interesting observation, by the way: Draw rate when opponents differ 51 to 100 Elo is in both lists slightly higher compared to the draw rate when Elo-difference is 50 or below. (Slightly means about 1% which isn't enough to be sure in my opinion)
Even for a difference of +/- 101-150 Elo the draw rate doesn't change much, it starts to fall significantly only when the difference is above 150 Elo. Does anyone know whether this is compareable to humans ?

Greets, Thomas
User avatar
Kirill Kryukov
Posts: 518
Joined: Sun Mar 19, 2006 4:12 am
Full name: Kirill Kryukov

Re: Strong engines make more draws

Post by Kirill Kryukov »

towforce wrote:
smirobth wrote:
towforce wrote:
Kirill Kryukov wrote:Logically it is to be expected, but still it's nice to see some data.
This provides handy evidence in support of my "chess might be almost solved" assertion (click here).
To my way of thinking this result doesn't provide any support for your assertion. Rather it seems to support what most people already believe, the game theoretic result of chess with perfect play is a draw.
Robin,

The two assertions are mutually inclusive!

One cannot assert that stronger chess engines obtaining more draws is evidence of chess being a theoritical draw unless one also accepts that chess computers are getting close to playing "near-perfect" chess as defined here.
Yes, I think these draw rates support theory that chess is a theoretical draw. But I don't think this data does much to help us estimate how close we are to perfect (or near-perfect) chess.

Hm.. Perhaps plottig draw rates versus engine strength we could see the shape of a curve? Then by extending it towards larger ratings we can guess how soon we can reach "near-perfect" chess, or how close we will get at certain ELO.
User avatar
Kirill Kryukov
Posts: 518
Joined: Sun Mar 19, 2006 4:12 am
Full name: Kirill Kryukov

Re: Strong engines make more draws

Post by Kirill Kryukov »

Uri Blass wrote:
Kirill Kryukov wrote:Logically it is to be expected, but still it's nice to see some data.

This draw percentage table is constructed for my selection of well-tested free single-CPU engines tested under CCRL 40/4 conditions.

You can scroll around or use zoom function if your browser has it. Although there is a large variation from one engine pair to another, overall it is clear that that there are much more draws made by stronger engines.

This is one of the reasons why I am becoming more interested in weaker engines recently - the games are more interesting to watch. :-)

I see that most discussions are about stronger engines. Do you think the quality of games played by stronger engines makes up for the number of draws they make?

Best, Kirill
I think that some weak engines also make more draws when they play against engines with similiar strength(mainly weak engines that are unable to detect repetition so they get better position only to allow the opponent to get a draw by repetition).

I think that it is better simply to drop engines that do it out of the rating list.

I believe that they are responsible to the fact that the rating of very weak engines is too high because if an engine play like 2300 in the middle game and allow draw by repetition because of not having hash tables then that engine may perform almost like 2300 against 2500 players when the same engine may get many draws against 1900 engine and perform like 2100 against 1900 engine.

Practically the engine may get 25% against 2500 and 75% against 1900

If we decide to give the 2500 engine rating of 2500 then the 1900 engine may get rating near 2100 based only on this information.

I think that we need some rule that only stable engine that do not make stupid bugs often can be allowed to enter to tating lists.

Uri
I think no engine can be punished by having a drawish style. As long as an engine is stable it should be allowed to make any legal moves it likes to make. Whether it prefers draws or not, its rating still can be determined (assuming it has no learning).

BTW, please note that my own testing strategy is to test each engine with both stronger and weaker opponents. (At least 16 closest stronger opponents and 16 closest weaker opponents).
User avatar
Kirill Kryukov
Posts: 518
Joined: Sun Mar 19, 2006 4:12 am
Full name: Kirill Kryukov

Re: Strong engines make more draws

Post by Kirill Kryukov »

Thomas Mayer wrote:
here is a list which I created with the complete pgn of CCRL404:

Code: Select all

Rating  #ofGames        #ofDraws        %
1750                    
1800                    
1850    16              3               18,75%
1900    16              5               31,25%
1950    1                               0,00%
2000    87              16              18,39%
2050    16              4               25,00%
2100    33              7               21,21%
2150    77              16              20,78%
2200    5               1               20,00%
2250                    
2300    187             47              25,13%
2350    688             159             23,11%
2400    79              15              18,99%
2450    538             158             29,37%
2500    796             209             26,26%
2550    856             251             29,32%
2600    3415            1054            30,86%
2650    5192            1590            30,62%
2700    5458            1768            32,39%
2750    4125            1640            39,76%
2800    1166            407             34,91%
2850    8110            2963            36,54%
2900    4705            1801            38,28%
2950    3075            1254            40,78%
3000    3865            2054            53,14%
3050    214             99              46,26%
3100    61              24              39,34%
3150    1023            737             72,04%
3200                    

Tot     43804           16282           37,17%
well, it seems to support Kirill's opinion. But there are some things we shouldn't forget:
a) Maybe different quality of opening books for lower rated engines which might lead to more strange lines
Thanks for counting, Thomas! I personally use the same book for weak and strong engines. Though other testers that use different books may tend to play more with stronger or weaker engines and that can theoretically create a slight shift.
Thomas Mayer wrote:b) Especially in the higher elo-ranges we have a very low number of engine families. E.g. in the 3150 afaik only Rybka. Especially the last number seems to support my opinion that a big number of engines with similar roots lead as well to a higher number of draws, imo there is no doubt that the variety is bigger in the lower ranges.

So afterall I think that Kirills observation might have several reasons, maybe not only strength.

Greets, Thomas
Yeah second reason is a possibility if you use all games. When two versions of the same engine play each other, the draw rate is known to be high. To take this into account, only games where two sides are from different families should be counted for draw rate. I may do it when I have some time. Now I became curious about this too.

Selection of free engines I linked in my first post does not have this problem because all engines there are from different families. (With the exception of Glaurung 2.0.1 and Glaurung 1.2.1, which I consider to be two different engines).

Actually there is one more problem in using all games for draw rate counting. Commercial engines (which are at the top of the list) are developed in slightly different way from amateur (free) engines. Commercial engine author are very much motivated to create the strongest engine possible, by whatever means available to them. So borrowing the ideas from other engines (Fruit in particular) is a very common trend, because it is productive in creating a stronger engine. This makes a number (though not all) of commercial engines play very similar styles. (I won't go into detail here, this is just my personal observation).

On the other hand, amateur engine authors are more often trying to work out their own ideas. This means that their engines are not necessarily the strongest they could achieve, but they have more diversity in ideas and playing style. So using only free engines for draw rate calculation seems reasonable to me.
Dann Corbit
Posts: 12815
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Strong engines make more draws

Post by Dann Corbit »

Kirill Kryukov wrote:
towforce wrote:
smirobth wrote:
towforce wrote:
Kirill Kryukov wrote:Logically it is to be expected, but still it's nice to see some data.
This provides handy evidence in support of my "chess might be almost solved" assertion (click here).
To my way of thinking this result doesn't provide any support for your assertion. Rather it seems to support what most people already believe, the game theoretic result of chess with perfect play is a draw.
Robin,

The two assertions are mutually inclusive!

One cannot assert that stronger chess engines obtaining more draws is evidence of chess being a theoritical draw unless one also accepts that chess computers are getting close to playing "near-perfect" chess as defined here.
Yes, I think these draw rates support theory that chess is a theoretical draw. But I don't think this data does much to help us estimate how close we are to perfect (or near-perfect) chess.

Hm.. Perhaps plottig draw rates versus engine strength we could see the shape of a curve? Then by extending it towards larger ratings we can guess how soon we can reach "near-perfect" chess, or how close we will get at certain ELO.
To solve the game, we do not have to play perfect chess, only to *prove* what the outcome of the game is (won, lost or drawn for white at the opening)

To play perfect chess will not prove the game unless we can *prove* that it is perfect chess.

So I don't think that the draw tendency helps other goal, other than providing empirical evidence that seems to support it. But such information can only be used in a inductive and not a deductive manner.
User avatar
Kirill Kryukov
Posts: 518
Joined: Sun Mar 19, 2006 4:12 am
Full name: Kirill Kryukov

Re: Strong engines make more draws

Post by Kirill Kryukov »

Hi Thomas!
Thomas Mayer wrote:Interesting observation, by the way: Draw rate when opponents differ 51 to 100 Elo is in both lists slightly higher compared to the draw rate when Elo-difference is 50 or below. (Slightly means about 1% which isn't enough to be sure in my opinion)
Even for a difference of +/- 101-150 Elo the draw rate doesn't change much, it starts to fall significantly only when the difference is above 150 Elo.
Very interesting! In case if you have the data showing this, could you post the numbers? Thanks!
User avatar
smirobth
Posts: 2307
Joined: Wed Mar 08, 2006 8:41 pm
Location: Brownsville Texas USA

Re: Strong engines make more draws

Post by smirobth »

towforce wrote:
smirobth wrote:
towforce wrote:
Kirill Kryukov wrote:Logically it is to be expected, but still it's nice to see some data.
This provides handy evidence in support of my "chess might be almost solved" assertion (click here).
To my way of thinking this result doesn't provide any support for your assertion. Rather it seems to support what most people already believe, the game theoretic result of chess with perfect play is a draw.
Robin,

The two assertions are mutually inclusive!

One cannot assert that stronger chess engines obtaining more draws is evidence of chess being a theoritical draw unless one also accepts that chess computers are getting close to playing "near-perfect" chess as defined here.
As computers get stronger they make mistakes less often. But that does not necessarily translate into them getting 'close to playing "near-perfect" chess', merely stronger chess than they previously played. Closer to perfect is not that same thing at all as close to perfect, and there are clearly still positions that occur relatively frequently in actual games where computers persist in showing very poor understanding and play(fortresses are a prime example). It seems this situation is unlikely to change dramatically any time soon and until it does change claims that computers are near playing perfect chess are very premature, IMO.
- Robin Smith
Uri Blass
Posts: 11153
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Strong engines make more draws

Post by Uri Blass »

Kirill Kryukov wrote:
Uri Blass wrote:
Kirill Kryukov wrote:Logically it is to be expected, but still it's nice to see some data.

This draw percentage table is constructed for my selection of well-tested free single-CPU engines tested under CCRL 40/4 conditions.

You can scroll around or use zoom function if your browser has it. Although there is a large variation from one engine pair to another, overall it is clear that that there are much more draws made by stronger engines.

This is one of the reasons why I am becoming more interested in weaker engines recently - the games are more interesting to watch. :-)

I see that most discussions are about stronger engines. Do you think the quality of games played by stronger engines makes up for the number of draws they make?

Best, Kirill
I think that some weak engines also make more draws when they play against engines with similiar strength(mainly weak engines that are unable to detect repetition so they get better position only to allow the opponent to get a draw by repetition).

I think that it is better simply to drop engines that do it out of the rating list.

I believe that they are responsible to the fact that the rating of very weak engines is too high because if an engine play like 2300 in the middle game and allow draw by repetition because of not having hash tables then that engine may perform almost like 2300 against 2500 players when the same engine may get many draws against 1900 engine and perform like 2100 against 1900 engine.

Practically the engine may get 25% against 2500 and 75% against 1900

If we decide to give the 2500 engine rating of 2500 then the 1900 engine may get rating near 2100 based only on this information.

I think that we need some rule that only stable engine that do not make stupid bugs often can be allowed to enter to tating lists.

Uri
I think no engine can be punished by having a drawish style. As long as an engine is stable it should be allowed to make any legal moves it likes to make. Whether it prefers draws or not, its rating still can be determined (assuming it has no learning).

BTW, please note that my own testing strategy is to test each engine with both stronger and weaker opponents. (At least 16 closest stronger opponents and 16 closest weaker opponents).
The main problem is that engines with serious bugs distort the rating list.

As an extreme case
If an engine is drawing all games against levels of 2000-2400 by forcing repetition in better positions then the rating of the 2000 players go up and the rating of the 2400 players go down.

It does not happen with human and no human will force repetition often in superior positions like counter0.1(note that counter0.l is not tested by CCRL but I am not sure that there are not engines with the same problem.

I think that we need engines with no serious bugs to have a reliable rating list for the weak engine and a good idea may be to use rybka or other strong programs at fixed depth.

The only problem is how to play games of engine with fixed depth against engines with specific time control(it may be interesting to have ccrl rating for rybka at different depths and games like that can take very small time when we use small depths)

I believe that rybka2.3.2a depth 11 may get blitz ccrl rating above 2700 when it is going to use less than 4 minutes/40 moves in most cases.

Games with smaller depth can be played even faster and can help to get fast reliable rating for the small depths that may help to get more reliable rating for the weak engines.

If the problem of playing fixed depth against specific time control is solved then
I suggest to include
Rybka2.3.2a depth 11
Rybka2.3.2a depth 10
Rybka2.3.2a depth 9
Rybka2.3.2a depth 8
and other strong engines (not rybka) with fixed depth both in the CCRL 40/4 and CCRL 40/40

It may give us also better data about the rating difference between
40/4 and 40/40(of course I expect the same entry at fixed depth to have smaller rating at 40/40 but the question is how much smaller)

Uri
User avatar
towforce
Posts: 12760
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: Graham Laight

Re: Strong engines make more draws

Post by towforce »

smirobth wrote:
towforce wrote:
smirobth wrote:
towforce wrote:
Kirill Kryukov wrote:Logically it is to be expected, but still it's nice to see some data.
This provides handy evidence in support of my "chess might be almost solved" assertion (click here).
To my way of thinking this result doesn't provide any support for your assertion. Rather it seems to support what most people already believe, the game theoretic result of chess with perfect play is a draw.
Robin, the two assertions are mutually inclusive! One cannot assert that stronger chess engines obtaining more draws is evidence of chess being a theoritical draw unless one also accepts that chess computers are getting close to playing "near-perfect" chess as defined here.
As computers get stronger they make mistakes less often. But that does not necessarily translate into them getting 'close to playing "near-perfect" chess', merely stronger chess than they previously played. Closer to perfect is not that same thing at all as close to perfect, and there are clearly still positions that occur relatively frequently in actual games where computers persist in showing very poor understanding and play(fortresses are a prime example). It seems this situation is unlikely to change dramatically any time soon and until it does change claims that computers are near playing perfect chess are very premature, IMO.
If what you say is correct, then you have also attacked your own assertion that stronger chess computers gettting more draws is evidence of chess being a theoretical draw.

But actually, I don't agree that you are correct: with extra depth of search, two things will happen:

1. The number of positions where computers make mistakes will continue to fall away (as suddenly they see far enough ahead to see the mistake)

2. It becomes less easy to get them into positions where they can make these mistakes
Human chess is partly about tactics and strategy, but mostly about memory
User avatar
Thomas Mayer
Posts: 385
Joined: Thu Mar 09, 2006 6:45 pm
Location: Nellmersbach, Germany

Re: Strong engines make more draws

Post by Thomas Mayer »

Hi Kirill,
Kirill Kryukov wrote:Hi Thomas!
Thomas Mayer wrote:Interesting observation, by the way: Draw rate when opponents differ 51 to 100 Elo is in both lists slightly higher compared to the draw rate when Elo-difference is 50 or below. (Slightly means about 1% which isn't enough to be sure in my opinion)
Even for a difference of +/- 101-150 Elo the draw rate doesn't change much, it starts to fall significantly only when the difference is above 150 Elo.
Very interesting! In case if you have the data showing this, could you post the numbers? Thanks!
thanks god I mustn't do the counting myself, a nice little Excel Script written on the fly does the job. I extracted all the stuff I got into some html-files.
Just a note, in the upper row 50 means difference of +-0-49 Elo, 100 means +- 50-99 Elo, 150 means +- 100-149 Elo etc. etc. just for clarification.

Here are the links: (Attention ! Huge tables)

For CCRL 40/04: http://www.quarkchess.de/quark/CCRL-404 ... istics.htm
For CEGL 40/04: http://www.quarkchess.de/quark/CEGT-404 ... istics.htm
For CEGT 40/20: http://www.quarkchess.de/quark/CEGT-402 ... istics.htm

It isn't easy to interpret the material, any idea whether I should present it differently ? I have tried with a diagram but you can't really read anything out of it... So for now just pure data...

Greets, Thomas