Difference between Computer and humans chess players

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Dann Corbit, Harvey Williamson

Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Difference between Computer and humans chess players

Post by Frank Quisinsky »

Hi Larry,

OK, I have it with Wasp what I search here.
But I like Komodo, Slow, Booot, Revenge, Rubi and so many others.
To have more of such engines, can reduce the strength very tricky with a human like style will be great.
Very important that strongest engines, Komodo is, can do it.

Since a while an other question is maybe more interesting?

After all I know Kortschnoj plays more as 3.300 tournament games.
With KI it must be possible to simulate the style of great players ... 3.300 games = x moves.
This must be enough for a simulation.

Mr. Kasparow can follow games with an engine cloned his own style.

:-)

And for sure we too!
A dream from me because we can mixed the style from different players in one engine.
Example: Tal, Shirow, Christiansen and other great attackers.

Best
Frank
lkaufman
Posts: 5942
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Difference between Computer and humans chess players

Post by lkaufman »

The biggest difference between Dragon 2.5 and Dragon 2.6 (or 2.6.1) is that 2.6 has a much stronger, "smarter" net, trained in a much more intelligent way that considers game stage. So probably it plays the endgame much better. I don't know why that would lead to longer games, but perhaps because it is much better in the endgame it is more prone to simplify to slightly better endgames and play on forever trying to win them. Rather like Carlsen does, though I'm sure you would say that even Carlsen wouldn't play out many of those endgames.

It is quite easy to make the elo levels relatively stronger in the middlegame and weaker in the endgame; all that's needed is to use depth rather than nodes, as indeed we used to do with our skill levels. In the endgame you can reach much greater depths with the same nodes or time, so cutting off by depth would do what you want. But I'm not convinced that this would be better. From what I've seen, games between humans and the corresponding Elo levels of Dragon tend to be won by Dragon in the middlegame, not in the endgame. I'm talking about games with reasonably strong players, like maybe 2000 elo or better; this might be different at 1000 elo level. Dragon seems to be generally more accurate tactically than equally rated (strong) players, even though Dragon is only doing maybe 4 or 5 ply searches (depending on the elo set). I haven't seen enough games even reach the endgame to make a judgment on relative endgame play. Anyway, if people show me game scores where they outplayed Dragon at the same elo setting (or at least close) in the middlegame but then lost in the endgame, I might be convinced. At least, now I'll pay attention to this issue.
Komodo rules!
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Difference between Computer and humans chess players

Post by Frank Quisinsky »

Hi Larry,

to the net:
So, the power of the stronger net is the explanation only!
Often I am thinking it can be not the net only.
But in reality it is.
Without the net's it was more easy to find out a bit about engines with statistics.
More and more complex today!

Thank you!

And to the playing strength ,,,
Your explanation is strong, can understand it.

My explanation was not good enough.
Shortly again ...

To _adjust_ the human-like-style is possible if we do that with max. strength from max. human level.
How we can simulate 2850 Elo the World Champion have?

And here I am quit sure that best chess programs are to strong in gaming phase 3 (passage before endgame) and gaming phase 4 (the endgame).

If we have simulate exactly the strength best chess players have we can reduce the strength to all other levels, 2700, 2600, 2500 ... 1500 with nodes per second.

A good example for an unbalanced strength is what Novag Super Constellation plays in the past. The endgame is maybe 1000 Elo, the passage before endgame is maybe 1200 Elo and the mid-game is maybe 1900 Elo. With the final result, that this old chess computer produced 1550 - 1600 Elo. Completly unrealistic what human's produced but great for testing in times Wasp offer's first time the Elo levels.

Wasp with 1500 Elo vs. Super Constellation in a match:
Super Constellation is most of times stronger in the midgames as Wasp but if the games goes in the endgame the different is enormous. Super Conny have no chance in endgames. And all this with 35 nodes per second. I am thinking, super ... exactly this one is right. If I am playing vs. Wasp with 2200 Elo I made around 30%. I have the feeling, that on the other site sitting an stronger human. I can win in the mid-game or in the endgame vs. 2200 Elo rarly. Most games I lost with blunders in mid-games. Most won games I produced in the late mid-games with very aggresssiv chess. So for me all is OK with "Human-Like-Style".

Back to Dragon:
The different to the other chess programs is: The gigantic strength, exactly in the playing phase 3 (passage before endgame or the very late mid-game) Dragon have. In my opinion Dragon is here the number 1 in the World (around 30-50 Elo stronger as Stockfish). Listen, my stats must not be right.

I believe to 100% what you wrote here if you are speaking about 2000 Elo. I think to "adjust" Elo strength with an human-like style ... in case of Komodo ... is really an order. Again, the gigantic strength Dragon have exactly in playing phase 3 can be a bit the "sticking point".

Stepmotherly we do it in times the commercial Ruffian was available.

I gave the order to the programmers:
I will make a commercial release only, if Ruffian can reduced the own strength with an UCI feature.

Some days I had discussion about it with PerOla or Martin Blume how we can do it.
Martin have a great idea and PerOla do that. The final result was in the near of OK but better as to have nothing.

Best
Frank
lkaufman
Posts: 5942
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Difference between Computer and humans chess players

Post by lkaufman »

I am not sure that I fully understand you, but I think you are saying that Dragon is stronger than even Stockfish in the later stages of the game, but weaker in the earlier stages. Then you say that the later-stage play must be weakened to simulate Magnus Carlsen. But I don't follow the logic here. Just because Dragon is better than other engines in the late stages, how does this tell us anything about what stages it is relatively strongest against the top human? I suppose you could analyze the handicap games it has played vs. Nakamura and other grandmasters, but even that would be misleading as handicap play is obviously very different from even game play in how the play varies by stage. It seems to be your claim that if we slow Dragon down to some small percentage of normal speed (or just cut off the search after N nodes), whatever percentage would make it equal with magnus Carlsen (my estimate is 0.2% of normal speed, on one thread, backed by data vs. strong GM), that it would generally get losing positions against Magnus but turn them around in the late stages. But where is the evidence for this claim? Actually I may soon be able to test this out, not against Magnus but against a pretty strong GM. If so I'll let you know the results.
Komodo rules!
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Difference between Computer and humans chess players

Post by Frank Quisinsky »

Hi Larry,

I am not sure that I fully understand you, but I think you are saying that Dragon is stronger than even Stockfish in the later stages of the game, but weaker in the earlier stages.

For comparing stats for versions Stockfish 15.11.2022 and Dragon 2.51 and versions before I am testing ... yes ... think so!

Maybe changed with newer versions?
At the moment I am waiting of the final results from run-1 (still running FCP Tourney-KI).
For the moment I am thinking that the last Dragon 2.61 is much better in the first gaming phase as before.
But here I am not looking in stats in detail.

Then you say that the later-stage play must be weakened to simulate Magnus Carlsen.

Yes, Magnus Carlsen or other very strong players ... think so!

But I don't follow the logic here. Just because Dragon is better than other engines in the late stages, how does this tell us anything about what stages it is relatively strongest against the top human?

The strength must go back for a simulation to humans in exactly this gaming phase. The secret is to simulate exactly the strengths best humans have. If engines like Dragon are to strong in the passage into endgames we have the same effect if we reduce simply the complete strength with nodes only.

It seems to be your claim that if we slow Dragon down to some small percentage of normal speed (or just cut off the search after N nodes), whatever percentage would make it equal with magnus Carlsen (my estimate is 0.2% of normal speed, on one thread, backed by data vs. strong GM), that it would generally get losing positions against Magnus but turn them around in the late stages.

As step 2.
In step 1 generally the Elo must go down for the "Passage into endgames".

That's the problem I think in case of Dragon. Dragon is to strong in gaming phase 3.

Example:
If I am looking in games from Iwantchuk or So (like more Iwantchuk or So as Carlsen) I found more often (with strongest engines) a way to can win games in gaming phase 3 (often in 4) as in gaming phase 1, 2.

What I wrote in my first message to the topic:

I wrote:

1. Earlier mid-game after openings ... phase 1
2. Mid-game ... phase 2
3. Passage into the endgame ... phase 3
4. Endgames ... phase 4

Humans: 08 (phase 1) / 07 (2) / 05 (3) / 06 (4)
Engines: 07 (phase 1) / 08 (2) / 09 (3) / 10 (4)

...

Very interesting what you organize all the time with GM games.
You can be sure I will follow all the experiments because fact is ... this topic is indeed very interesting for most of us (think so).

Best
Frank
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Difference between Computer and humans chess players

Post by Frank Quisinsky »

One of the most easy stats i made (FCP Tourney-2022).
Around the same with FCP Toruney-2020 & FCP Tourney-2021:

Games ended without resign-mode below 60 moves:

Code: Select all

   # Player                           :      Elo  Games  Score%  won  draw  lost  Points  Draw%  Error   OppAvg   OppE   OppD
   1 Stockfish 151121 NN x64          :  3416.39    217    80.2  131    86     0   174.0   39.6  34.80  3157.24  21.47   35.9
   2 Dragon 2.5.1 by Komodo NN AVX    :  3386.77    174    78.4   99    75     0   136.5   43.1  37.27  3148.37  21.76   34.0
 
Games ended without resign-mode between 60-79 moves:

Code: Select all

   # Player                           :      Elo  Games  Score%  won  draw  lost  Points  Draw%  Error   OppAvg   OppE   OppD
   1 Stockfish 151121 NN x64          :  3583.88    452    90.6  367    85     0   409.5   18.8  38.16  3137.90  28.67   38.3
   2 Dragon 2.5.1 by Komodo NN AVX    :  3551.81    408    89.3  323    83     2   364.5   20.3  39.82  3139.16  28.70   38.1
Games endet without resign-mode between 80-99 moves:

Code: Select all

   # Player                           :      Elo  Games  Score%  won  draw  lost  Points  Draw%  Error   OppAvg   OppE   OppD
   1 Dragon 2.5.1 by Komodo NN AVX    :  3582.07    308    89.0  244    60     4   274.0   19.5  47.83  3159.49  32.45   37.7
   2 Stockfish 151121 NN x64          :  3537.52    282    86.7  208    73     1   244.5   25.9  44.75  3156.48  32.57   37.6
Games endet without resign-mode between 100-300 moves:

Code: Select all

   # Player                           :      Elo  Games  Score%  won  draw  lost  Points  Draw%  Error   OppAvg   OppE   OppD
   1 Fire 8 NN MC.3 x64               :  3306.79    303    65.5   98   201     4   198.5   66.3  22.04  3180.45  18.73   36.2
   2 Stockfish 151121 NN x64          :  3300.87    249    62.0   61   187     1   154.5   75.1  23.83  3201.84  18.85   34.3
   3 Dragon 2.5.1 by Komodo NN AVX    :  3297.26    310    61.9   75   234     1   192.0   75.5  20.35  3196.32  18.94   35.5
I can do the same with pieces on board (partition in 4 gaming phases).
But this is allways the same.

Maybe this will be changed with newer version from Dragon and Stockfish.
Have a look on gaming phase 3.

Best
Frank
RubiChess
Posts: 562
Joined: Fri Mar 30, 2018 7:20 am
Full name: Andreas Matthies

Re: Difference between Computer and humans chess players

Post by RubiChess »

Frank Quisinsky wrote: Sat Jan 29, 2022 8:54 am Games endet without resign-mode between 100-300 moves:

Code: Select all

   # Player                           :      Elo  Games  Score%  won  draw  lost  Points  Draw%  Error   OppAvg   OppE   OppD
   1 Fire 8 NN MC.3 x64               :  3306.79    303    65.5   98   201     4   198.5   66.3  22.04  3180.45  18.73   36.2
   2 Stockfish 151121 NN x64          :  3300.87    249    62.0   61   187     1   154.5   75.1  23.83  3201.84  18.85   34.3
   3 Dragon 2.5.1 by Komodo NN AVX    :  3297.26    310    61.9   75   234     1   192.0   75.5  20.35  3196.32  18.94   35.5
I can do the same with pieces on board (partition in 4 gaming phases).
But this is allways the same.

Maybe this will be changed with newer version from Dragon and Stockfish.
Have a look on gaming phase 3.

Best
Frank
What do you conclude from this statistics? Writing an Elo ranking in this table is imo just wrong.
What I conclude is that Stockfish and Dragon have finished most of the winnable games before move 100 while Fire needs longer to find the win/mate.

Regards, Andreas
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Difference between Computer and humans chess players

Post by Frank Quisinsky »

Hi Andreas,

again ...
1. Games ended with x moves ... isolate to 4 gaming phases
2. Games ended with x pieces on board ... isolate to 4 gaming phases

For FCP Tourney-2020, FCP Tourney-2021, FCP Tounrey-2022 ..., each engine have 40 opponents.

The results are allways the same!

Its one way to do that.
An other way is to looking on eval and give 1:0, draw and 0:1 with x pieces on board and isolate it to 4 gaming phases.

And the results is again the same!
Elo is here not very important, more the rank in the table.

With databases each one against each other and without resign a lot of interesting stats are possible.
Nothing is new ... I can look on really boring ratings or I do more with the games as to produce ratings only.

Best
Frank

PS:
Most interesting are stats with pawn structures, not this one.
RubiChess
Posts: 562
Joined: Fri Mar 30, 2018 7:20 am
Full name: Andreas Matthies

Re: Difference between Computer and humans chess players

Post by RubiChess »

Frank Quisinsky wrote: Sat Jan 29, 2022 1:16 pm Hi Andreas,

again ...
1. Games ended with x moves ... isolate to 4 gaming phases
2. Games ended with x pieces on board ... isolate to 4 gaming phases

For FCP Tourney-2020, FCP Tourney-2021, FCP Tounrey-2022 ..., each engine have 40 opponents.

The results are allways the same!
Then please explain: What ARE the results?
Esp. the last table, the only one with Fire in the table and "leading": What is the result of this table? Is it Fire that is very good here? Or are Stockfish and Komodo just too strong to delay wins into this >100 moves stage?
What is wrong about my conclusion that Fire just transports far more winnable games to this "game phase" instead of winning these games in an earlier stage?

I'm sure I could easily write a Rubi-Master-of-4th-phase that gets highest score percentage in this >100 moves table by just shuffling won games into this stage. But this isn't what chess is about. Chess is about win/draw/lose, nothing else.

Regards, Andreas

PS. I know that you won't agree esp. in this last statement so discussion is probably useless.
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Difference between Computer and humans chess players

Post by Frank Quisinsky »

Hi Andreas,

self-explanatory.

Results from engines, rank on the tables, to 95% the same with the differrent ways I try out.

The different ways:
- quantity of pieces on board in endposition of games
- games ended with x moves (examples you can see in my posting for comparing Stockfish with Komodo).

A bit more complicated is:
- combination from "eval" with "quantity of pieces on board" (without to look on the end of the game).

Much more is possible with a wish tool Ferdinand Mosca created for me.
I added all the tools in my download selection.

The topic is:
If I split a chess game in 4 parts:

- (1) earlier mid-game
- (2) mid-game
- (3) passage into endgame or transition into endgame
- (4) endgame

In all of the three ways (I wrote before) Dragon by Komodo on rank 1 for part 3 (passage into endgame).

All that is very self-explanatory:

Another example:
Engines, like Gogobello or Zahak for two examples (lost many games very fast) can't be the hero in the earlier mid-game (phase / part 1).
I think this is a good example.

The question in case of Gogobello or Zahak: Where are the strong points!
In case of Gogobello I will wait of the end of run-1 (still running tournament).

In case of Zahak ... (FCP Tourney-2022)

(1) = place 33
(2) = place 32
(3) = place 20
(4) = place 16

And to Fire on place 1 in gaming phase 4 (endgame) ...
Endgame is very special because here the differents between best engines are very small.
During the tournament are still running a long time Koivisto is on rank 1.

Will give the information, that we have different engines with around the same strength.

Have a look on Elo (place 1 3306 - place 41 = 3048)
The different is 258 Elo only.

Code: Select all

   # Player                           :      Elo  Games  Score%  won  draw  lost  Points  Draw%  Error   OppAvg   OppE   OppD
   1 Fire 8 NN MC.3 x64               :  3306.79    303    65.5   98   201     4   198.5   66.3  22.04  3180.45  18.73   36.2
   2 Stockfish 151121 NN x64          :  3300.87    249    62.0   61   187     1   154.5   75.1  23.83  3201.84  18.85   34.3
   3 Dragon 2.5.1 by Komodo NN AVX    :  3297.26    310    61.9   75   234     1   192.0   75.5  20.35  3196.32  18.94   35.5
   4 Koivisto 7.5 NN AVX2 x64         :  3291.75    474    65.4  153   314     7   310.0   66.2  17.41  3175.19  18.74   38.1
   5 SlowChess Blitz 2.8 NN AVX2 x    :  3287.02    359    62.8   96   259     4   225.5   72.1  19.61  3188.08  18.70   37.0
   6 Ethereal 13.25 NN PEXT x64       :  3284.69    425    64.4  132   283    10   273.5   66.6  18.18  3174.83  18.78   38.0
   7 Berserk 7 NN PEXT x64            :  3276.23    439    62.8  121   309     9   275.5   70.4  17.68  3178.64  18.70   37.9
   8 Nemorino 6.09 NN x64             :  3260.65    338    63.0  110   206    22   213.0   60.9  20.64  3166.09  18.67   38.4
   9 Seer 2.4.0 NN AVX2 x64           :  3254.31    356    60.3   95   239    22   214.5   67.1  19.53  3176.31  18.74   38.0
  10 RubiChess 2.2 NN x64             :  3253.99    327    59.8   79   233    15   195.5   71.3  19.32  3180.52  18.70   36.8
  11 Igel 3.0.10 NN BMI2 x64          :  3243.68    284    57.7   63   202    19   164.0   71.1  20.88  3180.85  18.54   35.9
  12 Lc0 0.28.0 NN CPU-dnnl           :  3218.12    365    55.2   63   277    25   201.5   75.9  17.99  3174.35  18.65   38.7
  13 rofChade 2.310 NN x64            :  3211.11    380    53.9   63   284    33   205.0   74.7  18.03  3181.31  18.74   37.6
  14 Revenge 1.0 NN x64               :  3207.81    298    52.2   45   221    32   155.5   74.2  20.59  3187.82  18.62   36.9
  15 Arasan 23.0.1 NN AVX2 x64        :  3192.40    334    51.8   52   242    40   173.0   72.5  18.75  3179.37  18.60   37.9
  16 Zahak 8.6 AMD x64                :  3185.59    352    53.4   66   244    42   188.0   69.3  18.27  3160.63  18.41   36.8
  17 Xiphos 0.6 BMI2 x64              :  3182.23    357    49.7   50   255    52   177.5   71.4  18.87  3179.81  18.52   36.2
  18 Minic 3.17 NN x64                :  3181.22    365    52.9   55   276    34   193.0   75.6  17.72  3163.46  18.57   37.6
  19 Shredder 13 POPCNT x64           :  3151.62    344    45.9   46   224    74   158.0   65.1  18.62  3184.42  18.71   38.2
  20 Weiss 2.0 PEXT x64               :  3150.02    413    47.2   48   294    71   195.0   71.2  15.98  3170.66  18.59   38.0
  21 Defenchess 2.3 dev BMI2 x64      :  3148.43    397    47.5   46   285    66   188.5   71.8  16.77  3169.43  18.55   38.2
  22 Booot 6.5 POPCNT x64             :  3146.87    356    45.1   35   251    70   160.5   70.5  18.95  3182.96  18.56   37.5
  23 Fritz 18 (Ginkgo) x64            :  3144.37    414    46.0   39   303    72   190.5   73.2  17.00  3176.36  18.69   38.6
  24 Clover 2.4 x64                   :  3140.40    384    47.1   37   288    59   181.0   75.0  17.13  3165.27  18.61   37.7
  25 Marvin 5.2.0 NN AVX2 x64         :  3139.90    336    45.8   36   236    64   154.0   70.2  18.64  3172.36  18.51   37.3
  26 Halogen 10 NN PEXT x64           :  3134.84    293    46.4   33   206    54   136.0   70.3  19.99  3162.11  18.46   37.0
  27 Wasp 5.00 NN AVX2 x64            :  3133.87    334    45.2   28   246    60   151.0   73.7  17.76  3170.09  18.49   37.4
  28 Laser 1.7 BMI2 x64               :  3132.90    349    44.6   47   217    85   155.5   62.2  17.68  3174.01  18.61   38.5
  29 Winter 0.9 BMI2 x64              :  3131.94    318    44.5   28   227    63   141.5   71.4  19.16  3176.76  18.52   37.6
  30 Orion 0.8 NN FMA x64             :  3128.39    293    45.7   21   226    46   134.0   77.1  19.14  3161.63  18.51   36.4
  31 GullChess 3.0 Sy BMI2 x64        :  3127.71    341    46.0   35   244    62   157.0   71.6  17.83  3161.90  18.48   37.3
  32 Chiron 5 x64                     :  3122.24    376    42.6   43   234    99   160.0   62.2  17.49  3179.54  18.65   38.0
  33 Combusken 1.4.0 AMD x64          :  3120.93    289    45.8   28   209    52   132.5   72.3  19.03  3156.23  18.57   36.0
  34 Fizbo 2.0 BMI2 x64               :  3119.39    304    43.4   32   200    72   132.0   65.8  19.18  3171.18  18.47   36.5
  35 Schooner 2.2 XB SSE x64          :  3108.76    363    40.4   30   233   100   146.5   64.2  18.71  3180.66  18.68   38.0
  36 Andscacs 0.95.123 x64            :  3108.61    372    41.8   34   243    95   155.5   65.3  17.62  3172.57  18.50   38.1
  37 Dark Toga 1.1 NN AVX2 x64        :  3107.42    451    42.6   46   292   113   192.0   64.7  16.80  3167.47  18.56   38.0
  38 DanaSah 9.0 NN AVX2 x64          :  3091.42    314    39.3   19   209    86   123.5   66.6  20.46  3169.91  18.48   37.2
  39 Stash 31.16 x64                  :  3075.53    451    37.3   21   294   136   168.0   65.2  17.50  3171.84  18.58   38.3
  40 Nirvanachess 2.5 POPCNT x64      :  3066.19    351    35.9   21   210   120   126.0   59.8  19.92  3175.40  18.55   37.6
  41 Demolito 2021-07-09 x64          :  3048.54    446    34.4   14   279   153   153.5   62.6  18.66  3170.11  18.51   37.6

White advantage = 31.84 +/- 2.28
Draw rate (equal opponents) = 79.42 % +/- 0.60
Best
Frank