which program is best in endgames

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Uri Blass
Posts: 10301
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

which program is best in endgames

Post by Uri Blass »

I wonder if somebody did endgame tests between top programs to test which program is best in that stage.

I know that houdini is better than stockfish or critter but it is not clear to me that houdini is best in all stages of the games and it is possible that a program is number 1 because it is better in the early stages of the game.
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: which program is best in endgames

Post by Albert Silver »

Uri Blass wrote:I wonder if somebody did endgame tests between top programs to test which program is best in that stage.

I know that houdini is better than stockfish or critter but it is not clear to me that houdini is best in all stages of the games and it is possible that a program is number 1 because it is better in the early stages of the game.
Rybka 4 and Houdini. They have different strengths in the endgame depending on the pieces on the board. Note that I am talking about play, not pure evaluation.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
Sean Evans
Posts: 1777
Joined: Thu Jun 05, 2008 10:58 pm
Location: Canada

Re: which program is best in endgames

Post by Sean Evans »

Uri Blass wrote:I wonder if somebody did endgame tests between top programs to test which program is best in that stage.

I know that houdini is better than stockfish or critter but it is not clear to me that houdini is best in all stages of the games and it is possible that a program is number 1 because it is better in the early stages of the game.
Why? They are all using TBs!
Jouni
Posts: 3293
Joined: Wed Mar 08, 2006 8:15 pm

Re: which program is best in endgames

Post by Jouni »

I took 10 non-trivial endgame positions and played 4 engines with white and black (of course with tablebases). Result:

Code: Select all

                 
1   Stockfish 2.0 JA 64bit    12.5 - 7.5 10.0 - 10.0 11.5 - 8.5    34.0/60
2   Rybka 4                    7.5 - 12.5 11.5 - 8.5 11.5 - 8.5    30.5/60
3   Houdini 1.5a x64          10.0 - 10.0 8.5 - 11.5 9.5 - 10.5     28.0/60
4   Critter 0.90 64-bit       8.5 - 11.5 8.5 - 11.5 10.5 - 9.5    27.5/60



Personally I trust Rybka 4 most, because I have more Nalimov files than Gaviota ones.

Jouni
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: which program is best in endgames

Post by Laskos »

Jouni wrote:I took 10 non-trivial endgame positions and played 4 engines with white and black (of course with tablebases). Result:

Code: Select all

                 
1   Stockfish 2.0 JA 64bit    12.5 - 7.5 10.0 - 10.0 11.5 - 8.5    34.0/60
2   Rybka 4                    7.5 - 12.5 11.5 - 8.5 11.5 - 8.5    30.5/60
3   Houdini 1.5a x64          10.0 - 10.0 8.5 - 11.5 9.5 - 10.5     28.0/60
4   Critter 0.90 64-bit       8.5 - 11.5 8.5 - 11.5 10.5 - 9.5    27.5/60



Personally I trust Rybka 4 most, because I have more Nalimov files than Gaviota ones.

Jouni
All engines equal here well within 95% error margins confidence intervals. For 60 games it would be needed ~10 points separation to fall outside 95% error margins confidence intervals.

An endgame test at 5' + 5'', 1 core, 30 endgame positions, reversed colours:

Code: Select all

Endgame testing

1   Houdini 1.5 x64            **             97.0/159
2   Rybka 4 x64                 **            93.0/159  7191.25
3   Ivanhoe B47cBx64a            **           93.0/159  7190.75
4   Stockfish 2.0.1 JA 64bit      **          89.0/159
5   Critter 0.90 64-bit SSE4       **         85.0/159
6   Komodo64 1.3 JA                 **        82.0/159
7   Fritz 12                         **       74.0/159
8   spark-1.0-win64-mp-corei          **      72.5/159
9   Protector 140_64_JA                **     71.5/159
10  Gull 1.2 x64                        **    71.0/159
11  HIARCS 13.2 SP                       **   69.0/159
12  Deep Junior 12 UCI                    **  57.0/159
Here the 95% separation would be ~16 points.

Kai
Jouni
Posts: 3293
Joined: Wed Mar 08, 2006 8:15 pm

Re: which program is best in endgames

Post by Jouni »

Yes 10 endgame positions are not enough. So I took 25 another endings and still Stockfish is stunning with fast time control:

Code: Select all

Stockfish - Rybka4 : 27.0/50 11-7-32 (======10011=10=01=1===1=====10====10=1===1=====0==)  54%   +28

Code: Select all

Stockfish - Houdini15 : 27.0/50 14-10-26 (1====01==010======1=01====1==010=1==01==1=100101=1)  54%   +28
Jouni
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: which program is best in endgames

Post by Laskos »

Jouni wrote:Yes 10 endgame positions are not enough. So I took 25 another endings and still Stockfish is stunning with fast time control:

Code: Select all

Stockfish - Rybka4 : 27.0/50 11-7-32 (======10011=10=01=1===1=====10====10=1===1=====0==)  54%   +28

Code: Select all

Stockfish - Houdini15 : 27.0/50 14-10-26 (1====01==010======1=01====1==010=1==01==1=100101=1)  54%   +28
Jouni
Still within all meaningful error margins, but my feeling is the same, that Stockfish evaluates some endgames better. Just a feeling, probably the first 4-5 engines are pretty equal. Possibly I will conduct a larger test, but I have to find some relevant, balanced, non-trivial endgame positions.

Kai
gaard
Posts: 447
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: which program is best in endgames

Post by gaard »

Uri Blass wrote:I wonder if somebody did endgame tests between top programs to test which program is best in that stage.

I know that houdini is better than stockfish or critter but it is not clear to me that houdini is best in all stages of the games and it is possible that a program is number 1 because it is better in the early stages of the game.
Houdini, hands down. This is its strongest point relative to other programs. It is marginally better than Rybka in the middle game, but it really shines in the endgame. The evaluations even look much more sensible.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: which program is best in endgames

Post by Laskos »

I performed a more extensive endgame testing at short time control with 50 initial positions (+reversed colours). Even if the same position was played several times, there are no two identical games, the randomness is very high. Each engine 1,600 games:

Code: Select all

    Program                            Score      %    Av.Op.  Elo    +   -    Draws

  1 Houdini 1.5a x64               : 945.5/1600  59.1   3190   3254   13  13   45.3 %
  2 Rybka 4_x64                    : 924.0/1600  57.8   3191   3246   12  12   51.2 %
  3 Ivanhoe B49jA                  : 893.5/1600  55.8   3193   3234   12  12   50.1 %
  4 Stockfish 2.0.1 JA 64bit       : 750.0/1600  46.9   3204   3182   12  12   49.1 %
  5 Komodo64 1.3 JA                : 727.0/1600  45.4   3206   3174   13  13   45.6 %
  6 Critter 0.90 64-bit            : 560.0/1600  35.0   3218   3111   13  13   43.2 %
Houdini seems just a bit stronger than Rybka in the endgame, if at all. The gap between the first three and the rest is large, Stockfish seems not as good in the endgame as I suspected.

Was interesting to note that Houdini was 1-1.5 plies short of Rybka (adjusted +3 for depth) and IvanHoe, and 2-2.5 plies short of Stockfish, although its nps was the highest. Houdini seems to prune less in the endgame compared to some other engines. In the middlegame the search depth of Houdini is ~ equal to Rybka (+3) and IvanHoe. Contrary to what I felt, the main strength (those +60 Elo points) of Houdini seems to come more from the middlegame (although it's still strong in the endgame).

Kai
Jouni
Posts: 3293
Joined: Wed Mar 08, 2006 8:15 pm

Re: which program is best in endgames

Post by Jouni »

I played more games between Stockfish and Houdini and still Stockfish won, but of course result depends a lot from position selection. Critter is interesting: in some testsuites like EET it's simply the best, but not in real games!

Jouni