I wonder if somebody did endgame tests between top programs to test which program is best in that stage.
I know that houdini is better than stockfish or critter but it is not clear to me that houdini is best in all stages of the games and it is possible that a program is number 1 because it is better in the early stages of the game.
which program is best in endgames
Moderators: hgm, Rebel, chrisw
-
- Posts: 10301
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: which program is best in endgames
Rybka 4 and Houdini. They have different strengths in the endgame depending on the pieces on the board. Note that I am talking about play, not pure evaluation.Uri Blass wrote:I wonder if somebody did endgame tests between top programs to test which program is best in that stage.
I know that houdini is better than stockfish or critter but it is not clear to me that houdini is best in all stages of the games and it is possible that a program is number 1 because it is better in the early stages of the game.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
-
- Posts: 1777
- Joined: Thu Jun 05, 2008 10:58 pm
- Location: Canada
Re: which program is best in endgames
Why? They are all using TBs!Uri Blass wrote:I wonder if somebody did endgame tests between top programs to test which program is best in that stage.
I know that houdini is better than stockfish or critter but it is not clear to me that houdini is best in all stages of the games and it is possible that a program is number 1 because it is better in the early stages of the game.
-
- Posts: 3293
- Joined: Wed Mar 08, 2006 8:15 pm
Re: which program is best in endgames
I took 10 non-trivial endgame positions and played 4 engines with white and black (of course with tablebases). Result:
Personally I trust Rybka 4 most, because I have more Nalimov files than Gaviota ones.
Jouni
Code: Select all
1 Stockfish 2.0 JA 64bit 12.5 - 7.5 10.0 - 10.0 11.5 - 8.5 34.0/60
2 Rybka 4 7.5 - 12.5 11.5 - 8.5 11.5 - 8.5 30.5/60
3 Houdini 1.5a x64 10.0 - 10.0 8.5 - 11.5 9.5 - 10.5 28.0/60
4 Critter 0.90 64-bit 8.5 - 11.5 8.5 - 11.5 10.5 - 9.5 27.5/60
Jouni
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: which program is best in endgames
All engines equal here well within 95% error margins confidence intervals. For 60 games it would be needed ~10 points separation to fall outside 95% error margins confidence intervals.Jouni wrote:I took 10 non-trivial endgame positions and played 4 engines with white and black (of course with tablebases). Result:
Personally I trust Rybka 4 most, because I have more Nalimov files than Gaviota ones.Code: Select all
1 Stockfish 2.0 JA 64bit 12.5 - 7.5 10.0 - 10.0 11.5 - 8.5 34.0/60 2 Rybka 4 7.5 - 12.5 11.5 - 8.5 11.5 - 8.5 30.5/60 3 Houdini 1.5a x64 10.0 - 10.0 8.5 - 11.5 9.5 - 10.5 28.0/60 4 Critter 0.90 64-bit 8.5 - 11.5 8.5 - 11.5 10.5 - 9.5 27.5/60
Jouni
An endgame test at 5' + 5'', 1 core, 30 endgame positions, reversed colours:
Code: Select all
Endgame testing
1 Houdini 1.5 x64 ** 97.0/159
2 Rybka 4 x64 ** 93.0/159 7191.25
3 Ivanhoe B47cBx64a ** 93.0/159 7190.75
4 Stockfish 2.0.1 JA 64bit ** 89.0/159
5 Critter 0.90 64-bit SSE4 ** 85.0/159
6 Komodo64 1.3 JA ** 82.0/159
7 Fritz 12 ** 74.0/159
8 spark-1.0-win64-mp-corei ** 72.5/159
9 Protector 140_64_JA ** 71.5/159
10 Gull 1.2 x64 ** 71.0/159
11 HIARCS 13.2 SP ** 69.0/159
12 Deep Junior 12 UCI ** 57.0/159
Kai
-
- Posts: 3293
- Joined: Wed Mar 08, 2006 8:15 pm
Re: which program is best in endgames
Yes 10 endgame positions are not enough. So I took 25 another endings and still Stockfish is stunning with fast time control:
Jouni
Code: Select all
Stockfish - Rybka4 : 27.0/50 11-7-32 (======10011=10=01=1===1=====10====10=1===1=====0==) 54% +28
Code: Select all
Stockfish - Houdini15 : 27.0/50 14-10-26 (1====01==010======1=01====1==010=1==01==1=100101=1) 54% +28
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: which program is best in endgames
Still within all meaningful error margins, but my feeling is the same, that Stockfish evaluates some endgames better. Just a feeling, probably the first 4-5 engines are pretty equal. Possibly I will conduct a larger test, but I have to find some relevant, balanced, non-trivial endgame positions.Jouni wrote:Yes 10 endgame positions are not enough. So I took 25 another endings and still Stockfish is stunning with fast time control:
Code: Select all
Stockfish - Rybka4 : 27.0/50 11-7-32 (======10011=10=01=1===1=====10====10=1===1=====0==) 54% +28
JouniCode: Select all
Stockfish - Houdini15 : 27.0/50 14-10-26 (1====01==010======1=01====1==010=1==01==1=100101=1) 54% +28
Kai
-
- Posts: 447
- Joined: Mon Jun 07, 2010 3:13 am
- Location: Holland, MI
- Full name: Martin W
Re: which program is best in endgames
Houdini, hands down. This is its strongest point relative to other programs. It is marginally better than Rybka in the middle game, but it really shines in the endgame. The evaluations even look much more sensible.Uri Blass wrote:I wonder if somebody did endgame tests between top programs to test which program is best in that stage.
I know that houdini is better than stockfish or critter but it is not clear to me that houdini is best in all stages of the games and it is possible that a program is number 1 because it is better in the early stages of the game.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: which program is best in endgames
I performed a more extensive endgame testing at short time control with 50 initial positions (+reversed colours). Even if the same position was played several times, there are no two identical games, the randomness is very high. Each engine 1,600 games:
Houdini seems just a bit stronger than Rybka in the endgame, if at all. The gap between the first three and the rest is large, Stockfish seems not as good in the endgame as I suspected.
Was interesting to note that Houdini was 1-1.5 plies short of Rybka (adjusted +3 for depth) and IvanHoe, and 2-2.5 plies short of Stockfish, although its nps was the highest. Houdini seems to prune less in the endgame compared to some other engines. In the middlegame the search depth of Houdini is ~ equal to Rybka (+3) and IvanHoe. Contrary to what I felt, the main strength (those +60 Elo points) of Houdini seems to come more from the middlegame (although it's still strong in the endgame).
Kai
Code: Select all
Program Score % Av.Op. Elo + - Draws
1 Houdini 1.5a x64 : 945.5/1600 59.1 3190 3254 13 13 45.3 %
2 Rybka 4_x64 : 924.0/1600 57.8 3191 3246 12 12 51.2 %
3 Ivanhoe B49jA : 893.5/1600 55.8 3193 3234 12 12 50.1 %
4 Stockfish 2.0.1 JA 64bit : 750.0/1600 46.9 3204 3182 12 12 49.1 %
5 Komodo64 1.3 JA : 727.0/1600 45.4 3206 3174 13 13 45.6 %
6 Critter 0.90 64-bit : 560.0/1600 35.0 3218 3111 13 13 43.2 %
Was interesting to note that Houdini was 1-1.5 plies short of Rybka (adjusted +3 for depth) and IvanHoe, and 2-2.5 plies short of Stockfish, although its nps was the highest. Houdini seems to prune less in the endgame compared to some other engines. In the middlegame the search depth of Houdini is ~ equal to Rybka (+3) and IvanHoe. Contrary to what I felt, the main strength (those +60 Elo points) of Houdini seems to come more from the middlegame (although it's still strong in the endgame).
Kai
-
- Posts: 3293
- Joined: Wed Mar 08, 2006 8:15 pm
Re: which program is best in endgames
I played more games between Stockfish and Houdini and still Stockfish won, but of course result depends a lot from position selection. Critter is interesting: in some testsuites like EET it's simply the best, but not in real games!
Jouni
Jouni