CCRL 40/4 testing Houdini, Stockfish and Komodo

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Shaun
Posts: 322
Joined: Wed Mar 08, 2006 9:55 pm
Location: Brighton - UK

CCRL 40/4 testing Houdini, Stockfish and Komodo

Post by Shaun »

I have been running so 1CPU test using the Nunn v2 suite*

EDIT: Ignore the actual Elo ratings I just used an offset of 3000 in bayeselo so the ratings will not match the CCRL list.

Results to date

Code: Select all

Rank Name                            Elo    +    - games score oppo. draws 
   1 Houdini 4 x64 sz 1CPU          3045   27   27   400   59%  2992   44% 
   2 Houdini 4 x64 1CPU             3021   29   29   350   56%  2988   45% 
   3 Stockfish DD 64 SSE42 Sz 1CPU  3018   29   29   350   54%  2993   49% 
   4 Stockfish 131113 64 Sz 1CPU    3018   35   35   250   55%  2987   44% 
   5 Stockfish DD 64 SSE4.2 1CPU    3014   29   29   350   54%  2993   48% 
   6 Komodo TCEC 64-bit 1CPU        2996   31   31   300   48%  3007   46% 
   7 Houdini 3 x64 1CPU             2996   25   25   450   49%  3000   48% 
   8 Komodo 6 64-bit 1CPU           2985   27   27   400   47%  3002   46% 
   9 Komodo CCT 64-bit              2960   27   27   400   43%  3006   47% 
  10 Stockfish 4 64 SSE4.2 1CPU     2949   26   26   450   40%  3006   48% 

Code: Select all

                               Ho Ho St St St Ko Ho Ko Ko St
Houdini 4 x64 sz 1CPU             87 92 90 94 99 99 99 99 99
Houdini 4 x64 1CPU             12    57 56 64 89 91 97 99 99
Stockfish DD 64 SSE42 Sz 1CPU   7 42    50 57 85 88 95 99 99
Stockfish 131113 64 Sz 1CPU     9 43 49    56 81 85 94 99 99
Stockfish DD 64 SSE4.2 1CPU     5 35 42 43    80 83 93 99 99
Komodo TCEC 64-bit 1CPU         0 10 14 18 19    50 70 95 99
Houdini 3 x64 1CPU              0  8 11 14 16 49    73 97 99
Komodo 6 64-bit 1CPU            0  2  4  5  6 29 26    90 97
Komodo CCT 64-bit               0  0  0  0  0  4  2  9    73
Stockfish 4 64 SSE4.2 1CPU      0  0  0  0  0  0  0  2 26

Code: Select all

   1 Houdini 4 x64 sz 1CPU          3045 400.0 (235.0 : 165.0)
                                          50.0 ( 26.0 :  24.0) Stockfish DD 64 SSE42 Sz 1CPU  3018
                                          50.0 ( 27.5 :  22.5) Stockfish 131113 64 Sz 1CPU    3018
                                          50.0 ( 26.0 :  24.0) Stockfish DD 64 SSE4.2 1CPU    3014
                                          50.0 ( 30.5 :  19.5) Komodo TCEC 64-bit 1CPU        2996
                                          50.0 ( 32.5 :  17.5) Houdini 3 x64 1CPU             2996
                                          50.0 ( 33.0 :  17.0) Komodo 6 64-bit 1CPU           2985
                                          50.0 ( 30.5 :  19.5) Komodo CCT 64-bit              2960
                                          50.0 ( 29.0 :  21.0) Stockfish 4 64 SSE4.2 1CPU     2949
   2 Houdini 4 x64 1CPU             3021 350.0 (194.5 : 155.5)
                                          50.0 ( 24.5 :  25.5) Stockfish DD 64 SSE42 Sz 1CPU  3018
                                          50.0 ( 27.0 :  23.0) Stockfish DD 64 SSE4.2 1CPU    3014
                                          50.0 ( 27.0 :  23.0) Komodo TCEC 64-bit 1CPU        2996
                                          50.0 ( 27.0 :  23.0) Houdini 3 x64 1CPU             2996
                                          50.0 ( 25.5 :  24.5) Komodo 6 64-bit 1CPU           2985
                                          50.0 ( 31.5 :  18.5) Komodo CCT 64-bit              2960
                                          50.0 ( 32.0 :  18.0) Stockfish 4 64 SSE4.2 1CPU     2949
   3 Stockfish DD 64 SSE42 Sz 1CPU  3018 350.0 (190.0 : 160.0)
                                          50.0 ( 24.0 :  26.0) Houdini 4 x64 sz 1CPU          3045
                                          50.0 ( 25.5 :  24.5) Houdini 4 x64 1CPU             3021
                                          50.0 ( 27.0 :  23.0) Komodo TCEC 64-bit 1CPU        2996
                                          50.0 ( 28.5 :  21.5) Houdini 3 x64 1CPU             2996
                                          50.0 ( 25.5 :  24.5) Komodo 6 64-bit 1CPU           2985
                                          50.0 ( 29.0 :  21.0) Komodo CCT 64-bit              2960
                                          50.0 ( 30.5 :  19.5) Stockfish 4 64 SSE4.2 1CPU     2949
   4 Stockfish 131113 64 Sz 1CPU    3018 250.0 (138.0 : 112.0)
                                          50.0 ( 22.5 :  27.5) Houdini 4 x64 sz 1CPU          3045
                                          50.0 ( 29.0 :  21.0) Houdini 3 x64 1CPU             2996
                                          50.0 ( 28.0 :  22.0) Komodo 6 64-bit 1CPU           2985
                                          50.0 ( 29.0 :  21.0) Komodo CCT 64-bit              2960
                                          50.0 ( 29.5 :  20.5) Stockfish 4 64 SSE4.2 1CPU     2949
   5 Stockfish DD 64 SSE4.2 1CPU    3014 350.0 (187.5 : 162.5)
                                          50.0 ( 24.0 :  26.0) Houdini 4 x64 sz 1CPU          3045
                                          50.0 ( 23.0 :  27.0) Houdini 4 x64 1CPU             3021
                                          50.0 ( 25.5 :  24.5) Komodo TCEC 64-bit 1CPU        2996
                                          50.0 ( 24.0 :  26.0) Houdini 3 x64 1CPU             2996
                                          50.0 ( 30.5 :  19.5) Komodo 6 64-bit 1CPU           2985
                                          50.0 ( 29.0 :  21.0) Komodo CCT 64-bit              2960
                                          50.0 ( 31.5 :  18.5) Stockfish 4 64 SSE4.2 1CPU     2949
   6 Komodo TCEC 64-bit 1CPU        2996 300.0 (144.0 : 156.0)
                                          50.0 ( 19.5 :  30.5) Houdini 4 x64 sz 1CPU          3045
                                          50.0 ( 23.0 :  27.0) Houdini 4 x64 1CPU             3021
                                          50.0 ( 23.0 :  27.0) Stockfish DD 64 SSE42 Sz 1CPU  3018
                                          50.0 ( 24.5 :  25.5) Stockfish DD 64 SSE4.2 1CPU    3014
                                          50.0 ( 25.0 :  25.0) Houdini 3 x64 1CPU             2996
                                          50.0 ( 29.0 :  21.0) Stockfish 4 64 SSE4.2 1CPU     2949
   7 Houdini 3 x64 1CPU             2996 450.0 (221.0 : 229.0)
                                          50.0 ( 17.5 :  32.5) Houdini 4 x64 sz 1CPU          3045
                                          50.0 ( 23.0 :  27.0) Houdini 4 x64 1CPU             3021
                                          50.0 ( 21.5 :  28.5) Stockfish DD 64 SSE42 Sz 1CPU  3018
                                          50.0 ( 21.0 :  29.0) Stockfish 131113 64 Sz 1CPU    3018
                                          50.0 ( 26.0 :  24.0) Stockfish DD 64 SSE4.2 1CPU    3014
                                          50.0 ( 25.0 :  25.0) Komodo TCEC 64-bit 1CPU        2996
                                          50.0 ( 28.0 :  22.0) Komodo 6 64-bit 1CPU           2985
                                          50.0 ( 29.5 :  20.5) Komodo CCT 64-bit              2960
                                          50.0 ( 29.5 :  20.5) Stockfish 4 64 SSE4.2 1CPU     2949
   8 Komodo 6 64-bit 1CPU           2985 400.0 (188.0 : 212.0)
                                          50.0 ( 17.0 :  33.0) Houdini 4 x64 sz 1CPU          3045
                                          50.0 ( 24.5 :  25.5) Houdini 4 x64 1CPU             3021
                                          50.0 ( 24.5 :  25.5) Stockfish DD 64 SSE42 Sz 1CPU  3018
                                          50.0 ( 22.0 :  28.0) Stockfish 131113 64 Sz 1CPU    3018
                                          50.0 ( 19.5 :  30.5) Stockfish DD 64 SSE4.2 1CPU    3014
                                          50.0 ( 22.0 :  28.0) Houdini 3 x64 1CPU             2996
                                          50.0 ( 26.0 :  24.0) Komodo CCT 64-bit              2960
                                          50.0 ( 32.5 :  17.5) Stockfish 4 64 SSE4.2 1CPU     2949
   9 Komodo CCT 64-bit              2960 400.0 (170.5 : 229.5)
                                          50.0 ( 19.5 :  30.5) Houdini 4 x64 sz 1CPU          3045
                                          50.0 ( 18.5 :  31.5) Houdini 4 x64 1CPU             3021
                                          50.0 ( 21.0 :  29.0) Stockfish DD 64 SSE42 Sz 1CPU  3018
                                          50.0 ( 21.0 :  29.0) Stockfish 131113 64 Sz 1CPU    3018
                                          50.0 ( 21.0 :  29.0) Stockfish DD 64 SSE4.2 1CPU    3014
                                          50.0 ( 20.5 :  29.5) Houdini 3 x64 1CPU             2996
                                          50.0 ( 24.0 :  26.0) Komodo 6 64-bit 1CPU           2985
                                          50.0 ( 25.0 :  25.0) Stockfish 4 64 SSE4.2 1CPU     2949
  10 Stockfish 4 64 SSE4.2 1CPU     2949 450.0 (181.5 : 268.5)
                                          50.0 ( 21.0 :  29.0) Houdini 4 x64 sz 1CPU          3045
                                          50.0 ( 18.0 :  32.0) Houdini 4 x64 1CPU             3021
                                          50.0 ( 19.5 :  30.5) Stockfish DD 64 SSE42 Sz 1CPU  3018
                                          50.0 ( 20.5 :  29.5) Stockfish 131113 64 Sz 1CPU    3018
                                          50.0 ( 18.5 :  31.5) Stockfish DD 64 SSE4.2 1CPU    3014
                                          50.0 ( 21.0 :  29.0) Komodo TCEC 64-bit 1CPU        2996
                                          50.0 ( 20.5 :  29.5) Houdini 3 x64 1CPU             2996
                                          50.0 ( 17.5 :  32.5) Komodo 6 64-bit 1CPU           2985
                                          50.0 ( 25.0 :  25.0) Komodo CCT 64-bit              2960
It certainly seems that H4 gains more from the syzygy bases - my conclusion/guess is that H4 has been optimised for them. i.e. I assume that knowledge that is no longer required for the endgame has been removed speeding up the program.

ignoring ratings/for analysis I would love to see a UCI option in all engines that allows us to turn on the knowledge.

It's a shame that knowledge seems to hurt performance on average although in individual positions make the difference between the engine appearing dumb/intelligent.

Anyway I will be repeating most of the pairings for the 40-40 list - let see if 10x the think time alters the rankings.

Shaun

* Nunn v2 25 positions colours reversed (50 games) for CCRL 3(6) games from each paring have to be removed as they go beyond 12 moves.

P.S. If anyone wants the games let me know and I will provide a link
ouachita
Posts: 454
Joined: Tue Jan 15, 2013 4:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: CCRL 40/4 testing Houdini, Stockfish and Komodo

Post by ouachita »

Shaun wrote:I have been running so 1CPU test using the Nunn v2 suite*
Shaun - it's early here and I'm probably just still asleep and not seeing it, but would list the specs and setting for this test; cpu, speed, tc, engine and gui settings, etc.
SIM, PhD, MBA, PE
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: CCRL 40/4 testing Houdini, Stockfish and Komodo

Post by Houdini »

Thanks for the test!
Three comments.

1) It certainly seems that H4 gains more from the syzygy bases...
The error levels in your ratings are 25 to 35 Elo, which is very large compared to the observed rating differences.
This makes all conclusions very tentative - your guess that engine A gains more from Syzygy than engine B involves a combination of 4 ratings, with a combined error level of 50 to 70 Elo.

2) "I assume that knowledge that is no longer required for the endgame has been removed speeding up the program. "
Based on a test with high error margins you entertain some wild speculation, which of course is incorrect.

3) If you plan on playing only with Houdini/Stockfish/Komodo you'd better use Contempt 0 for Houdini, it provides the highest level of play against more or less equal opponents.

Cheers,
Robert
Shaun
Posts: 322
Joined: Wed Mar 08, 2006 9:55 pm
Location: Brighton - UK

Re: CCRL 40/4 testing Houdini, Stockfish and Komodo

Post by Shaun »

ouachita wrote:
Shaun wrote:I have been running so 1CPU test using the Nunn v2 suite*
Shaun - it's early here and I'm probably just still asleep and not seeing it, but would list the specs and setting for this test; cpu, speed, tc, engine and gui settings, etc.
Adjusted time control for CCRL 40-4, we run a benchmark against our different machines and adjust the time control to match 40-4 on an AMD X2 4600+. On this latop it works out at 40 moves in 2 minutes (rounded).

Laptop i5 3210M (16GB Ram / Samsung 840 EVO 1TB SSD)

Fritz 13 GUI
256MB hash for each engine
5 men Nalimov and 6 men syzygy on SSD
Engine defaults - threads to 1

Shaun
Shaun
Posts: 322
Joined: Wed Mar 08, 2006 9:55 pm
Location: Brighton - UK

Re: CCRL 40/4 testing Houdini, Stockfish and Komodo

Post by Shaun »

Houdini wrote:Thanks for the test!
Three comments.

1) It certainly seems that H4 gains more from the syzygy bases...
The error levels in your ratings are 25 to 35 Elo, which is very large compared to the observed rating differences.
This makes all conclusions very tentative - your guess that engine A gains more from Syzygy than engine B involves a combination of 4 ratings, with a combined error level of 50 to 70 Elo.
Agreed more games required but looking that way currently
Houdini wrote:
2) "I assume that knowledge that is no longer required for the endgame has been removed speeding up the program. "
Based on a test with high error margins you entertain some wild speculation, which of course is incorrect.
Does this mean you don't test by default with the tbs? and I agree 'speculation'
Houdini wrote:
3) If you plan on playing only with Houdini/Stockfish/Komodo you'd better use Contempt 0 for Houdini, it provides the highest level of play against more or less equal opponents.
I will probably test this - but I do plan to run additional opponents
Houdini wrote:
Cheers,
Robert
anyway congrats Houdini 4 is certainly doing well - although with the improvements in Stockfish and Komodo you will not find it easy to stay ahead - exciting times.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: CCRL 40/4 testing Houdini, Stockfish and Komodo

Post by lkaufman »

Houdini wrote:Thanks for the test!
Three comments.

1) It certainly seems that H4 gains more from the syzygy bases...
The error levels in your ratings are 25 to 35 Elo, which is very large compared to the observed rating differences.
This makes all conclusions very tentative - your guess that engine A gains more from Syzygy than engine B involves a combination of 4 ratings, with a combined error level of 50 to 70 Elo.

2) "I assume that knowledge that is no longer required for the endgame has been removed speeding up the program. "
Based on a test with high error margins you entertain some wild speculation, which of course is incorrect.

3) If you plan on playing only with Houdini/Stockfish/Komodo you'd better use Contempt 0 for Houdini, it provides the highest level of play against more or less equal opponents.

Cheers,
Robert
Regarding your last point, the same would apply for Komodo; if you test Houdini with Contempt 0 against these opponents, you should do the same for the same reason with Komodo. Then you have to maintain separate ratings for contempt 0 for each engine, and then what happens when another tester uses a wider range of opponents? It's a real can of worms. Maybe all engines should always be tested with zero contempt.
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: CCRL 40/4 testing Houdini, Stockfish and Komodo

Post by Dr.Wael Deeb »

Shaun wrote:
Houdini wrote:Thanks for the test!
Three comments.

1) It certainly seems that H4 gains more from the syzygy bases...
The error levels in your ratings are 25 to 35 Elo, which is very large compared to the observed rating differences.
This makes all conclusions very tentative - your guess that engine A gains more from Syzygy than engine B involves a combination of 4 ratings, with a combined error level of 50 to 70 Elo.
Agreed more games required but looking that way currently
Houdini wrote:
2) "I assume that knowledge that is no longer required for the endgame has been removed speeding up the program. "
Based on a test with high error margins you entertain some wild speculation, which of course is incorrect.
Does this mean you don't test by default with the tbs? and I agree 'speculation'
Houdini wrote:
3) If you plan on playing only with Houdini/Stockfish/Komodo you'd better use Contempt 0 for Houdini, it provides the highest level of play against more or less equal opponents.
I will probably test this - but I do plan to run additional opponents
Houdini wrote:
Cheers,
Robert
anyway congrats Houdini 4 is certainly doing well - although with the improvements in Stockfish and Komodo you will not find it easy to stay ahead - exciting times.
I give him maximum 2-3 months and he'll fall behind....

So breaks of one year to release a new version won't be advised nowadays....
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
shrapnel
Posts: 1339
Joined: Fri Nov 02, 2012 9:43 am
Location: New Delhi, India

Re: CCRL 40/4 testing Houdini, Stockfish and Komodo

Post by shrapnel »

Shaun wrote: It certainly seems that H4 gains more from the syzygy bases
Not just H4, even Stockfish becomes remarkably strong when Syzygy TBs are enabled.
In fact I agree with Mr Houdart that error levels are large in your tests, but not for the reasons he thinks ! :P
Actually, in my Tests and online games, a certain version of Stockfish with Syzygy support enabled is EASILY and CONSISTENTLY beating Houdini 4 !
And Mr Kaufman, I would STRONGLY suggest that the Komodo Team implement Syzygy support in future Komodo versions.
As someone who has been using Nalimov EGTBs for a long time, I can confidently state that Syzygy TB support remarkably increases the strength of the Engine !
I'm just using the Syzygy 3-4-5 men ( still d/l the 6-men) and already see a remarkable increase in the number of online games I win !
Regards
i7 5960X @ 4.1 Ghz, 64 GB G.Skill RipJaws RAM, Twin Asus ROG Strix OC 11 GB Geforce 2080 Tis
ouachita
Posts: 454
Joined: Tue Jan 15, 2013 4:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: CCRL 40/4 testing Houdini, Stockfish and Komodo

Post by ouachita »

Dr.:
Dr.Wael Deeb wrote:I give him maximum 2-3 months and he'll fall behind....
I assume you're referring to blitz and TC <60 wherein H4 is clearly the King. However, the tests at this and other sites at >60 TC show that the latest versions of SF have caught up if not surpassed H4.
Dr.Wael Deeb wrote:So breaks of one year to release a new version won't be advised nowadays....Dr.D
Pretty sure Robert has long ago concluded same.
SIM, PhD, MBA, PE
User avatar
Graham Banks
Posts: 41460
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: CCRL 40/4 testing Houdini, Stockfish and Komodo

Post by Graham Banks »

Houdini wrote:....... If you plan on playing only with Houdini/Stockfish/Komodo you'd better use Contempt 0 for Houdini, it provides the highest level of play against more or less equal opponents.

Cheers,
Robert
Hi Robert,

if Shaun tests with settings other than default, the results will be reported separately. We can't combine them. Pretty sure that you realise this, but thought I'd mention it.

Graham.
gbanksnz at gmail.com