Page 1 of 2

AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Posted: Fri Nov 29, 2013 9:16 pm
by Aser Huerga
AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

i7-3930K CPUs 4.25 GHz
90'+30" TC
1 core for all engines
Ponder off
1024 Hash
3-4-5 EGTBs (when available) in SSDs
150 Early Starting Positions Suite, slightly tunned to avoid transpositions (checked with engine vs same-engine matches), and created as a proportional representation of the most played openings/variations on the last recent years in high quality human chess tournaments (source TWIC, only 2400+ ELO players) AH_150_Opening_Suite
All positions are played with Switched Colors for a total of 300 games per match
Games available for download, including eval/time/depth and PV for each move

Results:

Code: Select all

1 Houdini 4             +64/-61/=175 50.50% 151.5/300
2 Stockfish_SZ 13110122 +61/-64/=175 49.50% 148.5/300
[/color]

Games: Houdini 4 vs Stockfish_SZ 13110122

(Both SF and Houdini 4 are using Syzygy 3-4-5 EGTBs)

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Posted: Fri Nov 29, 2013 9:21 pm
by ouachita
Aser,
is anyone performing LTC testing of H4, SF, K, etc.? I've seen the current nTEC tournament.

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Posted: Fri Nov 29, 2013 9:33 pm
by Aser Huerga
ouachita wrote:Aser,
is anyone performing LTC testing of H4, SF, K, etc.? I've seen the current nTEC tournament.
... Me? Mine are LTC testings of H4, SF, K ... In the near future I plan to run other engines too ...

Sorry, maybe I don't understand your question.

Edit: if you mean hours per move time controls, I don't think it would be practical at all to get some reasonable statistical results.

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Posted: Fri Nov 29, 2013 9:40 pm
by majortom
Thx, Aser!

All Aser's 90'+30'' games and statistics

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish SZ 13110122          : 2969   15  15   900    54.0 %   2942   58.0 %
  2 Houdini 4 x64A                 : 2963   19  19   600    51.1 %   2956   55.5 %
  3 Komodo 6                       : 2942   15  15   900    48.8 %   2951   57.1 %
  4 Houdini 3                      : 2919   19  19   600    44.8 %   2956   54.5 %

Code: Select all

1 Stockfish SZ 13110122     : 2969  900 (+225,=522,-153), 54.0 %

Houdini 3                     : 300 (+ 94,=159,- 47), 57.8 %
Komodo 6                      : 300 (+ 70,=188,- 42), 54.7 %
Houdini 4 x64A                : 300 (+ 61,=175,- 64), 49.5 %

2 Houdini 4 x64A            : 2963  600 (+140,=333,-127), 51.1 %

Stockfish SZ 13110122         : 300 (+ 64,=175,- 61), 50.5 %
Komodo 6                      : 300 (+ 76,=158,- 66), 51.7 %

3 Komodo 6                  : 2942  900 (+182,=514,-204), 48.8 %

Stockfish SZ 13110122         : 300 (+ 42,=188,- 70), 45.3 %
Houdini 3                     : 300 (+ 74,=168,- 58), 52.7 %
Houdini 4 x64A                : 300 (+ 66,=158,- 76), 48.3 %

4 Houdini 3                 : 2919  600 (+105,=327,-168), 44.8 %

Stockfish SZ 13110122         : 300 (+ 47,=159,- 94), 42.2 %
Komodo 6                      : 300 (+ 58,=168,- 74), 47.3 %
http://treu.ru/pgn/AH_LTC_all_games.7z

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Posted: Fri Nov 29, 2013 9:58 pm
by ouachita
900 games of 90+30 is a good start. How about 90+30 on a 16 or 24 core machine? Where can I find Stockfish SZ 13110122?
majortom wrote:Thx, Aser!

All Aser's 90'+30'' games and statistics

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish SZ 13110122          : 2969   15  15   900    54.0 %   2942   58.0 %
  2 Houdini 4 x64A                 : 2963   19  19   600    51.1 %   2956   55.5 %
  3 Komodo 6                       : 2942   15  15   900    48.8 %   2951   57.1 %
  4 Houdini 3                      : 2919   19  19   600    44.8 %   2956   54.5 %

Code: Select all

1 Stockfish SZ 13110122     : 2969  900 (+225,=522,-153), 54.0 %

Houdini 3                     : 300 (+ 94,=159,- 47), 57.8 %
Komodo 6                      : 300 (+ 70,=188,- 42), 54.7 %
Houdini 4 x64A                : 300 (+ 61,=175,- 64), 49.5 %

2 Houdini 4 x64A            : 2963  600 (+140,=333,-127), 51.1 %

Stockfish SZ 13110122         : 300 (+ 64,=175,- 61), 50.5 %
Komodo 6                      : 300 (+ 76,=158,- 66), 51.7 %

3 Komodo 6                  : 2942  900 (+182,=514,-204), 48.8 %

Stockfish SZ 13110122         : 300 (+ 42,=188,- 70), 45.3 %
Houdini 3                     : 300 (+ 74,=168,- 58), 52.7 %
Houdini 4 x64A                : 300 (+ 66,=158,- 76), 48.3 %

4 Houdini 3                 : 2919  600 (+105,=327,-168), 44.8 %

Stockfish SZ 13110122         : 300 (+ 47,=159,- 94), 42.2 %
Komodo 6                      : 300 (+ 58,=168,- 74), 47.3 %
http://treu.ru/pgn/AH_LTC_all_games.7z

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Posted: Fri Nov 29, 2013 10:07 pm
by Aser Huerga
ouachita wrote:900 games of 90+30 is a good start. How about 90+30 on a 16 or 24 core machine?
Fantastic! The only problem is that I lack 16 or 24 core machines :D and it would take x6 times to get same number of games.
ouachita wrote: Where can I find Stockfish SZ 13110122?
http://abrok.eu/stockfish_syzygy/808517 ... _sse42.exe

And all other SZ dev. versions: http://abrok.eu/stockfish_syzygy/

Thanks Andrey for the statistics!

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Posted: Fri Nov 29, 2013 10:12 pm
by ouachita
How long from start to finish did the 900, 90+30 test take?

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Posted: Fri Nov 29, 2013 10:16 pm
by Aser Huerga
ouachita wrote:How long from start to finish did the 900, 90+30 test take?
Each match takes me around 36-48 hours. For non top engines it will take me more to get results because I plan to play the tests only at night (I'm a corr chess player too).

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Posted: Fri Nov 29, 2013 11:00 pm
by ouachita
Well, if I knew how to set it up properly, I could run a H4, SF (whatever version) and K 1142 (when released) 40/120 or 40/160 test on my 16 core machine

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Posted: Sat Nov 30, 2013 5:18 pm
by lkaufman
Aser Huerga wrote: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

i7-3930K CPUs 4.25 GHz
90'+30" TC
1 core for all engines
Ponder off
1024 Hash
3-4-5 EGTBs (when available) in SSDs
150 Early Starting Positions Suite, slightly tunned to avoid transpositions (checked with engine vs same-engine matches), and created as a proportional representation of the most played openings/variations on the last recent years in high quality human chess tournaments (source TWIC, only 2400+ ELO players) AH_150_Opening_Suite
All positions are played with Switched Colors for a total of 300 games per match
Games available for download, including eval/time/depth and PV for each move

Results:

Code: Select all

1 Houdini 4             +64/-61/=175 50.50% 151.5/300
2 Stockfish_SZ 13110122 +61/-64/=175 49.50% 148.5/300
[/color]

Games: Houdini 4 vs Stockfish_SZ 13110122

Assuming you plan to continue to run matches at this level among top engines (as they are released), why not have a rating list at this level based on your games? It isn't much work I think. Actually anyone could make this list based on your posts, but it makes more sense for you to do it. I think the only major issue is whether to use BayesElo or Ordo for ratings; I personally would use Ordo as there are no decisions to make regarding parameters and the ratings for a match always correspond with the ratings you get with the standard Elo/FIDE formula.
This would be the only standard time control list that is even close to being up to date. I think it would fill an important void. True, it might be limited to new versions of SF, Houdini, and Komodo unless another unrelated engine is even close to competitive, but those are the three that most people are interested in I suppose. As long as you play each new version against the two best unrelated versions, each rating would be based on at least 600 games.

(Both SF and Houdini 4 are using Syzygy 3-4-5 EGTBs)