AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Discussion of computer chess matches and engine tournaments.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Aser Huerga
Posts: 812
Joined: Tue Jun 16, 2009 8:09 am
Location: Spain

AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Post by Aser Huerga » Fri Nov 29, 2013 9:16 pm

AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

i7-3930K CPUs 4.25 GHz
90'+30" TC
1 core for all engines
Ponder off
1024 Hash
3-4-5 EGTBs (when available) in SSDs
150 Early Starting Positions Suite, slightly tunned to avoid transpositions (checked with engine vs same-engine matches), and created as a proportional representation of the most played openings/variations on the last recent years in high quality human chess tournaments (source TWIC, only 2400+ ELO players) AH_150_Opening_Suite
All positions are played with Switched Colors for a total of 300 games per match
Games available for download, including eval/time/depth and PV for each move

Results:

Code: Select all

1 Houdini 4             +64/-61/=175 50.50% 151.5/300
2 Stockfish_SZ 13110122 +61/-64/=175 49.50% 148.5/300
[/color]

Games: Houdini 4 vs Stockfish_SZ 13110122

(Both SF and Houdini 4 are using Syzygy 3-4-5 EGTBs)

ouachita
Posts: 454
Joined: Tue Jan 15, 2013 3:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Post by ouachita » Fri Nov 29, 2013 9:21 pm

Aser,
is anyone performing LTC testing of H4, SF, K, etc.? I've seen the current nTEC tournament.
SIM, PhD, MBA, PE

User avatar
Aser Huerga
Posts: 812
Joined: Tue Jun 16, 2009 8:09 am
Location: Spain

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Post by Aser Huerga » Fri Nov 29, 2013 9:33 pm

ouachita wrote:Aser,
is anyone performing LTC testing of H4, SF, K, etc.? I've seen the current nTEC tournament.
... Me? Mine are LTC testings of H4, SF, K ... In the near future I plan to run other engines too ...

Sorry, maybe I don't understand your question.

Edit: if you mean hours per move time controls, I don't think it would be practical at all to get some reasonable statistical results.
Last edited by Aser Huerga on Fri Nov 29, 2013 9:45 pm, edited 2 times in total.

majortom
Posts: 669
Joined: Mon Nov 04, 2013 9:19 pm
Contact:

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Post by majortom » Fri Nov 29, 2013 9:40 pm

Thx, Aser!

All Aser's 90'+30'' games and statistics

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish SZ 13110122          : 2969   15  15   900    54.0 %   2942   58.0 %
  2 Houdini 4 x64A                 : 2963   19  19   600    51.1 %   2956   55.5 %
  3 Komodo 6                       : 2942   15  15   900    48.8 %   2951   57.1 %
  4 Houdini 3                      : 2919   19  19   600    44.8 %   2956   54.5 %

Code: Select all

1 Stockfish SZ 13110122     : 2969  900 (+225,=522,-153), 54.0 %

Houdini 3                     : 300 (+ 94,=159,- 47), 57.8 %
Komodo 6                      : 300 (+ 70,=188,- 42), 54.7 %
Houdini 4 x64A                : 300 (+ 61,=175,- 64), 49.5 %

2 Houdini 4 x64A            : 2963  600 (+140,=333,-127), 51.1 %

Stockfish SZ 13110122         : 300 (+ 64,=175,- 61), 50.5 %
Komodo 6                      : 300 (+ 76,=158,- 66), 51.7 %

3 Komodo 6                  : 2942  900 (+182,=514,-204), 48.8 %

Stockfish SZ 13110122         : 300 (+ 42,=188,- 70), 45.3 %
Houdini 3                     : 300 (+ 74,=168,- 58), 52.7 %
Houdini 4 x64A                : 300 (+ 66,=158,- 76), 48.3 %

4 Houdini 3                 : 2919  600 (+105,=327,-168), 44.8 %

Stockfish SZ 13110122         : 300 (+ 47,=159,- 94), 42.2 %
Komodo 6                      : 300 (+ 58,=168,- 74), 47.3 %
http://treu.ru/pgn/AH_LTC_all_games.7z

ouachita
Posts: 454
Joined: Tue Jan 15, 2013 3:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Post by ouachita » Fri Nov 29, 2013 9:58 pm

900 games of 90+30 is a good start. How about 90+30 on a 16 or 24 core machine? Where can I find Stockfish SZ 13110122?
majortom wrote:Thx, Aser!

All Aser's 90'+30'' games and statistics

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish SZ 13110122          : 2969   15  15   900    54.0 %   2942   58.0 %
  2 Houdini 4 x64A                 : 2963   19  19   600    51.1 %   2956   55.5 %
  3 Komodo 6                       : 2942   15  15   900    48.8 %   2951   57.1 %
  4 Houdini 3                      : 2919   19  19   600    44.8 %   2956   54.5 %

Code: Select all

1 Stockfish SZ 13110122     : 2969  900 (+225,=522,-153), 54.0 %

Houdini 3                     : 300 (+ 94,=159,- 47), 57.8 %
Komodo 6                      : 300 (+ 70,=188,- 42), 54.7 %
Houdini 4 x64A                : 300 (+ 61,=175,- 64), 49.5 %

2 Houdini 4 x64A            : 2963  600 (+140,=333,-127), 51.1 %

Stockfish SZ 13110122         : 300 (+ 64,=175,- 61), 50.5 %
Komodo 6                      : 300 (+ 76,=158,- 66), 51.7 %

3 Komodo 6                  : 2942  900 (+182,=514,-204), 48.8 %

Stockfish SZ 13110122         : 300 (+ 42,=188,- 70), 45.3 %
Houdini 3                     : 300 (+ 74,=168,- 58), 52.7 %
Houdini 4 x64A                : 300 (+ 66,=158,- 76), 48.3 %

4 Houdini 3                 : 2919  600 (+105,=327,-168), 44.8 %

Stockfish SZ 13110122         : 300 (+ 47,=159,- 94), 42.2 %
Komodo 6                      : 300 (+ 58,=168,- 74), 47.3 %
http://treu.ru/pgn/AH_LTC_all_games.7z
SIM, PhD, MBA, PE

User avatar
Aser Huerga
Posts: 812
Joined: Tue Jun 16, 2009 8:09 am
Location: Spain

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Post by Aser Huerga » Fri Nov 29, 2013 10:07 pm

ouachita wrote:900 games of 90+30 is a good start. How about 90+30 on a 16 or 24 core machine?
Fantastic! The only problem is that I lack 16 or 24 core machines :D and it would take x6 times to get same number of games.
ouachita wrote: Where can I find Stockfish SZ 13110122?
http://abrok.eu/stockfish_syzygy/808517 ... _sse42.exe

And all other SZ dev. versions: http://abrok.eu/stockfish_syzygy/

Thanks Andrey for the statistics!

ouachita
Posts: 454
Joined: Tue Jan 15, 2013 3:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Post by ouachita » Fri Nov 29, 2013 10:12 pm

How long from start to finish did the 900, 90+30 test take?
SIM, PhD, MBA, PE

User avatar
Aser Huerga
Posts: 812
Joined: Tue Jun 16, 2009 8:09 am
Location: Spain

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Post by Aser Huerga » Fri Nov 29, 2013 10:16 pm

ouachita wrote:How long from start to finish did the 900, 90+30 test take?
Each match takes me around 36-48 hours. For non top engines it will take me more to get results because I plan to play the tests only at night (I'm a corr chess player too).

ouachita
Posts: 454
Joined: Tue Jan 15, 2013 3:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Post by ouachita » Fri Nov 29, 2013 11:00 pm

Well, if I knew how to set it up properly, I could run a H4, SF (whatever version) and K 1142 (when released) 40/120 or 40/160 test on my 16 core machine
SIM, PhD, MBA, PE

lkaufman
Posts: 4079
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

Post by lkaufman » Sat Nov 30, 2013 5:18 pm

Aser Huerga wrote: AH_LTC Match: Houdini 4 vs Stockfish_SZ 13110122

i7-3930K CPUs 4.25 GHz
90'+30" TC
1 core for all engines
Ponder off
1024 Hash
3-4-5 EGTBs (when available) in SSDs
150 Early Starting Positions Suite, slightly tunned to avoid transpositions (checked with engine vs same-engine matches), and created as a proportional representation of the most played openings/variations on the last recent years in high quality human chess tournaments (source TWIC, only 2400+ ELO players) AH_150_Opening_Suite
All positions are played with Switched Colors for a total of 300 games per match
Games available for download, including eval/time/depth and PV for each move

Results:

Code: Select all

1 Houdini 4             +64/-61/=175 50.50% 151.5/300
2 Stockfish_SZ 13110122 +61/-64/=175 49.50% 148.5/300
[/color]

Games: Houdini 4 vs Stockfish_SZ 13110122

Assuming you plan to continue to run matches at this level among top engines (as they are released), why not have a rating list at this level based on your games? It isn't much work I think. Actually anyone could make this list based on your posts, but it makes more sense for you to do it. I think the only major issue is whether to use BayesElo or Ordo for ratings; I personally would use Ordo as there are no decisions to make regarding parameters and the ratings for a match always correspond with the ratings you get with the standard Elo/FIDE formula.
This would be the only standard time control list that is even close to being up to date. I think it would fill an important void. True, it might be limited to new versions of SF, Houdini, and Komodo unless another unrelated engine is even close to competitive, but those are the three that most people are interested in I suppose. As long as you play each new version against the two best unrelated versions, each rating would be based on at least 600 games.

(Both SF and Houdini 4 are using Syzygy 3-4-5 EGTBs)

Post Reply