Houdini-4 Syzygy vs Stockfish_Syzygy_14012021 Dev

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
RJN
Posts: 303
Joined: Fri Jun 21, 2013 5:18 am
Location: Orion Spiral Arm

Houdini-4 Syzygy vs Stockfish_Syzygy_14012021 Dev

Post by RJN »

So far in my "Big 3" EGTB-OnOff tournament, H4 Syzygy has been doing the best. However, the competition is against various instances of SF-DD, which according to some claims is about 30 Elo weaker than the latest SF dev version.

Link to the Big-3 EGTB-OnOff 10 engine tournament:

http://talkchess.com/forum/viewtopic.ph ... =&start=20

I wanted to see how the latest SF dev version does. Conditions are basically the same as the above tournament, except now I am using Graham's opening book. The tournament above used no book.

Current Tournament Settings

Two cores per engine
Hash=512
Ponder Off
Contempt 0 for engines
HT off, SF "idle threads sleep"=true
6-piece Syzygy, all on SSD
Graham2014-1F opening book, randomly selected, openings repeated by both sides. (Thanks Graham!)

TC 40 in 5 minutes, repeating.

Results so far, it's early (please no comments that not enough games yet, which is obvious), but SF is off to a good start. The PGN will be posted when there are more games played. The SF version is Ronald de Man's fork from January 20, 2014 (latest version as of this post).

Code: Select all

60 games

1/22/2014 6:28:34 AM :

    Program                            Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish_Syzygy_14012021-SSE42 :   26   53  51    60    57.5 %    -26   65.0 %
  2 Houdini_4_Pro_x64B-Syzygy :        -26   51  53    60    42.5 %     26   65.0 %
ouachita
Posts: 454
Joined: Tue Jan 15, 2013 4:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: Houdini-4 Syzygy vs Stockfish_Syzygy_14012021 Dev

Post by ouachita »

RJN wrote:please no comments that not enough games yet, which is obvious
couldn't help but laugh at this, because it's so true - no one EVER has enough games particularly those that don't like the results!
SIM, PhD, MBA, PE
User avatar
RJN
Posts: 303
Joined: Fri Jun 21, 2013 5:18 am
Location: Orion Spiral Arm

Re: Houdini-4 Syzygy vs Stockfish_Syzygy_14012021 Dev

Post by RJN »

Indeed! Not enough games, or games are too short TC "low quality", or fill-in-the-blank...

Anyway, all we can do is keep plugging away; an update:

100 Games, PGN link

https://www.sugarsync.com/pf/D0094094_75971673_810554

Code: Select all

100 games

1/22/2014 7:08:04 PM :

    Program                             Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish_Syzygy_14012021-SSE42 :    19   42  41   100    55.5 %    -19   63.0 %
  2 Houdini_4_Pro_x64B-Syzygy :         -19   41  42   100    44.5 %     19   63.0 %

  
Stockfish_Syzygy_14012021-SSE42:         19  100 (+ 24,= 63,- 13), 55.5 %
User avatar
Ozymandias
Posts: 1537
Joined: Sun Oct 25, 2009 2:30 am

Re: Houdini-4 Syzygy vs Stockfish_Syzygy_14012021 Dev

Post by Ozymandias »

Not related to your test, but could you include a "Houdini_4_Pro_x64B-Syzygy-nobook" participant? That way we would see how the candle is burning from both ends :wink:
User avatar
RJN
Posts: 303
Joined: Fri Jun 21, 2013 5:18 am
Location: Orion Spiral Arm

Re: Houdini-4 Syzygy vs Stockfish_Syzygy_14012021 Dev

Post by RJN »

No book for all the engines in the (paused) tournament linked to below, just several EGTB formats, on and off. However, it was running various instances of SF-DD, and not the latest dev version:

http://talkchess.com/forum/viewtopic.ph ... =&start=20

When I'm done with this one in a week or so, I can possibly add the latest SF using no book to that established tournament, giving the Dev version a wider field of competitors.
User avatar
Ozymandias
Posts: 1537
Joined: Sun Oct 25, 2009 2:30 am

Re: Houdini-4 Syzygy vs Stockfish_Syzygy_14012021 Dev

Post by Ozymandias »

I think my metaphor was lost there. The candle is the amount of ELO an engine can offer, as we reduce the number of moves for them to play, so does the relevance of engines in computer chess.
Your other test quantifies the amount of ELO we get from EGTB, in this one you add a book to ALL participants. If you include in THIS test the participant I was suggesting, we could also see the amount of ELO we get from Book. Just for fun.