Houdini beats Rybka4 with 57 – 43 %

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
notyetagm
Posts: 253
Joined: Mon Jan 25, 2010 3:11 am

Re: Houdini beats Rybka4 with 57 – 43 %

Post by notyetagm »

Dr.Wael Deeb wrote:
notyetagm wrote:Where can I find this Houdini 1.01 engine?

Thanks
http://talkchess.com/forum/viewtopic.ph ... 56&start=0
Thanks, Dr. D.

:-)
ozziejoe
Posts: 811
Joined: Wed Mar 08, 2006 10:07 pm

Re: Houdini beats Rybka4 with 57 – 43 %

Post by ozziejoe »

using the silver test suite and 4 mn 2 sec on decent dual core hardware (3059 mghz), i have so far:

houdini 15.5
rybka 3 11.5
stockfish 11
bright 5c 2 (bright does not seem to use enough time)
Albert Silver
Posts: 3026
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: Houdini beats Rybka4 with 57 – 43 %

Post by Albert Silver »

Dann Corbit wrote:
beram wrote:Stunning results sofar in my private engine matches for Houdini 1.01 32 bit, against Deep Rybka 4 32 bit. Houdini sofar beats Rybka at blitz and at LTC, with 57 – 43 % !! And thats approx a 56 ELO difference

Houdini 1.01- Rybka 4

Blitz games
T8100 (6,29 Fritzmark) Blitz 4m+2s Nunn2

Code: Select all

1   Houdini 1.01 w32 2_CPU  +17/=27/-6 61.00%   30.5/50
2   Deep Rybka 4 w32        +6/=27/-17 39.00%   19.5/50
T8100 (6,29 Fritzmark) Blitz 4m+2s Bram privat suite 1.2

Code: Select all

1   Houdini 1.01 w32 2_CPU  +14/=30/-6 58.00%   29.0/50
2   Deep Rybka 4 w32        +6/=30/-14 42.00%   21.0/50
T8100 (6,29 Fritzmark), Blitz 4m+2s Nunn2 and Bram privat suite 1.2 - 116 games

Code: Select all

1   Houdini 1.01 w32 2_CPU  +33/=67/-16 57.33%   66.5/116
2   Deep Rybka 4 w32        +16/=67/-33 42.67%   49.5/116
LTC games
T8100 (6,29 Fritzmark), LTC - 20m/40+10m/20+(10m+12s) Nunn2 – after 23 games

Code: Select all

1   Houdini 1.01 w32 2_CPU  +7/=13/-3 58.70%   13.5/23
2   Deep Rybka 4 w32        +3/=13/-7 41.30%    9.5/23
games can be provided
regards Bram
I get similar results. You might check for losses on time in the PGN if the time control is very fast (e.g. 4+2"). All of these engines suffer from it, but Rybka 4 suffers the worst.
Actually, I believe the results. Using the default Time Management, at 2+0 using the SilverSuite, I also got:

Code: Select all

1   Houdini x64 2_CPU  +29/=53/-18 55.50%   55.5/100
2   Deep Rybka 4 x64   +18/=53/-29 44.50%   44.5/100
However with my custom time management (the best one I have tested so far for fast time controls), I got:

Code: Select all

1   Deep Rybka 4 x64   +25/=58/-17 54.00%   54.0/100
2   Houdini x64 2_CPU  +17/=58/-25 46.00%   46.0/100
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
kingliveson

Re: Houdini beats Rybka4 with 57 – 43 %

Post by kingliveson »

Albert Silver wrote:
Dann Corbit wrote:
beram wrote:Stunning results sofar in my private engine matches for Houdini 1.01 32 bit, against Deep Rybka 4 32 bit. Houdini sofar beats Rybka at blitz and at LTC, with 57 – 43 % !! And thats approx a 56 ELO difference

Houdini 1.01- Rybka 4

Blitz games
T8100 (6,29 Fritzmark) Blitz 4m+2s Nunn2

Code: Select all

1   Houdini 1.01 w32 2_CPU  +17/=27/-6 61.00%   30.5/50
2   Deep Rybka 4 w32        +6/=27/-17 39.00%   19.5/50
T8100 (6,29 Fritzmark) Blitz 4m+2s Bram privat suite 1.2

Code: Select all

1   Houdini 1.01 w32 2_CPU  +14/=30/-6 58.00%   29.0/50
2   Deep Rybka 4 w32        +6/=30/-14 42.00%   21.0/50
T8100 (6,29 Fritzmark), Blitz 4m+2s Nunn2 and Bram privat suite 1.2 - 116 games

Code: Select all

1   Houdini 1.01 w32 2_CPU  +33/=67/-16 57.33%   66.5/116
2   Deep Rybka 4 w32        +16/=67/-33 42.67%   49.5/116
LTC games
T8100 (6,29 Fritzmark), LTC - 20m/40+10m/20+(10m+12s) Nunn2 – after 23 games

Code: Select all

1   Houdini 1.01 w32 2_CPU  +7/=13/-3 58.70%   13.5/23
2   Deep Rybka 4 w32        +3/=13/-7 41.30%    9.5/23
games can be provided
regards Bram
I get similar results. You might check for losses on time in the PGN if the time control is very fast (e.g. 4+2"). All of these engines suffer from it, but Rybka 4 suffers the worst.
Actually, I believe the results. Using the default Time Management, at 2+0 using the SilverSuite, I also got:

Code: Select all

1   Houdini x64 2_CPU  +29/=53/-18 55.50%   55.5/100
2   Deep Rybka 4 x64   +18/=53/-29 44.50%   44.5/100
However with my custom time management (the best one I have tested so far for fast time controls), I got:

Code: Select all

1   Deep Rybka 4 x64   +25/=58/-17 54.00%   54.0/100
2   Houdini x64 2_CPU  +17/=58/-25 46.00%   46.0/100
Could you share these custom time controls because with R4 default time management, it's not outperforming other top engines. I would like suggested settings for a few time controls; 4+2, 5+0, 15+10, and 90+30. Or perhaps a formula to derive the right time management settings.

Thanks.
Albert Silver
Posts: 3026
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: Houdini beats Rybka4 with 57 – 43 %

Post by Albert Silver »

kingliveson wrote:
Albert Silver wrote:
Dann Corbit wrote:
beram wrote:Stunning results sofar in my private engine matches for Houdini 1.01 32 bit, against Deep Rybka 4 32 bit. Houdini sofar beats Rybka at blitz and at LTC, with 57 – 43 % !! And thats approx a 56 ELO difference

Houdini 1.01- Rybka 4

Blitz games
T8100 (6,29 Fritzmark) Blitz 4m+2s Nunn2

Code: Select all

1   Houdini 1.01 w32 2_CPU  +17/=27/-6 61.00%   30.5/50
2   Deep Rybka 4 w32        +6/=27/-17 39.00%   19.5/50
T8100 (6,29 Fritzmark) Blitz 4m+2s Bram privat suite 1.2

Code: Select all

1   Houdini 1.01 w32 2_CPU  +14/=30/-6 58.00%   29.0/50
2   Deep Rybka 4 w32        +6/=30/-14 42.00%   21.0/50
T8100 (6,29 Fritzmark), Blitz 4m+2s Nunn2 and Bram privat suite 1.2 - 116 games

Code: Select all

1   Houdini 1.01 w32 2_CPU  +33/=67/-16 57.33%   66.5/116
2   Deep Rybka 4 w32        +16/=67/-33 42.67%   49.5/116
LTC games
T8100 (6,29 Fritzmark), LTC - 20m/40+10m/20+(10m+12s) Nunn2 – after 23 games

Code: Select all

1   Houdini 1.01 w32 2_CPU  +7/=13/-3 58.70%   13.5/23
2   Deep Rybka 4 w32        +3/=13/-7 41.30%    9.5/23
games can be provided
regards Bram
I get similar results. You might check for losses on time in the PGN if the time control is very fast (e.g. 4+2"). All of these engines suffer from it, but Rybka 4 suffers the worst.
Actually, I believe the results. Using the default Time Management, at 2+0 using the SilverSuite, I also got:

Code: Select all

1   Houdini x64 2_CPU  +29/=53/-18 55.50%   55.5/100
2   Deep Rybka 4 x64   +18/=53/-29 44.50%   44.5/100
However with my custom time management (the best one I have tested so far for fast time controls), I got:

Code: Select all

1   Deep Rybka 4 x64   +25/=58/-17 54.00%   54.0/100
2   Houdini x64 2_CPU  +17/=58/-25 46.00%   46.0/100
Could you share these custom time controls because with R4 default time management, it's not outperforming other top engines. I would like suggested settings for a few time controls; 4+2, 5+0, 15+10, and 90+30. Or perhaps a formula to derive the right time management settings.

Thanks.
Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
kingliveson

Re: Houdini beats Rybka4 with 57 – 43 %

Post by kingliveson »

Albert Silver wrote:
kingliveson wrote:
Albert Silver wrote:
Dann Corbit wrote:
beram wrote:Stunning results sofar in my private engine matches for Houdini 1.01 32 bit, against Deep Rybka 4 32 bit. Houdini sofar beats Rybka at blitz and at LTC, with 57 – 43 % !! And thats approx a 56 ELO difference

Houdini 1.01- Rybka 4

Blitz games
T8100 (6,29 Fritzmark) Blitz 4m+2s Nunn2

Code: Select all

1   Houdini 1.01 w32 2_CPU  +17/=27/-6 61.00%   30.5/50
2   Deep Rybka 4 w32        +6/=27/-17 39.00%   19.5/50
T8100 (6,29 Fritzmark) Blitz 4m+2s Bram privat suite 1.2

Code: Select all

1   Houdini 1.01 w32 2_CPU  +14/=30/-6 58.00%   29.0/50
2   Deep Rybka 4 w32        +6/=30/-14 42.00%   21.0/50
T8100 (6,29 Fritzmark), Blitz 4m+2s Nunn2 and Bram privat suite 1.2 - 116 games

Code: Select all

1   Houdini 1.01 w32 2_CPU  +33/=67/-16 57.33%   66.5/116
2   Deep Rybka 4 w32        +16/=67/-33 42.67%   49.5/116
LTC games
T8100 (6,29 Fritzmark), LTC - 20m/40+10m/20+(10m+12s) Nunn2 – after 23 games

Code: Select all

1   Houdini 1.01 w32 2_CPU  +7/=13/-3 58.70%   13.5/23
2   Deep Rybka 4 w32        +3/=13/-7 41.30%    9.5/23
games can be provided
regards Bram
I get similar results. You might check for losses on time in the PGN if the time control is very fast (e.g. 4+2"). All of these engines suffer from it, but Rybka 4 suffers the worst.
Actually, I believe the results. Using the default Time Management, at 2+0 using the SilverSuite, I also got:

Code: Select all

1   Houdini x64 2_CPU  +29/=53/-18 55.50%   55.5/100
2   Deep Rybka 4 x64   +18/=53/-29 44.50%   44.5/100
However with my custom time management (the best one I have tested so far for fast time controls), I got:

Code: Select all

1   Deep Rybka 4 x64   +25/=58/-17 54.00%   54.0/100
2   Houdini x64 2_CPU  +17/=58/-25 46.00%   46.0/100
Could you share these custom time controls because with R4 default time management, it's not outperforming other top engines. I would like suggested settings for a few time controls; 4+2, 5+0, 15+10, and 90+30. Or perhaps a formula to derive the right time management settings.

Thanks.
Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.
Vytron on Rybka Forum wrote:
TCBuffer - How many seconds Rybka thinks she has subtracted from the clock. Useful for bullet (1 '0) and blitz games (3 '0) so Rybka avoids losing on time (by time stolen by the GUI or the opposing engine, like clones), and also may give a general better time management when set at 3 (setting does nothing on incremental or repeating time controls.)

TCNormal Move Time - The rate at which Rybka should play the game. Lower values will make the engine play faster, and vice versa.

TC Max Move Time - The amount of time Rybka is willing to spend in critical positions (higher values will make her think longer on such cases).
Albert Silver
Posts: 3026
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: Houdini beats Rybka4 with 57 – 43 %

Post by Albert Silver »

kingliveson wrote:
albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
It should be fun for us to play with finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For One Or Any Engine. By doing so, you're are distorting outcomes and perverting results. It should not be allowed and should be stopped immediately. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal settings?! Please, it should not be done. If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Default should always be used. For us however, it should be a fun thing to play with.
Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.

As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
kingliveson

Re: Houdini beats Rybka4 with 57 – 43 %

Post by kingliveson »

Albert Silver wrote:
kingliveson wrote:
albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.
Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.

As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.

I have built many books in the past using pgn databases from CCRL, CEGT, SSDF. I am indeed familiar with these lists (games and results). Normally for example, when Shredder OA (Opening Advice) is turned on/off, results are separated and noted. The potential issue here is tweaking parameters and combining the results -- such should not be done -- just as results of R3 Dynamic, Human, and Default are not combined.
Last edited by kingliveson on Sat Jun 05, 2010 2:52 am, edited 1 time in total.
Albert Silver
Posts: 3026
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: Houdini beats Rybka4 with 57 – 43 %

Post by Albert Silver »

kingliveson wrote:
Albert Silver wrote:
kingliveson wrote:
albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.
Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.

As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.

I have built many books in the past using pgn databases from CCRL, CEGT, SSDF. I am indeed familiar with these lists (games and results). Normally for example, when Shredder OA (Opening Advice) is turned on/off, results are separated. The potential issue here is tweaking parameters and combining the results -- such should not be done -- just as results of R3 Dynamic, Human, and Default are not combined.
So you are saying they should stop testing what they want because of a potential issue you imagined?
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
kingliveson

Re: Houdini beats Rybka4 with 57 – 43 %

Post by kingliveson »

Albert Silver wrote:
kingliveson wrote:
Albert Silver wrote:
kingliveson wrote:
albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.
Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.

As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.

I have built many books in the past using pgn databases from CCRL, CEGT, SSDF. I am indeed familiar with these lists (games and results). Normally for example, when Shredder OA (Opening Advice) is turned on/off, results are separated. The potential issue here is tweaking parameters and combining the results -- such should not be done -- just as results of R3 Dynamic, Human, and Default are not combined.
So you are saying they should stop testing what they want because of a potential issue you imagined?
If CCRL wants to become an arm of Rybka enterprise, that is of course its business. Tweaking Rybka parameters to play against other engines' default, again, you are perverting the ranking table. This should be easy to understand. They just will not be known as independent testers as it becomes obvious to everyone how these results come about.
Last edited by kingliveson on Sat Jun 05, 2010 3:04 am, edited 1 time in total.