Houdini beats Rybka4 with 57 – 43 %

notyetagm · Post by **notyetagm** » Fri Jun 04, 2010 11:12 pm

Dr.Wael Deeb wrote:
notyetagm wrote:Where can I find this Houdini 1.01 engine?

Thanks
http://talkchess.com/forum/viewtopic.ph ... 56&start=0

Thanks, Dr. D.

ozziejoe · Post by **ozziejoe** » Fri Jun 04, 2010 11:19 pm

using the silver test suite and 4 mn 2 sec on decent dual core hardware (3059 mghz), i have so far:

houdini 15.5
rybka 3 11.5
stockfish 11
bright 5c 2 (bright does not seem to use enough time)

Albert Silver · Post by **Albert Silver** » Sat Jun 05, 2010 12:32 am

Dann Corbit wrote:
beram wrote:Stunning results sofar in my private engine matches for Houdini 1.01 32 bit, against Deep Rybka 4 32 bit. Houdini sofar beats Rybka at blitz and at LTC, with 57 – 43 % !! And thats approx a 56 ELO difference

Houdini 1.01- Rybka 4

Blitz games
T8100 (6,29 Fritzmark) Blitz 4m+2s Nunn2
Code: Select all
1   Houdini 1.01 w32 2_CPU  +17/=27/-6 61.00%   30.5/50
2   Deep Rybka 4 w32        +6/=27/-17 39.00%   19.5/50
T8100 (6,29 Fritzmark) Blitz 4m+2s Bram privat suite 1.2
Code: Select all
1   Houdini 1.01 w32 2_CPU  +14/=30/-6 58.00%   29.0/50
2   Deep Rybka 4 w32        +6/=30/-14 42.00%   21.0/50
T8100 (6,29 Fritzmark), Blitz 4m+2s Nunn2 and Bram privat suite 1.2 - 116 games
Code: Select all
1   Houdini 1.01 w32 2_CPU  +33/=67/-16 57.33%   66.5/116
2   Deep Rybka 4 w32        +16/=67/-33 42.67%   49.5/116
LTC games
T8100 (6,29 Fritzmark), LTC - 20m/40+10m/20+(10m+12s) Nunn2 – after 23 games
Code: Select all
1   Houdini 1.01 w32 2_CPU  +7/=13/-3 58.70%   13.5/23
2   Deep Rybka 4 w32        +3/=13/-7 41.30%    9.5/23
games can be provided
regards Bram
I get similar results. You might check for losses on time in the PGN if the time control is very fast (e.g. 4+2"). All of these engines suffer from it, but Rybka 4 suffers the worst.

Actually, I believe the results. Using the default Time Management, at 2+0 using the SilverSuite, I also got:

Code: Select all

1   Houdini x64 2_CPU  +29/=53/-18 55.50%   55.5/100
2   Deep Rybka 4 x64   +18/=53/-29 44.50%   44.5/100

However with my custom time management (the best one I have tested so far for fast time controls), I got:

Code: Select all

1   Deep Rybka 4 x64   +25/=58/-17 54.00%   54.0/100
2   Houdini x64 2_CPU  +17/=58/-25 46.00%   46.0/100

kingliveson · Post by **kingliveson** » Sat Jun 05, 2010 12:41 am

Albert Silver wrote:
Dann Corbit wrote:
beram wrote:Stunning results sofar in my private engine matches for Houdini 1.01 32 bit, against Deep Rybka 4 32 bit. Houdini sofar beats Rybka at blitz and at LTC, with 57 – 43 % !! And thats approx a 56 ELO difference

Houdini 1.01- Rybka 4

Blitz games
T8100 (6,29 Fritzmark) Blitz 4m+2s Nunn2
Code: Select all
1   Houdini 1.01 w32 2_CPU  +17/=27/-6 61.00%   30.5/50
2   Deep Rybka 4 w32        +6/=27/-17 39.00%   19.5/50
T8100 (6,29 Fritzmark) Blitz 4m+2s Bram privat suite 1.2
Code: Select all
1   Houdini 1.01 w32 2_CPU  +14/=30/-6 58.00%   29.0/50
2   Deep Rybka 4 w32        +6/=30/-14 42.00%   21.0/50
T8100 (6,29 Fritzmark), Blitz 4m+2s Nunn2 and Bram privat suite 1.2 - 116 games
Code: Select all
1   Houdini 1.01 w32 2_CPU  +33/=67/-16 57.33%   66.5/116
2   Deep Rybka 4 w32        +16/=67/-33 42.67%   49.5/116
LTC games
T8100 (6,29 Fritzmark), LTC - 20m/40+10m/20+(10m+12s) Nunn2 – after 23 games
Code: Select all
1   Houdini 1.01 w32 2_CPU  +7/=13/-3 58.70%   13.5/23
2   Deep Rybka 4 w32        +3/=13/-7 41.30%    9.5/23
games can be provided
regards Bram
I get similar results. You might check for losses on time in the PGN if the time control is very fast (e.g. 4+2"). All of these engines suffer from it, but Rybka 4 suffers the worst.
Actually, I believe the results. Using the default Time Management, at 2+0 using the SilverSuite, I also got:
Code: Select all
1   Houdini x64 2_CPU  +29/=53/-18 55.50%   55.5/100
2   Deep Rybka 4 x64   +18/=53/-29 44.50%   44.5/100
However with my custom time management (the best one I have tested so far for fast time controls), I got:
Code: Select all
1   Deep Rybka 4 x64   +25/=58/-17 54.00%   54.0/100
2   Houdini x64 2_CPU  +17/=58/-25 46.00%   46.0/100

Could you share these custom time controls because with R4 default time management, it's not outperforming other top engines. I would like suggested settings for a few time controls; 4+2, 5+0, 15+10, and 90+30. Or perhaps a formula to derive the right time management settings.

Thanks.

Albert Silver · Post by **Albert Silver** » Sat Jun 05, 2010 12:55 am

kingliveson wrote:
Albert Silver wrote:
Dann Corbit wrote:
beram wrote:Stunning results sofar in my private engine matches for Houdini 1.01 32 bit, against Deep Rybka 4 32 bit. Houdini sofar beats Rybka at blitz and at LTC, with 57 – 43 % !! And thats approx a 56 ELO difference

Houdini 1.01- Rybka 4

Blitz games
T8100 (6,29 Fritzmark) Blitz 4m+2s Nunn2
Code: Select all
1   Houdini 1.01 w32 2_CPU  +17/=27/-6 61.00%   30.5/50
2   Deep Rybka 4 w32        +6/=27/-17 39.00%   19.5/50
T8100 (6,29 Fritzmark) Blitz 4m+2s Bram privat suite 1.2
Code: Select all
1   Houdini 1.01 w32 2_CPU  +14/=30/-6 58.00%   29.0/50
2   Deep Rybka 4 w32        +6/=30/-14 42.00%   21.0/50
T8100 (6,29 Fritzmark), Blitz 4m+2s Nunn2 and Bram privat suite 1.2 - 116 games
Code: Select all
1   Houdini 1.01 w32 2_CPU  +33/=67/-16 57.33%   66.5/116
2   Deep Rybka 4 w32        +16/=67/-33 42.67%   49.5/116
LTC games
T8100 (6,29 Fritzmark), LTC - 20m/40+10m/20+(10m+12s) Nunn2 – after 23 games
Code: Select all
1   Houdini 1.01 w32 2_CPU  +7/=13/-3 58.70%   13.5/23
2   Deep Rybka 4 w32        +3/=13/-7 41.30%    9.5/23
games can be provided
regards Bram
I get similar results. You might check for losses on time in the PGN if the time control is very fast (e.g. 4+2"). All of these engines suffer from it, but Rybka 4 suffers the worst.
Actually, I believe the results. Using the default Time Management, at 2+0 using the SilverSuite, I also got:
Code: Select all
1   Houdini x64 2_CPU  +29/=53/-18 55.50%   55.5/100
2   Deep Rybka 4 x64   +18/=53/-29 44.50%   44.5/100
However with my custom time management (the best one I have tested so far for fast time controls), I got:
Code: Select all
1   Deep Rybka 4 x64   +25/=58/-17 54.00%   54.0/100
2   Houdini x64 2_CPU  +17/=58/-25 46.00%   46.0/100
Could you share these custom time controls because with R4 default time management, it's not outperforming other top engines. I would like suggested settings for a few time controls; 4+2, 5+0, 15+10, and 90+30. Or perhaps a formula to derive the right time management settings.

Thanks.

Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.

kingliveson · Post by **kingliveson** » Sat Jun 05, 2010 2:10 am

Albert Silver wrote:
kingliveson wrote:
Albert Silver wrote:
Dann Corbit wrote:
beram wrote:Stunning results sofar in my private engine matches for Houdini 1.01 32 bit, against Deep Rybka 4 32 bit. Houdini sofar beats Rybka at blitz and at LTC, with 57 – 43 % !! And thats approx a 56 ELO difference

Houdini 1.01- Rybka 4

Blitz games
T8100 (6,29 Fritzmark) Blitz 4m+2s Nunn2
Code: Select all
1   Houdini 1.01 w32 2_CPU  +17/=27/-6 61.00%   30.5/50
2   Deep Rybka 4 w32        +6/=27/-17 39.00%   19.5/50
T8100 (6,29 Fritzmark) Blitz 4m+2s Bram privat suite 1.2
Code: Select all
1   Houdini 1.01 w32 2_CPU  +14/=30/-6 58.00%   29.0/50
2   Deep Rybka 4 w32        +6/=30/-14 42.00%   21.0/50
T8100 (6,29 Fritzmark), Blitz 4m+2s Nunn2 and Bram privat suite 1.2 - 116 games
Code: Select all
1   Houdini 1.01 w32 2_CPU  +33/=67/-16 57.33%   66.5/116
2   Deep Rybka 4 w32        +16/=67/-33 42.67%   49.5/116
LTC games
T8100 (6,29 Fritzmark), LTC - 20m/40+10m/20+(10m+12s) Nunn2 – after 23 games
Code: Select all
1   Houdini 1.01 w32 2_CPU  +7/=13/-3 58.70%   13.5/23
2   Deep Rybka 4 w32        +3/=13/-7 41.30%    9.5/23
games can be provided
regards Bram
I get similar results. You might check for losses on time in the PGN if the time control is very fast (e.g. 4+2"). All of these engines suffer from it, but Rybka 4 suffers the worst.
Actually, I believe the results. Using the default Time Management, at 2+0 using the SilverSuite, I also got:
Code: Select all
1   Houdini x64 2_CPU  +29/=53/-18 55.50%   55.5/100
2   Deep Rybka 4 x64   +18/=53/-29 44.50%   44.5/100
However with my custom time management (the best one I have tested so far for fast time controls), I got:
Code: Select all
1   Deep Rybka 4 x64   +25/=58/-17 54.00%   54.0/100
2   Houdini x64 2_CPU  +17/=58/-25 46.00%   46.0/100
Could you share these custom time controls because with R4 default time management, it's not outperforming other top engines. I would like suggested settings for a few time controls; 4+2, 5+0, 15+10, and 90+30. Or perhaps a formula to derive the right time management settings.

Thanks.
Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.

As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.

Vytron on Rybka Forum wrote:
TCBuffer - How many seconds Rybka thinks she has subtracted from the clock. Useful for bullet (1 '0) and blitz games (3 '0) so Rybka avoids losing on time (by time stolen by the GUI or the opposing engine, like clones), and also may give a general better time management when set at 3 (setting does nothing on incremental or repeating time controls.)

TCNormal Move Time - The rate at which Rybka should play the game. Lower values will make the engine play faster, and vice versa.

TC Max Move Time - The amount of time Rybka is willing to spend in critical positions (higher values will make her think longer on such cases).

Albert Silver · Post by **Albert Silver** » Sat Jun 05, 2010 2:16 am

kingliveson wrote:
albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
It should be fun for us to play with finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For One Or Any Engine. By doing so, you're are distorting outcomes and perverting results. It should not be allowed and should be stopped immediately. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal settings?! Please, it should not be done. If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Default should always be used. For us however, it should be a fun thing to play with.

Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.

As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.

kingliveson · Post by **kingliveson** » Sat Jun 05, 2010 2:37 am

Albert Silver wrote:
kingliveson wrote:
albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.
Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.

As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.

I have built many books in the past using pgn databases from CCRL, CEGT, SSDF. I am indeed familiar with these lists (games and results). Normally for example, when Shredder OA (Opening Advice) is turned on/off, results are separated and noted. The potential issue here is tweaking parameters and combining the results -- such should not be done -- just as results of R3 Dynamic, Human, and Default are not combined.

Albert Silver · Post by **Albert Silver** » Sat Jun 05, 2010 2:52 am

kingliveson wrote:
Albert Silver wrote:
kingliveson wrote:
albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.
Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.

As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.

I have built many books in the past using pgn databases from CCRL, CEGT, SSDF. I am indeed familiar with these lists (games and results). Normally for example, when Shredder OA (Opening Advice) is turned on/off, results are separated. The potential issue here is tweaking parameters and combining the results -- such should not be done -- just as results of R3 Dynamic, Human, and Default are not combined.

So you are saying they should stop testing what they want because of a potential issue you imagined?

kingliveson · Post by **kingliveson** » Sat Jun 05, 2010 3:01 am

Albert Silver wrote:
kingliveson wrote:
Albert Silver wrote:
kingliveson wrote:
albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.

For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.

These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.
Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.

As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.

I have built many books in the past using pgn databases from CCRL, CEGT, SSDF. I am indeed familiar with these lists (games and results). Normally for example, when Shredder OA (Opening Advice) is turned on/off, results are separated. The potential issue here is tweaking parameters and combining the results -- such should not be done -- just as results of R3 Dynamic, Human, and Default are not combined.
So you are saying they should stop testing what they want because of a potential issue you imagined?

If CCRL wants to become an arm of Rybka enterprise, that is of course its business. Tweaking Rybka parameters to play against other engines' default, again, you are perverting the ranking table. This should be easy to understand. They just will not be known as independent testers as it becomes obvious to everyone how these results come about.

Houdini beats Rybka4 with 57 – 43 %

Re: Houdini beats Rybka4 with 57 – 43 %

Re: Houdini beats Rybka4 with 57 – 43 %

Re: Houdini beats Rybka4 with 57 – 43 %

Re: Houdini beats Rybka4 with 57 – 43 %

Re: Houdini beats Rybka4 with 57 – 43 %

Re: Houdini beats Rybka4 with 57 – 43 %

Re: Houdini beats Rybka4 with 57 – 43 %

Re: Houdini beats Rybka4 with 57 – 43 %

Re: Houdini beats Rybka4 with 57 – 43 %

Re: Houdini beats Rybka4 with 57 – 43 %