Ah... Now it becomes clear. It is ok to test options with other engines, just not Rybka. Kind of interesting perspective.kingliveson wrote:If CCRL wants to become an arm of Rybka enterprise, that is of course its business. Tweaking Rybka parameters to play against other engines' default, again, you are perverting the ranking table. This should be easy to understand. They just will not be known as independent testers as it becomes known to everyone how these results come about.Albert Silver wrote:So you are saying they should stop testing what they want because of a potential issue you imagined?kingliveson wrote:Albert Silver wrote:Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.kingliveson wrote:As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.
For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.
These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.
I have built many books in the past using pgn databases from CCRL, CEGT, SSDF. I am indeed familiar with these lists (games and results). Normally for example, when Shredder OA (Opening Advice) is turned on/off, results are separated. The potential issue here is tweaking parameters and combining the results -- such should not be done -- just as results of R3 Dynamic, Human, and Default are not combined.
Houdini beats Rybka4 with 57 – 43 %
Moderators: hgm, Rebel, chrisw
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: Houdini beats Rybka4 with 57 – 43 %
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
Re: Houdini beats Rybka4 with 57 – 43 %
If a separate table is created to note parameters were tweaked, there would not be an issue.Albert Silver wrote:Ah... Now it becomes clear. It is ok to test options with other engines, just not Rybka. Kind of interesting perspective.kingliveson wrote:If CCRL wants to become an arm of Rybka enterprise, that is of course its business. Tweaking Rybka parameters to play against other engines' default, again, you are perverting the ranking table. This should be easy to understand. They just will not be known as independent testers as it becomes known to everyone how these results come about.Albert Silver wrote:So you are saying they should stop testing what they want because of a potential issue you imagined?kingliveson wrote:Albert Silver wrote:Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.kingliveson wrote:As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.
For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.
These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.
I have built many books in the past using pgn databases from CCRL, CEGT, SSDF. I am indeed familiar with these lists (games and results). Normally for example, when Shredder OA (Opening Advice) is turned on/off, results are separated. The potential issue here is tweaking parameters and combining the results -- such should not be done -- just as results of R3 Dynamic, Human, and Default are not combined.
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: Houdini beats Rybka4 with 57 – 43 %
I'm guessing you didn't look at the list at all, since the title of the engine has the parameters in the name? As opposed to the results of the default settings of course, for all to see.kingliveson wrote:If a separate table is created to note parameters were tweaked, there would not be an issue.Albert Silver wrote:Ah... Now it becomes clear. It is ok to test options with other engines, just not Rybka. Kind of interesting perspective.kingliveson wrote:If CCRL wants to become an arm of Rybka enterprise, that is of course its business. Tweaking Rybka parameters to play against other engines' default, again, you are perverting the ranking table. This should be easy to understand. They just will not be known as independent testers as it becomes known to everyone how these results come about.Albert Silver wrote:So you are saying they should stop testing what they want because of a potential issue you imagined?kingliveson wrote:Albert Silver wrote:Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.kingliveson wrote:As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.
For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.
These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.
I have built many books in the past using pgn databases from CCRL, CEGT, SSDF. I am indeed familiar with these lists (games and results). Normally for example, when Shredder OA (Opening Advice) is turned on/off, results are separated. The potential issue here is tweaking parameters and combining the results -- such should not be done -- just as results of R3 Dynamic, Human, and Default are not combined.
http://www.computerchess.org.uk/ccrl/40 ... t_all.html
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
Re: Houdini beats Rybka4 with 57 – 43 %
I was just at CCRL, looked at live results and didn't see it. But good that it's now updated and parameters do appear on the engine's name. Nothing personal -- just want transparency.Albert Silver wrote:I'm guessing you didn't look at the list at all, since the title of the engine has the parameters in the name? As opposed to the results of the default settings of course, for all to see.kingliveson wrote:If a separate table is created to note parameters were tweaked, there would not be an issue.Albert Silver wrote:Ah... Now it becomes clear. It is ok to test options with other engines, just not Rybka. Kind of interesting perspective.kingliveson wrote:If CCRL wants to become an arm of Rybka enterprise, that is of course its business. Tweaking Rybka parameters to play against other engines' default, again, you are perverting the ranking table. This should be easy to understand. They just will not be known as independent testers as it becomes known to everyone how these results come about.Albert Silver wrote:So you are saying they should stop testing what they want because of a potential issue you imagined?kingliveson wrote:Albert Silver wrote:Ok, just to be clear, these are my best overall settings at 2+0 and co. I published the Houdini results, as that was the topic of the thread, but when I say best, I mean best overall after testing against Houdini, FB 1.2, and Stockfish. Results were improved against all 3 opponents.kingliveson wrote:As end users, we have fun finding optimal settings for particular play. It Should Be Absolutely Unacceptable For Independent Testers To Tweak Settings And Configurations For Any Single Engine. By doing so, you're are distorting outcomes and perverting results. Are they also going to tweak Shredder, Naum, Stockfish to find optimal settings, and would there be consensus that those are actual optimal?! If they continue to tweak a single engine or any engine for that matter, they risk losing credibility as independent testers. Please, stop and think about it. Default ought to be used always. For us however, it should be a fun thing to play with.albert silver wrote:Please note that there are no universal settings AFAIK. For example, my best settings at repeating time controls, minimum speed being CCRL/CEGT Blitz, just got published at CCRL Blitz, with a huge ELO leap, but they aren't anywhere near as good at 2+0 for example.
For 2+0, no increments, my best is TC Buffer =1, Normal Move = 72, Max Move = 115.
These were tested with single-CPU (no SSE42 or Large Pages BTW), ponder off, and 512 MB hash.
As to your comment on the Blitz list. Clearly you never ever look at these lists or you would refrain from such a comment. There are TONS of variations of tons of engines. You'll find plenty of Chessmaster profile results, Shredder has indeed been tested with options on and off, Hiarcs too, and many more. You are way off base here.
I have built many books in the past using pgn databases from CCRL, CEGT, SSDF. I am indeed familiar with these lists (games and results). Normally for example, when Shredder OA (Opening Advice) is turned on/off, results are separated. The potential issue here is tweaking parameters and combining the results -- such should not be done -- just as results of R3 Dynamic, Human, and Default are not combined.
http://www.computerchess.org.uk/ccrl/40 ... t_all.html
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: Houdini beats Rybka4 with 57 – 43 %
I'm guessing you were looking at either a non-updated list in your browser's cache, or the Best list, which shows only the best results of any particular engine, or variation thereof.Albert Silver wrote:I was just at CCRL, looked at live results and didn't see it. But good that it's now updated and parameters do appear on the engine's name. Nothing personal -- just want transparency.kingliveson wrote: I'm guessing you didn't look at the list at all, since the title of the engine has the parameters in the name? As opposed to the results of the default settings of course, for all to see.
http://www.computerchess.org.uk/ccrl/40 ... t_all.html
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
-
- Posts: 1187
- Joined: Wed Jan 06, 2010 3:11 pm
Re: Houdini beats Rybka4 - AT LTC - with 57 – 43 %
Code: Select all
LTC games
T8100 (6,29 Fritzmark), LTC - 20m/40+10m/20+(10m+12s) Nunn2 –
yet after 36 games of 50 played
1 Houdini 1.01 w32 2_CPU +10/=22/-4 58,33 % 21/36
2 Deep Rybka 4 w32 +4/=22/-10 41,67 % 15/36
Regarding the special Rybka 4 time control settings for short time control. This doen't count for my used LTC (20m/40+10m/20+(10m+12s))
I did earlier also test Fire 1.31 at this LTC and here Deep Rybka wins:
Code: Select all
T8100 (6,29 Fritzmark) LTC 20m/40+10m/20+(10m+12s) 50 games Bram privat suite 1.2
1 Deep Rybka 4 w32 +15/=28/-7 58.00% 29.0/50
2 Fire 1.3 w32 +7/=28/-15 42.00% 21.0/50
-
- Posts: 1539
- Joined: Thu Mar 09, 2006 2:02 pm
Re: Houdini beats Rybka4 with 57 – 43 %
If I would play hypoteticaly 1000 games with one thread and 5 + 3 ponder on with Houdini 1.01 they would most likly end like this:
this would result in exactly the 20-30 Elo area like all the other Littos since Robbo 83 in the past - but agreed, at the upper end of that 20 Elo frame:
It is a pitty I dont do these test, they would end all that baseless and "amateurish" (why is someone testing only agains ONE engine - looks like intension?) speculations.
Bye
Ingo
PS: I forgot to mention that the enigne is crashing from time to time and remains in memory under full load.
Code: Select all
Houdini 1.01 x64 1_CPU 2935 1000.0 (716.0 : 284.0)
100.0 ( 49.5 : 50.5) Deep Rybka 4 2948
100.0 ( 59.5 : 40.5) Stockfish 1.7.1 JA 2883
100.0 ( 68.0 : 32.0) Naum 4.2 2818
100.0 ( 69.0 : 31.0) Komodo 1.2 JA 2801
100.0 ( 68.0 : 32.0) Deep Shredder 12 2797
100.0 ( 76.0 : 24.0) Critter 0.70 2788
100.0 ( 82.5 : 17.5) HIARCS 13.1 MP 32b 2731
100.0 ( 82.0 : 18.0) spark-0.4 2713
100.0 ( 78.0 : 22.0) Zappa Mexico II 2710
100.0 ( 83.5 : 16.5) Deep Onno 1-2-70 2681
Code: Select all
1 Deep Rybka 4 2948 15 15 2000 79% 2724 29%
2 Rybka 3 mp 2T 2943 13 13 2100 75% 2764 34%
3 Houdini 1.01 x64 1_CPU 2935 19 18 1000 72% 2787 39%
4 Stockfish 1.7.1 JA 2T 2928 14 14 1800 73% 2766 38%
5 Rybka 3 mp 2898 9 9 5000 74% 2725 34%
6 Stockfish 1.7.1 JA 2883 11 11 3500 70% 2735 35%
7 Naum 4.2 2T 2882 13 13 1900 64% 2786 42%
8 Stockfish 1.6.x JA 2T 2863 14 14 1800 65% 2764 42%
9 Rybka 3 32b 2848 14 14 1800 70% 2713 36%
10 Deep Shredder 12 2T 2835 13 13 2100 58% 2777 39%
11 Stockfish 1.6.x JA 2831 11 10 3200 65% 2723 39%
12 Naum 4 2T 2829 14 14 1600 60% 2761 41%
13 Deep Fritz 12 32b 2T 2823 13 13 1900 55% 2790 44%
14 Naum 4.2 2818 10 10 3300 62% 2732 40%
15 Rybka 2.3.2a mp 2802 11 11 3100 67% 2691 40%
16 Komodo 1.2 JA 2801 12 12 2300 59% 2735 40%
17 Deep Shredder 12 UCI 32b 2800 9 9 4000 62% 2720 38%
Bye
Ingo
PS: I forgot to mention that the enigne is crashing from time to time and remains in memory under full load.
-
- Posts: 1187
- Joined: Wed Jan 06, 2010 3:11 pm
Re: Houdini beats Rybka4 with 57 – 43 %
Thanks for your very mature reply. It helps a lot when people clarify matters with hypothetical match results. It would contribute more though, when you would test such things for real.
For instance why are the match results - at LTC - against Rybka4 so different between Fire 1.31 and Houdini 1.01 ?
Houdart is certainly more than just a copy-paste program. I am just like other people fascinated by these new and incredibly strong program.
For instance why are the match results - at LTC - against Rybka4 so different between Fire 1.31 and Houdini 1.01 ?
Houdart is certainly more than just a copy-paste program. I am just like other people fascinated by these new and incredibly strong program.
-
- Posts: 95
- Joined: Sun Jan 10, 2010 6:10 am
- Location: Lamar, Colorado, USA
Re: Houdini beats Rybka4 with 57 – 43 %
Deep Rybka 4 SSE42 x64 TC3100150 vs. Houdini x64 POPCNT_4CPU
2 CPU each, ponder off TC 1/0
Quad i7-920 4.0 GHz
GUI Aquarium 4.0.5
Narrowbook, played each opening twice
117-113, + 6 elo
2 CPU each, ponder off TC 1/0
Quad i7-920 4.0 GHz
GUI Aquarium 4.0.5
Narrowbook, played each opening twice
117-113, + 6 elo
-
- Posts: 1187
- Joined: Wed Jan 06, 2010 3:11 pm
Re: Houdini beats Rybka4 with 57 – 43 %
This is fairly in line with my results in 100 games Nunn2 and Private book at 4m 2sec on my T4300 Win7 64 bit (Fritzmark 5,9)
Deep Rybka 4 64 bit 2CPU - Houdini 1.01 64 bit 2CPU result 48,5 - 51,5
Seems Houdini 32 bit is doing much better against Deep Rybka 4 on the 32 bit platform
The difference between Deep Rybka 4 TC3100150 64 bit and DR4 64 bit standard is 21 ELO at the moment on the CCRL 40/4 list of 4 june.
So DR4 and Houdini 1.01 are about equally strong according to their matches at blitz on 64 bit systems.
kind regards Bram
Deep Rybka 4 64 bit 2CPU - Houdini 1.01 64 bit 2CPU result 48,5 - 51,5
Seems Houdini 32 bit is doing much better against Deep Rybka 4 on the 32 bit platform
The difference between Deep Rybka 4 TC3100150 64 bit and DR4 64 bit standard is 21 ELO at the moment on the CCRL 40/4 list of 4 june.
So DR4 and Houdini 1.01 are about equally strong according to their matches at blitz on 64 bit systems.
kind regards Bram