Ivan The Terrible
Moderator: Ras
Re: Ivan The Terrible
Please see this post for the conditions and settings for these 50 games: http://www.talkchess.com/forum/viewtopi ... 481#332481
Re: Ivan The Terrible
Long Time Control: Rybka vs IvanHoe 40+2
Elostat: IvanHoe +48
Bayesian: IvanHoe +36
The matches use KLOECOA00E97V test suite.
-25 positions with switched colors
-ponder off
-no egtb
-512 mb hash
-AMD athlon 64x2 3800+ @ 2.0 GHz
-Engines use 2 cores
-Time Control: 40m +2s inc
Games
Note: The results seem consistent with other tests including 50 games (6 crashed) previously posted where IvanHoe 9.65b scored +8/=32/-4 against Rybka 3. We are now beginning to notice the strength of Ivanhoe at longer time controls as more bugs get corrected.
Elostat: IvanHoe +48
Bayesian: IvanHoe +36
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 IvanHoe 9.63b x64 : 24 54 51 50 57.0 % -24 70.0 %
2 Rybka 3 x64 : -24 51 54 50 43.0 % 24 70.0 %
-25 positions with switched colors
-ponder off
-no egtb
-512 mb hash
-AMD athlon 64x2 3800+ @ 2.0 GHz
-Engines use 2 cores
-Time Control: 40m +2s inc
Code: Select all
Individual statistics:
1 IvanHoe 9.63b x64 : 24 50 (+ 11,= 35,- 4), 57.0 %
Rybka 3 x64 : 50 (+ 11,= 35,- 4), 57.0 %
2 Rybka 3 x64 : -24 50 (+ 4,= 35,- 11), 43.0 %
IvanHoe 9.63b x64 : 50 (+ 4,= 35,- 11), 43.0 %
Code: Select all
Games : 50 (finished)
White Wins : 13 (26.0 %)
Black Wins : 2 ( 4.0 %)
Draws : 35 (70.0 %)
Unfinished : 0
White Perf. : 61.0 %
Black Perf. : 39.0 %
ECO A = 6 Games (12.0 %)
ECO B = 18 Games (36.0 %)
ECO C = 10 Games (20.0 %)
ECO D = 8 Games (16.0 %)
ECO E = 8 Games (16.0 %)
Code: Select all
Bayesian Elo Rating:
Rank Name Elo + - games score oppo. draws
1 IvanHoe 9.63b x64 18 35 35 50 57% -18 70%
2 Rybka 3 x64 -18 35 35 50 43% 18 70%
Note: The results seem consistent with other tests including 50 games (6 crashed) previously posted where IvanHoe 9.65b scored +8/=32/-4 against Rybka 3. We are now beginning to notice the strength of Ivanhoe at longer time controls as more bugs get corrected.
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Ivan The Terrible
Out of 50 games Ivanhoe crashed 6 times?kingliveson wrote:Long Time Control: Rybka vs IvanHoe 40+2
Elostat: IvanHoe +48
Bayesian: IvanHoe +36
The matches use KLOECOA00E97V test suite.Code: Select all
Program Elo + - Games Score Av.Op. Draws 1 IvanHoe 9.63b x64 : 24 54 51 50 57.0 % -24 70.0 % 2 Rybka 3 x64 : -24 51 54 50 43.0 % 24 70.0 %
-25 positions with switched colors
-ponder off
-no egtb
-512 mb hash
-AMD athlon 64x2 3800+ @ 2.0 GHz
-Engines use 2 cores
-Time Control: 40m +2s inc
Code: Select all
Individual statistics: 1 IvanHoe 9.63b x64 : 24 50 (+ 11,= 35,- 4), 57.0 % Rybka 3 x64 : 50 (+ 11,= 35,- 4), 57.0 % 2 Rybka 3 x64 : -24 50 (+ 4,= 35,- 11), 43.0 % IvanHoe 9.63b x64 : 50 (+ 4,= 35,- 11), 43.0 %
Code: Select all
Games : 50 (finished) White Wins : 13 (26.0 %) Black Wins : 2 ( 4.0 %) Draws : 35 (70.0 %) Unfinished : 0 White Perf. : 61.0 % Black Perf. : 39.0 % ECO A = 6 Games (12.0 %) ECO B = 18 Games (36.0 %) ECO C = 10 Games (20.0 %) ECO D = 8 Games (16.0 %) ECO E = 8 Games (16.0 %)
GamesCode: Select all
Bayesian Elo Rating: Rank Name Elo + - games score oppo. draws 1 IvanHoe 9.63b x64 18 35 35 50 57% -18 70% 2 Rybka 3 x64 -18 35 35 50 43% 18 70%
Note: The results seem consistent with other tests including 50 games (6 crashed) previously posted where IvanHoe 9.65b scored +8/=32/-4 against Rybka 3. We are now beginning to notice the strength of Ivanhoe at longer time controls as more bugs get corrected.
Miguel
-
- Posts: 12778
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Ivan The Terrible
Ivanhoe has major problems in two areas:michiguel wrote:Out of 50 games Ivanhoe crashed 6 times?kingliveson wrote:Long Time Control: Rybka vs IvanHoe 40+2
Elostat: IvanHoe +48
Bayesian: IvanHoe +36
The matches use KLOECOA00E97V test suite.Code: Select all
Program Elo + - Games Score Av.Op. Draws 1 IvanHoe 9.63b x64 : 24 54 51 50 57.0 % -24 70.0 % 2 Rybka 3 x64 : -24 51 54 50 43.0 % 24 70.0 %
-25 positions with switched colors
-ponder off
-no egtb
-512 mb hash
-AMD athlon 64x2 3800+ @ 2.0 GHz
-Engines use 2 cores
-Time Control: 40m +2s inc
Code: Select all
Individual statistics: 1 IvanHoe 9.63b x64 : 24 50 (+ 11,= 35,- 4), 57.0 % Rybka 3 x64 : 50 (+ 11,= 35,- 4), 57.0 % 2 Rybka 3 x64 : -24 50 (+ 4,= 35,- 11), 43.0 % IvanHoe 9.63b x64 : 50 (+ 4,= 35,- 11), 43.0 %
Code: Select all
Games : 50 (finished) White Wins : 13 (26.0 %) Black Wins : 2 ( 4.0 %) Draws : 35 (70.0 %) Unfinished : 0 White Perf. : 61.0 % Black Perf. : 39.0 % ECO A = 6 Games (12.0 %) ECO B = 18 Games (36.0 %) ECO C = 10 Games (20.0 %) ECO D = 8 Games (16.0 %) ECO E = 8 Games (16.0 %)
GamesCode: Select all
Bayesian Elo Rating: Rank Name Elo + - games score oppo. draws 1 IvanHoe 9.63b x64 18 35 35 50 57% -18 70% 2 Rybka 3 x64 -18 35 35 50 43% 18 70%
Note: The results seem consistent with other tests including 50 games (6 crashed) previously posted where IvanHoe 9.65b scored +8/=32/-4 against Rybka 3. We are now beginning to notice the strength of Ivanhoe at longer time controls as more bugs get corrected.
Miguel
1. At very fast time control it will lose *most* of its games on time. If you get fast enough time control, it will lose *all* of its games on time (even against itself).
2. At slow time controls, with lots of threads in use, it will crash frequently (most of its losses under these conditions will be attributed to crashes). Clearly, an SMP flaw of some sort (probably a public variable written to and not gated in any way).
Re: Ivan The Terrible
Recently upgraded that system to Windows 7 and don't really know what happened with the first tournament. There is no log. In the 6 games, IvanHoe beta 999965 ran out of time 4 times and Rybka twice. Neither engine had any significant lead in those games. Updates and patches have now been applied to that system. The games are posted here. As Dann pointed out, there are still some issues with IvanHoe in general.michiguel wrote: Out of 50 games Ivanhoe crashed 6 times?
Miguel
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: Ivan The Terrible
a) TrueDann Corbit wrote:2. At slow time controls, with lots of threads in use, it will crash frequently (most of its losses under these conditions will be attributed to crashes). Clearly, an SMP flaw of some sort (probably a public variable written to and not gated in any way).
b) It's not a public variable, it's something else....
Re: Ivan The Terrible
I wonder if these bugs are more prevalent in the windows compile. I have been testing Ivanhoe v999963 with some minor source modifications mentioned on the wiki on a quadcore linux system, and both at very fast (1 min for 100 moves) and fairly long (40/40) I have had zero time losses and only one crash in several days of matches under xboard/polyglot. Even against stockfish 1.6.3, the results are so one-sided in favor of Ivanhoe that I haven't been tracking them in detail. Following up on your suggest to look at the SMP.c code, it seems like the windows and linux locking implementations are quite different - the locks are differently #defined macros in each, and the code has a lot of #ifdef code blocks for the different operating systems.Dann Corbit wrote:
Ivanhoe has major problems in two areas:
1. At very fast time control it will lose *most* of its games on time. If you get fast enough time control, it will lose *all* of its games on time (even against itself).
2. At slow time controls, with lots of threads in use, it will crash frequently (most of its losses under these conditions will be attributed to crashes). Clearly, an SMP flaw of some sort (probably a public variable written to and not gated in any way).
Do people generally count crashes as losses in tournament results?
-
- Posts: 9773
- Joined: Wed Mar 08, 2006 8:44 pm
- Location: Amman,Jordan
Re: Ivan The Terrible
They should I think even though there are still rating lists out there which count time loses as a result altering the whole sense of the testing process....benkidwell wrote:I wonder if these bugs are more prevalent in the windows compile. I have been testing Ivanhoe v999963 with some minor source modifications mentioned on the wiki on a quadcore linux system, and both at very fast (1 min for 100 moves) and fairly long (40/40) I have had zero time losses and only one crash in several days of matches under xboard/polyglot. Even against stockfish 1.6.3, the results are so one-sided in favor of Ivanhoe that I haven't been tracking them in detail. Following up on your suggest to look at the SMP.c code, it seems like the windows and linux locking implementations are quite different - the locks are differently #defined macros in each, and the code has a lot of #ifdef code blocks for the different operating systems.Dann Corbit wrote:
Ivanhoe has major problems in two areas:
1. At very fast time control it will lose *most* of its games on time. If you get fast enough time control, it will lose *all* of its games on time (even against itself).
2. At slow time controls, with lots of threads in use, it will crash frequently (most of its losses under these conditions will be attributed to crashes). Clearly, an SMP flaw of some sort (probably a public variable written to and not gated in any way).
Do people generally count crashes as losses in tournament results?
I even check my games for bad opening lines if they use my q82010.abk or q8.ctg which thankfuly happens so rarely....
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
Re: Ivan The Terrible
Hi, I guess that you have not changed the Rybka contempt to 0?
I am assuming the clones have removed the contempt factor from Rybka.
I am assuming the clones have removed the contempt factor from Rybka.
Re: Ivan The Terrible
If you want to discuss whether or not Rybka is a clone of Fruit, you can on the engine origins sub-forum. Please take all such topics to that forum.Titu wrote:Hi, I guess that you have not changed the Rybka contempt to 0?
I am assuming the clones have removed the contempt factor from Rybka.