Ivan The Terrible

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

kingliveson

Re: Ivan The Terrible

Post by kingliveson »

Please see this post for the conditions and settings for these 50 games: http://www.talkchess.com/forum/viewtopi ... 481#332481
kingliveson

Re: Ivan The Terrible

Post by kingliveson »

Long Time Control: Rybka vs IvanHoe 40+2

Elostat: IvanHoe +48
Bayesian: IvanHoe +36

Code: Select all

 
    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 IvanHoe 9.63b x64              :   24   54  51    50    57.0 %    -24   70.0 %
  2 Rybka 3 x64                    :  -24   51  54    50    43.0 %     24   70.0 %
The matches use KLOECOA00E97V test suite.

-25 positions with switched colors
-ponder off
-no egtb
-512 mb hash
-AMD athlon 64x2 3800+ @ 2.0 GHz
-Engines use 2 cores
-Time Control: 40m +2s inc

Code: Select all

 
Individual statistics:

1 IvanHoe 9.63b x64         :   24   50 (+ 11,= 35,-  4), 57.0 %

Rybka 3 x64                   :  50 (+ 11,= 35,-  4), 57.0 %

2 Rybka 3 x64               :  -24   50 (+  4,= 35,- 11), 43.0 %

IvanHoe 9.63b x64             :  50 (+  4,= 35,- 11), 43.0 %

Code: Select all

 
Games        :     50 (finished)

White Wins   :     13 (26.0 %)
Black Wins   :      2 ( 4.0 %)
Draws        :     35 (70.0 %)
Unfinished   :      0

White Perf.  : 61.0 %
Black Perf.  : 39.0 %

ECO A =      6 Games (12.0 %)
ECO B =     18 Games (36.0 %)
ECO C =     10 Games (20.0 %)
ECO D =      8 Games (16.0 %)
ECO E =      8 Games (16.0 %)

Code: Select all

 
Bayesian Elo Rating:

Rank Name                Elo    +    - games score oppo. draws
   1 IvanHoe 9.63b x64    18   35   35    50   57%   -18   70%
   2 Rybka 3 x64         -18   35   35    50   43%    18   70%
Games

Note: The results seem consistent with other tests including 50 games (6 crashed) previously posted where IvanHoe 9.65b scored +8/=32/-4 against Rybka 3. We are now beginning to notice the strength of Ivanhoe at longer time controls as more bugs get corrected.
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Ivan The Terrible

Post by michiguel »

kingliveson wrote:Long Time Control: Rybka vs IvanHoe 40+2

Elostat: IvanHoe +48
Bayesian: IvanHoe +36

Code: Select all

 
    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 IvanHoe 9.63b x64              :   24   54  51    50    57.0 %    -24   70.0 %
  2 Rybka 3 x64                    :  -24   51  54    50    43.0 %     24   70.0 %
The matches use KLOECOA00E97V test suite.

-25 positions with switched colors
-ponder off
-no egtb
-512 mb hash
-AMD athlon 64x2 3800+ @ 2.0 GHz
-Engines use 2 cores
-Time Control: 40m +2s inc

Code: Select all

 
Individual statistics:

1 IvanHoe 9.63b x64         :   24   50 (+ 11,= 35,-  4), 57.0 %

Rybka 3 x64                   :  50 (+ 11,= 35,-  4), 57.0 %

2 Rybka 3 x64               :  -24   50 (+  4,= 35,- 11), 43.0 %

IvanHoe 9.63b x64             :  50 (+  4,= 35,- 11), 43.0 %

Code: Select all

 
Games        :     50 (finished)

White Wins   :     13 (26.0 %)
Black Wins   :      2 ( 4.0 %)
Draws        :     35 (70.0 %)
Unfinished   :      0

White Perf.  : 61.0 %
Black Perf.  : 39.0 %

ECO A =      6 Games (12.0 %)
ECO B =     18 Games (36.0 %)
ECO C =     10 Games (20.0 %)
ECO D =      8 Games (16.0 %)
ECO E =      8 Games (16.0 %)

Code: Select all

 
Bayesian Elo Rating:

Rank Name                Elo    +    - games score oppo. draws
   1 IvanHoe 9.63b x64    18   35   35    50   57%   -18   70%
   2 Rybka 3 x64         -18   35   35    50   43%    18   70%
Games

Note: The results seem consistent with other tests including 50 games (6 crashed) previously posted where IvanHoe 9.65b scored +8/=32/-4 against Rybka 3. We are now beginning to notice the strength of Ivanhoe at longer time controls as more bugs get corrected.
Out of 50 games Ivanhoe crashed 6 times?

Miguel
Dann Corbit
Posts: 12778
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Ivan The Terrible

Post by Dann Corbit »

michiguel wrote:
kingliveson wrote:Long Time Control: Rybka vs IvanHoe 40+2

Elostat: IvanHoe +48
Bayesian: IvanHoe +36

Code: Select all

 
    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 IvanHoe 9.63b x64              :   24   54  51    50    57.0 %    -24   70.0 %
  2 Rybka 3 x64                    :  -24   51  54    50    43.0 %     24   70.0 %
The matches use KLOECOA00E97V test suite.

-25 positions with switched colors
-ponder off
-no egtb
-512 mb hash
-AMD athlon 64x2 3800+ @ 2.0 GHz
-Engines use 2 cores
-Time Control: 40m +2s inc

Code: Select all

 
Individual statistics:

1 IvanHoe 9.63b x64         :   24   50 (+ 11,= 35,-  4), 57.0 %

Rybka 3 x64                   :  50 (+ 11,= 35,-  4), 57.0 %

2 Rybka 3 x64               :  -24   50 (+  4,= 35,- 11), 43.0 %

IvanHoe 9.63b x64             :  50 (+  4,= 35,- 11), 43.0 %

Code: Select all

 
Games        :     50 (finished)

White Wins   :     13 (26.0 %)
Black Wins   :      2 ( 4.0 %)
Draws        :     35 (70.0 %)
Unfinished   :      0

White Perf.  : 61.0 %
Black Perf.  : 39.0 %

ECO A =      6 Games (12.0 %)
ECO B =     18 Games (36.0 %)
ECO C =     10 Games (20.0 %)
ECO D =      8 Games (16.0 %)
ECO E =      8 Games (16.0 %)

Code: Select all

 
Bayesian Elo Rating:

Rank Name                Elo    +    - games score oppo. draws
   1 IvanHoe 9.63b x64    18   35   35    50   57%   -18   70%
   2 Rybka 3 x64         -18   35   35    50   43%    18   70%
Games

Note: The results seem consistent with other tests including 50 games (6 crashed) previously posted where IvanHoe 9.65b scored +8/=32/-4 against Rybka 3. We are now beginning to notice the strength of Ivanhoe at longer time controls as more bugs get corrected.
Out of 50 games Ivanhoe crashed 6 times?

Miguel
Ivanhoe has major problems in two areas:

1. At very fast time control it will lose *most* of its games on time. If you get fast enough time control, it will lose *all* of its games on time (even against itself).

2. At slow time controls, with lots of threads in use, it will crash frequently (most of its losses under these conditions will be attributed to crashes). Clearly, an SMP flaw of some sort (probably a public variable written to and not gated in any way).
kingliveson

Re: Ivan The Terrible

Post by kingliveson »

michiguel wrote: Out of 50 games Ivanhoe crashed 6 times?

Miguel
Recently upgraded that system to Windows 7 and don't really know what happened with the first tournament. There is no log. In the 6 games, IvanHoe beta 999965 ran out of time 4 times and Rybka twice. Neither engine had any significant lead in those games. Updates and patches have now been applied to that system. The games are posted here. As Dann pointed out, there are still some issues with IvanHoe in general.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Ivan The Terrible

Post by Milos »

Dann Corbit wrote:2. At slow time controls, with lots of threads in use, it will crash frequently (most of its losses under these conditions will be attributed to crashes). Clearly, an SMP flaw of some sort (probably a public variable written to and not gated in any way).
a) True
b) It's not a public variable, it's something else....
benkidwell

Re: Ivan The Terrible

Post by benkidwell »

Dann Corbit wrote:
Ivanhoe has major problems in two areas:

1. At very fast time control it will lose *most* of its games on time. If you get fast enough time control, it will lose *all* of its games on time (even against itself).

2. At slow time controls, with lots of threads in use, it will crash frequently (most of its losses under these conditions will be attributed to crashes). Clearly, an SMP flaw of some sort (probably a public variable written to and not gated in any way).
I wonder if these bugs are more prevalent in the windows compile. I have been testing Ivanhoe v999963 with some minor source modifications mentioned on the wiki on a quadcore linux system, and both at very fast (1 min for 100 moves) and fairly long (40/40) I have had zero time losses and only one crash in several days of matches under xboard/polyglot. Even against stockfish 1.6.3, the results are so one-sided in favor of Ivanhoe that I haven't been tracking them in detail. Following up on your suggest to look at the SMP.c code, it seems like the windows and linux locking implementations are quite different - the locks are differently #defined macros in each, and the code has a lot of #ifdef code blocks for the different operating systems.

Do people generally count crashes as losses in tournament results?
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: Ivan The Terrible

Post by Dr.Wael Deeb »

benkidwell wrote:
Dann Corbit wrote:
Ivanhoe has major problems in two areas:

1. At very fast time control it will lose *most* of its games on time. If you get fast enough time control, it will lose *all* of its games on time (even against itself).

2. At slow time controls, with lots of threads in use, it will crash frequently (most of its losses under these conditions will be attributed to crashes). Clearly, an SMP flaw of some sort (probably a public variable written to and not gated in any way).
I wonder if these bugs are more prevalent in the windows compile. I have been testing Ivanhoe v999963 with some minor source modifications mentioned on the wiki on a quadcore linux system, and both at very fast (1 min for 100 moves) and fairly long (40/40) I have had zero time losses and only one crash in several days of matches under xboard/polyglot. Even against stockfish 1.6.3, the results are so one-sided in favor of Ivanhoe that I haven't been tracking them in detail. Following up on your suggest to look at the SMP.c code, it seems like the windows and linux locking implementations are quite different - the locks are differently #defined macros in each, and the code has a lot of #ifdef code blocks for the different operating systems.

Do people generally count crashes as losses in tournament results?
They should I think even though there are still rating lists out there which count time loses as a result altering the whole sense of the testing process....
I even check my games for bad opening lines if they use my q82010.abk or q8.ctg which thankfuly happens so rarely....
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
Titu

Re: Ivan The Terrible

Post by Titu »

Hi, I guess that you have not changed the Rybka contempt to 0?

I am assuming the clones have removed the contempt factor from Rybka.
kingliveson

Re: Ivan The Terrible

Post by kingliveson »

Titu wrote:Hi, I guess that you have not changed the Rybka contempt to 0?

I am assuming the clones have removed the contempt factor from Rybka.
If you want to discuss whether or not Rybka is a clone of Fruit, you can on the engine origins sub-forum. Please take all such topics to that forum.