interesting test data.

Discussion of chess software programming and technical issues.

Moderator: Ras

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

interesting test data.

Post by bob »

Tracy sent me a couple of versions to test. It's going slow as only 1/4 of the cluster is back up, but I started it and as I watched, I saw this. Each output is produced about once every 45 seconds or so:

Code: Select all

Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2-4        2641    3    3 40000   58%  2583   23% 
   3 Crafty-23.2-2        2639    3    3 40000   57%  2583   23% 
   4 Crafty-23.1-4        2639    3    3 40000   57%  2583   23% 
   5 Crafty-23.2-1        2639    3    3 40000   57%  2583   22% 
   6 Crafty-23.2-3        2639    3    3 40000   57%  2583   23% 
   7 Crafty-23.1-1        2639    3    3 40000   57%  2583   22% 
   8 Crafty-23.1-2        2637    3    3 40000   57%  2583   23% 
   9 Crafty-23.1-3        2636    3    3 40000   57%  2583   23% 
  12 Crafty-23.0-1        2546    3    3 40000   45%  2583   21% 
  13 Crafty-23.0-2        2544    3    3 40000   45%  2583   21% 
  14 Crafty-23.0-4        2544    3    3 40000   45%  2583   21% 
  15 Crafty-23.0-3        2543    3    3 40000   45%  2583   21% 
-----------------------  currently using 29 nodes.
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2-4        2639    6    6 40000   58%  2581   23% 
   3 Crafty-23.2-2        2637    6    6 40000   57%  2581   23% 
   4 Crafty-23.1-4        2637    6    6 40000   57%  2581   23% 
   5 Crafty-23.2-1        2637    6    6 40000   57%  2581   22% 
   6 Crafty-23.2-3        2637    6    6 40000   57%  2581   23% 
   7 Crafty-23.1-1        2637    6    6 40000   57%  2581   22% 
   8 Crafty-23.2T1        2636   84   84    51   52%  2588    6% 
   9 Crafty-23.1-2        2635    6    6 40000   57%  2581   23% 
  10 Crafty-23.1-3        2634    6    6 40000   57%  2581   23% 
  13 Crafty-23.0-1        2544    6    6 40000   45%  2581   21% 
  14 Crafty-23.0-2        2542    6    6 40000   45%  2581   21% 
  15 Crafty-23.0-4        2542    6    6 40000   45%  2581   21% 
  16 Crafty-23.0-3        2541    6    6 40000   45%  2581   21% 
-----------------------  currently using 29 nodes.
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2T1        2639   46   46   164   57%  2577   18% 
   3 Crafty-23.2-4        2639    4    4 40000   58%  2580   23% 
   4 Crafty-23.2-2        2637    4    4 40000   57%  2580   23% 
   5 Crafty-23.1-4        2637    4    4 40000   57%  2580   23% 
   6 Crafty-23.2-1        2637    4    4 40000   57%  2580   22% 
   7 Crafty-23.2-3        2637    4    4 40000   57%  2580   23% 
   8 Crafty-23.1-1        2637    4    4 40000   57%  2580   22% 
   9 Crafty-23.1-2        2635    4    4 40000   57%  2580   23% 
  10 Crafty-23.1-3        2634    4    4 40000   57%  2580   23% 
  13 Crafty-23.0-1        2544    4    4 40000   45%  2580   21% 
  14 Crafty-23.0-2        2542    4    4 40000   45%  2580   21% 
  15 Crafty-23.0-4        2542    4    4 40000   45%  2580   21% 
  16 Crafty-23.0-3        2540    4    4 40000   45%  2580   21% 
-----------------------  currently using 29 nodes.
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2T1        2652   35   35   285   59%  2576   20% 
   3 Crafty-23.2-4        2638    4    4 40000   58%  2580   23% 
   4 Crafty-23.2-2        2636    4    4 40000   57%  2580   23% 
   5 Crafty-23.1-4        2636    4    4 40000   57%  2580   23% 
   6 Crafty-23.2-1        2636    4    4 40000   57%  2580   22% 
   7 Crafty-23.2-3        2636    4    4 40000   57%  2580   23% 
   8 Crafty-23.1-1        2636    4    4 40000   57%  2580   22% 
   9 Crafty-23.1-2        2634    4    4 40000   57%  2580   23% 
  10 Crafty-23.1-3        2633    4    4 40000   57%  2580   23% 
  13 Crafty-23.0-1        2543    4    4 40000   45%  2580   21% 
  14 Crafty-23.0-2        2541    4    4 40000   45%  2580   21% 
  15 Crafty-23.0-4        2541    4    4 40000   45%  2580   21% 
  16 Crafty-23.0-3        2540    4    4 40000   45%  2580   21% 
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2T1        2656   29   29   402   60%  2575   19% 
   3 Crafty-23.2-4        2638    3    3 40000   58%  2579   23% 
   4 Crafty-23.2-2        2636    3    3 40000   57%  2579   23% 
   5 Crafty-23.1-4        2636    3    3 40000   57%  2579   23% 
   6 Crafty-23.2-1        2636    3    3 40000   57%  2579   22% 
   7 Crafty-23.2-3        2636    3    3 40000   57%  2579   23% 
   8 Crafty-23.1-1        2636    3    3 40000   57%  2579   22% 
   9 Crafty-23.1-2        2634    3    3 40000   57%  2579   23% 
  10 Crafty-23.1-3        2633    3    3 40000   57%  2579   23% 
  13 Crafty-23.0-1        2543    3    3 40000   45%  2579   21% 
  14 Crafty-23.0-2        2541    3    3 40000   45%  2579   21% 
  15 Crafty-23.0-4        2541    3    3 40000   45%  2579   21% 
  16 Crafty-23.0-3        2539    3    3 40000   45%  2579   21% 
-----------------------  currently using 29 nodes.
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2T1        2644   25   25   529   58%  2576   20% 
   3 Crafty-23.2-4        2638    3    3 40000   58%  2580   23% 
   4 Crafty-23.2-2        2637    3    3 40000   57%  2580   23% 
   5 Crafty-23.1-4        2637    3    3 40000   57%  2580   23% 
   6 Crafty-23.2-1        2637    3    3 40000   57%  2580   22% 
   7 Crafty-23.2-3        2637    3    3 40000   57%  2580   23% 
   8 Crafty-23.1-1        2636    3    3 40000   57%  2580   22% 
   9 Crafty-23.1-2        2634    3    3 40000   57%  2580   23% 
  10 Crafty-23.1-3        2634    3    3 40000   57%  2580   23% 
  13 Crafty-23.0-1        2544    3    3 40000   45%  2580   21% 
  14 Crafty-23.0-2        2542    3    3 40000   45%  2580   21% 
  15 Crafty-23.0-4        2541    3    3 40000   45%  2580   21% 
  16 Crafty-23.0-3        2540    3    3 40000   45%  2580   21% 
-----------------------  currently using 29 nodes.
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2-4        2639    3    3 40000   58%  2581   23% 
   3 Crafty-23.2-2        2637    3    3 40000   57%  2581   23% 
   4 Crafty-23.1-4        2637    3    3 40000   57%  2581   23% 
   5 Crafty-23.2-1        2637    3    3 40000   57%  2581   22% 
   6 Crafty-23.2-3        2637    3    3 40000   57%  2581   23% 
   7 Crafty-23.1-1        2637    3    3 40000   57%  2581   22% 
   8 Crafty-23.1-2        2635    3    3 40000   57%  2581   23% 
   9 Crafty-23.1-3        2634    3    3 40000   57%  2581   23% 
  10 Crafty-23.2T1        2633   23   23   654   57%  2577   20% 
  13 Crafty-23.0-1        2544    3    3 40000   45%  2581   21% 
  14 Crafty-23.0-2        2542    3    3 40000   45%  2581   21% 
  15 Crafty-23.0-4        2542    3    3 40000   45%  2581   21% 
  16 Crafty-23.0-3        2541    3    3 40000   45%  2581   21% 
-----------------------  currently using 29 nodes.
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2-4        2639    3    3 40000   58%  2581   23% 
   3 Crafty-23.2-2        2637    3    3 40000   57%  2581   23% 
   4 Crafty-23.1-4        2637    3    3 40000   57%  2581   23% 
   5 Crafty-23.2-1        2637    3    3 40000   57%  2581   22% 
   6 Crafty-23.2-3        2637    3    3 40000   57%  2581   23% 
   7 Crafty-23.1-1        2637    3    3 40000   57%  2581   22% 
   8 Crafty-23.1-2        2635    3    3 40000   57%  2581   23% 
   9 Crafty-23.1-3        2634    3    3 40000   57%  2581   23% 
  10 Crafty-23.2T1        2633   21   21   766   57%  2577   20% 
  13 Crafty-23.0-1        2544    3    3 40000   45%  2581   21% 
  14 Crafty-23.0-2        2542    3    3 40000   45%  2581   21% 
  15 Crafty-23.0-4        2542    3    3 40000   45%  2581   21% 
  16 Crafty-23.0-3        2541    3    3 40000   45%  2581   21% 
-----------------------  currently using 27 nodes.
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2-4        2639    3    3 40000   58%  2581   23% 
   3 Crafty-23.2-2        2638    3    3 40000   57%  2581   23% 
   4 Crafty-23.1-4        2638    3    3 40000   57%  2581   23% 
   5 Crafty-23.2-1        2637    3    3 40000   57%  2581   22% 
   6 Crafty-23.2-3        2637    3    3 40000   57%  2581   23% 
   7 Crafty-23.1-1        2637    3    3 40000   57%  2581   22% 
   8 Crafty-23.1-2        2635    3    3 40000   57%  2581   23% 
   9 Crafty-23.1-3        2635    3    3 40000   57%  2581   23% 
  10 Crafty-23.2T1        2630   20   20   868   56%  2580   21% 
  13 Crafty-23.0-1        2544    3    3 40000   45%  2581   21% 
  14 Crafty-23.0-2        2543    3    3 40000   45%  2581   21% 
  15 Crafty-23.0-4        2542    3    3 40000   45%  2581   21% 
  16 Crafty-23.0-3        2541    3    3 40000   45%  2581   21% 
-----------------------  currently using 27 nodes.
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2-4        2639    3    3 40000   58%  2581   23% 
   3 Crafty-23.2-2        2637    3    3 40000   57%  2581   23% 
   4 Crafty-23.1-4        2637    3    3 40000   57%  2581   23% 
   5 Crafty-23.2-1        2637    3    3 40000   57%  2581   22% 
   6 Crafty-23.2-3        2637    3    3 40000   57%  2581   23% 
   7 Crafty-23.1-1        2637    3    3 40000   57%  2581   22% 
   8 Crafty-23.1-2        2635    3    3 40000   57%  2581   23% 
   9 Crafty-23.1-3        2634    3    3 40000   57%  2581   23% 
  10 Crafty-23.2T1        2631   19   19   953   56%  2583   22% 
  13 Crafty-23.0-1        2544    3    3 40000   45%  2581   21% 
  14 Crafty-23.0-2        2543    3    3 40000   45%  2581   21% 
  15 Crafty-23.0-4        2542    3    3 40000   45%  2581   21% 
  16 Crafty-23.0-3        2541    3    3 40000   45%  2581   21% 
-----------------------  currently using 29 nodes.
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2-4        2639    3    3 40000   58%  2580   23% 
   3 Crafty-23.2T1        2638   18   18  1056   57%  2581   22% 
   4 Crafty-23.2-2        2637    3    3 40000   57%  2580   23% 
   5 Crafty-23.1-4        2637    3    3 40000   57%  2580   23% 
   6 Crafty-23.2-1        2637    3    3 40000   57%  2580   22% 
   7 Crafty-23.2-3        2637    3    3 40000   57%  2580   23% 
   8 Crafty-23.1-1        2637    3    3 40000   57%  2580   22% 
   9 Crafty-23.1-2        2635    3    3 40000   57%  2580   23% 
  10 Crafty-23.1-3        2634    3    3 40000   57%  2580   23% 
  13 Crafty-23.0-1        2544    3    3 40000   45%  2580   21% 
  14 Crafty-23.0-2        2542    3    3 40000   45%  2580   21% 
  15 Crafty-23.0-4        2542    3    3 40000   45%  2580   21% 
  16 Crafty-23.0-3        2540    3    3 40000   45%  2580   21% 
-----------------------  currently using 29 nodes.
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2T1        2640   17   17  1177   58%  2580   22% 
   3 Crafty-23.2-4        2639    3    3 40000   58%  2580   23% 
   4 Crafty-23.2-2        2637    3    3 40000   57%  2580   23% 
   5 Crafty-23.1-4        2637    3    3 40000   57%  2580   23% 
   6 Crafty-23.2-1        2637    3    3 40000   57%  2580   22% 
   7 Crafty-23.2-3        2637    3    3 40000   57%  2580   23% 
   8 Crafty-23.1-1        2637    3    3 40000   57%  2580   22% 
   9 Crafty-23.1-2        2635    3    3 40000   57%  2580   23% 
  10 Crafty-23.1-3        2634    3    3 40000   57%  2580   23% 
  13 Crafty-23.0-1        2544    3    3 40000   45%  2580   21% 
  14 Crafty-23.0-2        2542    3    3 40000   45%  2580   21% 
  15 Crafty-23.0-4        2541    3    3 40000   45%  2580   21% 
  16 Crafty-23.0-3        2540    3    3 40000   45%  2580   21% 
-----------------------  currently using 29 nodes.
Rank Name                  Elo    +    - games score oppo. draws
   2 Crafty-23.2T1        2641   16   16  1292   58%  2580   22% 
   3 Crafty-23.2-4        2639    3    3 40000   58%  2580   23% 
   4 Crafty-23.2-2        2637    3    3 40000   57%  2580   23% 
   5 Crafty-23.1-4        2637    3    3 40000   57%  2580   23% 
   6 Crafty-23.2-1        2637    3    3 40000   57%  2580   22% 
   7 Crafty-23.2-3        2637    3    3 40000   57%  2580   23% 
   8 Crafty-23.1-1        2637    3    3 40000   57%  2580   22% 
   9 Crafty-23.1-2        2635    3    3 40000   57%  2580   23% 
  10 Crafty-23.1-3        2634    3    3 40000   57%  2580   23% 
  14 Crafty-23.0-2        2542    3    3 40000   45%  2580   21% 
  15 Crafty-23.0-4        2541    3    3 40000   45%  2580   21% 
  16 Crafty-23.0-3        2540    3    3 40000   45%  2580   21% 
Crafty-23.0-1/2/3/4, Crafty-23.1-1/23/4 and Crafty-23.2-1/2/3/4 are the old results previously computed. 23.0 and 23.1 are the results for those two released versions. Each was run 4 times to verify that the results were similar and that nothing unusual was going on. Crafty-23.2T1 is the latest. For each sample, look at where it is in the overall standings based on Elo and error bar. By the time we have completed 400 games, it appears to be +20 elo better than the best version. After 1000 games, it is about 10 elo worse than the best. It is now settling down and may be very slightly better, although I won't know until the entire thing is completed.

This does show the danger of relying on small numbers of games to predict whether a change is good or bad...

Code: Select all

   2 Crafty-23.2T1        2640   11   11  3049   58%  2580   21% 
   3 Crafty-23.2-4        2639    3    3 40000   58%  2580   23% 
   4 Crafty-23.2-2        2637    3    3 40000   57%  2580   23% 
   5 Crafty-23.1-4        2637    3    3 40000   57%  2580   23% 
   6 Crafty-23.2-1        2637    3    3 40000   57%  2580   22% 
   7 Crafty-23.2-3        2637    3    3 40000   57%  2580   23% 
   8 Crafty-23.1-1        2637    3    3 40000   57%  2580   22% 
   9 Crafty-23.1-2        2635    3    3 40000   57%  2580   23% 
  10 Crafty-23.1-3        2634    3    3 40000   57%  2580   23% 
liuzy

Re: interesting test data.

Post by liuzy »

Bob, why don't you improve ipplite using your cluster?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: interesting test data.

Post by bob »

liuzy wrote:Bob, why don't you improve ipplite using your cluster?
Because we are busy improving Crafty, which is my own code. :) What would be the benefit of improving IP* if we one day discover it is derived from Rybka??? Also, what would be the motivation? I want to win with my code. Not much point in copying or using what someone else has done, IMHO. This is about enjoying a hobby, not copying what others have done. (Of course, not _everyone_ believes in that philosophy, but that's a separate issue.)
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: interesting test data.

Post by diep »

bob wrote:
liuzy wrote:Bob, why don't you improve ipplite using your cluster?
Because we are busy improving Crafty, which is my own code. :) What would be the benefit of improving IP* if we one day discover it is derived from Rybka??? Also, what would be the motivation? I want to win with my code. Not much point in copying or using what someone else has done, IMHO. This is about enjoying a hobby, not copying what others have done. (Of course, not _everyone_ believes in that philosophy, but that's a separate issue.)
That wasn't a very ethical question in your direction Bob!

These games are at which time control for each game?

Vincent
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: interesting test data.

Post by hgm »

Bit of an expensive way to calculate a square root. :lol:

As expected, the value nicely drifts in the range of about +/- sigma.
jdart
Posts: 4397
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: interesting test data.

Post by jdart »

bob wrote:Each output is produced about once every 45 seconds or so
Yikes. What is the cluster size, and what game time control are you using?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: interesting test data.

Post by bob »

diep wrote:
bob wrote:
liuzy wrote:Bob, why don't you improve ipplite using your cluster?
Because we are busy improving Crafty, which is my own code. :) What would be the benefit of improving IP* if we one day discover it is derived from Rybka??? Also, what would be the motivation? I want to win with my code. Not much point in copying or using what someone else has done, IMHO. This is about enjoying a hobby, not copying what others have done. (Of course, not _everyone_ believes in that philosophy, but that's a separate issue.)
That wasn't a very ethical question in your direction Bob!

These games are at which time control for each game?

Vincent
Those were 30 seconds + .2 seconds increment I believe.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: interesting test data.

Post by bob »

hgm wrote:Bit of an expensive way to calculate a square root. :lol:

As expected, the value nicely drifts in the range of about +/- sigma.
I know, but it does graphically show this. When you read "300 games might give you a hint" it makes you cringe. That version actually finished like this, for reference:

Code: Select all

   2 Crafty-23.2-4        2637    3    3 40000   58%  2579   23% 
   3 Crafty-23.2-2        2636    3    3 40000   57%  2579   23% 
   4 Crafty-23.1-4        2636    3    3 40000   57%  2579   23% 
   5 Crafty-23.2-1        2636    3    3 40000   57%  2579   22% 
   6 Crafty-23.2-3        2636    3    3 40000   57%  2579   23% 
   7 Crafty-23.1-1        2635    3    3 40000   57%  2579   22% 
   8 Crafty-23.1-2        2633    3    3 40000   57%  2579   23% 
   9 Crafty-23.1-3        2633    3    3 40000   57%  2579   23% 
  10 Crafty-23.2T1-1      2631    3    3 40000   57%  2579   23% 
So the "hints" were all "bad hints" as it ended up a little worse than the best 23.2 versions so far...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: interesting test data.

Post by bob »

jdart wrote:
bob wrote:Each output is produced about once every 45 seconds or so
Yikes. What is the cluster size, and what game time control are you using?
those were 30 sec + .2 sec games. Something over 32 nodes is up, about 1/4 of the thing. We are bringing it up slowly to monitor the A/C behaviour. Normally I see a couple of hundred games every update.
Edsel Apostol
Posts: 803
Joined: Mon Jul 17, 2006 5:53 am
Full name: Edsel Apostol

Re: interesting test data.

Post by Edsel Apostol »

It would be interesting to see this data being shown in a graph including the error bars. I'm interested to see how consistent (straight) the line of the elo and how the error bar diverge nearer the elo line as the number of games increases.