A while back you mentioned that I should move from the older 2.0 epsilon whatever to the most recent. I didn't change at the time because I didn't want to alter a constant opponent that was represented in a lot of old data.
With the new testing approach, I am in the progress os now re-evaluating the opponents, and perhaps adding a few more opponents (to do a few less games per opponent to keep things close computationally).
One oddity I found is this:
Code: Select all
crafty-22.2R5
Rank Name                   Elo    +    - games score oppo. draws
   1 Glaurung 2-epsilon/5   115    9    9  3894   70%   -34   20% 
   2 Fruit 2.1               68    9    9  3894   64%   -34   24% 
   3 opponent-21.7           20    8    8  3894   58%   -34   34% 
   4 Glaurung 1.1 SMP        14    9    9  3894   57%   -34   20% 
   5 Crafty-22.2            -34    5    5 19470   44%     7   23% 
   6 Arasan 10.0           -184    9    9  3894   30%   -34   20% 
Rank Name                   Elo    +    - games score oppo. draws
   1 Glaurung 2.1        95   11   11  2271   65%   -22   18% 
   2 Fruit 2.1           52   11   11  2267   60%   -22   24% 
   3 Glaurung 1.1 SMP    27   11   11  2263   57%   -22   21% 
   4 opponent-21.7       16   11   10  2269   56%   -22   35% 
   5 Crafty-22.2        -22    5    6 11344   46%     4   24% 
   6 Arasan 10.0       -169   11   11  2274   30%   -22   20% 
I could, if you are interested, run a round-robin so that everybody plays everybody a ton of games to see how the old and new compare??
It is always possible that the results will change when all games are played, so I will post a follow-up when it finishes, probably late tonight.
Cluster is currently at 1/2 speed with half the nodes powered down until the A/C problem is resolved.
