lkaufman wrote:It seems at first that we have a bit of a mystery, in that Stockfish 4 beats Houdini 3 in your test but does worse against almost every other opponent.
Please let me help you to re-read the data. This is the table of the score improvements from SF 3 to SF 4 as downloaded from Ingo site.
Code: Select all
4 Stockfish 4 : 3016 3000 (+1660,=1113,-227), 73.9 %
8 Stockfish 3 : 2977 3450 (+1568,=1490,-392), 67.0 %
Houdini 3 STD : 37.3 -> 52.0
Komodo CCT : 41.3 -> 48.3
Critter 1.4a : 51.7 -> 54.3
Deep Rybka 4.1 : 55.0 -> 59.3
Gull 2.1 : 59.0 -> 61.0
Chiron 1.5 : 68.0 -> 74.0
Protector 1.5.0 : 71.7 -> 77.7
Naum 4.2 : 71.3 -> 76.7
Hannibal 1.3 : 72.7 -> 74.3
Deep Fritz 13 32b : 74.7 -> 77.3
HIARCS 14 WCSC 32b : 71.3 -> 75.7
Deep Shredder 12 : 70.0 -> 77.3
Deep Sjeng c't 2010 32b : 75.7 -> 81.7
Spike 1.4 32b : 82.0 -> 83.3
spark-1.0 : 81.7 -> 80.3 *
Deep Junior 13.3 : 80.7 -> 82.0
Booot 5.2.0 : 78.7 -> 80.3
Quazar 0.4 : 84.7 -> 88.7
Toga II 3.0 32b : 82.7 -> 87.7
Zappa Mexico II : 81.0 -> 85.7
So SF has done terribly better against H3 and very well against Komodo, but has substantially improved against all the opponents (with the exception of spark). It is also interesting to note that the improvement is more or less equally distributed across all the range, no matter if strong or weak. For instance we have improved a lot also against Toga, Deep Sjeng and Shredder.
Ingo, thanks a lot for running your tournament in a timely and efficient way as usual, and sorry if the binary issue has caused you some trouble. I will try to manage it differently in the future.