Discussion of computer chess matches and engine tournaments.
Moderator: Ras
lucasart
Posts: 3241 Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart
Post
by lucasart » Wed Feb 01, 2012 11:27 am
my testing seems to confirms the rumours that SF 2.2.2 is significantly stronger than 2.2.1 (20-30 elo)
remember to look at the error bar however
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Critter 1.4 3240 36 35 300 75% 3031 31%
2 Stockfish 2.2.2 3189 41 39 250 71% 3011 27%
3 IvanHoe 999946h 3182 35 34 300 65% 3051 33%
4 Protector 1.4 2920 32 32 400 51% 2915 21%
5 Umko 1.2 2867 27 27 550 53% 2854 24%
6 Toga 1.4.1 2847 25 25 650 57% 2804 24%
7 Daydreamer 1.75 2741 25 25 550 61% 2662 29%
8 Fruit 2.1 2700 23 23 650 52% 2682 26%
9 Crafty 23.4 2667 25 25 550 45% 2706 23%
10 Cheng3 1.07 2655 26 26 500 54% 2628 25%
11 GNU Chess 5.07.173b 2645 23 23 600 47% 2667 28%
12 Arasan 13.4 2640 25 25 550 47% 2660 22%
13 Rodent 0.10 2633 24 24 550 55% 2598 26%
14 Scorpio 2.7 2628 22 22 700 44% 2677 27%
15 Pepito 1.59 2596 23 23 650 52% 2578 23%
16 Sloppy 0.2.2 2540 22 22 750 41% 2604 24%
17 EXchess 6.10 2535 24 23 650 59% 2463 24%
18 Greko 9.0 2495 22 22 750 41% 2567 23%
19 Pawny 0.3.1 2475 27 27 500 48% 2495 19%
20 DoubleCheck 2.5.2 2464 27 26 500 56% 2418 21%
21 Olithink 5.3.0 2385 29 29 400 49% 2391 21%
22 Sungorus 1.4 2342 26 27 500 38% 2430 23%
23 Jazz 501 2317 30 30 400 39% 2399 20%
24 Beowulf 2.4 2254 35 36 300 34% 2381 19%
25 KMT Chess 1.2.1 2246 35 36 300 33% 2383 17%
mcostalba
Posts: 2684 Joined: Sat Jun 14, 2008 9:17 pm
Post
by mcostalba » Wed Feb 01, 2012 8:29 pm
lucasart wrote: my testing seems to confirms the rumours that SF 2.2.2 is significantly stronger than 2.2.1 (20-30 elo)
Perhaps that little easy move detector bug has more impact than one would assume: I noticed it while watching a game on Playchess of a SF against a Houdini.
Houdini pushed a pawn in a very complicated position where SF was in advantage and SF immediately took it and immediately found itself fighting for a draw !
I understand it happens rarely, but when it happens a single bad move could compromise an entire game, especially at mid/long TC: with these monsters a single weak move is enough to lose a game.
lucasart
Posts: 3241 Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart
Post
by lucasart » Thu Feb 02, 2012 12:35 am
I played another 50 games vs Critter, IvanHoe, and Protector. So that SF 2.2.2 replays exactly the same games then SF 2.2.1. About +10 elo now. Anyway the easy move is certainly a good bugfix
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Critter 1.4 3245 33 32 350 73% 3052 32%
2 IvanHoe 999946h 3186 32 31 350 64% 3069 35%
3 Stockfish 2.2.2 3178 30 30 400 65% 3052 31%
4 Protector 1.4 2921 30 30 450 47% 2944 22%
5 Umko 1.2 2867 27 27 550 53% 2854 24%
6 Toga 1.4.1 2847 25 25 650 57% 2804 24%
7 Daydreamer 1.75 2741 25 25 550 61% 2662 29%
8 Fruit 2.1 2700 23 23 650 52% 2682 26%
9 Crafty 23.4 2667 25 25 550 45% 2706 23%
10 Cheng3 1.07 2655 26 26 500 54% 2628 25%
11 GNU Chess 5.07.173b 2644 23 23 600 47% 2667 28%
12 Arasan 13.4 2640 25 25 550 47% 2660 22%
13 Rodent 0.10 2632 24 24 550 55% 2598 26%
14 Scorpio 2.7 2628 22 22 700 44% 2677 27%
15 Pepito 1.59 2596 23 23 650 52% 2578 23%
16 Sloppy 0.2.2 2540 22 22 750 41% 2604 24%
17 EXchess 6.10 2535 24 23 650 59% 2462 24%
18 Greko 9.0 2495 22 22 750 41% 2567 23%
19 Pawny 0.3.1 2475 27 27 500 48% 2494 19%
20 DoubleCheck 2.5.2 2464 27 26 500 56% 2418 21%
21 Olithink 5.3.0 2385 29 29 400 49% 2391 21%
22 Sungorus 1.4 2342 26 27 500 38% 2430 23%
23 Jazz 501 2317 30 30 400 39% 2399 20%
24 Beowulf 2.4 2253 35 36 300 34% 2381 19%
25 KMT Chess 1.2.1 2246 35 36 300 33% 2383 17%