Testing Stockfish 11-03-13. 480 Games.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING STOCKFISH IPMAN COMPILE 180114X IP = 480 GAMES.

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Large Pages allowed.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control= 4+0

Stockfish 180114IP 64 SSE4.2L - Houdini 4 x64_st_X6_CT0 22.0 - 18.0 +10/=24/-6 55.00%
Stockfish 180114IP 64 SSE4.2L - Komodo TCECr 64-bitx6 24.5 - 15.5 +14/=21/-5 61.25%
Stockfish 180114IP 64 SSE4.2L - Critter 1.6a 64-bitX6_NOB 25.0 - 15.0 +13/=24/-3 62.50%

Time Control= 2+2

Stockfish 180114IP 64 SSE4.2L - Houdini 4 x64_st_X6_CT0 20.0 - 20.0 +7/=26/-7 50.00%
Stockfish 180114IP 64 SSE4.2L - Komodo TCECr 64-bitx6 26.5 - 13.5 +18/=17/-5 66.25%
Stockfish 180114IP 64 SSE4.2L - Critter 1.6a 64-bitX6_NOB 25.0 - 15.0 +17/=16/-7 62.50%

240 Games = http://www.mediafire.com/view/5f372i43u ... 0games.pgn
Score using 6 Cores= 143.0 – 97.0= 59.58%

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Large Pages allowed
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control = 4+0

Stockfish 180114IP 64 SSE4.2L - Houdini 4 x64xCT0 19.5 - 20.5 +9/=21/-10 48.75%
Stockfish 180114IP 64 SSE4.2L - Komodo TCECr 64-bitx4 23.0 - 17.0 +14/=18/-8 57.50%
Stockfish 180114IP 64 SSE4.2L - Critter 1.6a 64-bitnob_4 24.0 - 16.0 +13/=22/-5 60.00%

Time Control= 2+2

Stockfish 180114IP 64 SSE4.2L - Houdini 4 x64xCT0 19.0 - 21.0 +9/=20/-11 47.50%
Stockfish 180114IP 64 SSE4.2L - Komodo TCECr 64-bitx4 24.5 - 15.5 +15/=19/-6 61.25%
Stockfish 180114IP 64 SSE4.2L - Critter 1.6a 64-bitnob_4 26.0 - 14.0 +13/=26/-1 65.00%

240 Games= http://www.mediafire.com/view/91seo8s5i ... 0Games.pgn
Score using 4 Cores= 136.0 – 104.0 = 56.67%

Segmenting by Time Control:

Fixed TC = 138.0 – 102.0 = 57.50%
Incremental TC = 141.0 – 99.0 = 58.75%

Global Score= 279.0 – 201.0= 58.12%

Against : Houdini 4.0 St. Ct0 (3233) =50.31% ; Komodo TCECr (3178) =61.56% ; Critter 1.6a (3093) = 62.50%

Average Estimated Elo Opponents = 3168
Estimated Elo Performance= 3225


18 EEP below the best SF Ipman’s compile, 050114 (3243)
20 EEP below the best engine ever in my tests: SF Development, 080114 (3245)

Regards,

Tom.
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by duncan »

Tomcass wrote:TESTING STOCKFISH IPMAN COMPILE 180114X IP = 480 GAMES.

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Large Pages allowed.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control= 4+0

Stockfish 180114IP 64 SSE4.2L - Houdini 4 x64_st_X6_CT0 22.0 - 18.0 +10/=24/-6 55.00%
Stockfish 180114IP 64 SSE4.2L - Komodo TCECr 64-bitx6 24.5 - 15.5 +14/=21/-5 61.25%
Stockfish 180114IP 64 SSE4.2L - Critter 1.6a 64-bitX6_NOB 25.0 - 15.0 +13/=24/-3 62.50%

Time Control= 2+2

Stockfish 180114IP 64 SSE4.2L - Houdini 4 x64_st_X6_CT0 20.0 - 20.0 +7/=26/-7 50.00%
Stockfish 180114IP 64 SSE4.2L - Komodo TCECr 64-bitx6 26.5 - 13.5 +18/=17/-5 66.25%
Stockfish 180114IP 64 SSE4.2L - Critter 1.6a 64-bitX6_NOB 25.0 - 15.0 +17/=16/-7 62.50%

240 Games = http://www.mediafire.com/view/5f372i43u ... 0games.pgn
Score using 6 Cores= 143.0 – 97.0= 59.58%

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Large Pages allowed
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control = 4+0

Stockfish 180114IP 64 SSE4.2L - Houdini 4 x64xCT0 19.5 - 20.5 +9/=21/-10 48.75%
Stockfish 180114IP 64 SSE4.2L - Komodo TCECr 64-bitx4 23.0 - 17.0 +14/=18/-8 57.50%
Stockfish 180114IP 64 SSE4.2L - Critter 1.6a 64-bitnob_4 24.0 - 16.0 +13/=22/-5 60.00%

Time Control= 2+2

Stockfish 180114IP 64 SSE4.2L - Houdini 4 x64xCT0 19.0 - 21.0 +9/=20/-11 47.50%
Stockfish 180114IP 64 SSE4.2L - Komodo TCECr 64-bitx4 24.5 - 15.5 +15/=19/-6 61.25%
Stockfish 180114IP 64 SSE4.2L - Critter 1.6a 64-bitnob_4 26.0 - 14.0 +13/=26/-1 65.00%

240 Games= http://www.mediafire.com/view/91seo8s5i ... 0Games.pgn
Score using 4 Cores= 136.0 – 104.0 = 56.67%

Segmenting by Time Control:

Fixed TC = 138.0 – 102.0 = 57.50%
Incremental TC = 141.0 – 99.0 = 58.75%

Global Score= 279.0 – 201.0= 58.12%

Against : Houdini 4.0 St. Ct0 (3233) =50.31% ; Komodo TCECr (3178) =61.56% ; Critter 1.6a (3093) = 62.50%

Average Estimated Elo Opponents = 3168
Estimated Elo Performance= 3225


18 EEP below the best SF Ipman’s compile, 050114 (3243)
20 EEP below the best engine ever in my tests: SF Development, 080114 (3245)

Regards,

Tom.
do you know what is the longest time it has taken for stockfish to get an elo point gain from previous best version. 2 weeks /3 weeks ?

duncan
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

Hi Duncan,

I have not calculated the longest time for DF to get an Elo point gain from its previous version. I can not do i from my tests. They have an obvious limitation regarding the number of games (480 each).

Only as a reference, the average Elo improvement per week for Stockfish in the last 10 months is roughly 3 points. :-)

Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

Hi,

Someone has reported problems for downloading the 4 cores link in my latest test. If you can not download it, please try this one:

http://www.mediafire.com/view/pwrmppl14 ... mes(2).pgn

Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING STOCKFISH ROCKWOOD COMPILE 210114 = 480 GAMES.

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Large Pages allowed.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control= 4+0

Stockwood210114 64 SSE4.2L - Houdini 4 x64_st_X6_CT0 23.5 - 16.5 +12/=23/-5 58.75%
Stockwood210114 64 SSE4.2L - Komodo TCECr 64-bitx6 25.0 - 15.0 +14/=22/-4 62.50%
Stockwood210114 64 SSE4.2L - Critter 1.6a 64-bitX6_NOB 28.5 - 11.5 +18/=21/-1 71.25%

Time Control= 2+2

Stockwood210114 64 SSE4.2L - Houdini 4 x64_st_X6_CT0 20.5 - 19.5 +8/=25/-7 51.25%
Stockwood210114 64 SSE4.2L - Komodo TCECr 64-bitx6 25.0 - 15.0 +13/=24/-3 62.50%
Stockwood210114 64 SSE4.2L - Critter 1.6a 64-bitX6_NOB 25.5 - 14.5 +14/=23/-3 63.75%

240 Games = http://www.mediafire.com/view/ger6n4w43 ... 0games.pgn
Score using 6 Cores= 148.0 – 92.0 = 61.67%

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Large Pages allowed
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control = 4+0

Stockwood210114 64 SSE4.2L - Houdini 4 x64xCT0 22.5 - 17.5 +10/=25/-5 56.25%
Stockwood210114 64 SSE4.2L - Komodo TCECr 64-bitx4 26.0 - 14.0 +16/=20/-4 65.00%
Stockwood210114 64 SSE4.2L - Critter 1.6a 64-bitnob_4 29.0 - 11.0 +19/=20/-1 72.50%

Time Control= 2+2

Stockwood210114 64 SSE4.2L - Houdini 4 x64xCT0 18.0 - 22.0 +5/=26/-9 45.00%
Stockwood210114 64 SSE4.2L - Komodo TCECr 64-bitx4 23.5 - 16.5 +11/=25/-4 58.75%
Stockwood210114 64 SSE4.2L - Critter 1.6a 64-bitnob_4 26.0 - 14.0 +15/=22/-3 65.00%

240 Games=
http://www.mediafire.com/view/30r8fu7xx ... 0games.pgn
Score using 4 Cores= 145.0 – 95.0 = 60.42%

Segmenting by Time Control:

Fixed TC = 154.5 – 85.5 = 64.37%
Incremental TC = 138.5 – 101.5 = 57.71%

Global Score= 293.0 – 187.0 = 61.0417%

Against : Houdini 4.0 St. Ct0 (3233) = 52.81% ; Komodo TCECr (3178) = 62.19% ; Critter 1.6a (3093) = 68.12%

Average Estimated Elo Opponents = 3168
Estimated Elo Performance= 3245


This version of Rockwwod improved -by only half point in 480 games- the best score so far in my tests. Obviously within error bars.

Brilliant!. Again a big difference in score from Fixed to Incremental Time Control.

Regards,

Tom.

-------------------------------------------------------------------

Only as a matter of comparison, the best scorer till now was Stockfish Development 080114, Bench: 8502826 Timestamp: 1389220684. It got this score in my test:

Fixed TC = 149.0 – 91.0 = 62.08%
Incremental TC = 143.5 – 96.5 = 59.79%

Global Score= 292.5 – 187.5 = 60.9375%

Against : Houdini 4.0 St. Ct0 (3233) = 51.25% ; Komodo TCECr (3178) = 61.25% ; Critter 1.6a (3093) = 70.31%

Average Estimated Elo Opponents = 3168
Estimated Elo Performance= 3245.
ouachita
Posts: 454
Joined: Tue Jan 15, 2013 4:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: Testing Stockfish 11-03-13. 480 Games.

Post by ouachita »

Keep up the good work Tom, I'm depending on you
SIM, PhD, MBA, PE
ernest
Posts: 2053
Joined: Wed Mar 08, 2006 8:30 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by ernest »

ouachita wrote:I'm depending on you
...for what ?... 8-)
ouachita
Posts: 454
Joined: Tue Jan 15, 2013 4:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: Testing Stockfish 11-03-13. 480 Games.

Post by ouachita »

to find truth and justice . . .
SIM, PhD, MBA, PE
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

RETESTING THE BEST STOCKFISHS= 1.440 GAMES. FIRST HALF.

The best Stockfishs so far in my tests have been:

Stockwood210114 64 SSE4.2L 293.0 – 187.0 = 61.0417% Estimated Elo= 3245
Stockfish Development 080114 292.5 - 186.5 = 60.9375% Estimated Elo= 3245
Stockfish Ipman LP 050114 291.5 – 188.5 = 60.7202% Estimated Elo= 3243

Since the difference is so small, I decided to test again those three SF to clarify which is the best under my testing conditions ands reduce statistical noise. Here you have the scores after the first half of this test: 720 Games at Fixed Time Control.


i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control= 4+0

Houdini 4 x64_st_X6_CT0 - Stockfish 080114 64 SSE4.2X 22.0 - 18.0 +9/=26/-5 55.00%
Houdini 4 x64_st_X6_CT0 - Stockwood210114 64 SSE4.2L 16.0 - 24.0 +6/=20/-14 40.00%
Houdini 4 x64_st_X6_CT0 - Stockfish 050114IP 64 SSE4.2L 20.0 - 20.0 +9/=22/-9 50.00%


Komodo TCECr 64-bitx6_NOB - Stockfish 080114 64 SSE 15.5 - 24.5 +2/=27/-11 38.75%
Komodo TCECr 64-bitx6_NOB - Stockwood210114 64 SSE 21.0 - 19.0 +10/=22/-8 52.50%
Komodo TCECr 64-bitx6_NOB - Stockfish 050114IP 64 SSE 19.5 - 20.5 +9/=21/-10 48.75%


Critter 1.6a 64-bitX6_NOB_ok - Stockfish 080114 64 SSE 14.5 - 25.5 +3/=23/-14 36.25%
Critter 1.6a 64-bitX6_NOB_ok - Stockwood210114 64 SSE 11.0 - 29.0 +3/=16/-21 27.50%
Critter 1.6a 64-bitX6_NOB_ok - Stockfish 050114IP 64 SSE 14.0 - 26.0 +4/=20/-16 35.00%

Total Score at Fixed Time Control using 6 Cores=

Stockwood 210114 SSE4.2L = 72.0 – 48.0 = 60.00%
Stockfish 080114 64 SSE4.2X = 68.0 – 52.0 = 56.67%
Stockfish 050114 IP 64 SSE4.2L = 66.5 – 53.5 = 55.42%

360 Games= http://www.mediafire.com/view/395l4c...6_360games.pgn

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control = 4+0

Houdini 4 x64xCT0 - Stockfish 080114 64 SSE4.2x 17.5 - 22.5 +6/=23/-11 43.75%
Houdini 4 x64xCT0 - Stockwood210114 64 SSE4.2L 18.5 - 21.5 +7/=23/-10 46.25%
Houdini 4 x64xCT0 - Stockfish 050114IP 64 SSE4.2x 20.0 - 20.0 +8/=24/-8 50.00%

Komodo TCECr 64-bitx4_NOB - Stockfish 080114 64 SSE 20.0 - 20.0 +9/=22/-9 50.00%
Komodo TCECr 64-bitx4_NOB - Stockwood210114 64 SSE 16.5 - 23.5 +8/=17/-15 41.25%
Komodo TCECr 64-bitx4_NOB - Stockfish 050114IP 64 SSE 16.5 - 23.5 +8/=17/-15 41.25%

Critter 1.6a 64-bitnob_4 - Stockfish 080114 64 SSE4.2x 14.0 - 26.0 +3/=22/-15 35.00%
Critter 1.6a 64-bitnob_4 - Stockwood210114 64 SSE4.2L 10.5 - 29.5 +0/=21/-19 26.25%
Critter 1.6a 64-bitnob_4 - Stockfish 050114IP 64 SSE4.2x 14.0 - 26.0 +5/=18/-17 35.00%

Total Score at Fixed Time Control using 4 Cores=

Stockwood 210114 SSE4.2L = 74.5 – 45.5 = 62.08%
Stockfish 050114 IP 64 SSE4.2L = 69.5 – 50.5 = 57.92%
Stockfish 080114 64 SSE4.2X = 68.5 – 51.5 = 57.08%

360 Games=
http://www.mediafire.com/view/iaaue7...20_Gamesx4.pgn

Global Score at Fixed Time Control after 720 Games=

Stockwood 210114 SSE4.2L = 146.5 – 83.5 = 61.04%
Stockfish 080114 64 SSE4.2X = 136.5 – 93.5 = 56.87%
Stockfish 050114 IP 64 SSE4.2L = 136.0 – 94.0 = 56.67%


Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

RETESTING THE BEST STOCKFISHS= 1.440 GAMES. SECOND HALF AT INCREMENTAL TIME CONTROL


i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control= 2+2

Houdini 4 x64_st_X6_CT0 - Stockfish 080114 64 SSE4.2X 22.5 - 17.5 +11/=23/-6 56.25%
Houdini 4 x64_st_X6_CT0 - Stockwood210114 64 SSE4.2L 20.0 - 20.0 +7/=26/-7 50.00%
Houdini 4 x64_st_X6_CT0 - Stockfish 050114IP 64 SSE4.2L 19.0 - 21.0 +6/=26/-8 47.50%

Komodo TCECr 64-bitx6_NOB - Stockfish 080114 64 SSE 15.5 - 24.5 +4/=23/-13 38.75%
Komodo TCECr 64-bitx6_NOB - Stockwood210114 64 18.5 - 21.5 +7/=23/-10 46.25%
Komodo TCECr 64-bitx6_NOB - Stockfish 050114IP 64 SSE 17.5 - 22.5 +6/=23/-11 43.75%

Critter 1.6a 64-bitX6_NOB_ok - Stockfish 080114 64 SSE4.2 10.0 - 30.0 +1/=18/-21 25.00%
Critter 1.6a 64-bitX6_NOB_ok - Stockwood210114 64 L 13.0 - 27.0 +1/=24/-15 32.50%
Critter 1.6a 64-bitX6_NOB_ok - Stockfish 050114IP 64 14.5 - 25.5 +4/=21/-15 36.25%

Total Score at Incremental Time Control=

Stockfish 080114 64 SSE4.2X = 72.0 -48.0 = 60.00%
Stockwood 210114 SSE4.2L = 68.5 – 51.5 = 57.09%
Stockfish 050114 IP 64 SSE4.2L = 68.0 – 52.0 = 56.67%

360 Games=
http://www.mediafire.com/view/acm8wkrit ... _2%2B2.pgn

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control = 2+2

Houdini 4 x64xCT0 - Stockfish 080114 64 SSE4.2x 18.5 - 21.5 +7/=23/-10 46.25%
Houdini 4 x64xCT0 - Stockwood210114 64 SSE4.2L 21.0 - 19.0 +8/=26/-6 52.50%
Houdini 4 x64xCT0 - Stockfish 050114IP 64 SSE4.2x 22.0 - 18.0 +8/=28/-4 55.00%


Komodo TCECr 64-bitx4_NOB - Stockfish 080114 64 2x 19.5 - 20.5 +6/=27/-7 48.75%
Komodo TCECr 64-bitx4_NOB - Stockwood210114 64 17.0 - 23.0 +7/=20/-13 42.50%
Komodo TCECr 64-bitx4_NOB - Stockfish 050114IP 64 16.0 - 24.0 +5/=22/-13 40.00%

Critter 1.6a 64-bitnob_4 - Stockfish 080114 64 SSE4.2x 11.5 - 28.5 +6/=11/-23 28.75%
Critter 1.6a 64-bitnob_4 - Stockwood210114 64 SSE4.2L 10.0 - 30.0 +2/=16/-22 25.00%
Critter 1.6a 64-bitnob_4 - Stockfish 050114IP 64 SSE4.2x 15.5 - 24.5 +4/=23/-13 38.75%

Total Score at Incremental Time Control using 4 Cores=

Stockwood 210114 SSE4.2L = 72.0 -48.0 = 60.00%
Stockfish 080114 64 SSE4.2X = 70.5 – 49.5 = 58.75%
Stockfish 050114 IP 64 SSE4.2L = 66.5 -53.5 = 55.42%

360 Games=
http://www.mediafire.com/view/pvtapayqa ... _2%2B2.pgn

Global Score at Incremental Time Control after 720 Games=

Stockwood 210114 SSE4.2L = 140.5 – 99.5 = 58.54%
Stockfish 080114 64 SSE4.2X = 142.5 – 97.5 = 59.37%
Stockfish 050114 IP 64 SSE4.2L = 134.5 – 105.5 = 56.04%


TOTAL SCORE AFTER 1440 GAMES =

Stockwood 210114 SSE4.2L = 287.0 – 193.0 = 59.79% (3237)
Stockfish 080114 64 SSE4.2X = 279.0 – 201.0 = 58.12% (3225)
Stockfish 050114 IP 64 SSE4.2L = 270.5 –199.5 = 56.35% (3212)


All the SF have scored worse than in my previous test. Stockwood 210114 confirmed as best scorer so far, 14 Estimated Elo Points ahead of Houdini 4.0 St. Contempt 0.

My current rating list including these 1440 games is:

1.- Stockwood 210114 SSE4.2L = 3241
2.- Stockfish 080114 64 SSE4.2X = 3235
3.- Stockfish 050114 IP 64 SSE4.2L = 3228
4.- Houdini 4.0 St. Contempt 0 = 3227
5.- Stockfish DD = 3212
6.- Houdini 4.0 Pro B = 3210
7.- Komodo TCECr = 3181
8.- Houdini 3.0 Pro = 3172
9.- Amitis 16-10-2013 = 3.152
10.- Stockfish 4= 3119
11.- Critter 1.6a = 3104
12.- Deep Rybka 4.1 = 3015

Regards,

Tom.