Testing Stockfish 11-03-13. 480 Games.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Sedat Canbaz »

Tomcass wrote:Thanks for your kind 'welcome back', Bobby, Sedat and Kah Huat. :D

Here you have a new test:

TESTING STOCKFISH DEVELOPMENT 050314= 480 GAMES

Bench: 8430785 Timestamp: 1394006112

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control= 4+0

Stockfish 050314 64 SSE4.2x - Houdini 4 x64_st_X6_CT0 26.0 - 14.0 +15/=22/-3 65.00%
Stockfish 050314 64 SSE4.2x - Komodo TCECr 64-bitx6 25.0 - 15.0 +18/=14/-8 62.50%
Stockfish 050314 64 SSE4.2x - Critter 1.6a 64-bitX6_NOB 27.0 - 13.0 +17/=20/-3 67.50%

Time Control= 2+2

201403Stockfish_050314_2+2 2014

Stockfish 050314 64 SSE4.2x - Houdini 4 x64_st_X6_CT0 22.0 - 18.0 +10/=24/-6 55.00%
Stockfish 050314 64 SSE4.2x - Komodo TCECr 64-bitx6 23.5 - 16.5 +14/=19/-7 58.75%
Stockfish 050314 64 SSE4.2x - Critter 1.6a 64-bitX6_NOB 29.5 - 10.5 +20/=19/-1 73.75%

Score using 6 cores: 153.0 – 87.0 = 63.75%
240 Games: http://www.mediafire.com/view/8oi3beohm ... 0games.pgn

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control: 4+0

Stockfish 050314 64 SSE4.2x - Houdini 4 x64xCT0 21.0 - 19.0 +8/=26/-6 52.50%
Stockfish 050314 64 SSE4.2x - Komodo TCECr 64-bitx4 26.0 - 14.0 +15/=22/-3 65.00%
Stockfish 050314 64 SSE4.2x - Critter 1.6a 64-bitnob_4 28.0 - 12.0 +19/=18/-3 70.00%


Time Control: 2+2

Stockfish 050314 64 SSE4.2x - Houdini 4 x64xCT0 19.0 - 21.0 +5/=28/-7 47.50%
Stockfish 050314 64 SSE4.2x - Komodo TCECr 64-bitx4 25.0 - 15.0 +16/=18/-6 62.50%
Stockfish 050314 64 SSE4.2x - Critter 1.6a 64-bitnob_4 24.5 - 15.5 +14/=21/-5 61.25%

Score using 4 Cores= 143.5 – 96.5 = 59.79%
240 Games:
http://www.mediafire.com/view/vv21rfbdf ... amesx4.pgn

Segmenting by Time Control:

Fixed TC = 153.0 – 87.0 =63.75%
Incremental TC = 143.5 – 96.5 = 59.79%

Global Score= 296.5 – 183.5 = 61.77%

Against : Houdini 4.0 St. Ct0 (3227) = 55.00% ; Komodo TCECr (3181) = 62.19%, Critter 1.6a (3104) = 68.12%

Average Estimated Elo Opponents = 3171
Estimated Elo Performance= 3253


Error bars: +/- 23 EEP

Regards,

Tom.
Thanks for the update dear Tom,

You can try also as a participant Gull 2.8b, anther serious opponent

Best,
Sedat
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

Thanks for your suggestion, Sedat. I have no doubt that Gull 2.8b is a strong engine, but it has performed below Critter 1.6a so far. For example:

http://www.computerchess.org.uk/ccrl/40 ... .8b+64-bit

Although other reliable testers, as Ipman, found a better score for Gull 2.8b.

http://www.ipmanchess.yolasite.com/list-i3-m380.php

Since Critter is clearly weaker than the other engines I play with, I would like to find another better option for my tests. By now however I have not found it yet. Perhaps I will give a look to this Gull... :wink:

Regards,

Tom.
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Sedat Canbaz »

Tomcass wrote:Thanks for your suggestion, Sedat. I have no doubt that Gull 2.8b is a strong engine, but it has performed below Critter 1.6a so far. For example:

http://www.computerchess.org.uk/ccrl/40 ... .8b+64-bit

Although other reliable testers, as Ipman, found a better score for Gull 2.8b.

http://www.ipmanchess.yolasite.com/list-i3-m380.php

Since Critter is clearly weaker than the other engines I play with, I would like to find another better option for my tests. By now however I have not found it yet. Perhaps I will give a look to this Gull... :wink:

Regards,

Tom.

Not at all...

CCRL rated Gull better than Critter, please check:
http://www.computerchess.org.uk/ccrl/404/

Also please see SCCT rating too,Gull performed 50 Elo over than Critter:
http://www.sedatcanbaz.com/chess/?page_id=515

Happy testings,
Sedat
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

.... mmmmm, why not to try it?. You have convinced me. It seems around 30 Elo points stronger than Critter 1.6a. I have in my queue a S.F. Rockwood version, after the current Ipman compile I am testing right now. After that I will test the latest Gull. Thanks, Sedat!.

Tom.
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Sedat Canbaz »

Tomcass wrote:.... mmmmm, why not to try it?. You have convinced me. It seems around 30 Elo points stronger than Critter 1.6a. I have in my queue a S.F. Rockwood version, after the current Ipman compile I am testing right now. After that I will test the latest Gull. Thanks, Sedat!.

Tom.
Yes Tom, Gull is another super strong engine

Btw, Gull performed as 3rd on my latest Swiss competitions:
http://www.talkchess.com/forum/viewtopic.php?t=51480

Best,
Sedat
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING STOCKFISH IPMAN COMPILE (NO LARGE PAGES) 230214 = 480 GAMES.

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014c – 5 moves (Thanks, Sedat!)
No tablebases. No RTB used.
Large Pages NOT allowed.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control= 4+0

SF 230214IPx 64 SSE4.2NLP - Houdini 4 x64_st_X6_CT0 22.5 - 17.5 +10/=25/-5 56.25%
SF 230214IPx 64 SSE4.2NLP - Komodo TCECr 64-bitx6 25.5 - 14.5 +13/=25/-2 63.75%
SF 230214IPx 64 SSE4.2NLP - Critter 1.6a 64-bitX6_NOB 26.0 - 14.0 +16/=20/-4 65.00%

Time Control= 2+2

SF 230214IPx 64 SSE4.2NLP - Houdini 4 x64_st_X6_CT0 22.5 - 17.5 +10/=25/-5 56.25%
SF 230214IPx 64 SSE4.2NLP - Komodo TCECr 64-bitx6 24.0 - 16.0 +14/=20/-6 60.00%
SF 230214IPx 64 SSE4.2NLP - Critter 1.6a 64-bitX6_NOB 27.0 - 13.0 +14/=26/-0 67.50%

240 Games = http://www.mediafire.com/download/7p7bd ... 0games.pgn
Score using 6 Cores= 147.5 – 92.5 = 61.46%

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014c – 5 moves -
No tablebases. No RTB used.
Large Pages NOT allowed
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control = 4+0

SF 230214IPx 64 SSE4.2NLP - Houdini 4 x64xCT0 24.5 - 15.5 +14/=21/-5 61.25%
SF 230214IPx 64 SSE4.2NLP - Komodo TCECr 64-bitx4 26.5 - 13.5 +17/=19/-4 66.25%
SF 230214IPx 64 SSE4.2NLP - Critter 1.6a 64-bitnob_4 26.5 - 13.5 +16/=21/-3 66.25%

Time Control= 2+2

SF 230214IPx 64 SSE4.2NLP - Houdini 4 x64xCT0 19.5 - 20.5 +10/=19/-11 48.75%
SF 230214IPx 64 SSE4.2NLP - Komodo TCECr 64-bitx4 23.5 - 16.5 +12/=23/-5 58.75%
SF 230214IPx 64 SSE4.2NLP - Critter 1.6a 64-bitnob_4 29.0 - 11.0 +18/=22/-0 72.50%

240 Games=
http://www.mediafire.com/view/i7ydmiwkd ... amesx4.pgn

Score using 4 Cores= 149.5 – 90.5 = 62.29%

Segmenting by Time Control:

Fixed TC = 151.5 – 88.5 = 63.12%
Incremental TC = 145.5 – 94.5 = 60.62%

Global Score= 297.0 – 183 = 61.87%

Against : Houdini 4.0 St. Ct0 (3227) 55.62% ; Komodo TCECr (3181) 62.19% ; Critter 1.6a (3104) 67.81%

Average Estimated Elo Opponents = 3171
Estimated Elo Performance= 3254


Error bars= +/- 23 EEP

Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING STOCKFISH ROCKWOOD COMPILE 140307 = 440 GAMES (Not 480 GAMES THIS TIME).

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014c Sedat –Limit 5 moves-
No tablebases. No RTB used.
Large Pages allowed.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control= 4+0

StockfishRW 140307 64X6 - Houdini 4 x64_st_X6_CT0 22.5 - 17.5 +9/=27/-4 56.25%
StockfishRW 140307 64X6 - Komodo TCECr 64-bitx6 B 25.5 - 14.5 +17/=17/-6 63.75%
StockfishRW 140307 64X6 - Critter 1.6a 64-bitX6_NOBk 25.0 - 15.0 +11/=28/-1 62.50%

Time Control= 2+2

StockfishRW 140307 64X6 - Houdini 4 x64_st_X6_CT0 18.5 - 21.5 +6/=25/-9 46.25%
StockfishRW 140307 64X6 - Komodo TCECr 64-bitx6 24.0 - 16.0 +14/=20/-6 60.00%
StockfishRW 140307 64X6 - Critter 1.6a 64-bitX6_NOB 28.0 - 12.0 +19/=18/-3 70.00%

240 Games = http://www.mediafire.com/view/8f6cg9n15 ... 0games.pgn
Score using 6 Cores= 143.5 – 96.5 = 59.79%

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014c Sedat – Limit 5 moves -
No tablebases. No RTB used.
Large Pages allowed
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control = 4+0

StockfishRW 140307 64X - Houdini 4 x64xCT0 21.5 - 18.5 +13/=17/-10 53.75%
StockfishRW 140307 64X - Komodo TCECr 64-bitx4 22.0 - 18.0 +12/=20/-8 55.00%
StockfishRW 140307 64X - Critter 1.6a 64-bitnob 28.5 - 11.5 +20/=17/-3 71.25%

Time Control= 2+2

StockfishRW 140307 64X - Houdini 4 x64xCT0 11.5 - 15.5 +4/=15/-8 42.59%
StockfishRW 140307 64X - Komodo TCECr 64-bitx4 17.0 - 9.0 +11/=12/-3 65.38%
StockfishRW 140307 64X - Critter 1.6a 64-bitnob 21.0 - 6.0 +15/=12/-0 77.78%

240 Games= http://www.mediafire.com/view/hl62zcr43 ... 307_x4.pgn
Score using 4 Cores= 121.5 – 78.5 = 60.75%

Segmenting by Time Control:

Fixed TC = 145.0 – 95.0 = 60.42%
Incremental TC = 120.0 -80.0 = 60.00%

Global Score= 265.0 – 175.0 = 60.22%

Against : Houdini 4.0 St. Ct0 (3227) 50.68%= ; Komodo TCECr (3181) = 60.62% ; Critter 1.6a (3104) = 69.73%

Average Estimated Elo Opponents = 3171
Estimated Elo Performance= 3243


Error bars= +/- 24 EEP

After 440 games SF Rockwood failed in one of my computers and I was not able to restart the test. Perhaps I did not choose the best version for my old 4 cores computer. The score is not brilliant, anyway. About 18 EEP below the leading scorers… although 16 points ahead of the best Houdini 4.0.

Regards,

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING GULL 2.8 Beta = 480 GAMES

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014c (Sedat )
No tablebases. No RTB used.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control= 4+0

Gull 2.8 beta x64X6 - Houdini 4 x64_st_X6_CT0 15.0 - 25.0 +5/=20/-15 37.50%
Gull 2.8 beta x64X6 - Komodo TCECr 64-bitx6_NOB 19.0 - 21.0 +8/=22/-10 47.50%
Gull 2.8 beta x64X6 - Critter 1.6a 64-bitX6_NOB_ok 19.5 - 20.5 +9/=21/-10 48.75%

Time Control= 2+2

201403Gull2.8BetaX6_2+2 2014

Gull 2.8 beta x64X6 - Houdini 4 x64_st_X6_CT0 18.0 - 22.0 +8/=20/-12 45.00%
Gull 2.8 beta x64X6 - Komodo TCECr 64-bitx6_NOB 15.0 - 25.0 +3/=24/-13 37.50%
Gull 2.8 beta x64X6 - Critter 1.6a 64-bitX6_NOB_ok 18.5 - 21.5 +8/=21/-11 46.25%

Score using 6 cores: 105.0 – 135.0 = 43.75%
240 Games:
http://www.mediafire.com/view/4a5sm4ttb ... 0games.pgn

i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2014c (Sedat)
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control: 4+0

Gull 2.8 beta x64 - Houdini 4 x64xCT0 14.5 - 25.5 +5/=19/-16 36.25%
Gull 2.8 beta x64 - Komodo TCECr 64-bitx4_NOB 20.5 - 19.5 +10/=21/-9 51.25%
Gull 2.8 beta x64 - Critter 1.6a 64-bitnob_4 25.5 - 14.5 +15/=21/-4 63.75%

Time Control: 2+2

Gull 2.8 beta x64 - Houdini 4 x64xCT0 13.5 - 26.5 +3/=21/-16 33.75%
Gull 2.8 beta x64 - Komodo TCECr 64-bitx4_NOB 16.0 - 24.0 +7/=18/-15 40.00%
Gull 2.8 beta x64 - Critter 1.6a 64-bitnob_4 24.5 - 15.5 +14/=21/-5 61.25%

Score using 4 Cores= 114.5 – 125.5 = 47.71%
240 Games:
http://www.mediafire.com/view/6w7gecfau ... eta_x4.pgn

Segmenting by Time Control:

Fixed TC = 114.0 – 126.0 = 47.50%
Incremental TC = 105.5 -134.5 = 43.96%

Global Score= 219.5 – 260.5 = 45.73%

Against : Houdini 4.0 St. Ct0 (3227) = 38.12% ; Komodo TCECr (3181) = 44.06%, Critter 1.6a (3104) = 55.00 %

Average Estimated Elo Opponents = 3171
Estimated Elo Performance= 3141


Error bars: +/- 23 EEP

My friend Sedat Cambaz suggested me to introduce in my tests Gull 2.8beta instead of my beloved but already old Critter 1.6a. This test is good enough for me to make the change. Gull 2.8 beta seems to be around 37 EEP stronger than Critter 1.6a, and only 40 EEP below Komodo TCECr. If the expected new Gull 3 MP is even a bit stronger… well, let’s wait for this new release. :wink:

Regards,

Tom.
carldaman
Posts: 2287
Joined: Sat Jun 02, 2012 2:13 am

Re: Testing Stockfish 11-03-13. 480 Games.

Post by carldaman »

Tomcass wrote:
My friend Sedat Cambaz suggested me to introduce in my tests Gull 2.8beta instead of my beloved but already old Critter 1.6a. This test is good enough for me to make the change. Gull 2.8 beta seems to be around 37 EEP stronger than Critter 1.6a, and only 40 EEP below Komodo TCECr. If the expected new Gull 3 MP is even a bit stronger… well, let’s wait for this new release. :wink:

Regards,

Tom.
Hi Tom,

Interesting results, as Gull does appear stronger than Critter at faster TCs.
However, you could still use Critter for your tests, besides Houdini, Komodo and Gull. You can have the same *total* number of test games, but spread out between 4 engines, rather than 3. :D

Regards,
CL
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

Hi, Carl.

Thanks for your suggestion. This was my first thought, but I decided to keep my 480 games format with the three strongest available engines.

Best regards,

Tom.