Testing Stockfish 11-03-13. 480 Games.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

fauzi wrote:the second file is set to private, cannot download
I see that the problem appears only with one of my computers. I don't know why. Anyway, I hope this link will work:

http://www.mediafire.com/view/7fk1e7bnk ... 0games.pgn

Thanks for pointing out again the problem, Akram! :D

Regards,

Tom.
fauzi
Posts: 61
Joined: Wed Nov 20, 2013 10:42 am

Re: Testing Stockfish 11-03-13. 480 Games.

Post by fauzi »

yes this file can be downloaded but its only 120 games while that one had 240 games :D (sorry to bring you headache)
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

Woooow! It is difficult for me to cope with Mediafire. :wink:

I have tried this one and it seems to be right.

http://www.mediafire.com/view/oz1ru1j6w ... 0games.pgn

Apologies for the inconveniences!.

Tom.
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

TESTING STOCKFISH DEVELOPMENT 231213= 480 GAMES.

Timestamp: 1387828530
Bench: 6835416

i7 980 3.33 Ghz.
6 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Hash 512
Relative Speed: 28.66
Knodes per second: 13.759

Time Control= 4+0

Stockfish 231213 64 SSE4.2x - Houdini 4 x64_st_X6_CT0 20.5 - 19.5 +6/=29/-5 51.25%
Stockfish 231213 64 SSE4.2x - Komodo 6 64-bitNOBx6 26.0 - 14.0 +19/=14/-7 65.00%
Stockfish 231213 64 SSE4.2x - Critter 1.6a 64-bitX6_NOB 26.0 - 14.0 +16/=20/-4 65.00%

Time Control= 2+2

Stockfish 231213 64 SSE4.2x - Houdini 4 x64_st_X6_CT0 20.5 - 19.5 +9/=23/-8 51.25%
Stockfish 231213 64 SSE4.2x - Komodo 6 64-bitNOBx6 26.0 - 14.0 +17/=18/-5 65.00%
Stockfish 231213 64 SSE4.2x - Critter 1.6a 64-bitX6_NOB 27.5 - 12.5 +20/=15/-5 68.75%

240 Games = http://www.mediafire.com/view/yt8ga4vo0 ... 0games.pgn
Score using 6 Cores= 146.5 – 93.5 = 61.04%


i7 975 3.33 Ghz.
4 real cores
Ponder: Off.
GUI: Fritz 12
Book: Perfect 2012c
No tablebases. No RTB used.
Hash 256
Relative Speed: 20.62
Knodes per second: 9.899

Time Control = 4+0

Stockfish 231213 64 SSE4.2x - Houdini 4 x64xCT0 19.5 - 20.5 +10/=19/-11 48.75%
Stockfish 231213 64 SSE4.2x - Komodo 6 64-bitx4 25.5 - 14.5 +17/=17/-6 63.75%
Stockfish 231213 64 SSE4.2x - Critter 1.6a 64-bitnob 23.0 - 17.0 +11/=24/-5 57.50%

Time Control= 2+2

Stockfish 231213 64 SSE4.2x - Houdini 4 x64xCT0 19.5 - 20.5 +9/=21/-10 48.75%
Stockfish 231213 64 SSE4.2x - Komodo 6 64-bitx4_NOB 27.5 - 12.5 +18/=19/-3 68.75%
Stockfish 231213 64 SSE4.2x - Critter 1.6a 64-bitnob_4 26.0 - 14.0 +13/=26/-1 65.00%

240 Games http://www.mediafire.com/view/sljjma2ns ... 0games.pgn
Score using 4 Cores= 141.0 – 99.0 = 58.75%

Segmenting by Time Control:

Fixed TC = 140.5 – 99.5 = 58.54%
Incremental TC = 147.0 – 93.0 = 61.25%

Global Score= 287.5 – 192.5 = 59.90%

Against : Houdini 4.0 St. Ct0 (3233) = 50.00% ; Komodo 6 (3162) = 65.62% ; Critter 1.6a (3093) = 64.06%

Average Estimated Elo Opponents = 3163
Estimated Elo Performance= 3232


This is the best Dev. Version of Stockfish I have tested so far. Only as a reference, in 13 days the improvement -subjet to statistical noise, of course- has been apparent:

Stockfish 101213 64 SSE4.2 = 3214
Stoskfish 231213 64 SSE4.2 = 3232

+ 18 Estimated Elo Points. Well done, SF Team!!.

Regards,

Tom.

Btw. I hope the links with games will work well this time. (I cross my fingers!).
ouachita
Posts: 454
Joined: Tue Jan 15, 2013 4:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: Testing Stockfish 11-03-13. 480 Games.

Post by ouachita »

Tom,
Time Control = 4+0

Stockfish 231213 64 SSE4.2x - Houdini 4 x64xCT0= 48.75%

Time Control= 2+2

Stockfish 231213 64 SSE4.2x - Houdini 4 x64xCT0= 48.75%

It seems that the King of Blitz is not dead yet . . . in fact the rumors of his/her death are greatly exaggerated . . . LOL

Nice work.
SIM, PhD, MBA, PE
ouachita
Posts: 454
Joined: Tue Jan 15, 2013 4:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: Testing Stockfish 11-03-13. 480 Games.

Post by ouachita »

More test results from internet:

Games Completed = 3352 of 10000 (Avg game length = 19.883 sec)
Settings = RR/32MB/100ms per move/M 1000cp for 12 moves, D 150 moves/
Time = 34456 sec elapsed, 68336 sec remaining
1. Komodo TCECr 64-bit 1249.5/2235 882-618-735 (L: m=506 t=0 i=0 a=112) (D: r=408 i=203 f=40 s=15 a=69) (tpm=126.4 d=13.08 nps=1183476)
2. Houdini 4 Pro x64 v1x 1036.5/2235 662-824-749 (L: m=450 t=0 i=0 a=374) (D: r=530 i=126 f=43 s=11 a=39) (tpm=151.4 d=14.02 nps=1471894)
3. Stockfish 231213 64 SSE4.2 1066.0/2234 719-821-694 (L: m=405 t=0 i=0 a=416) (D: r=434 i=151 f=33 s=18 a=58) (tpm=122.5 d=16.87 nps=1302119)
SIM, PhD, MBA, PE
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by duncan »

Tomcass wrote: Stockfish 101213 64 SSE4.2 = 3214
Stoskfish 231213 64 SSE4.2 = 3232

+ 18 Estimated Elo Points. Well done, SF Team!!.
STOCKFISH 101213 Elo Performance= 3214
STOCKFISH 151213 Elo Performance= 3230

if you set the dates between 10th and 15th dec there is a gain of 16 points in 5 days. over 3 points a day. do you know if are there quicker increases between 2 versions. ?
Tomcass
Posts: 786
Joined: Sun Apr 16, 2006 9:09 pm

Re: Testing Stockfish 11-03-13. 480 Games.

Post by Tomcass »

duncan wrote:
Tomcass wrote: Stockfish 101213 64 SSE4.2 = 3214
Stoskfish 231213 64 SSE4.2 = 3232

+ 18 Estimated Elo Points. Well done, SF Team!!.
STOCKFISH 101213 Elo Performance= 3214
STOCKFISH 151213 Elo Performance= 3230

if you set the dates between 10th and 15th dec there is a gain of 16 points in 5 days. over 3 points a day. do you know if are there quicker increases between 2 versions. ?
Good point, Duncan. I have not checked intensively it, but at first sight I think the increase you have mentioned is the fastest one in my tests.

A good friend of mine pointed out a curiosity: The increase in ELO I have found between SF 101213 and SF 231213 (18 points) is exactly the same that the improvement expected by SF Team in the testframe. Surprising but exciting, isn't it?. :wink:

Regards,

Tom.
User avatar
pohl4711
Posts: 2821
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Testing Stockfish 11-03-13. 480 Games.

Post by pohl4711 »

Tomcass wrote:The increase in ELO I have found between SF 101213 and SF 231213 (18 points) is exactly the same that the improvement expected by SF Team in the testframe. Surprising but exciting, isn't it?. :wink:
.
Thats not correct. The +18 Elo Regression-Test in the SF-testframework is the result of Stockfish 231213 against Stockfish DD. All Regression-Tests are done with the last official release (at the moment Stockfish DD) and the new development-version.
And as you can see in my LS-ratinglist, Stockfish 101213 (I write 131210...) is +7 Elo stronger than Stockfish DD.
The result of Stockfish 131223 in its LS-testrun wil go online on Tuesday. And at the moment (4000 games) it is only +3 Elo stronger than Stockfish 131210 and +10 stronger than Stockfish DD. But there are still 6000 games to play.

Stefan
ouachita
Posts: 454
Joined: Tue Jan 15, 2013 4:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: Testing Stockfish 11-03-13. 480 Games.

Post by ouachita »

Tomcass wrote:The increase in ELO I have found between SF 101213 and SF 231213 (18 points) is exactly the same that the improvement expected by SF Team in the testframe. Surprising but exciting, isn't it?. :wink: .
pohl4711 wrote:Thats not correct.
Stefan,
There are numerous engine testers using differing test bases. Thus, the results will differ. Isn't it reasonable to say that there's no right or wrong, or correct or incorrect, but that each test result stands on its on, and each reviewer has to draw his/her own conclusions on each test result and the tests as a whole?
SIM, PhD, MBA, PE