Stockfish 1.6.3 JA update available

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Stockfish 1.6.3 JA update available

Post by mcostalba »

swami wrote: Stockfish 1.6.2 was tested when STS suites had a partial credit moves and Arena erroneously awarded points for certain moves. I wasn't aware of this bug until Wesley pointed it out in a thread later on.

Stockfish 1.6.3 is tested with no-partial scoring-STS suites which consists of only best moves.
Hi Swami,

I considered again this statement and I think there is something that I don't understand.

You said that previously Arena granted some points more to sf 1.6.2 but you also say that sf 1.6.3 has better test results, so this cannot be because otherwise 1.6.2 should have had better results of 1.6.3 because took some extra point as a gift from Arena, considering that functionality is the same between 1.6.2 and 1.6.3 I still believe your testing procedure is not reproducible at 100% and different runs on the _same_ engine yields to different results.

What do you think ?
swami
Posts: 6659
Joined: Thu Mar 09, 2006 4:21 am

Re: Stockfish 1.6.3 JA update available

Post by swami »

mcostalba wrote:
swami wrote: Stockfish 1.6.2 was tested when STS suites had a partial credit moves and Arena erroneously awarded points for certain moves. I wasn't aware of this bug until Wesley pointed it out in a thread later on.

Stockfish 1.6.3 is tested with no-partial scoring-STS suites which consists of only best moves.
Hi Swami,

thanks for the explanation. It would be interesting, as a verification if you could rerun SF 1.6.2 with the lastest no-partial scoring-STS setup so to verify your test gives no difference from 1.6.3

Thanks
Marco
Hi Marco,

I ran the same test again with same version, I notice little difference. It maybe due to the fact that I had Ivanhoe running 2 cores in background during the first two tests. When Stockfish was still using 1 CPU, I thought it wouldn't affect the results when I ran Ivanhoe for analysing games in another window.

Now, I'm going to run 1.6.2 and 1.6.3 (again) without any other programs in the background. Will let you know about the results.
Stockfish 1.6.3 JA
by Marco Costalba, Tord Romstad, Joona Kiiski, Europe.

Strategic Test Suite Conditions:

Core2Quad 32 bits, Q6600, 2 GB RAM, 2.4GHZ
10 seconds per position
900 positions
Engine uses 156 Mb Hash.
Single CPU
Arena GUI


Overall Performance:
  • Total Score: 695/900 [.....] Average : 77.22% [.....] Grade: A [.....] Total Rated Time: 41.37/150 minutes [2482 Seconds/9000 Seconds]
Subject-wise Scores:

STS (v1.0) - Undermining:
82/100, Grade: A+

STS (v2.1) - Open Files and Diagonals:
80/100, Grade: A+

STS (v3.0) - Knight Outposts/Centralization/Repositioning:
81/100, Grade: A+

STS (v4.1) - Square Vacancy:
83/100, Grade: S

STS (v5.0) - Bishop vs Knight:
79/100, Grade: A

STS (v6.0) - Re-Capturing:
78/100, Grade: A

STS (v7.0) - Offer of Simplification:
74/100, Grade: A-

STS (v8.1) - Advancement of f/g/h Pawns:
63 /100, Grade: B

STS (v9.0) - Advancement of a/b/c Pawns:
75/100, Grade: A
Best Wishes,
Swami
swami
Posts: 6659
Joined: Thu Mar 09, 2006 4:21 am

Re: Stockfish 1.6.3 JA update available

Post by swami »

mcostalba wrote:
swami wrote: Stockfish 1.6.2 was tested when STS suites had a partial credit moves and Arena erroneously awarded points for certain moves. I wasn't aware of this bug until Wesley pointed it out in a thread later on.

Stockfish 1.6.3 is tested with no-partial scoring-STS suites which consists of only best moves.
Hi Swami,

I considered again this statement and I think there is something that I don't understand.

You said that previously Arena granted some points more to sf 1.6.2 but you also say that sf 1.6.3 has better test results, so this cannot be because otherwise 1.6.2 should have had better results of 1.6.3 because took some extra point as a gift from Arena, considering that functionality is the same between 1.6.2 and 1.6.3 I still believe your testing procedure is not reproducible at 100% and different runs on the _same_ engine yields to different results.

What do you think ?
Yes, Arena did award the marks for newly introduced move in some of the tests (which would be usually within 3 points range)

Not all tests were subject to erroneous moves from Arena. It's usually STS 2 and 3.

I will have to try different GUI's and see if it's constant. It would be a good experiment :-)

I will try the test with Stockfish as the base in GUI's such as

Chessbase
ChessGUI
Arena
Gradual Test

We will then see which one is more closest to being more static.
Dann Corbit
Posts: 12778
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Stockfish 1.6.3 JA update available

Post by Dann Corbit »

Since we have a gradualtest converter, maybe it is best just to use gradualtest to perform the analysis
noctiferus
Posts: 364
Joined: Sun Oct 04, 2009 1:27 pm
Location: Italy

Re: Stockfish 1.6.3 JA update available

Post by noctiferus »

As i sad a few times, STS 8 seems to give counterintuitive results, in my rough model.
I wonder why.. with the new results I'll try to use a more exotic technique, let's see..
swami
Posts: 6659
Joined: Thu Mar 09, 2006 4:21 am

Re: Stockfish 1.6.3 JA update available

Post by swami »

Now with Stockfish 1.6.2

I didn't use Gradual Test for testing this as I didn't know how to limit the number of cores used.

Very little difference in both the scores and Total Rated time. So it nearly matches in every case.
Stockfish 1.6.2 JA
by Marco Costalba, Tord Romstad, Joona Kiiski, Europe.

Strategic Test Suite Conditions:

Core2Quad 32 bits, Q6600, 2 GB RAM, 2.4GHZ
10 seconds per position
900 positions
Engine uses 156 Mb Hash.
Single CPU
Arena GUI


Overall Performance:
  • Total Score: 699/900 [.....] Average : 77.67% [.....] Grade: A [.....] Total Rated Time: 40.98/150 minutes [2459 Seconds/9000 Seconds]
Subject-wise Scores:

STS (v1.0) - Undermining:
82/100, Grade: A+

STS (v2.1) - Open Files and Diagonals:
80/100, Grade: A+

STS (v3.0) - Knight Outposts/Centralization/Repositioning:
81/100, Grade: A+

STS (v4.1) - Square Vacancy:
84/100, Grade: S

STS (v5.0) - Bishop vs Knight:
79/100, Grade: A

STS (v6.0) - Re-Capturing:
78/100, Grade: A

STS (v7.0) - Offer of Simplification:
76/100, Grade: A-

STS (v8.1) - Advancement of f/g/h Pawns:
64 /100, Grade: B

STS (v9.0) - Advancement of a/b/c Pawns:
75/100, Grade: A
Best Wishes,
Swami
maxchgr

Re: Stockfish 1.6.3 JA update available

Post by maxchgr »

What does the square vacancy test?
swami
Posts: 6659
Joined: Thu Mar 09, 2006 4:21 am

Re: Stockfish 1.6.3 JA update available

Post by swami »

maxchgr wrote:What does the square vacancy test?
Not sure what you just asked. Anyway, if you want to know what it means or what it does, please take a look here:

http://sites.google.com/site/strategict ... re-vacancy
User avatar
David Dahlem
Posts: 900
Joined: Wed Mar 08, 2006 9:06 pm

Re: Stockfish 1.6.3 JA update available

Post by David Dahlem »

swami wrote:I didn't use Gradual Test for testing this as I didn't know how to limit the number of cores used.
Hi Swami

Using the GradualTest "/s" switch, i think this will work -

/s "setoption name Hash value 512\nsetoption name Ponder value false\nsetoption name Threads value 4"
Frank Quisinsky
Posts: 6936
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Stockfish 1.6.3 JA update available

Post by Frank Quisinsky »

Hi Swami,

I don't try Arena in the latest 3 1/2 years. But for test suits you should try the Fritz GUI. The best possibilities I think for automatic test suits.

The Arena versions I know :-) made nothing own things with engines. So the results should be the same.

Could be easy to test.

Made the same test with Stockfish 1.6.3 again under Arena.

Best
Frank