SPCC: Testrun of Stockfish 170503 finished

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
pohl4711
Posts: 2950
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

SPCC: Testrun of Stockfish 170503 finished

Post by pohl4711 »

Testrun of Stockfish 170503 finished.

Long thinking-time tournament updated.

http://spcc.beepworld.de

(Perhaps you have to clear your browsercache or reload the website)
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: SPCC: Testrun of Stockfish 170503 finished

Post by Lyudmil Tsvetkov »

-12 elo looks absurd here, there were some worthwhile patches in this period.
maybe some settings went wrong, or it is the book, with some specific patch responsible for SF particularly disliking this book.
User avatar
pohl4711
Posts: 2950
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Stockfish 170503 finished

Post by pohl4711 »

The problem is, that this Stockfish-version has a measureable higher draw-rate in my testrun and because of this, the number of wins is lower. In the framework, the opening-testsets are producing extremly high draw-rates. So, perhaps, a more drawish playing Stockfish cant be recognized there.
But lets see, how asmFish with the same patches will score. Then we will know, if we have a regression in the Stockfish-code or a statistical “accident“...

Stefan
Guenther
Posts: 4718
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: SPCC: Testrun of Stockfish 170503 finished

Post by Guenther »

Lyudmil Tsvetkov wrote:-12 elo looks absurd here, there were some worthwhile patches in this period.
maybe some settings went wrong, or it is the book, with some specific patch responsible for SF particularly disliking this book.
Well even at 7000 games you see the error bars are +-7 for both entries.
https://rwbc-chess.de

[Trolls n'existent pas...]
JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: SPCC: Testrun of Stockfish 170503 finished

Post by JJJ »

The regression test is showing a solid +20 elo on 40K game and I think it is more accurate than others test. Also +17 elo at 180 sec + 1,8 sec.
User avatar
Eelco de Groot
Posts: 4724
Joined: Sun Mar 12, 2006 2:40 am
Full name:   Eelco de Groot

Re: SPCC: Testrun of Stockfish 170503 finished

Post by Eelco de Groot »

JJJ wrote:The regression test is showing a solid +20 elo on 40K game and I think it is more accurate than others test. Also +17 elo at 180 sec + 1,8 sec.
If you add 20 points to 3390 which is where Stockfish 8 is in Stefan's list, you get to 3410. That is about the same point you land at interpolating for example Stefan's last 5 data points, the first three are on this line, then one above and the last below the interpolated line. So that would mean Stockfish development version in Stefan's list now (virtually) at 3410 and no real regression but also no big jump with the last patch.
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: SPCC: Testrun of Stockfish 170503 finished

Post by JJJ »

Eelco de Groot wrote:
JJJ wrote:The regression test is showing a solid +20 elo on 40K game and I think it is more accurate than others test. Also +17 elo at 180 sec + 1,8 sec.
If you add 20 points to 3390 which is where Stockfish 8 is in Stefan's list, you get to 3410. That is about the same point you land at interpolating for example Stefan's last 5 data points, the first three are on this line, then one above and the last below the interpolated line. So that would mean Stockfish development version in Stefan's list now (virtually) at 3410 and no real regression but also no big jump with the last patch.
Yes. And I trust more a regression test with 40K game and a lower error bar.