What is wrong with Stockfish in TCEC

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

What is wrong with Stockfish in TCEC

Post by Don »

I have heard that question a lot. My short answer is "probably nothing" and I have a prediction for the next round.

At the moment each program has played 11 or less games. The error margins for such a tiny sample of games is well over 100 ELO.

The error margin even allows for the possibility that Stockfish is the strongest program in the tournament and is just having a "bad day." This can EASILY happen with any 10 or 11 string of games.

So Stockfish has had a bad result, but not bad enough to prove (with even reasonable certainty) that something is "horribly" wrong.


If that doesn't convince you,  then consider that the Stockfish team is very methodical about testing each version.    If it's REALLY completely broken as some have implied,  then it must be the case that the SF team is completely incompetent or just got real sloppy here.    Do you really think they let a version of Stockfish pass though that was seriously weaker than anything else they have?     To me that is beyond laughable.    

Is it possible that it's a regression? Of course it is. But this stages bad result in no way implies that it "must" be a regression. If it's a regression it's almost certainly a very minor one and a regression does not explain their bad results or even come close. Do you believe they made a 50 or 100 ELO regression?

The most reasonable explanation for what we see here is that Stockfish is just having a bad tournament.        If this had happened to Komodo I would not panic because I know that the version I submitted was well tested and I also know that it can do well or do poorly.  

By the way,  as horrible as these result may seem to some,  Stockfish is only 2 points down from Houdini and Komodo with a lot of games to go.     Do you really believe that the evidence is overwhelming that something is wrong with Stockfish simply because it got "unlucky" in a couple of games?   

My prediction is that Stockfish is clearly in the top 3, even this version, and will do well in the next stage whether they update it or not.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Damir
Posts: 2801
Joined: Mon Feb 11, 2008 3:53 pm
Location: Denmark
Full name: Damir Desevac

Re: What is wrong with Stockfish in TCEC

Post by Damir »

Is something wrong with Martin's website ? I can not get access to it right now. I get an error message.
Dirt
Posts: 2851
Joined: Wed Mar 08, 2006 10:01 pm
Location: Irvine, CA, USA

Re: What is wrong with Stockfish in TCEC

Post by Dirt »

Don wrote:If that doesn't convince you,  then consider that the Stockfish team is very methodical about testing each version.    If it's REALLY completely broken as some have implied,  then it must be the case that the SF team is completely incompetent or just got real sloppy here.    Do you really think they let a version of Stockfish pass though that was seriously weaker than anything else they have?     To me that is beyond laughable.
As I understand it, they are now using a version of Stockfish that supports tablebases, but without the tablebases. This might not be so well tested. Still, you're probably right.
Modern Times
Posts: 3548
Joined: Thu Jun 07, 2012 11:02 pm

Re: What is wrong with Stockfish in TCEC

Post by Modern Times »

Dirt wrote: As I understand it, they are now using a version of Stockfish that supports tablebases, but without the tablebases. This might not be so well tested. Still, you're probably right.
How bizarre. If tablebases aren't being used, why not use the normal dev versions ?
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: What is wrong with Stockfish in TCEC

Post by Milos »

Modern Times wrote:
Dirt wrote: As I understand it, they are now using a version of Stockfish that supports tablebases, but without the tablebases. This might not be so well tested. Still, you're probably right.
How bizarre. If tablebases aren't being used, why not use the normal dev versions ?
I don't know where this comes from, but Syzygy are of course used, just not 6-men, but 5.
And it works nicely. You can simply download the same version of SF that is competing in TCEC and try it yourself.
Masta
Posts: 28
Joined: Fri Jul 26, 2013 6:24 am

Re: What is wrong with Stockfish in TCEC

Post by Masta »

Lucas compile.

It was not tested (the compile).
brianr
Posts: 536
Joined: Thu Mar 09, 2006 3:01 pm

Re: What is wrong with Stockfish in TCEC

Post by brianr »

There were also new versions of several other top engines for Stage 3.
Things do not stand still.
From https://www.facebook.com/tcec.chess

TCEC - Thoresen Chess Engines Competition
October 18
I know a lot of people have been waiting, so here are the engines that will be updated for Stage 3:

Houdini 3 -> Houdini 9601
Stockfish 160913 -> Stockfish 151013
Naum 4.5 -> Naum 4.6
Gull 2.2 -> Gull 2.3
Bouquet 1.8a -> Bouquet 1.8b
Komodo 1092 -> Komodo 1119 -> 1121.05
Modern Times
Posts: 3548
Joined: Thu Jun 07, 2012 11:02 pm

Re: What is wrong with Stockfish in TCEC

Post by Modern Times »

Milos wrote:I don't know where this comes from, but Syzygy are of course used, just not 6-men, but 5.
And it works nicely.
OK good. Yes I've played 3,000 games with one of the Stockfish Syzygy versions with 5-men on SSD. It does indeed work very well.

I think as Don says, it is just having a bad or unlucky tournament. Or perhaps it isn't working well on 16 cores ? Also Martin has enough experience to have set it up properly so no concerns there.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: What is wrong with Stockfish in TCEC

Post by Don »

Modern Times wrote:
Milos wrote:I don't know where this comes from, but Syzygy are of course used, just not 6-men, but 5.
And it works nicely.
OK good. Yes I've played 3,000 games with one of the Stockfish Syzygy versions with 5-men on SSD. It does indeed work very well.

I think as Don says, it is just having a bad or unlucky tournament. Or perhaps it isn't working well on 16 cores ? Also Martin has enough experience to have set it up properly so no concerns there.
It is possible the Syzygy words better at fast time controls where you can run off thousands of games easily. Still, even if Syzygy hurts it I cannot imagine it hurting it enough to explain this bad result. So I personally don't think there is anything to explain - SF is just a victim of statistics noise - it's just silly to panic after 11 games.

I can tell you that that this bad performance of SF does not change my fear of playing black against Stockfish later in this tournament. This is the game we are most likely to lose of the remaining games.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
IGarcia
Posts: 543
Joined: Mon Jul 05, 2010 10:27 pm

Re: What is wrong with Stockfish in TCEC

Post by IGarcia »

Masta wrote:Lucas compile.

It was not tested (the compile).
It make sense, If there is any problem, this can be the source.

Still, SF at TCEC looks fine, some bad luck. As when you bet 3 times in roulete to red and the ball hits black(or geen ) 3 times in a row.

Play Millons of TCEC and you will have many times winers as houdini, komodo and SF... also, some few times will see Junior, and why not once in millions have gaviota winning?