No progress for Stockfish in last month

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Jouni
Posts: 3286
Joined: Wed Mar 08, 2006 8:15 pm

No progress for Stockfish in last month

Post by Jouni »

https://nextchessmove.com/dev-builds

Weird, because there are many +5 ELO patches!
Jouni
Dann Corbit
Posts: 12540
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: No progress for Stockfish in last month

Post by Dann Corbit »

If you notice here:
http://abrok.eu/stockfish/

The STC and LTC ("short" and "long") time controls almost always have different Elo numbers.
That means (quite obviously) that the patches are highly dependent on the environment of the tests.

So the lack of progress at the nextchessmove site is probably correct for the conditions stated.
I guess that if they ran using the exact conditions of LTC or STC we would see different results.

But overall, the fishtest methodology is clearly good and produces strong results.
So I would not worry about it.

I would say this:
If you play with the exact same conditions as the tests at nextchessmove, you probably won't see improvements for the engine over the past month.
That is because with those exact conditions, there isn't any improvement. However, improvements with other settings are likely.
And there could be settings where the engines actually do worse over time.

Hence, testing bodies like CCRL and CEGT are quite valuable as well as those entities that test at slow time control with ultra-powerful hardware like TCEC.

With enough testing data, we can get a pretty good idea of how an engine will do on our hardware.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: No progress for Stockfish in last month

Post by corres »

The basic difference is the Stockfish test uses games between earlier "Master" Stockfish and the actual StockfishDev but nextchessmove.com uses games between StockfishDev and Stockfish 7.
In my experience the results of nextchessmove.com is more practical and more usable for an engine user than the self tests of Stockfish.
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: No progress for Stockfish in last month

Post by Ovyron »

For what's it worth, for analysis of chess positions for correspondence games, I haven't seen an improvement since Stockfish from July 2019.
Your beliefs create your reality, so be careful what you wish for.
Uri Blass
Posts: 10282
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: No progress for Stockfish in last month

Post by Uri Blass »

Jouni wrote: Tue Oct 01, 2019 2:18 pm https://nextchessmove.com/dev-builds

Weird, because there are many +5 ELO patches!
There are not many +5 elo patches.
SPRT does not give a good estimate for the elo improvement.

Stockfish needs to test with fixed number of games and not with SPRT if they want a good estimate.
Unfortunately they want to make improvement as fast as possible and prefer not to find unbiased estimate for the exact elo improvement for every patch that I consider to be more interesting.
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: No progress for Stockfish in last month

Post by Ozymandias »

Jouni wrote: Tue Oct 01, 2019 2:18 pm https://nextchessmove.com/dev-builds

Weird, because there are many +5 ELO patches!
I also noticed something, simply by looking at the graph for all builds, and here's what caught my eye:

20190912-0833 20000 12298 378 7324 +238.66 +/- 3.94
20190821-0711 20001 11890 435 7676 +226.38 +/- 3.84
20190814-2015 20000 11761 446 7793 +222.79 +/- 3.81
20190208-0919 20000 11724 389 7887 +223.30 +/- 3.78

There's no progress between February 8 and August 20, then there's a good streak from August 21 to September 12, and since then, we've not reached that level again. This means that (except for a short period of 23 days), there's been no progress for the last 9 months.
Dann Corbit
Posts: 12540
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: No progress for Stockfish in last month

Post by Dann Corbit »

Something went wrong in March 2019, just like June/July of 2017. But it still looks like clear progress to me, by looking at Pohl's data:
https://www.sp-cc.de/
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
jorose
Posts: 358
Joined: Thu Jan 22, 2015 3:21 pm
Location: Zurich, Switzerland
Full name: Jonathan Rosenthal

Re: No progress for Stockfish in last month

Post by jorose »

I honestly also think NCM should consider updating the benchmark engine after SF11 release. Elo model is not particularly good for engines outside a certain rating range and at roughly +240 Elo, I think we might have reached that point already. For a 240+ Elo performance you need to score 80%.
-Jonathan
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: No progress for Stockfish in last month

Post by Ozymandias »

Image

This is what I noticed in my previous post:
  • The bar to left, making it all the way to the top, is the latest sf dev.
  • At the end to the right, that very short bar is SF 10.
  • The bar missing towards the middle, is for an statistical "aberration".
  • The red bars are for 20190912, 20190821 and 20190208.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: No progress for Stockfish in last month

Post by corres »

Jouni wrote: Tue Oct 01, 2019 2:18 pm https://nextchessmove.com/dev-builds

Weird, because there are many +5 ELO patches!
It is an illusion to expect linear or continuously growing progression from any development.
The "+5 Elo patches" are only the results of self tests. It is rather possible in tests with different contenders the enhancement would be different.
Nowadays the most developed engine is the Leela with stronger and stronger nets.
I think until the development of Stockfish can space the step with Leela there is no real issue in the development of Stockfish.