https://nextchessmove.com/dev-builds
Weird, because there are many +5 ELO patches!
No progress for Stockfish in last month
Moderators: hgm, Rebel, chrisw
-
- Posts: 3291
- Joined: Wed Mar 08, 2006 8:15 pm
-
- Posts: 12541
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: No progress for Stockfish in last month
If you notice here:
http://abrok.eu/stockfish/
The STC and LTC ("short" and "long") time controls almost always have different Elo numbers.
That means (quite obviously) that the patches are highly dependent on the environment of the tests.
So the lack of progress at the nextchessmove site is probably correct for the conditions stated.
I guess that if they ran using the exact conditions of LTC or STC we would see different results.
But overall, the fishtest methodology is clearly good and produces strong results.
So I would not worry about it.
I would say this:
If you play with the exact same conditions as the tests at nextchessmove, you probably won't see improvements for the engine over the past month.
That is because with those exact conditions, there isn't any improvement. However, improvements with other settings are likely.
And there could be settings where the engines actually do worse over time.
Hence, testing bodies like CCRL and CEGT are quite valuable as well as those entities that test at slow time control with ultra-powerful hardware like TCEC.
With enough testing data, we can get a pretty good idea of how an engine will do on our hardware.
http://abrok.eu/stockfish/
The STC and LTC ("short" and "long") time controls almost always have different Elo numbers.
That means (quite obviously) that the patches are highly dependent on the environment of the tests.
So the lack of progress at the nextchessmove site is probably correct for the conditions stated.
I guess that if they ran using the exact conditions of LTC or STC we would see different results.
But overall, the fishtest methodology is clearly good and produces strong results.
So I would not worry about it.
I would say this:
If you play with the exact same conditions as the tests at nextchessmove, you probably won't see improvements for the engine over the past month.
That is because with those exact conditions, there isn't any improvement. However, improvements with other settings are likely.
And there could be settings where the engines actually do worse over time.
Hence, testing bodies like CCRL and CEGT are quite valuable as well as those entities that test at slow time control with ultra-powerful hardware like TCEC.
With enough testing data, we can get a pretty good idea of how an engine will do on our hardware.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 3657
- Joined: Wed Nov 18, 2015 11:41 am
- Location: hungary
Re: No progress for Stockfish in last month
The basic difference is the Stockfish test uses games between earlier "Master" Stockfish and the actual StockfishDev but nextchessmove.com uses games between StockfishDev and Stockfish 7.
In my experience the results of nextchessmove.com is more practical and more usable for an engine user than the self tests of Stockfish.
In my experience the results of nextchessmove.com is more practical and more usable for an engine user than the self tests of Stockfish.
-
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: No progress for Stockfish in last month
For what's it worth, for analysis of chess positions for correspondence games, I haven't seen an improvement since Stockfish from July 2019.
Your beliefs create your reality, so be careful what you wish for.
-
- Posts: 10297
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: No progress for Stockfish in last month
There are not many +5 elo patches.Jouni wrote: ↑Tue Oct 01, 2019 2:18 pm https://nextchessmove.com/dev-builds
Weird, because there are many +5 ELO patches!
SPRT does not give a good estimate for the elo improvement.
Stockfish needs to test with fixed number of games and not with SPRT if they want a good estimate.
Unfortunately they want to make improvement as fast as possible and prefer not to find unbiased estimate for the exact elo improvement for every patch that I consider to be more interesting.
-
- Posts: 1535
- Joined: Sun Oct 25, 2009 2:30 am
Re: No progress for Stockfish in last month
I also noticed something, simply by looking at the graph for all builds, and here's what caught my eye:Jouni wrote: ↑Tue Oct 01, 2019 2:18 pm https://nextchessmove.com/dev-builds
Weird, because there are many +5 ELO patches!
20190912-0833 20000 12298 378 7324 +238.66 +/- 3.94
20190821-0711 20001 11890 435 7676 +226.38 +/- 3.84
20190814-2015 20000 11761 446 7793 +222.79 +/- 3.81
20190208-0919 20000 11724 389 7887 +223.30 +/- 3.78
There's no progress between February 8 and August 20, then there's a good streak from August 21 to September 12, and since then, we've not reached that level again. This means that (except for a short period of 23 days), there's been no progress for the last 9 months.
-
- Posts: 12541
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: No progress for Stockfish in last month
Something went wrong in March 2019, just like June/July of 2017. But it still looks like clear progress to me, by looking at Pohl's data:
https://www.sp-cc.de/
https://www.sp-cc.de/
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 360
- Joined: Thu Jan 22, 2015 3:21 pm
- Location: Zurich, Switzerland
- Full name: Jonathan Rosenthal
Re: No progress for Stockfish in last month
I honestly also think NCM should consider updating the benchmark engine after SF11 release. Elo model is not particularly good for engines outside a certain rating range and at roughly +240 Elo, I think we might have reached that point already. For a 240+ Elo performance you need to score 80%.
-Jonathan
-
- Posts: 1535
- Joined: Sun Oct 25, 2009 2:30 am
Re: No progress for Stockfish in last month
This is what I noticed in my previous post:
- The bar to left, making it all the way to the top, is the latest sf dev.
- At the end to the right, that very short bar is SF 10.
- The bar missing towards the middle, is for an statistical "aberration".
- The red bars are for 20190912, 20190821 and 20190208.
-
- Posts: 3657
- Joined: Wed Nov 18, 2015 11:41 am
- Location: hungary
Re: No progress for Stockfish in last month
It is an illusion to expect linear or continuously growing progression from any development.Jouni wrote: ↑Tue Oct 01, 2019 2:18 pm https://nextchessmove.com/dev-builds
Weird, because there are many +5 ELO patches!
The "+5 Elo patches" are only the results of self tests. It is rather possible in tests with different contenders the enhancement would be different.
Nowadays the most developed engine is the Leela with stronger and stronger nets.
I think until the development of Stockfish can space the step with Leela there is no real issue in the development of Stockfish.