Why is SF6 so much stronger?

gladius · Post by **gladius** » Thu Feb 19, 2015 7:07 pm

clumma wrote:How did they make such a leap in one version? The code is available. Is the cause not understood? I looked here and on other forums, and didn't see discussion on this question.

-Carl

The key idea is testing every change that could affect the strength of SF. Each change is tested to see if it's an improvement or not, with a high degree of precision (requiring tens of thousands of games per patch, using statistical analysis - SPRT).

Then, you have a high degree of certainty that when you accept a change, it is improving the strength of the engine. Each change is probably only a small improvement, 1-2 elo or so, but when you add up many of these improvements, you get a big gain in the end!

I did a rough count of the number of elo affecting changes between SF5 and SF6, and it was around 40. So each change adds roughly 1 elo on average.

Joerg Oster · Post by **Joerg Oster** » Thu Feb 19, 2015 7:24 pm

clumma wrote:What is the distribution of Elo improvement over commits since version 5? All equal and small? If not, what did the main changes accomplish (SMP handling, fine tuning of parameters...)?

Why is this important?
But again, you can take a look at each and every commit on github and the corresponding tests done in the framework and also the regression tests, which are done from time to time, and then trying to figure it out ...

clumma wrote: If answer is fine tuning of parameters I would be surprised, since I expect this to yield diminishing returns by now.

What happens if commits are applied in a different order?

The end result would still be the same. But not all commits are interchangeable, of course. Some/many depend on others.

clumma wrote: Etc.

Why is it not obvious what "understanding" means?

-Carl

Because you mean something else than "understanding"?
Rest assured, at code level all patches are well understood by the maintainers.

clumma · Post by **clumma** » Thu Feb 19, 2015 7:24 pm

BBauer wrote:The apple falls down from the tree. In some sense we know why even so gravity is not fully understood.

I'm not asking for Einstein or Newton. I would be very happy with Galileo.

Probably lead devs do have a rough understanding of these things. They should publish something about it with each release.

More insight is forthcoming from authors of closed-source engine Komodo!

-Carl

clumma · Post by **clumma** » Thu Feb 19, 2015 7:35 pm

Joerg Oster wrote:
clumma wrote:What is the distribution of Elo improvement over commits since version 5? All equal and small? If not, what did the main changes accomplish (SMP handling, fine tuning of parameters...)?
Why is this important?

Because it would answer my question?

But again, you can take a look at each and every commit on github and the corresponding tests done in the framework and also the regression tests, which are done from time to time, and then trying to figure it out

In other words, you have no idea. Why are you even replying to my question?

clumma wrote:What happens if commits are applied in a different order?
The end result would still be the same.

How do you know? I suspect it's mostly true, but it seems an obvious kind of experiment to perform.

-Carl

syzygy · Post by **syzygy** » Thu Feb 19, 2015 8:56 pm

clumma wrote:
But again, you can take a look at each and every commit on github and the corresponding tests done in the framework and also the regression tests, which are done from time to time, and then trying to figure it out
In other words, you have no idea. Why are you even replying to my question?

It has been explained to you that you can answer all your questions yourself. Why don't you do it and report back when you're done?

Dann Corbit · Post by **Dann Corbit** » Thu Feb 19, 2015 9:06 pm

On the link I gave, you can see the jump in Elo for each change.
I guess that if you add them up, all the surprise will go away.
Consider:
50 changes at 3 Elo each is 150 Elo.

They carefully test each change and commit only when:
1. There is a proven improvement.
OR
2. There is no regression and the code becomes simpler.

THAT is a formula for excellence that cannot be beaten.

clumma · Post by **clumma** » Thu Feb 19, 2015 9:17 pm

syzygy wrote:It has been explained to you that you can answer all your questions yourself. Why don't you do it and report back when you're done?

Because I expected the answer is already known by someone here. (This is generally why people ask things on forums.)

I haven't gathered the information myself because the answer isn't worth to me the few hours effort it would take. I expected someone here already knew it because it is a place where aspiring engine authors hang out, and it is manifestly worth the effort to them.

-Carl

mcostalba · Post by **mcostalba** » Thu Feb 19, 2015 9:27 pm

clumma wrote: I haven't gathered the information myself because the answer isn't worth to me the few hours effort it would take.

You are just an arrogant troll. Many people, including Gary, dedicated (aka wasted) some of their time to answer you, much more than what you deserve.

I have spent 15 seconds to write this post.....14 seconds too much.

syzygy · Post by **syzygy** » Thu Feb 19, 2015 9:52 pm

mcostalba wrote:
clumma wrote: I haven't gathered the information myself because the answer isn't worth to me the few hours effort it would take.
You are just an arrogant troll. Many people, including Gary, dedicated (aka wasted) some of their time to answer you, much more than what you deserve.

I have spent 15 seconds to write this post.....14 seconds too much.

But at least you have saved me that time! Thanks!

clumma · Post by **clumma** » Fri Feb 20, 2015 12:23 am

I probably made mistakes but I took a stab at this. I eyeballed the top graph on this page

http://tests.stockfishchess.org/regression

And picked the four biggest jumps, which account for about half the Elo improvement in SF6

Code: Select all

new       old       elo
4758fd3   ffedfa3	9
ea9c424   cd065dd   7
c6d45c6   79fa72f   5
296534f   1588642   5

It looks like major testing is only done at checkpoints like these -- each commit just needs to prove it isn't a regression, or declare it's not a functional change (nevertheless there is a regression in this graph...). So there's no immediate way to tell the Elo impact of a commit but I used the inverse of the number of games the SPRT apparently required before it quit as a proxy. That yields (again, by eye) these significant commits for the above checkpoints, respectively

Add bonuses for minors attacking enemy pieces 13
Tune trapped rook penalty (on master)
Double mg bonus and half eg bonus (on master, "outpost tuning")

King-pawn threat bonus for endgames 49
Evaluate king safety when no queen is present 51
Change history reduction in LMR to be a full ply 53
Remove use of half-ply reductions 55

Add bonuses for each threat instead of max threat value 94
Be more optimistic in aspiration window 105

Halve StormDanger bonus for blocked pawn on A/H file 153
Avoid searching TT twice for the same key/position... 151
Big King Safety tuning 184

The numbers here are the associated pull requests. Append to this URI to read more and get a link to the diff

https://github.com/official-stockfish/Stockfish/pull/

Of course the non-functional changes lay the groundwork for future improvements... This doesn't even scratch the surface. It's just what I, a newcomer with only a passing interest in computer chess, could do in an hour. It does appear that you (Uri) are being modest, since you made some of the more interesting changes in this release.

-Carl

Why is SF6 so much stronger?

Re: Why is SF6 so much stronger?

Re: Why is SF6 so much stronger?

Re: Why is SF6 so much stronger?

Re: Why is SF6 so much stronger?

Re: Why is SF6 so much stronger?

Re: Why is SF6 so much stronger?

Re: Why is SF6 so much stronger?

Re: Why is SF6 so much stronger?

Re: Why is SF6 so much stronger?

Re: Why is SF6 so much stronger?