Obligatory scaling

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Obligatory scaling

Post by zullil »

Lyudmil Tsvetkov wrote:And something else, Arjun.

Last Shane scaling patch for pawn span less or equal to 2 and all pieces involved adds 2 elo, while existing SF rule for rook scaling with same pawn span does not add any elo. Does that ring a bell, Arjun? It is meaningful to scale more complex endgames rather than less complex ones, there is a proof to this, you see it, it is 2 elo on the framework.

Another question is why SF still has the old rook rule that does not add any strength but only uglifies code, while still does not implement the 2 elo patch? Please tell me why. If the SF developers are rule-abiding, they could refuse the implementation of the Shane scaling patch until it is further improved, but they should also immediately throw out the old rook rule. Why not do that?
Perhaps posts having to do specifically with the implementation (or not) of evaluation ideas in Stockfish would be better-placed on a Stockfish-specific site?

And maybe posts that discuss evaluation ideas might be better in the Programming and Technical Discussions forum?

My answer is "yes" to both of these, but that's just me. :wink:
arjuntemurnikar
Posts: 204
Joined: Tue Oct 15, 2013 10:22 pm
Location: Singapore

Re: Obligatory scaling

Post by arjuntemurnikar »

Lyudmil Tsvetkov wrote:
arjuntemurnikar wrote:I did a quick google search and found this blog post claiming to have done statistical analysis on endgames.

This is by no means scientific proof, because the author of the analysis does not give any detailed accounts of which database he queried nor any information about his setup in general, so nobody can really verify his results independently.

But no matter, here is what he says:
"All rook endings are drawn", according to a common piece of chess folklore. We decided to distrust emotion and check the figures, comparing the percentages of draws in different types of endings, using a database of more than three million games. The results were very surprising. Bishop endings turned out to be the most drawish, with 47%. Second place went to queen endings on 43%. Even more surprising was the third place for knight endings, at 40%. And the notorious rook endings came only second-last at 38%, with pawn endings naturally turning out to be the least drawish at 27%.
The blog post is here: http://streathambrixtonchess.blogspot.c ... dgame.html

I also found this on the chessprogramming wiki:
In 2013, John Nunn applied the 7-piece Lomonosov Tablebases to R+2P vs. R+P positions from the famous book Rook Endings, 2nd edition, by Levenfish and Smyslov [7], which was assumed to contain the truth and, owing to Nunn, this is no longer so.
...but since I don't have access to the IGCA journal, I could not read up on it. If anybody does have access and is willing to share John Nunn's findings, it would be helpful to this discussion. :)
This is complete BS.

You do not know what the criteria for choosing which endgames to count are.

In pawn endgames, it is very simple, you either see a win very quickly, or it is simply a draw, so you do not need stats for this, there is no WDL probability there, the author should not have included this.
The first thing a primary teaches you is that knight endgames are absolutely the same as pawn endgames, so if you are up a pawn in an knight endgame with sufficient pawn span, this is simply a win; but that is far from true for rook endgames, quite the opposite.
One pawn more in queen endings is usually easier to convert than in rook endings, so another wrong statement.

Instead of citing unsubstantiated sources, it would be better to just try one and the other idea.
At least I made an attempt to find some statistical data on the subject. Do you have a better source? No you don't.

Instead you just blabber on whatever nonsense that spurs your thoughts. You provide no basis for the statements you make, and disgracefully dismiss any attempts by others to provide facts related to the discussion.

I already cautioned before quoting the blog post and I said it is not the most ideal source of data, but it is still data nonetheless. I highly doubt somebody just cooked up the figures. I am also very interested to know about John Nunn's findings. His analysis is always very intriguing.
Last edited by arjuntemurnikar on Fri Jun 27, 2014 6:49 pm, edited 3 times in total.
arjuntemurnikar
Posts: 204
Joined: Tue Oct 15, 2013 10:22 pm
Location: Singapore

Re: Obligatory scaling

Post by arjuntemurnikar »

Lyudmil Tsvetkov wrote:
Read more carefully, Arjun, you simply do not read carefully.

Tord says that SF had code for R+2 pawns vs R+ 1 pawn on one wing and it did not boost playing strength. He also says that he did not believe adding knowledge for more complex rook endgames would be useful, as the less complex rule did not work. But he did not try, and neither of the SF team did.

Do you know what the reason for scaling down of R+2 pawns vs R+ pawn not working in SF is? Main reason is that there is nothing to gain here, the endgame is simply too simple and the engine sees it all in the search. That is why no gain, very simple. That is why I suggested to scale 3 vs 2 in SF, and you will see the benefit. Such a scaling is meaningful, and SF often goes wrong with that, but I have not seen a game where it goes wrong with 2 vs 1 pawns.

About combinatorial problems with having many specific eval functions. Fully true, that is why I say scale what is important. If you are unable to scale most late endgames with equal material and pawn span less or equal to 3, then scale at least single rook endgames with 3 vs 2 pawns and less. One of the two, but not both at the same time.

Who needs a rule for 2 vs 1 pawns, when it is not working? So I would say drop it, and replace it with the meaningful rule for 3 vs 2 pawns, that already adds strength. Or even better, implement a rule for scaling all 1 and 2 piece endgames with equal material and pawn span less or equal to 3. Shane was very close, actually 2 of his scaling patches succeeded quite nicely. So that scaling works, you should not deny the facts and Tord's old minimalistic rules have already been superceded, but still not implemented. Is not this a proof, very evident one, that more complex scaling is good and works, and also adds nice strength?

Gary, Lucas and Joona might be right that the last Shane patch could be improved, and I think Shane or someone else should make a bit more effort so that the patch is finally implemented. Why not try what I suggest:

50% scaling when

- pawn span less or equal to 3
- non pawn material equal or just B vs N
- pawns equal or just one pawn more for one of the sides
- 1 or 2 pieces each side?

I think this is a meaningful rule that could help in many situations the search and also improve strength.
I thought I already mentioned above in my post that I agree that with the pawns on same side rule:
Single rook endgames with pawns on the same side of the board where the opposing player has no more than 1 pawn advantage, are known to be drawish. This is well established theory, so I will agree with you only here.
Please read carefully what I said before lecturing me about what I said wrong and what I didn't.

About Tord's post, it is by no account what I think exactly. You seem to have mistaken his words for my own. I do not agree with everything Tord said, but I do agree with most of what he said. Anyway, I only posted it here because I thought it was a very good read in general, and hopefully someone more saner than you got something out of it.

And about shane's efforts, have I ever said that it was a bad idea? Did I not say countless times before that I am all in favor of his patch? Why do you lecture me about what I already agree with?

Please think a bit more before you argue with and insult someone.
Uri Blass
Posts: 10282
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Obligatory scaling

Post by Uri Blass »

Lyudmil Tsvetkov wrote:
arjuntemurnikar wrote:I did a quick google search and found this blog post claiming to have done statistical analysis on endgames.

This is by no means scientific proof, because the author of the analysis does not give any detailed accounts of which database he queried nor any information about his setup in general, so nobody can really verify his results independently.

But no matter, here is what he says:
"All rook endings are drawn", according to a common piece of chess folklore. We decided to distrust emotion and check the figures, comparing the percentages of draws in different types of endings, using a database of more than three million games. The results were very surprising. Bishop endings turned out to be the most drawish, with 47%. Second place went to queen endings on 43%. Even more surprising was the third place for knight endings, at 40%. And the notorious rook endings came only second-last at 38%, with pawn endings naturally turning out to be the least drawish at 27%.
The blog post is here: http://streathambrixtonchess.blogspot.c ... dgame.html

I also found this on the chessprogramming wiki:
In 2013, John Nunn applied the 7-piece Lomonosov Tablebases to R+2P vs. R+P positions from the famous book Rook Endings, 2nd edition, by Levenfish and Smyslov [7], which was assumed to contain the truth and, owing to Nunn, this is no longer so.
...but since I don't have access to the IGCA journal, I could not read up on it. If anybody does have access and is willing to share John Nunn's findings, it would be helpful to this discussion. :)
This is complete BS.

You do not know what the criteria for choosing which endgames to count are.

In pawn endgames, it is very simple, you either see a win very quickly, or it is simply a draw, so you do not need stats for this, there is no WDL probability there, the author should not have included this.
The first thing a primary teaches you is that knight endgames are absolutely the same as pawn endgames, so if you are up a pawn in an knight endgame with sufficient pawn span, this is simply a win; but that is far from true for rook endgames, quite the opposite.
One pawn more in queen endings is usually easier to convert than in rook endings, so another wrong statement.

Instead of citing unsubstantiated sources, it would be better to just try one and the other idea.
Unfortunately my opponent in a correspondence game did not "understand" that rook endgames is always a draw so I had to resign because I found that rook endgame is winning for him

Here is the game(It is not a game that I started and somebody else started it and gave me a bad position and practically draw and loss were the same for my rating but I did not want to lose this game)

[pgn][Event "EU/TC9/sf3"]
[Site "ICCF"]
[Date "2011.7.15"]
[Round "-"]
[White "Blass, Uri"]
[Black "Savoca, Alfredo"]
[Result "0-1"]
[WhiteElo "2603"]
[BlackElo "2394"]
[Board "6"]
[WhiteTeam "Israel"]
[BlackTeam "Italy"]

1.Nf3 c5 2.c4 Nf6 3.Nc3 Nc6 4.d4 cxd4 5.Nxd4 e6
6.a3 Be7 7.e4 O-O 8.Nc2 b6 9.Be2 Bb7 10.O-O Qc7
11.Ne3 Ne5 12.f4 Ng6 13.e5 Ne4 14.Nb5 Qc6 15.Bf3 Nh4
16.Nd4 Qc8 17.Bxe4 Bxe4 18.Bd2 f6 19.Qg4 Bc5 20.Bc3 Nf5
21.Ndxf5 exf5 22.Qg3 fxe5 23.Bxe5 Bxe3+ 24.Qxe3 Qxc4 25.Rac1 Qf7
26.Rc7 Rfc8 27.Rfc1 Rxc7 28.Rxc7 Bc6 29.Qg3 h6 30.h3 Re8
31.h4 a5 32.Kf2 b5 33.h5 Kh7 34.Ra7 Re6 35.Ra6 g5
36.Rb6 Qxh5 37.Rb8 Qg4 38.Qxg4 fxg4 39.g3 Kg6 40.Rf8 Be4
41.Rg8+ Kh5 42.Rd8 Bc6 43.Ke3 Kg6 44.Kd3 Kf7 45.Bc7 gxf4
46.Bxf4 a4 47.Rb8 Kg7 48.Kd2 h5 49.Rd8 Re4 50.Bc7 Rd4+
51.Ke3 Rc4 52.Rb8 Re4+ 53.Kd2 Rd4+ 54.Ke2 Kf7 55.Rh8 Bf3+
56.Ke3 Re4+ 57.Kd3 Re2 58.Rxh5 Rxb2 59.Bd6 Be2+ 60.Ke3 Kg6
61.Re5 Bc4 62.Bb4 Rg2 63.Be1 d5 64.Kd4 b4 65.Bxb4 Rxg3
66.Re3 Rg1 67.Re1 Rg2 68.Ke5 Rf2 69.Re3 Rf5+ 70.Kd4 Bb3
71.Be1 Bd1 72.Bh4 Rh5 73.Bd8 Bf3 74.Re6+ Kf7 75.Re7+ Kf8
76.Re5 Rh1 77.Ke3 d4+ 78.Kf4 d3 79.Be7+ Kf7 80.Bb4 Rb1
81.Ke3 Rf1 82.Rg5 Be2 83.Bc3 Rg1 84.Ba5 Ra1 85.Bb4 Ke6
86.Kf4 Rf1+ 87.Ke4 Rb1 88.Ke3 Kf6 89.Rg8 Kf5 90.Bd6 d2
0-1[/pgn]

Note that Stockfish could not find d2 in a reasonable time and reducing the score for all rook endgames may only make it harder for it to find d2

[D]6R1/8/3B4/5k2/p5p1/P2pK3/4b3/1r6 b - - 23 90 bm d2

Maybe today stockfish is better relative to the time of the game and can find d2 faster but still it does not see d2 at least not after some minutes of search.