Further weaknesses

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: What to do about this?

Post by zullil »

zullil wrote:
zullil wrote:
Lyudmil Tsvetkov wrote: [d]8/1Q6/3ppk1p/6r1/3p4/P3b1PK/1P5P/8 w - - 0 49
Both engines see 1 to 2 full pawns white advantage here for the next 20 moves, when suddenly, first SF, and then Houdini, understand that black is winning.

In any case, it seems black has a decisive advantage in terms of eval on the above diagram.
After a long search, the latest Stockfish sure isn't seeing that Black has an advantage:

Code: Select all

info depth 48 seldepth 90 score cp 275 nodes 120428243868 nps 23691600 time 5083162 multipv 1 pv b7a6 g5d5 a6f1 d5f5 f1d3 f5c5 h3g2 d6d5 a3a4 h6h5 d3h7 c5c1 g2h3 c1a1 b2b3 f6e5 a4a5 a1a3 h7h5 e5d6 h5e8 a3b3 e8d8 d6c6 d8e7 c6b5 e7a7 b5c6 a5a6 b3b6 a7f7 b6a6 f7e6 c6b7 e6d5 b7c7 h3g4 a6d6 d5c5 c7d7 c5b5 d7e7 g4f3 e3h6 b5d3 h6g7 h2h4 d6e6 f3f4 e6f6 f4g4 f6e6 d3h7 e7f7 g4g5 e6d6

Code: Select all

info depth 49 seldepth 90 score cp 313 nodes 278416677366 nps 24538113 time 11346295 multipv 1 pv b7e4 g5e5 e4d3 e5c5 h3g2 h6h5 a3a4 c5a5 b2b4 a5a4 d3b1 a4a8 b4b5 a8b8 b5b6 d6d5 b1f1 f6e5 f1a6 d4d3 a6d3 e3b6 d3f3 e5d6 f3f4 e6e5 f4f6 d6c5 f6e5 b8b7 e5h5 b6c7 h5e2 c7d6 e2c2 c5d4 h2h4 b7b8 c2f2 d4e5 f2e2 e5f6 g3g4 d6e5 g4g5 f6f5 e2c2 f5e6 c2g6 e6e7 g6d3

Code: Select all

info depth 50 seldepth 102 score cp 275 nodes 652676948978 nps 25448132 time 25647342 multipv 1 pv b7e4 g5e5 e4c2 e5c5 c2e2 c5a5 e2d3 a5c5 d3e2
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: What to do about this?

Post by Lyudmil Tsvetkov »

zullil wrote:
zullil wrote:
zullil wrote:
Lyudmil Tsvetkov wrote: [d]8/1Q6/3ppk1p/6r1/3p4/P3b1PK/1P5P/8 w - - 0 49
Both engines see 1 to 2 full pawns white advantage here for the next 20 moves, when suddenly, first SF, and then Houdini, understand that black is winning.

In any case, it seems black has a decisive advantage in terms of eval on the above diagram.
After a long search, the latest Stockfish sure isn't seeing that Black has an advantage:

Code: Select all

info depth 48 seldepth 90 score cp 275 nodes 120428243868 nps 23691600 time 5083162 multipv 1 pv b7a6 g5d5 a6f1 d5f5 f1d3 f5c5 h3g2 d6d5 a3a4 h6h5 d3h7 c5c1 g2h3 c1a1 b2b3 f6e5 a4a5 a1a3 h7h5 e5d6 h5e8 a3b3 e8d8 d6c6 d8e7 c6b5 e7a7 b5c6 a5a6 b3b6 a7f7 b6a6 f7e6 c6b7 e6d5 b7c7 h3g4 a6d6 d5c5 c7d7 c5b5 d7e7 g4f3 e3h6 b5d3 h6g7 h2h4 d6e6 f3f4 e6f6 f4g4 f6e6 d3h7 e7f7 g4g5 e6d6

Code: Select all

info depth 49 seldepth 90 score cp 313 nodes 278416677366 nps 24538113 time 11346295 multipv 1 pv b7e4 g5e5 e4d3 e5c5 h3g2 h6h5 a3a4 c5a5 b2b4 a5a4 d3b1 a4a8 b4b5 a8b8 b5b6 d6d5 b1f1 f6e5 f1a6 d4d3 a6d3 e3b6 d3f3 e5d6 f3f4 e6e5 f4f6 d6c5 f6e5 b8b7 e5h5 b6c7 h5e2 c7d6 e2c2 c5d4 h2h4 b7b8 c2f2 d4e5 f2e2 e5f6 g3g4 d6e5 g4g5 f6f5 e2c2 f5e6 c2g6 e6e7 g6d3

Code: Select all

info depth 50 seldepth 102 score cp 275 nodes 652676948978 nps 25448132 time 25647342 multipv 1 pv b7e4 g5e5 e4c2 e5c5 c2e2 c5a5 e2d3 a5c5 d3e2
Thanks Louis, but I do not believe SF.
In the actual game, some 30+ plies on, it still showed a very significant white advantage.
That is how search without good eval is meaningless: both SF and Houdini do not see a forced winning line for black, as in the best of cases there is a long series of checks, but the win is there.

Something should be done about SF eval in the above position, but I am still not certain what exactly, apart from further rasing the passer bonus for the side with more pieces.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Spacemask 2

Post by Lyudmil Tsvetkov »

Well, it is a pity no one is interested in this, I not only believe, but am very much certain that is a very good idea.
Why so? Because I saw similar SF behaviour in more than 10% of the games I have looked at, and I have looked at much more than hundred games.

I am adamant, based on the games I have seen, that SF underestimates blocked enemy pawns on the 5th rank. Something seems to be going extremely wrong currently in STF, the queue is empty, patches that have passed STC are not tested at LTC for some reason, as the Trapped Bishop patch, first try, NoMobilityQueen patch, RNN vs R imbalance, the patch that passed STC. (btw. regarding this patch, a very meaningful thing would be to add R vs BBN to the imbalance, if it is not already added, with twice bigger success chances) Instead, people test patches that have very good almost obvious chances to fail, mostly linked to removing knowledge.

Believe it or not, even Joerg, definitely for me the most thoughtful, original and open-minded of the team, has started removing knowledge and using CLOP values as of late!! Joerg, you know removing knowledge almost never works, unless it is removing unessential things like queen on 7th. Concerning CLOP, I very much respect Remi Coulom, and The Crazy Bishop was one of my favourite sparring engines in the past, but CLOP simply does not work. Anyone still thinking CLOP is working? At STF all CLOP-tuned patches failed either easily, or even much more convincingly. All of them. A human assesssment is way superior than an automatic tuning system one.

I very much hope someone tries this idea for adding new features to Spacemask. SF simply needs it, as it needs more imbalance eval, as it needs more closed eval, and more pawn specifications. It needs this, because SF games say so. You can not go around blindly fixing some imaginary problem, you need to first have some evidence of the problem to start fixing it. And the evidence is in the games lost by SF. Please, look more carefully at them.

The nasty thing is that SF needs some 17 elo more at least for a reasonable update. And, if those 17 elo are not achieved within the next month at worst, this will be the first time STF fails in more than a year. SF really needs some new ideas, based on sound chess knowledge.

Joerg, you are not going to let down the sensible approach, are you? It is more than obvious for any relatively good chess player that removing trapped rook condition is going to fail. It is vital in the eval, queen on 7th is not at all. You can remove only things that have no real chess value, but not important ones. SF very much needs further imbalance testing, why not do a single patch of the Q vs 3 pieces imbalance, specifying within it the 3 respective imbalances with their respective values? (probably after some testing of how they behave on their own) Such a patch will have much much bigger chances to pass the test convincingly. I think this is the reasonable thing to do about rare eval elements as this.

Arjun, you liked space eval, why not try the above suggestion? Just assign a bonus of 5-10cps for any pawn on the 5th rank, blocked by an enemy pawn, on files c-f. Very simple.

Sorry guys, I do not want to say anything or interfere where no one asks me, but, you need to add another 17 elo in less than a month's time in order not to let end user expectations down, which are very high. Too many people, me included, are impatiently waiting for a new official SF 5 release. :D Please do something about it.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

StormDanger

Post by Lyudmil Tsvetkov »

Well, as you see, I have SF eval term naming as my second language.

From what I have learned from trying to read SF code (if I read it at all correctly), SF has a storm danger term that evaluates penalties for the king shelter for enemy storming pawns on the horizon. From what I have understood (please, correct me, if not so), SF considers only enemy storming pawns on ranks 5-7. However, enemy storming pawns on rank 4 are a very important element of storming pawn code. An engine needs such understanding in a variety of cases to be able to start storming the enemy king earlier. When you are already on the 4th, search might help you to push such a pawn to the 5th and 6th ranks, but the question is how to get to the 4th rank, especially in more complex situations, when you have to push pawns from the own shelter.

I think adding storming pawns on the 4th rank to StormDanger is absolutely essential for an improved understanding of storming capabilitites. As storming pawns are a very important eval element, a leading engine simply can not do without knowledge for storming pawns on the 4th rank. So that my suggestion would be, please SF, do add penalties in StormDanger for storming pawns on the 4th rank, like g4, f4,h4, etc. The penalty might be reasonably lower than that for 5th rank, but still existing. This might significantly change SF behaviour in a vairety of complicated positions.

I will post later some SF lost games on the theme (and I have observed a multitude of them where such lacking knowledge hurts the engine), but for the time being just my simple suggestion worded clearly, adding 4th rank enemy pawns on the horizon to StormDanger, so that Marco and co have the opportunity to annihilate me with words or otherwise to good reason.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: StormDanger

Post by Lyudmil Tsvetkov »

Here is one game where SF underestimates enemy storming pawns on the 4th rank.

[pgn][MLNrOfMoves "95"]
[MLFlags "000100"]
[Event "4 Minutes/Game"]
[Site "2 SF 4 min gauntlet, HP-PC"]
[Date "2014.04.10"]
[Round "69.1"]
[White "Stockfish140407IPxx64"]
[Black "Houdini4Prox64"]
[Result "0-1"]

1. c4 e5 2. Nc3 Nf6 3. Nf3 Nc6 4. e3 Bb4 5. Qc2 d6 6. d3 O-O 7. Bd2 Re8 8. Be2
a5 9. O-O {+0.01/23 6s} 9... Bf5 {+0.01/20 5s} 10. Nh4 {+0.01/23 2s} 10... Bg4
{-0.03/19 6s (Bxc3)} 11. Bxg4 {+0.01/22 2s (Nf3)} 11... Nxg4 {-0.07/19 4s} 12.
Nf3 {+0.03/23 2s} 12... Bxc3 {-0.11/19 4s (Nf6)} 13. Bxc3 {+0.04/23 2s} 13...
Nf6 {-0.09/20 3s (b6)} 14. Rfd1 {+0.03/22 3s (a3)} 14... h6 {-0.07/19 5s (b6)}
15. b3 {+0.08/21 3s (h3)} 15... Nb8 {-0.04/19 12s (Nb4)} 16. d4
{+0.14/24 4s (a3)} 16... e4 {-0.10/20 5s} 17. Nd2 {+0.11/23 3s} 17... d5
{-0.11/20 3s} 18. Rab1 {+0.11/23 18s (cxd5)} 18... Nc6 {-0.21/19 9s} 19. a3
{+0.11/22 2s} 19... Qd6 {-0.19/19 2s (Qd7)} 20. Bb2 {+0.08/23 2s (cxd5)} 20...
Ne7 {-0.16/18 2s (Ng4)} 21. h3 {+0.09/21 4s} 21... Qe6 {-0.13/18 3s} 22. Rdc1
{+0.08/22 2s (cxd5)} 22... Nf5 {-0.11/18 6s (Qf5)} 23. Qd1 {+0.07/21 7s (a4)}
23... Rad8 {-0.10/18 13s (Nh4)} 24. a4 {+0.09/22 2s (cxd5)} 24... Nh4
{-0.06/18 10s (b6)} 25. Ba3 {+0.12/21 5s} 25... c6 {-0.04/17 4s (Qf5)} 26. Bc5
{+0.18/21 7s (Nf1)} 26... Rb8 {+0.06/18 8s (Qf5)} 27. Bb6 {+0.14/21 3s} 27...
Ra8 {+0.12/17 1s} 28. b4 {+0.20/21 2s (Bc7)} 28... axb4 {-0.31/16 2s} 29. Rxb4
{+0.26/21 2s} 29... dxc4 {-0.31/17 1s (Reb8)} 30. Nxc4 {0.00/24 5s (Rcxc4)}
30... Nd5 {-0.27/17 1s} 31. Rb2 {0.00/25 2s} 31... Qg6 {-0.27/16 0s} 32. Qg4
{0.00/26 2s} 32... Qxg4 {-0.26/19 1s} 33. hxg4 {0.00/26 2s} 33... Rxa4
{-0.24/18 0s} 34. Nd6 {0.00/25 4s (Bc5)} 34... Re6 {-0.27/18 1s} 35. Nxb7
{0.00/27 2s} 35... Rg6 {-0.28/17 0s} 36. g3 {0.00/26 2s (Nc5)} 36... Rxg4
{-0.39/19 1s} 37. Nd6 {0.00/26 2s (Nc5)} 37... Nb4 {-0.54/17 2s} 38. Rbb1
{-0.13/26 11s (Rcb1)} 38... Nd3 {-0.51/18 2s} 39. Ra1 {-0.10/25 1s} 39... Rb4
{-0.58/18 1s} 40. Rcb1 {-0.14/22 3s} 40... Nf3+ {-0.51/18 2s (Rxb1+)} 41. Kg2
{-0.14/24 1s} 41... Rxb1 {-0.50/18 2s} 42. Rxb1 {-0.11/23 1s} 42... Nde1+
{-0.56/18 1s} 43. Kh3 {-0.12/25 1s} 43... h5 {-0.56/17 0s} 44. Bc7
{-0.18/24 14s (Rd1)} 44... Ng5+ {-0.58/17 2s} 45. Kh2 {-0.18/1 0s} 45... Kh7
{-0.66/19 3s (Nd3)} 46. f4 {0.00/25 2s (Kg1)} 46... Ngf3+ {-1.08/18 1s} 47. Kh3
{0.00/27 1s} 47... Ng1+ {-1.08/17 0s} 48. Kh2 {0.00/1 0s} 48... Nef3+
{-1.05/20 1s (Ngf3+)} 49. Kg2 {-0.43/24 2s} 49... h4 {-1.05/19 0s} 50. Nxe4
{-0.51/26 2s} 50... f5 {-1.29/20 1s} 51. Rxg1 {-0.55/26 1s} 51... fxe4
{-1.29/18 0s} 52. Rh1 {-0.53/29 1s} 52... Rxg3+ {-1.46/20 1s} 53. Kf1
{-0.57/31 2s (Kf2)} 53... Kh6 {-1.80/17 1s (Rg4)} 54. f5 {-0.71/27 7s (Kf2)}
54... Rg4 {-1.86/16 0s} 55. Bd8 {-0.90/26 4s} 55... g6 {-1.68/19 1s (Kh7)} 56.
f6 {-0.90/27 1s} 56... g5 {-1.34/21 1s (Kh7)} 57. Kf2 {-0.65/27 3s (Bc7)} 57...
Kg6 {-1.61/23 1s} 58. Bc7 {-0.74/30 1s} 58... Kxf6 {-1.74/22 1s} 59. Bd6
{-0.74/32 0s (Rc1)} 59... Ke6 {-1.81/21 1s} 60. Bb8 {-0.59/26 1s} 60... Kd7
{-2.26/20 1s (Nd2)} 61. Rc1 {-0.93/28 2s} 61... h3 {-2.26/18 0s} 62. Bg3
{-1.37/28 4s} 62... Nh4 {-2.11/21 0s} 63. Rc5 {-1.37/29 0s} 63... Ke8
{-2.14/20 1s (Ng6)} 64. Re5+ {-1.29/22 1s} 64... Kf8 {-2.58/18 1s (Kf7)} 65.
Ra5 {-1.37/24 1s} 65... Kg7 {-2.58/16 0s (Kf7)} 66. Rc5 {-1.25/21 1s (Bxh4)}
66... Kf7 {-2.54/19 1s} 67. d5 {-1.37/22 1s} 67... h2 {-2.65/18 1s} 68. Bxh2
{-1.31/23 0s} 68... Rg2+ {-2.66/18 0s} 69. Kf1 {-1.39/21 0s} 69... cxd5
{-2.66/17 0s} 70. Bg1 {-1.39/23 0s (Be5)} 70... Rd2 {-2.71/20 1s} 71. Bf2
{-1.39/22 0s} 71... Nf3 {-2.71/18 0s (Rd3)} 72. Bg3 {-1.43/21 1s (Rc7+)} 72...
Ke6 {-3.04/20 1s} 73. Rc8 {-1.75/23 4s} 73... Kf5 {-3.15/18 1s (Ra2)} 74. Re8
{-1.46/20 0s} 74... Rd1+ {-3.30/20 0s (Kg4)} 75. Ke2 {-1.52/22 0s (Kf2)} 75...
Rc1 {-3.86/18 0s (Rb1)} 76. Bb8 {-1.55/21 0s} 76... Kg4 {-3.86/16 0s} 77. Kf2
{-1.62/22 0s (Re7)} 77... Rc2+ {-4.97/19 1s (Rb1)} 78. Kf1 {-1.62/1 0s} 78...
Nh4 {-5.01/17 0s (Rb2)} 79. Ba7 {-1.92/20 0s} 79... Kg3 {-5.01/16 0s} 80. Ke1
{-2.16/22 0s} 80... Ra2 {-5.15/20 1s (Nf3+)} 81. Bc5 {-2.19/21 0s} 81... Nf3+
{-5.15/18 0s} 82. Kd1 {-2.44/23 1s} 82... g4 {-5.15/17 0s} 83. Rg8
{-2.40/25 0s} 83... Kh3 {-5.15/16 0s} 84. Rh8+ {-2.63/24 1s} 84... Kg2
{-5.15/15 0s} 85. Rg8 {-2.63/25 0s} 85... g3 {-5.15/14 0s} 86. Bd6 {-2.62/24 0s}
86... Kf2 {-5.15/13 0s} 87. Bxg3+ {-3.07/24 2s} 87... Kxe3 {-5.16/12 0s} 88. Kc1
{-3.07/25 0s} 88... d4 {-5.16/11 0s} 89. Kb1 {-3.07/24 0s} 89... Ra6
{-7.47/17 0s (Ra4)} 90. Rg4 {-2.96/17 0s (Kb2)} 90... d3 {-8.00/16 0s (Kd3)} 91.
Bf4+ {-4.17/18 0s} 91... Kd4 {-8.00/15 0s (Kf2)} 92. Kb2 {-3.96/16 0s} 92... d2
{-7.94/14 0s (Ne5)} 93. Bxd2 {-3.94/17 0s} 93... Nxd2 {-7.94/13 0s} 94. Rg8
{-22.05/21 1s (Rg7)} 94... e3 {-7.77/12 0s} 95. Rd8+ {-22.09/26 0s} 95... Kc4
{-7.77/11 0s} 0-1
[/pgn]

[d]6k1/5pp1/1BpN4/7p/3Pp1r1/4PnPK/5P2/1R2n3 w - - 0 44
SF thinks it is almost perfectly equal, Houdini sees half a pawn black edge. SF severely underestimates the black h5 storming pawn. I think storm danger should not be switched off, even in the endgame. (if this is an endgame)

[d]8/2B2ppk/2pN4/7p/3PpPr1/4PnPK/8/1R2n3 b - - 0 47
Still seeing perfectly equal score.

[d]8/2B2ppk/2pN4/8/3PpPrp/4PnP1/6K1/1R4n1 w - - 0 50
Only when h5-h4 is played SF sees white starts losing. SF does not take into account in its eval the h5 black pawn, but it takes the h4 pawn. Would not it be easier and more precise to score also storming pawns on the 4th rank? I think it adds in strength.

[d]8/2B3pk/2p5/8/3PpP1p/4Pnr1/8/5K1R b - - 0 53
White is totally lost.

Well, you can correct a problem that exists, not an imaginary one.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: StormDanger

Post by Lyudmil Tsvetkov »

A very instructive game between SF and Houdini.

[pgn][MLNrOfMoves "95"]
[MLFlags "000100"]
[Event "4 Minutes/Game"]
[Site "2 SF 4 min gauntlet, HP-PC"]
[Date "2014.04.09"]
[Round "41.2"]
[White "Stockfish1404061153x64"]
[Black "Houdini4ProC0x64"]
[Result "0-1"]

1. c4 e5 2. Nc3 Nf6 3. Nf3 Nc6 4. e4 Bb4 5. d3 d6 6. a3 Bc5 7. b4 Bb6 8. Na4 O-O
9. Be2 {+0.04/21 4s} 9... Qe7 {-0.04/20 13s (Bg4)} 10. Bd2 {+0.03/21 4s (Bb2)}
10... Bg4 {-0.09/19 5s (Bd7)} 11. O-O {+0.15/20 4s (Rb1)} 11... Bxf3
{-0.06/19 4s (Bd7)} 12. Bxf3 {+0.06/21 7s} 12... Bd4 {-0.13/21 2s (Nd4)} 13. Rb1
{+0.37/20 7s (Nc3)} 13... a5 {-0.07/21 5s (a6)} 14. b5 {+0.18/23 3s (Nc3)}
14... Nd8 {-0.16/21 3s} 15. g3 {+0.18/23 2s (Be3)} 15... Ne6 {-0.24/17 4s} 16.
Nc3 {+0.15/20 2s (Bg2)} 16... a4 {-0.22/18 5s (Bc5)} 17. Rb4
{+0.18/21 11s (Bg2)} 17... h6 {-0.23/18 5s (Nc5)} 18. Bg2 {+0.24/19 6s (Kg2)}
18... Nc5 {-0.32/19 4s (Bc5)} 19. Be1 {+0.32/20 3s} 19... Nfd7
{-0.32/19 2s (Ne6)} 20. Ne2 {+0.30/25 10s (Nd5)} 20... Nb3 {-0.75/20 3s} 21.
Nxd4 {+0.36/24 2s} 21... exd4 {-0.74/18 0s (Nxd4)} 22. f4 {+0.36/23 2s} 22...
Ndc5 {-0.73/19 1s (Rfe8)} 23. g4 {+0.36/24 11s (Bf2)} 23... Kh8
{-0.71/18 9s (Rfe8)} 24. Bg3 {+0.36/24 2s} 24... Rae8 {-0.65/18 3s (Kg8)} 25.
Bf3 {+0.36/22 2s} 25... f6 {-0.55/18 15s (Kg8)} 26. Qe2 {+0.41/26 2s} 26... b6
{-0.50/18 3s} 27. Rd1 {+0.41/27 2s} 27... g6 {-0.55/18 1s (Kh7)} 28. f5
{+0.41/28 2s} 28... Qg7 {-0.43/18 12s (g5)} 29. Qf2 {+0.41/25 7s (Kh1)} 29...
h5 {-0.55/18 2s (g5)} 30. Kh1 {+0.48/22 5s (fxg6)} 30... hxg4 {-0.49/17 3s} 31.
Bxg4 {+0.38/25 10s} 31... Rg8 {-0.51/18 2s (g5)} 32. fxg6 {+0.37/23 2s (h3)}
32... Qxg6 {-0.62/16 0s} 33. Qf4 {+0.33/26 2s} 33... Qg5 {-0.73/19 2s (Re7)} 34.
Bf5 {+0.27/27 3s (Qxg5)} 34... Qxf4 {-0.75/19 1s (Qh5)} 35. Bxf4 {+0.23/28 1s}
35... Kg7 {-0.64/21 2s} 36. Kg2 {+0.20/27 1s} 36... Rh8 {-0.58/21 3s} 37. Kf3
{+0.05/26 5s (Bg3)} 37... Kf7 {-0.63/20 1s (Ra8)} 38. Bg3 {0.00/26 2s} 38... Rh5
{-0.62/20 1s (Ke7)} 39. Bg4 {0.00/27 2s} 39... Rh7 {-0.58/21 1s} 40. h3
{0.00/29 1s (Bf5)} 40... Rd8 {-0.55/19 1s (Ke7)} 41. Be1 {0.00/29 4s (Ke2)}
41... Rg8 {-0.62/20 1s (Re8)} 42. Bf2 {0.00/28 1s (Bg3)} 42... Ke7
{-0.56/19 4s} 43. Bg3 {0.00/30 1s} 43... Rh6 {-0.55/19 3s (Re8)} 44. Kg2
{0.00/31 1s (Ke2)} 44... Rhg6 {-0.55/20 1s (Rh7)} 45. Bf2 {0.00/27 1s (Kf3)}
45... Rxg4+ {-0.62/20 6s (Rh6)} 46. hxg4 {0.00/31 3s} 46... Rxg4+ {-0.59/19 0s}
47. Kf1 {0.00/30 1s (Kh3)} 47... Ke6 {-0.59/20 1s (Rg8)} 48. Ke2 {0.00/30 1s}
48... Rg8 {-0.70/20 1s} 49. Kf3 {0.00/32 1s} 49... f5 {-0.65/19 0s (Ke7)} 50.
exf5+ {-0.37/22 3s} 50... Kxf5 {-0.68/20 1s} 51. Be1 {-0.38/24 2s (Bg3)} 51...
Kg6 {-0.69/20 2s (Rh8)} 52. Bh4 {-0.38/23 1s (Bf2)} 52... Re8 {-0.74/20 1s} 53.
Bf2 {-0.38/26 1s} 53... Kf5 {-0.80/20 1s (Rf8+)} 54. Bg1 {-0.44/25 1s} 54...
Rf8 {-0.83/20 1s (Rh8)} 55. Bf2 {-0.44/25 1s} 55... Ke5+ {-0.88/19 3s (Rh8)} 56.
Kg3 {-0.38/25 2s} 56... Ke6 {-0.81/20 1s} 57. Re1+ {-0.44/26 2s} 57... Kd7
{-0.83/21 1s (Kf5)} 58. Rd1 {-1.04/23 4s} 58... Rf7 {-0.86/21 0s (Rg8+)} 59.
Kg2 {-1.04/23 1s} 59... Re7 {-0.86/21 1s} 60. Kf1 {-1.10/24 2s (Kf3)} 60... Kc8
{-0.93/21 1s (Re8)} 61. Bg1 {-1.10/25 1s} 61... Kb7 {-0.91/21 0s (Re5)} 62. Bf2
{-1.10/26 1s} 62... Re8 {-0.95/21 0s (Re5)} 63. Bg1 {-1.10/27 1s} 63... Rh8
{-1.01/21 1s (Kc8)} 64. Ke2 {-1.12/22 1s (Kg2)} 64... Rh3 {-1.24/19 0s} 65.
Rxb3 {-1.09/25 1s} 65... axb3 {-1.78/17 0s} 66. Bxd4 {-1.12/27 1s} 66... Rh2+
{-1.70/17 0s (Ne6)} 67. Ke3 {-1.09/24 1s} 67... b2 {-1.86/19 1s} 68. Rb1
{-1.06/24 0s} 68... Na4 {-1.86/18 0s (Rh3+)} 69. Ke4 {-1.98/27 1s} 69... Rc2
{-1.92/21 1s} 70. Bf6 {-1.98/28 0s (Ke3)} 70... Rc1 {-1.94/20 1s} 71. Rxb2
{-2.04/32 1s} 71... Nxb2 {-2.13/20 2s} 72. Bxb2 {-2.04/34 0s} 72... Rc2
{-2.02/19 0s} 73. Bf6 {-2.04/31 0s (Bd4)} 73... Ra2 {-2.01/20 0s} 74. Bh4
{-2.10/31 1s (a4)} 74... Rxa3 {-2.01/19 0s (Kc8)} 75. Bg5 {-2.11/33 0s (d4)}
75... Kc8 {-2.02/19 0s} 76. d4 {-2.11/32 0s (Bf6)} 76... Kd7 {-2.10/20 0s} 77.
Kd5 {-2.11/32 0s} 77... Rh3 {-2.13/20 1s (Ra2)} 78. Bd2 {-2.11/31 0s} 78... Rh5+
{-2.17/19 0s (Rh4)} 79. Ke4 {-2.53/32 3s} 79... Ke6 {-2.16/20 0s} 80. d5+
{-2.53/34 0s} 80... Kf6 {-2.18/21 0s} 81. Bc3+ {-2.53/35 0s (Kd4)} 81... Kg5
{-2.26/22 1s (Kg6)} 82. Bg7 {-2.53/36 0s (Be1)} 82... Rh4+ {-2.29/20 0s (Rh7)}
83. Kd3 {-2.53/37 0s} 83... Kf5 {-2.35/20 0s} 84. Ba1 {-4.58/36 3s (Bb2)} 84...
Rh3+ {-2.65/20 0s} 85. Kc2 {-4.58/37 0s (Kd2)} 85... Ke4 {-2.58/21 0s (Rh1)}
86. Bg7 {-4.58/36 0s (Bf6)} 86... Rg3 {-2.80/20 1s (Rh7)} 87. Bf6
{-4.58/34 0s (Bh8)} 87... Rf3 {-2.80/22 0s (Rg2+)} 88. Bd8 {-4.58/37 0s (Bg7)}
88... Kd4 {-3.57/21 0s (Rf7)} 89. Bxc7 {-4.07/27 0s} 89... Kc5
{-3.57/19 0s (Kxc4)} 90. Bd8 {-4.17/28 0s} 90... Rf8 {-4.37/18 0s (Kxc4)} 91.
Be7 {-4.67/24 0s} 91... Re8 {-4.37/16 0s} 92. Bh4 {-4.67/29 0s} 92... Kxc4
{-4.37/15 0s} 93. Kc1 {-4.74/30 0s (Bg3)} 93... Re5 {-4.95/19 0s (Rf8)} 94. Bf2
{-5.32/23 0s (Bg3)} 94... Kc3 {-6.03/16 0s (Kxb5)} 95. Bh4 {-6.24/27 0s} 95...
Rxd5 {-6.55/15 0s} 0-1
[/pgn]

[d]4rr1k/2p1q1p1/1p1p1p1p/1Pn5/pRPpPPP1/Pn1P1BB1/4Q2P/5RK1 w - - 0 27
Look carefully at the above position: another zero mobility/trapped rook on b4. Who says the problem with SF trapped rooks in the center of the board does not exist? It is so frequent, and it is a very real one, just you have to find the right implementation. Again, you can correct an issue that really exists, and not an issue that is not traced as a pattern of behaviour in games.

However, this position is not important and critical so much because of the trapped rook, but because of the (bad) use of storming pawns. SF has an excellent position here, in spite of the trapped rook, it has excellent storming pawns on the 4th rank ready to advance further, and by doing so, white should win, I think, but SF fails to see the right continuation.

It now plays 27.Rd1? What is this rook doing on d1, apart from defending a pawn that is already defended? The right plan was to play 27.f5, followed by h4, g5, etc. As a general rule, the faster you advance your storming pawns, the better, but SF is clumsy here. Does SF consider f5 for a storming pawn, when its specification is that only pawns on the files where the king is and the 2 adjacent files are taken into account? With black king on h8, f5 should not be considered as a storming pawn, not being on an adjacent file. But I think it is a very strong storming pawn, so that maybe it kind of makes sense to try extending the storming pawn/storm danger code to the 4 closest adjacent files. Thus, the f file would be included here.

Louis will certainly say that f5 is bad, because of 27...g6 28.fg6 Qg7, but now white can play 29.Bf4 and follow with transferring its rook to the h file with a strong attack, as h6 is very vulnerable. I do not think black could defend in this case. SF however does not see this continuation. It is surprising how little engines see when their eval lets them down. They are fully lost in the immeasurable depth of lines.

[d]4rr1k/2p1q3/1p1p1ppp/1Pn5/pRPpPPP1/Pn1P1BB1/4Q2P/3R2K1 w - - 0 28
Houdini plays g6 itself. Now f4-f5 is not that efficient.

[d]4r1rk/2p5/1p1p1pq1/1Pn5/pRPpP1B1/Pn1P2B1/5Q1P/3R3K w - - 0 33
Nothing left of the white advantage, no storming pawns, but SF still does not realise it is worse because its rook on b4 is trapped.

Again my suggestion (hope someone tries it, there were so many developers testing patches a while ago, where have they all gone, when I go to STF, I see only: Pending - 0 games, 0.0 hrs, it is kind of offending ) :( : extend further storm danger with values for storming pawns on the 4th rank, and, possibly try extending, with lower values of course, storm danger also to the 2 adjacent files + 2 other files further apart. I.e., with Kg8, also the e file would be included, and considered files would be h,g,f,e, with Kh8 considered files would be h,g and f, and with Ke8 - g,f,e,d,c.

I think that is very important. Sophisticated storm danger eval is probably on of the 4 or 5 weightiest terms, almost on a par with piece king attack. Anyone knowing how many elo is piece king attack in SF worth, and how many storm danger? I think storm danger elo contribution should not be very much lower; if it is, then the storm danger code is only very basic and not sophisticated and should be improved. A matter of fact is that SF plays well enough with its attacking pieces, but shaky enough with its storming pawns. Maybe someone is going to try to extend storm danger, because here is a major problem that loses a lot of elo.

Please, look very careful at the posted games again.
Uri Blass
Posts: 10410
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: What to do about this?

Post by Uri Blass »

Lyudmil Tsvetkov wrote:
zullil wrote:
zullil wrote:
zullil wrote:
Lyudmil Tsvetkov wrote: [d]8/1Q6/3ppk1p/6r1/3p4/P3b1PK/1P5P/8 w - - 0 49
Both engines see 1 to 2 full pawns white advantage here for the next 20 moves, when suddenly, first SF, and then Houdini, understand that black is winning.

In any case, it seems black has a decisive advantage in terms of eval on the above diagram.
After a long search, the latest Stockfish sure isn't seeing that Black has an advantage:

Code: Select all

info depth 48 seldepth 90 score cp 275 nodes 120428243868 nps 23691600 time 5083162 multipv 1 pv b7a6 g5d5 a6f1 d5f5 f1d3 f5c5 h3g2 d6d5 a3a4 h6h5 d3h7 c5c1 g2h3 c1a1 b2b3 f6e5 a4a5 a1a3 h7h5 e5d6 h5e8 a3b3 e8d8 d6c6 d8e7 c6b5 e7a7 b5c6 a5a6 b3b6 a7f7 b6a6 f7e6 c6b7 e6d5 b7c7 h3g4 a6d6 d5c5 c7d7 c5b5 d7e7 g4f3 e3h6 b5d3 h6g7 h2h4 d6e6 f3f4 e6f6 f4g4 f6e6 d3h7 e7f7 g4g5 e6d6

Code: Select all

info depth 49 seldepth 90 score cp 313 nodes 278416677366 nps 24538113 time 11346295 multipv 1 pv b7e4 g5e5 e4d3 e5c5 h3g2 h6h5 a3a4 c5a5 b2b4 a5a4 d3b1 a4a8 b4b5 a8b8 b5b6 d6d5 b1f1 f6e5 f1a6 d4d3 a6d3 e3b6 d3f3 e5d6 f3f4 e6e5 f4f6 d6c5 f6e5 b8b7 e5h5 b6c7 h5e2 c7d6 e2c2 c5d4 h2h4 b7b8 c2f2 d4e5 f2e2 e5f6 g3g4 d6e5 g4g5 f6f5 e2c2 f5e6 c2g6 e6e7 g6d3

Code: Select all

info depth 50 seldepth 102 score cp 275 nodes 652676948978 nps 25448132 time 25647342 multipv 1 pv b7e4 g5e5 e4c2 e5c5 c2e2 c5a5 e2d3 a5c5 d3e2
Thanks Louis, but I do not believe SF.
In the actual game, some 30+ plies on, it still showed a very significant white advantage.
That is how search without good eval is meaningless: both SF and Houdini do not see a forced winning line for black, as in the best of cases there is a long series of checks, but the win is there.

Something should be done about SF eval in the above position, but I am still not certain what exactly, apart from further rasing the passer bonus for the side with more pieces.
I believe stockfish because the score went up and earlier it was lower.
I guess that stockfish saw some tactics that it did not see in the game(in the game it did not see 2.75 pawns advantage).
Uri Blass
Posts: 10410
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Spacemask 2

Post by Uri Blass »

Lyudmil Tsvetkov wrote:Well, it is a pity no one is interested in this, I not only believe, but am very much certain that is a very good idea.
Why so? Because I saw similar SF behaviour in more than 10% of the games I have looked at, and I have looked at much more than hundred games.

I am adamant, based on the games I have seen, that SF underestimates blocked enemy pawns on the 5th rank. Something seems to be going extremely wrong currently in STF, the queue is empty, patches that have passed STC are not tested at LTC for some reason, as the Trapped Bishop patch, first try, NoMobilityQueen patch, RNN vs R imbalance, the patch that passed STC. (btw. regarding this patch, a very meaningful thing would be to add R vs BBN to the imbalance, if it is not already added, with twice bigger success chances) Instead, people test patches that have very good almost obvious chances to fail, mostly linked to removing knowledge.

Believe it or not, even Joerg, definitely for me the most thoughtful, original and open-minded of the team, has started removing knowledge and using CLOP values as of late!! Joerg, you know removing knowledge almost never works, unless it is removing unessential things like queen on 7th. Concerning CLOP, I very much respect Remi Coulom, and The Crazy Bishop was one of my favourite sparring engines in the past, but CLOP simply does not work. Anyone still thinking CLOP is working? At STF all CLOP-tuned patches failed either easily, or even much more convincingly. All of them. A human assesssment is way superior than an automatic tuning system one.

I very much hope someone tries this idea for adding new features to Spacemask. SF simply needs it, as it needs more imbalance eval, as it needs more closed eval, and more pawn specifications. It needs this, because SF games say so. You can not go around blindly fixing some imaginary problem, you need to first have some evidence of the problem to start fixing it. And the evidence is in the games lost by SF. Please, look more carefully at them.

The nasty thing is that SF needs some 17 elo more at least for a reasonable update. And, if those 17 elo are not achieved within the next month at worst, this will be the first time STF fails in more than a year. SF really needs some new ideas, based on sound chess knowledge.

Joerg, you are not going to let down the sensible approach, are you? It is more than obvious for any relatively good chess player that removing trapped rook condition is going to fail. It is vital in the eval, queen on 7th is not at all. You can remove only things that have no real chess value, but not important ones. SF very much needs further imbalance testing, why not do a single patch of the Q vs 3 pieces imbalance, specifying within it the 3 respective imbalances with their respective values? (probably after some testing of how they behave on their own) Such a patch will have much much bigger chances to pass the test convincingly. I think this is the reasonable thing to do about rare eval elements as this.

Arjun, you liked space eval, why not try the above suggestion? Just assign a bonus of 5-10cps for any pawn on the 5th rank, blocked by an enemy pawn, on files c-f. Very simple.

Sorry guys, I do not want to say anything or interfere where no one asks me, but, you need to add another 17 elo in less than a month's time in order not to let end user expectations down, which are very high. Too many people, me included, are impatiently waiting for a new official SF 5 release. :D Please do something about it.
I do not plan to try to give new patches because I learned that people who give patches and have history of some productive patches can later be blamed of wasting resources.

I think that it is a very bad idea for development to do it and it clearly discourage people to give patches.

My opinion is that stockfish could earn more elo if there was a rule that you never blame people who give patches for wasting resources.

My opinion is that you could get significantly more testers and significantly more people who give patches with better rules that allow testing everything.

For more testers:
Looking at the almost empty queue it seems that stockfish does not need more testers and it certainly does not encourage more people to give computer time.

For more people who give patches
I think that there is a psychological advantage to allow people to give patches for tests that are not useful because it can encourage the same people to give also patches that are useful.

There are 3 possible simple ways to treat people who give also patches that are not useful(or patches at very long time control like 5 minutes+5 seconds time control that I did not try to test).

1)Allowing it at low priority with a rule that forbid complaining about it(if testers do not like to test it they can wait for tests with higher priority and not waste computer time on these tests)
2)Not allowing it in the first place and not complaining about people who give these patches about wasting computer resources because the patches are never tested in the framework
3)Allowing it and later complaining that the people waste computer resources.

I think that 1 is the best and 2 is better than 3
but from experience the stockfish team choose 3 that is the worst option.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: StormDanger

Post by Lyudmil Tsvetkov »

Another very instructive game against Houdini.

[pgn][MLNrOfMoves "59"]
[MLFlags "000100"]
[Event "4 Minutes/Game"]
[Site "2 SF 4 min gauntlet, HP-PC"]
[Date "2014.04.09"]
[Round "42.1"]
[White "Houdini4Prox64"]
[Black "Stockfish140407IPxx64"]
[Result "1-0"]

1. e4 c5 2. Nf3 Nc6 3. Bb5 e6 4. Nc3 Nge7 5. O-O a6 6. Bxc6 Nxc6 7. d4 cxd4 8.
Nxd4 Qc7 9. Nxc6 {+0.17/20 19s} 9... dxc6 {+0.07/21 6s (bxc6)} 10. Qh5
{+0.24/19 7s (Qg4)} 10... b5 {0.00/21 10s} 11. Rd1 {+0.35/19 6s} 11... Be7
{+0.13/23 4s (c5)} 12. e5 {+0.50/18 2s} 12... Bb7 {+0.10/23 2s} 13. Ne4
{+0.46/19 5s} 13... O-O {+0.11/23 6s} 14. Rd3 {+1.10/17 5s (Nd6)} 14... g6
{+0.15/22 7s} 15. Qh6 {+1.10/15 0s (Qf3)} 15... Qxe5 {+0.35/24 3s} 16. Bg5
{+0.60/18 2s} 16... Bxg5 {+0.28/26 2s} 17. Nxg5 {+0.60/16 0s} 17... Qh8
{+0.22/25 2s} 18. Rh3 {+0.61/18 2s (Rd7)} 18... Rfd8 {+0.15/25 12s} 19. Nxh7
{+0.61/18 2s (c3)} 19... Rd6 {+0.12/23 4s} 20. Rf1 {+0.35/18 5s (g4)} 20... Rad8
{+0.28/25 5s} 21. g4 {+0.46/19 6s} 21... Qg7 {+0.22/25 2s} 22. Qh4
{+0.28/18 6s} 22... g5 {+0.14/25 12s} 23. Nxg5 {+0.36/14 5s (Qxg5)} 23... c5
{+0.17/23 6s} 24. f3 {+0.39/18 9s} 24... c4 {+0.17/24 2s (Rd1)} 25. c3
{+0.31/17 6s} 25... Bc6 {+0.23/22 6s} 26. Ne4 {+0.24/17 1s (a3)} 26... Bxe4
{+0.21/23 4s} 27. fxe4 {+0.25/15 0s} 27... b4 {+0.12/23 7s (Rd1)} 28. Rf2
{+0.17/17 2s (Kh1)} 28... bxc3 {+0.19/23 5s} 29. bxc3 {+0.07/18 3s} 29... Rd1+
{+0.17/23 5s (Rd3)} 30. Kg2 {+0.21/17 1s} 30... R8d3 {+0.15/24 1s (R1d3)} 31.
Rb2 {+0.34/19 4s} 31... Rd8 {+0.19/23 3s} 32. a3 {+0.29/18 6s (a4)} 32... R1d6
{+0.19/25 14s (Rf8)} 33. a4 {+0.29/18 1s} 33... R6d7 {+0.19/25 1s (Rd1)} 34. Rb1
{+0.22/17 4s (Kg1)} 34... Rd6 {+0.19/22 3s (Rd2+)} 35. Kg1 {+0.24/18 2s} 35...
Rd2 {+0.19/24 2s} 36. Kh1 {+0.23/19 1s (Rb2)} 36... R2d7 {+0.23/22 2s} 37. a5
{+0.13/19 2s (g5)} 37... Rd2 {+0.18/23 1s} 38. Kg1 {+0.16/19 2s} 38... R2d7
{+0.17/22 3s} 39. Rb2 {+0.16/18 8s} 39... Re8 {+0.09/22 1s (Rf8)} 40. g5
{+0.19/18 6s (Rf3)} 40... Red8 {+0.27/22 3s (Rd3)} 41. Rf3 {+0.24/17 3s (g6)}
41... Rd1+ {+0.24/21 2s} 42. Kg2 {+0.29/19 1s (Kf2)} 42... R1d2+ {+0.12/22 4s}
43. Rxd2 {+0.31/18 1s} 43... Rxd2+ {+0.14/24 2s} 44. Kf1 {+0.29/19 1s} 44...
Qe5 {+0.14/25 1s} 45. g6 {+0.33/19 1s} 45... fxg6 {+0.14/24 1s} 46. Qf6
{+0.33/19 1s (Rg3)} 46... Qxf6 {+0.41/24 1s} 47. Rxf6 {+0.32/18 0s} 47... Rxh2
{+0.41/24 1s (Ra2)} 48. Rxe6 {+0.27/17 1s} 48... g5 {+0.44/26 1s (Kf7)} 49. Rxa6
{+0.56/19 2s} 49... Kg7 {+0.73/25 6s (Kf7)} 50. e5 {+1.08/17 2s (Ra7+)} 50...
Rh4 {+0.83/23 4s (Kf7)} 51. Kf2 {+1.59/17 1s} 51... Rh2+ {+0.96/25 2s} 52. Ke3
{+1.84/18 2s} 52... Rc2 {+0.90/27 1s (g4)} 53. Ra8 {+2.47/16 1s (Kd4)} 53...
Rxc3+ {+2.21/24 8s (g4)} 54. Kd4 {+2.94/15 0s (Ke4)} 54... Rd3+
{+2.46/24 3s (Ra3)} 55. Kxc4 {+2.90/14 0s (Ke4)} 55... Re3 {+2.75/23 1s} 56. Kd4
{+2.89/13 0s} 56... Rg3 {+3.45/24 3s (Re1)} 57. a6 {+3.78/12 0s} 57... Ra3
{+4.65/23 2s (Rg1)} 58. e6 {+3.96/11 0s (Kc4)} 58... Ra4+ {+6.72/17 1s (Kf6)}
59. Ke5 {+7.12/10 0s (Kd5)} 59... Ra5+ {+7.36/20 1s} 60. Kd6 {+7.60/9 0s} 1-0
[/pgn]

[d]r4rk1/1bq1bppp/p1p1p3/1p2P2Q/4N3/8/PPP2PPP/R1BR2K1 w - - 0 14
Houdini sees here a very big white advantage, SF thinks it is almost equal. Of course, Houdini is right. Interesting how search does not help when evaluation is wrong.

Is not the e5 pawn due a storm danger penalty? I think it is due, but SF will not consider it. Is not this the main reason why SF evaluates the position wrongly? I think it is. e5 plays an active role in attacking the black king, and should receive a bonus. So please, try extending storm danger one file further. Someone said that only pawns on adjacent files are storming pawns, maybe 99% of engines implement it like that, but this is only very basic. And once something is implemented in 99% of engines codes, no one wants to reimplement it in a better way. They will use CLOP to tune values for existing terms, but never introduce a new reasonable term.

Someone says that SF already has achieved a level of play which is difficult to improve further. I say no, quite the opposite - SF is a very weak engine, and it looks strong only because other engines are even weaker. It is fully possible to add further 500 elo to SF strength. And quite easily. But people stick to the routine, and that routine already has its limits.

Arjun, what do you say about leaving CLOP aside for a moment and try to extend storm danger with 4th rank and 2 files further apart? This is the promising patch that is going to work, but of course, no one will try this.

Or alternatively, I ask you: what to do so that SF sees nice white advantage above?

[d]r4rk1/1bq1bp1p/p1p1p1pQ/1p2P3/4N3/3R4/PPP2PPP/R1B3K1 b - - 0 15
How can you say that e5 is not storming the black king? Nf6 threatens, and the e5 pawn could realistically become an f6 pawn.

[d]r5kq/1b3p1N/p1prp1pQ/1p6/8/7R/PPP2PPP/R5K1 w - - 0 20
The storming pawn is gone, but leaves white with a nice attack. It is possible that Houdini missed some even stronger continuation.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: What to do about this?

Post by Lyudmil Tsvetkov »

Uri Blass wrote:
Lyudmil Tsvetkov wrote:
zullil wrote:
zullil wrote:
zullil wrote:
Lyudmil Tsvetkov wrote: [d]8/1Q6/3ppk1p/6r1/3p4/P3b1PK/1P5P/8 w - - 0 49
Both engines see 1 to 2 full pawns white advantage here for the next 20 moves, when suddenly, first SF, and then Houdini, understand that black is winning.

In any case, it seems black has a decisive advantage in terms of eval on the above diagram.
After a long search, the latest Stockfish sure isn't seeing that Black has an advantage:

Code: Select all

info depth 48 seldepth 90 score cp 275 nodes 120428243868 nps 23691600 time 5083162 multipv 1 pv b7a6 g5d5 a6f1 d5f5 f1d3 f5c5 h3g2 d6d5 a3a4 h6h5 d3h7 c5c1 g2h3 c1a1 b2b3 f6e5 a4a5 a1a3 h7h5 e5d6 h5e8 a3b3 e8d8 d6c6 d8e7 c6b5 e7a7 b5c6 a5a6 b3b6 a7f7 b6a6 f7e6 c6b7 e6d5 b7c7 h3g4 a6d6 d5c5 c7d7 c5b5 d7e7 g4f3 e3h6 b5d3 h6g7 h2h4 d6e6 f3f4 e6f6 f4g4 f6e6 d3h7 e7f7 g4g5 e6d6

Code: Select all

info depth 49 seldepth 90 score cp 313 nodes 278416677366 nps 24538113 time 11346295 multipv 1 pv b7e4 g5e5 e4d3 e5c5 h3g2 h6h5 a3a4 c5a5 b2b4 a5a4 d3b1 a4a8 b4b5 a8b8 b5b6 d6d5 b1f1 f6e5 f1a6 d4d3 a6d3 e3b6 d3f3 e5d6 f3f4 e6e5 f4f6 d6c5 f6e5 b8b7 e5h5 b6c7 h5e2 c7d6 e2c2 c5d4 h2h4 b7b8 c2f2 d4e5 f2e2 e5f6 g3g4 d6e5 g4g5 f6f5 e2c2 f5e6 c2g6 e6e7 g6d3

Code: Select all

info depth 50 seldepth 102 score cp 275 nodes 652676948978 nps 25448132 time 25647342 multipv 1 pv b7e4 g5e5 e4c2 e5c5 c2e2 c5a5 e2d3 a5c5 d3e2
Thanks Louis, but I do not believe SF.
In the actual game, some 30+ plies on, it still showed a very significant white advantage.
That is how search without good eval is meaningless: both SF and Houdini do not see a forced winning line for black, as in the best of cases there is a long series of checks, but the win is there.

Something should be done about SF eval in the above position, but I am still not certain what exactly, apart from further rasing the passer bonus for the side with more pieces.
I believe stockfish because the score went up and earlier it was lower.
I guess that stockfish saw some tactics that it did not see in the game(in the game it did not see 2.75 pawns advantage).
With those scores I can not trust SF.
I am almost convinced black wins here, but someone has to analyse it.