mar wrote:Thanks for reply
the PV prior to depth 14 is
d6d7 g8f7 d4f3 a8f3 a3a4 e8e7 d7d8 e7e8 d8d7, score cp 0
(no change since depth 9)
depth 14:
d6d7 e8f8 c7c8q f8c8 d4f3 c8c5 f3d4 c5c4 d7e6 a8b7 e6e5 c4c5 e5d6 c5a5 a3c3, score cp -255
1-meg version:
depth 9-11
d6d7 g8f7 d4f3 a8f3 a3a4 e8e7 d7d8 e7e8 d8d7, score cp 0
depth 12:
d6d7 e8f8 a3a4 f3f2 a4a1 a8g2 d4e6 g2h3 a1f1 h3f1 e6f8 f1b5 d7e7, score cp -240
I don't know if this reveals something (except score 0 due to repetition; I consider 1-fold repetition as a draw).
I get the same PVs as the 1-meg version (for depth 9-11) even if I disable TT stores (=> no TT).
I also think that your problem is TT and not search.
Regarding the PVs given above, there is not much unusual in my opinion:
- With low depth the engine does not see any better reply to 1.Kd7 than forcing repetition since the winning strategy for black is far beyond its horizon (advancing both g+h pawns and the king and finally winning with BPP vs RP). So the score of 0 seems to be ok for me.
- With higher depth the engine sees a refutation of 1.Kd7. (Q: Does it see that 1.e4 is better?) Since you wrote "at depth 14 it realises it's a losing move" I assume that the second PV of both engine versions is not the PV of the root node but the PV of 1.Kd7 being refuted. If this is the case then you don't get the *best* refutation but just one that is sufficient for 1.Kd7 failing low. So this "PV" will sometimes look surprising by containing some non-optimal moves of the opponent. I would not print a PV unless its score is above alpha, just to avoid confusion.
You may be right, though, when stating that the version with much more TT memory should not find the refutation two plies later than the other version.
The next step could be to confirm by testing that the TT implementation has a bug. I think your 40 games match is a hint into this direction but the number of games may be too small indeed. I would play a gauntlet of the version with most TT memory against different versions with less memory, including also a version with TT disabled, and using many games with very short time control (e.g. game/4 sec if this is possible for you). If the outcome is the opposite of the normal outcome, like this:
1. <version with TT disabled>
2. <1 MB version>
3. <4 MB version>
4. <16 MB version>
5. <64 MB version>
6. <256 MB version>
then I think the case is clear: more influence of TT means more influence of TT bugs.
If the outcome is much different from that then anything is possible, including that you might have no problem at all ...
Sven