syzygy wrote:Yes, and then it still makes the "unnatural" move. To avoid that, the engine will somehow have to decide that it does not want the TB win: "My position is so good that there must be a win that does not involve sacking my queen". That should be possible to implement, but of course it will then fail to win won games in certain circumstances.
I don't think it has to be as bad as that. You can still do it in a way that guarantees the win:
In the root, decide which moves preserve the forced win. This can be done by never allowing alpha in the root to rise above the score for a tablebase win. Moves that fail low compared to this threshold are removed from the move list. Then redo the search using only the moves that preserve the win, where you disabled probing of all 'inferior' tables.
The worst thing that can happen then is that it gets very reluctant to win in pathological cases (e.g. where the sacrifice is really the only way to win, because the superfluous piece was trapped anyway). It would then muddle on for a long time without sacrifice, until the 50-move barrier forces it to make the conversion. Arguably even this should count as 'natural play', as it is exactly what a human player would do when he is not bright enough to realize that the better material against normal expectation could not do the job.
It would still make the sacrifice early when the forced conversion was in danger of evaporating (i.e. none of the other moves can be proven to be a win). But in this case it could be seen as a brilliant sacrifice. (E.g. it could be that you sac a Queen, and some moves later gain it back, or get a new one through promotion, and only win the inferior end-game because of that.)
You could also do it the other way around: after you established the root contains a forced win, first search with probing inferior EGTs disabled. Each time a move beats alpha you repeat the search with the probing now enabled. If the move returns a tablebase win, you are done with this iteration. If not, the move is not acceptable (at this depth), and you continue the iteration with the next move. So at ply level 1 any fail low would trigger a verification search with all probing on to prove it fails 'very low' (i.e. a tablebase loss), and if it doesn't, you return beta instead of the fail-low score.