gladius wrote:
Thanks for tracking this down. It is indeed a time pressure blunder. If SF has <= 5ms to play a move, it plays instantly, before even searching. So it just plays a random move.
It appears to be deterministic, not random. Meaning I invoked the same search conditions five times and only got f3.
Also, the behavior seems the same for btime = 7msec
Stockfish sig-4728533 64 SSE4.2 by Tord Romstad, Marco Costalba and Joona Kiiski
position fen 8/5p2/3k2p1/3P3p/3RPp1P/4r3/5K2/8 b - - 2 52
isready
readyok
go btime 7
info nodes 2 time 1
bestmove f4f3 ponder (none)
I think this or a similar but unrelated bug also affects Stockfish 4's pondering of moves. Sometimes SF4 ponders nonsensical moves - position is equal a move before (say 0.22) then SF4 ponders a move and thinks the position is -4.77. Of course the opponent plays a normal move and everything is back to normal (0.2x). But SF4 wasted precious time pondering a nonsensical move. Whether this is because SF4 had not enough time on the previous move or main line before pondering had only move (unreliably) reconstructed from hash, i do not know. Unfortunately i did not save evals and ponder moves, but i saw it a couple of times, so it should not be long before it happens again.
zullil wrote:
gladius wrote:
Thanks for tracking this down. It is indeed a time pressure blunder. If SF has <= 5ms to play a move, it plays instantly, before even searching. So it just plays a random move.
It appears to be deterministic, not random. Meaning I invoked the same search conditions five times and only got f3.
Also, the behavior seems the same for btime = 7msec
Stockfish sig-4728533 64 SSE4.2 by Tord Romstad, Marco Costalba and Joona Kiiski
position fen 8/5p2/3k2p1/3P3p/3RPp1P/4r3/5K2/8 b - - 2 52
isready
readyok
go btime 7
info nodes 2 time 1
bestmove f4f3 ponder (none)
gladius wrote:
Thanks for tracking this down. It is indeed a time pressure blunder. If SF has <= 5ms to play a move, it plays instantly, before even searching. So it just plays a random move.
It appears to be deterministic, not random. Meaning I invoked the same search conditions five times and only got f3.
Also, the behavior seems the same for btime = 7msec
Stockfish sig-4728533 64 SSE4.2 by Tord Romstad, Marco Costalba and Joona Kiiski
position fen 8/5p2/3k2p1/3P3p/3RPp1P/4r3/5K2/8 b - - 2 52
isready
readyok
go btime 7
info nodes 2 time 1
bestmove f4f3 ponder (none)
By random, I meant that it's based on the order the moves are generated in, it just picks the first one generated.
And yes, actually, looking at the code, up to 10ms will cause the issue.
gladius wrote:
Thanks for tracking this down. It is indeed a time pressure blunder. If SF has <= 5ms to play a move, it plays instantly, before even searching. So it just plays a random move.
It appears to be deterministic, not random. Meaning I invoked the same search conditions five times and only got f3.
Also, the behavior seems the same for btime = 7msec
Stockfish sig-4728533 64 SSE4.2 by Tord Romstad, Marco Costalba and Joona Kiiski
position fen 8/5p2/3k2p1/3P3p/3RPp1P/4r3/5K2/8 b - - 2 52
isready
readyok
go btime 7
info nodes 2 time 1
bestmove f4f3 ponder (none)
Another option is that it makes sense to judge overall playing strength in a large quantity of very fast games, but it does not make sense to judge quality of play in such very fast games. Maybe even an engine needs a bit more time.
I think many other engines would make similar mistakes under the same conditions, but Evgeni has somehow concentrated his attention too much on Stockfish.
Sorry all, I have to admit that some of my data about the strength of SF 2.3 with Mobility=200 is rotten. Some data about SF 2.3 Coward also. It seems that at some point (may be after installing Arena-Blitzer a week ago) these altered by me settings were erased and returned to the default settings -- but with Threads=2 (I don't remember is Threads=2 default in SF 2.3?...) This pretty much explains fantastic results by these "personalities"... Sorry, guys, now I'm erasing all phony games from my database and I hope to start all the testing by the very beginning.