Leto wrote:Don't know if this is relevant but in both screenshots Stockfish shows mate in 4.
If there is a quick mate, Stockfish does not need much time per iteration, so reaches high depth. This is why I am suspecting that the crash occurs when Stockfish reaches the maximum iteration depth during pondering.
(I know that my private engine does not deal very well with this situation. Once it reaches max depth, it restarts the search from iteration 1 and completely spams the GUI until the opponent makes a move. It does not crash, but that is more a question of luck. Of course if it did crash, I would have had more reason to fix it...)
Aser Huerga wrote:Ponder ON=since engines has no psychological capabilities, no foresight, ponder on only introduces some degree of random bias. It has no sense to emulate human conditions, engines aren't humans.
Ponder OFF=a more precise and reproductible way of testing engines
Is my honest opinion.
This not mean Ingo work are invalid, but I think it would be more precise with ponder off, even with the same number of games (ponder off allow to play twice number of games for the same period of time, hence more accurate results).
I was hoping for some kind of insight, into Ponder ON testing. What you say is basically what I thought, but maybe extensive testing has revealed, that some engines benefit more than others, because they guess the move played more often. It'd be interesting to see those stats.
The best idea is to simply test the way real games are played. Every tournament known to man has two clocks and pondering is a part of the game. You could make the same argument for max_threads = 1 that you do for ponder = off. But you will produce a skewed result compared to real tournament games...
syzygy wrote:PERHAPS Stockfish crashes if it reaches maximum depth while pondering (i.e. maximum number of iterations). I did not test this.
Don't know, 127 plys in the below case, but only 70 in the example before, but there it was only 3 pieces ... I don't know
But it is not the related to a low number of pieces on the board:
It could still be that the engine reached 127 and crashed before iteration 71 reached the GUI...
In any case there seems to be a relation with reaching very high depths.
Engines should be debugged of those problems by reducing max depths in a "special debugged mode" to a very low number (16, for instance) to force those situations very often. That applies to anything that happens rarely.
By far most games were adjucated by the GUi.
Over night I had 5 GUI crashes, which means that without restarting a Ponder ON tourney like mine would be stuck sooner or later (at least not if the GUI is not restarting automaticaly).
Only 66 games were tried to be played to the mate and only 36 succeded. 26 crashed - that is a 40% crash rate ...
I never had any problems with Stockfish 6!
Bye
Ingo
PS: Unfortunately there is no easy channel (email) to report bugs to the SF team! So I have to do it here and hope for the best.
bob wrote:
The best idea is to simply test the way real games are played. Every tournament known to man has two clocks and pondering is a part of the game. You could make the same argument for max_threads = 1 that you do for ponder = off. But you will produce a skewed result compared to real tournament games...
Exactly my point. Ponder off was a compromise for engine games when you had limited resources years ago (which is no problem nowadays) but it is not the normal way to play chess. That is why I consider Ponder OFF games as a sub group of "real", "full" or "name it as you want" chess.
I know it would belong in the other section but I dont want to start another thread:
You are very close to your goal for a new release. If the current SF woudl be a final release the TOP of my list (if someone cares) would look like this:
Aser Huerga wrote:Ponder ON=since engines has no psychological capabilities, no foresight, ponder on only introduces some degree of random bias. It has no sense to emulate human conditions, engines aren't humans.
Ponder OFF=a more precise and reproductible way of testing engines
Is my honest opinion.
This not mean Ingo work are invalid, but I think it would be more precise with ponder off, even with the same number of games (ponder off allow to play twice number of games for the same period of time, hence more accurate results).
I was hoping for some kind of insight, into Ponder ON testing. What you say is basically what I thought, but maybe extensive testing has revealed, that some engines benefit more than others, because they guess the move played more often. It'd be interesting to see those stats.
The best idea is to simply test the way real games are played. Every tournament known to man has two clocks and pondering is a part of the game. You could make the same argument for max_threads = 1 that you do for ponder = off. But you will produce a skewed result compared to real tournament games…
I will make you the same question, then: how would you label Ponder=Off games? Unreal? I'm really curious.
Aser Huerga wrote:Ponder ON=since engines has no psychological capabilities, no foresight, ponder on only introduces some degree of random bias. It has no sense to emulate human conditions, engines aren't humans.
Ponder OFF=a more precise and reproductible way of testing engines
Is my honest opinion.
This not mean Ingo work are invalid, but I think it would be more precise with ponder off, even with the same number of games (ponder off allow to play twice number of games for the same period of time, hence more accurate results).
I was hoping for some kind of insight, into Ponder ON testing. What you say is basically what I thought, but maybe extensive testing has revealed, that some engines benefit more than others, because they guess the move played more often. It'd be interesting to see those stats.
The best idea is to simply test the way real games are played. Every tournament known to man has two clocks and pondering is a part of the game. You could make the same argument for max_threads = 1 that you do for ponder = off. But you will produce a skewed result compared to real tournament games…
I will make you the same question, then: how would you label Ponder=Off games? Unreal? I'm really curious.
I don't see the point in insisting in a name for it beside to distinguish it.
I don't mind the label/name, but it is not played how chess is intended to be played! It is a sub group of real/normal chess as it was invented to use limited resources.
Again, if you like it - fine. Its just not my type of game.
Last edited by IWB on Mon Nov 02, 2015 10:36 am, edited 1 time in total.