- Making the engine play arbitrarily weaker without limiting the search depth with the goal of making it a more interesting sparing partner for human opponents of diverse strength.
- Making the engine less deterministic during self-play so that a wider variety of positions are encountered. This could be helpful when using selfplay to generate data for the tuner.
Code: Select all
//Scoring Root Moves with a random bonus: https://www.chessprogramming.org/Ronald_de_Man
int bonus = _rng.Next(_rootRandomness); //generates a random number between 0 and _rootRandomness
int score = bonus - EvaluateTT(1, depth - 1, bonus - MAX_BETA, bonus - alpha, moveGen);
Code: Select all
//Scoring Root Moves with a random bonus: https://www.chessprogramming.org/Ronald_de_Man
int bonus = RootBonus[playState.PlayedMoves - 1];
int score = bonus - EvaluateTT(1, depth - 1, bonus - MAX_BETA, bonus - alpha, moveGen);
But surprisingly (for me) when you pit both implementations against each other with the same range of randomness then the first version is playing much stronger! For example with the random bonus set to [0..50] centipawns the first implementation was playing ~50 Elo stronger. I would have intuitively expected the opposite to be true and first assumed a bug or that I made an error testing. But I found nothing.
How can changing the bonus of moves each search-iteration be benefitial to the quality of the final search result? Any ideas?