An ELO drop caused by futility pruning margin when switching to a different evaluation function

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
Posts: 67
Joined: Sat Dec 18, 2010 4:19 pm
Location: Tianjin, China

An ELO drop caused by futility pruning margin when switching to a different evaluation function

Post by nkg114mc » Thu Jun 25, 2020 9:16 pm

Hi all,

Recently I am doing an experiment with Senpai v1 engine by replacing its original evaluation function to the evaluation function of Stockfish, just curious about how much ELO gain I can get.

I use the 40 moves in 5 minutes time control, and play 400 games in each experiment. The original Senpai outperforms the baseline engine for around 80 ELO. Say the baseline engine’s ELO is 0, then Senpai is around 80. However, when I switch to Stockfish evaluation, the engine strength surprisingly dropped 400 ELO, which is a very confusing result.

After debugging for a while, I realized that the problem comes from the "delta pruning" with "futility-pruning" margin. The main reason is that, Stockfish evaluation use a range of (-31000, 31000), while original Senpai use (-9000, 9000) which is relatively smaller than Stockfish evaluation. In the futility-pruning, Senpai use a margin (val in the code):

Code: Select all

val = eval(sl) + depth * 40 + 50;
Where sl literally means search_local. You can treat it as a wrapper of a board position. And a move will be pruned if it satisfied following condition:

Code: Select all

if (use_fp && 
    move::is_tactical(mv) && 
    !move::is_check(mv, bd) && 
    val + move::see_max(mv) <= alpha) { // delta pruning
As a result, since alpha because larger than what it was in the origin Senpai, the condition `val + see_max(mv) <= alpha` becomes much easier to be satisfied with Stockfish evaluation than with Senpai evaluation. Therefore, the delta pruning happens more aggressively than before, and hurts the strength. After turning the delta pruning off, the performance resumes to around 160 ELO higher than the baseline engine.

In summary, with time control 40moves/300secs:
Baseline engine: 0 ELO
Senpai_searcher + Senpai_eval: +80 ELO +/- 15
Senpai_searcher + Stock_eval + original_futility: -400 ELO +/- 15
Senpai_searcher + Stock_eval + no_futility: +160 ELO +/- 15
Senpai_searcher + Stock_eval + tuned_futility: +170 ELO +/- 15

BTW, one more found during this experiment is that, the g++/clang++ compiling option "-fomit-frame-pointer" gives me some mysterious runtime errors (it makes the program even produce different behaviors under different OS). We'd better void using this compiling optimization option. This also confused me for a long time :(

Post Reply