Unit testing eval & search
Moderator: Ras
-
- Posts: 22
- Joined: Fri May 30, 2025 10:18 pm
- Full name: Ben Vining
Unit testing eval & search
I've got lots of unit tests for my board representation, move gen, and FEN/PGN handling, but I'm wondering if/how other engine authors handle testing evaluation & search. So far I've got some really basic tests for each: for eval I'm basically just checking that it scores mate & stalemate correctly, and for search I've got a few basic test positions. I know about SPRT, but I think the problem there is that it requires a baseline working engine to compare against, so especially as a first-time engine author, I'm wondering if anyone has any tips to help test my eval/search algorithms as I'm implementing them.
-
- Posts: 547
- Joined: Tue Feb 04, 2014 12:25 pm
- Location: Gower, Wales
- Full name: Colin Jenkins
Re: Unit testing eval & search
Your baseline is the code you have now. Do a singular tweak on top of that and you can SPRT before and after versions. Or maybe strip things back to Negamax and a simple eval and start again using SPRT as you go; it's a quick process early on.
For tweaks that should not functionally change eval/search - like most optimisations - you can use a bench command - it's extremely useful. Essentially run a search over N positions to depth M (I use 12) and report the total number of nodes and nps. If the bench nodes value is unchanged pre/post tweak and nps has not crashed, you're (very probably) good. I think I stole my positions from Ethereal. It's also very useful as a paranoia check when you make any change at all.
Assuming you already have PERFT, SPRT and bench is all you need really, with optional SPSA infrequently.
Edit: Ah sorry, I just reread your post and maybe you don't have a working engine. In which case make that top priority and then SPRT/bench from there removing stuff if needed to get you going. You only need implement a minimal set of UCI commands for SPRT.
PS: Re your other question about Zobrists, if you calculate them on invocation make sure your random function always generates the same set of values so that bench is stable. You can then also SPRT the quality of your randoms.
For tweaks that should not functionally change eval/search - like most optimisations - you can use a bench command - it's extremely useful. Essentially run a search over N positions to depth M (I use 12) and report the total number of nodes and nps. If the bench nodes value is unchanged pre/post tweak and nps has not crashed, you're (very probably) good. I think I stole my positions from Ethereal. It's also very useful as a paranoia check when you make any change at all.
Assuming you already have PERFT, SPRT and bench is all you need really, with optional SPSA infrequently.
Edit: Ah sorry, I just reread your post and maybe you don't have a working engine. In which case make that top priority and then SPRT/bench from there removing stuff if needed to get you going. You only need implement a minimal set of UCI commands for SPRT.
PS: Re your other question about Zobrists, if you calculate them on invocation make sure your random function always generates the same set of values so that bench is stable. You can then also SPRT the quality of your randoms.
Last edited by op12no2 on Wed Jun 11, 2025 2:32 pm, edited 4 times in total.
-
- Posts: 547
- Joined: Tue Feb 04, 2014 12:25 pm
- Location: Gower, Wales
- Full name: Colin Jenkins
Re: Unit testing eval & search
Ethereal's bench positions - https://github.com/AndyGrant/Ethereal/b ... /bench.csv
-
- Posts: 22
- Joined: Fri May 30, 2025 10:18 pm
- Full name: Ben Vining
Re: Unit testing eval & search
A bench command sounds like a great idea, thanks for the tip.
I've just taken my first stab at a transposition table, which seems to have finally gotten me to a point where my search algorithm seems mostly correct & fast enough to actually be usable. My next step is probably iterative deepening. But maybe I'm ready to try SPRT now...
I've just taken my first stab at a transposition table, which seems to have finally gotten me to a point where my search algorithm seems mostly correct & fast enough to actually be usable. My next step is probably iterative deepening. But maybe I'm ready to try SPRT now...