Unit testing eval & search

benvining · Post by **benvining** » Wed Jun 11, 2025 2:55 am

I've got lots of unit tests for my board representation, move gen, and FEN/PGN handling, but I'm wondering if/how other engine authors handle testing evaluation & search. So far I've got some really basic tests for each: for eval I'm basically just checking that it scores mate & stalemate correctly, and for search I've got a few basic test positions. I know about SPRT, but I think the problem there is that it requires a baseline working engine to compare against, so especially as a first-time engine author, I'm wondering if anyone has any tips to help test my eval/search algorithms as I'm implementing them.

op12no2 · Post by **op12no2** » Wed Jun 11, 2025 2:07 pm

Your baseline is the code you have now. Do a singular tweak on top of that and you can SPRT before and after versions. Or maybe strip things back to Negamax and a simple eval and start again using SPRT as you go; it's a quick process early on.

For tweaks that should not functionally change eval/search - like most optimisations - you can use a bench command - it's extremely useful. Essentially run a search over N positions to depth M (I use 12) and report the total number of nodes and nps. If the bench nodes value is unchanged pre/post tweak and nps has not crashed, you're (very probably) good. I think I stole my positions from Ethereal. It's also very useful as a paranoia check when you make any change at all.

Assuming you already have PERFT, SPRT and bench is all you need really, with optional SPSA infrequently.

Edit: Ah sorry, I just reread your post and maybe you don't have a working engine. In which case make that top priority and then SPRT/bench from there removing stuff if needed to get you going. You only need implement a minimal set of UCI commands for SPRT.

PS: Re your other question about Zobrists, if you calculate them on invocation make sure your random function always generates the same set of values so that bench is stable. You can then also SPRT the quality of your randoms.

op12no2 · Post by **op12no2** » Wed Jun 11, 2025 2:25 pm

Ethereal's bench positions - https://github.com/AndyGrant/Ethereal/b ... /bench.csv

benvining · Post by **benvining** » Thu Jun 12, 2025 12:29 am

A bench command sounds like a great idea, thanks for the tip.

I've just taken my first stab at a transposition table, which seems to have finally gotten me to a point where my search algorithm seems mostly correct & fast enough to actually be usable. My next step is probably iterative deepening. But maybe I'm ready to try SPRT now...

Unit testing eval & search

Unit testing eval & search

Re: Unit testing eval & search

Re: Unit testing eval & search

Re: Unit testing eval & search