Testing strength

Discussion of chess software programming and technical issues.

Moderator: Ras

eligolf
Posts: 114
Joined: Sat Nov 14, 2020 12:49 pm
Full name: Elias Nilsson

Testing strength

Post by eligolf »

Hi,

I find it really hard to test my engine strength, and to get an idea whether changes are improving my engine or not. The main issue is that it takes a lot of time to play 100s of games at longer time controls (3+2 or something like that). I assume strenght must vary depending on time control, some angine are better with 1+0 and some better with longer like 10+10.

Also, is testing games to a given depth a good way of checking just the evaluation function? I assume some cutoffs will make the engine weeker if I play to a given depth, with the same evaluation. For example, I tried to play a tournament with my engine vs TSCP 1.81 to depth 5 per move and won like 90% of the games with just value+PST evaluation.

Are there any standard time controls or levels for testing which is efficient? At a search rate of around 25-30k nodes/s it is a slow process :)
jdart
Posts: 4410
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Testing strength

Post by jdart »

Most engines are using very fast blitz games for testing. Stockfish uses 0:10+0.1 as their standard "fast" time control and 1:0+0.6 as their LTC time control. That can still be compute-intensive. Some Stockfish test runs go to 100k games before they get a significant SPRT.