Hello everyone,
Anyone has some tips how to properly check if some changes improved the playing strength?
I understand, that it should play many games with different openings etc. I was wondering about the details: how many games should be played for the result to be reliable? Should I test after every single tweak, or rather after some big changes? Or maybe it's a good practice to prepare many versions with slightly different parameters and then let them play together in one tournament?
Testing engine's playing strength between versions
Moderator: Ras
-
ChickenMan4236
- Posts: 6
- Joined: Wed Oct 01, 2025 8:55 pm
- Location: Poland
- Full name: Jakub Szczerbiński
-
Aleks Peshkov
- Posts: 950
- Joined: Sun Nov 19, 2006 9:16 pm
- Location: Russia
- Full name: Aleks Peshkov
Re: Testing engine's playing strength between versions
https://www.chessprogramming.org/Sequen ... Ratio_Test
You need to download fastchess or cutechess-cli and some opening book,
for example 8moves_v3.pgn from https://github.com/official-stockfish/books
You need to download fastchess or cutechess-cli and some opening book,
for example 8moves_v3.pgn from https://github.com/official-stockfish/books
-
ChickenMan4236
- Posts: 6
- Joined: Wed Oct 01, 2025 8:55 pm
- Location: Poland
- Full name: Jakub Szczerbiński
Re: Testing engine's playing strength between versions
Thanks, exactly what I was looking for!