Testing engines

Discussion of chess software programming and technical issues.

Moderator: Ras

EricLang

Testing engines

Post by EricLang »

I'm planning to test my engine, by playing against itself with other parameters (personality).
Now I can set this up using OOP and all, but want to try to keep things "dirty simple" with plain procedures and global variables.
Now two questions:
1) Is it a good idea to startup 2 uci-engines with a third program? ( I still need to write all things concerning the uci-communication).
2) How do you guys do things like this?
Dann Corbit
Posts: 12808
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Testing engines

Post by Dann Corbit »

EricLang wrote:I'm planning to test my engine, by playing against itself with other parameters (personality).
Now I can set this up using OOP and all, but want to try to keep things "dirty simple" with plain procedures and global variables.
Now two questions:
1) Is it a good idea to startup 2 uci-engines with a third program? ( I still need to write all things concerning the uci-communication).
Usually people just use an existing tournament organizer instead of writing their own tools for this.
2) How do you guys do things like this?
Most people use a pre-existing tool. For instance, you can organize a tournament using Arena or ChessGUI.

Dr. Hyatt has a {highly enviable} cluster against which he can achive very rapid results for a huge number of games.
User avatar
hgm
Posts: 28429
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Testing engines

Post by hgm »

Having different versions of a program play each other is a notoriously ineffective way to tune an engine. Even gross errors like move-generator bugs remain unnoticed sometimes. The 'optimized' version often plays worse against other opponents.

Most people evaluate their engines (program changes as well as tuning) by playing a gauntlet against a wide variety of opponents, which bracket the engine in strength by a range of about 200-300 Elo on either side. For normal Chess I usually play Nunn matches, i.e. games starting from the 10 Nunn positions, so that you can play each opponent 20 times with the guarantee of no duplicate games. With 25 different opponents this gives you 500 games, which again results in an accuracy of 25 Elo points (95% confidence).

I use PSWBTM + WinBoard for this. (UCI engines would then have to use Polyglot, which you can make WinBoard do automatically by installing the engine in PSWBTM with the engine-following WinBoard option "WBopt /%sIsUCI=true.)
EricLang

Re: Testing engines

Post by EricLang »

The term "Nunn matches" is new to me. What is that? And "the 10 Nunn positions"...
Dann Corbit
Posts: 12808
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Testing engines

Post by Dann Corbit »

EricLang wrote:The term "Nunn matches" is new to me. What is that? And "the 10 Nunn positions"...
Named after GM Nunn.

Here are the 10 Nunn positions compressed with bzip2:
http://cap.connx.com/EPD/nunn.epd.bz2
Another version with ECO classification:
http://cap.connx.com/EPD/NUNNTEST.EPD.bz2

Here are the 6 Nunn v2 positions:
http://cap.connx.com/EPD/nunn2.epd.bz2
EricLang

Re: Testing engines

Post by EricLang »

Ok, thanks
User avatar
hgm
Posts: 28429
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Testing engines

Post by hgm »

Note that there are also PGN files around for the Nunn positions (I don't have the link now), which contain the sequence of opening moves leading to the position. I always use those, rather than EPD files, as they also work on engines that do not support setting up a position. (And especially in the strength range of a beginning engine, there are quite a few opponents that suffer from that.)