Testing engines

EricLang · Post by **EricLang** » Mon Dec 08, 2008 10:52 pm

I'm planning to test my engine, by playing against itself with other parameters (personality).
Now I can set this up using OOP and all, but want to try to keep things "dirty simple" with plain procedures and global variables.
Now two questions:
1) Is it a good idea to startup 2 uci-engines with a third program? ( I still need to write all things concerning the uci-communication).
2) How do you guys do things like this?

Dann Corbit · Post by **Dann Corbit** » Mon Dec 08, 2008 11:05 pm

EricLang wrote:I'm planning to test my engine, by playing against itself with other parameters (personality).
Now I can set this up using OOP and all, but want to try to keep things "dirty simple" with plain procedures and global variables.
Now two questions:
1) Is it a good idea to startup 2 uci-engines with a third program? ( I still need to write all things concerning the uci-communication).

Usually people just use an existing tournament organizer instead of writing their own tools for this.

2) How do you guys do things like this?

Most people use a pre-existing tool. For instance, you can organize a tournament using Arena or ChessGUI.

Dr. Hyatt has a {highly enviable} cluster against which he can achive very rapid results for a huge number of games.

hgm · Post by **hgm** » Mon Dec 08, 2008 11:18 pm

Having different versions of a program play each other is a notoriously ineffective way to tune an engine. Even gross errors like move-generator bugs remain unnoticed sometimes. The 'optimized' version often plays worse against other opponents.

Most people evaluate their engines (program changes as well as tuning) by playing a gauntlet against a wide variety of opponents, which bracket the engine in strength by a range of about 200-300 Elo on either side. For normal Chess I usually play Nunn matches, i.e. games starting from the 10 Nunn positions, so that you can play each opponent 20 times with the guarantee of no duplicate games. With 25 different opponents this gives you 500 games, which again results in an accuracy of 25 Elo points (95% confidence).

I use PSWBTM + WinBoard for this. (UCI engines would then have to use Polyglot, which you can make WinBoard do automatically by installing the engine in PSWBTM with the engine-following WinBoard option "WBopt /%sIsUCI=true.)

EricLang · Post by **EricLang** » Tue Dec 09, 2008 12:03 am

The term "Nunn matches" is new to me. What is that? And "the 10 Nunn positions"...

Dann Corbit · Post by **Dann Corbit** » Tue Dec 09, 2008 12:11 am

EricLang wrote:The term "Nunn matches" is new to me. What is that? And "the 10 Nunn positions"...

Named after GM Nunn.

Here are the 10 Nunn positions compressed with bzip2:
http://cap.connx.com/EPD/nunn.epd.bz2
Another version with ECO classification:
http://cap.connx.com/EPD/NUNNTEST.EPD.bz2

Here are the 6 Nunn v2 positions:
http://cap.connx.com/EPD/nunn2.epd.bz2

EricLang · Post by **EricLang** » Tue Dec 09, 2008 12:31 am

Ok, thanks

hgm · Post by **hgm** » Tue Dec 09, 2008 9:46 am

Note that there are also PGN files around for the Nunn positions (I don't have the link now), which contain the sequence of opening moves leading to the position. I always use those, rather than EPD files, as they also work on engines that do not support setting up a position. (And especially in the strength range of a beginning engine, there are quite a few opponents that suffer from that.)

Testing engines

Testing engines

Re: Testing engines

Re: Testing engines

Re: Testing engines

Re: Testing engines

Re: Testing engines

Re: Testing engines