matching 2 or more engines

Discussion of chess software programming and technical issues.

Moderator: Ras

Robert Pope
Posts: 570
Joined: Sat Mar 25, 2006 8:27 pm
Location: USA
Full name: Robert Pope

Re: matching 2 or more engines

Post by Robert Pope »

And my engines.json file:

[
{
"command" : "EightBall.exe",
"name" : "8ball",
"protocol" : "xboard",
"workingDirectory" : "C:\\games\\Chess\\Eightball\\Release"
},
{
"command" : "Beaches232.exe",
"name" : "Beaches",
"protocol" : "xboard",
"workingDirectory" : "C:\\games\\chess\\Beaches232"
},
{
"command" : "damas7c.exe",
"name" : "Damas",
"protocol" : "xboard",
"workingDirectory" : "C:\\games\\chess\\Damas"
},
{
"command" : "elf.exe -r 700 -d 300 -b on",
"name" : "Elf",
"protocol" : "xboard",
"workingDirectory" : "C:\\games\\chess\\Damas"
},
{
"command" : "jsbam.exe",
"name" : "J.S.BAM",
"protocol" : "xboard",
"workingDirectory" : "C:\\games\\chess\\JSBAM"
},
{
"command" : "kanguruh93.exe",
"name" : "Kanguruh",
"protocol" : "xboard",
"workingDirectory" : "c:\\games\\chess\\kanguruh\\Kanga193"
}
]
Ferdy
Posts: 4853
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: matching 2 or more engines

Post by Ferdy »

Robert Pope wrote:
Ferdy wrote:
Robert Pope wrote:
mar wrote:cutechess-cli is what you're looking for
Is there anyone that regularly uses cutechess-cli in windows (sorry about the hijack). I was starting to look into it as a replacement for PSWBTM, but it keeps throwing errors (something like "option games has too many parameters").

I haven't been able to find useful documentation, beyond the man page for the command line options.
Post your sample command line.
C:\Games\chess\cutechess-cli>cutechess-cli -engine conf=8ball -engine conf=Beaches -engine conf=Damas -engine conf=Elf -engine conf=J.S.BAM -each tc=50/60 -repeat -openings file="C:\\games\\chess\\pswbtm\\pick21.pgn" -tournament gauntlet -debug -repeat -pgnout games.pgn -games 10
Warning: Invalid engine option: "ûrepeat"
Warning: Invalid or missing time control

C:\Games\chess\cutechess-cli>
C:\Games\chess\cutechess-cli>cutechess-cli -engine conf=8ball -engine conf=Beaches -engine conf=Damas -each tc=50/60 -games 8 -repeat -openings file="C:\\games\\chess\\pswbtm\\pick21.pgn" -tournament gauntlet -debug -repeat -concurrency 1 -pgnout games.pgn
Warning: Too many arguments for option "-games"

This was supposed to be an 8 or 10 game gauntlet tourney with a set of fixed openings, each opening played as white and black.
You have 2 "-repeat" there, use only 1.
Robert Pope
Posts: 570
Joined: Sat Mar 25, 2006 8:27 pm
Location: USA
Full name: Robert Pope

Re: matching 2 or more engines

Post by Robert Pope »

Thanks, that was the problem.
Robert Pope
Posts: 570
Joined: Sat Mar 25, 2006 8:27 pm
Location: USA
Full name: Robert Pope

Re: matching 2 or more engines

Post by Robert Pope »

Okay, next issue. I am running a gauntlet against multiple engines with a fixed opening book. The code below uses games 1-3 for the first engine, but then the second engine's games are played with openings from games 4-6, then 7-9, etc. But I want all matches to use the same openings 1-3. Is that possible?

Code: Select all

cutechess-cli -engine conf=8ball -engine conf=Beaches -engine conf=Damas -engine conf=Elf -engine conf=J.S.BAM -each tc=50/20 -openings file="C:\\games\\chess\\pswbtm\\pick21.pgn" -tournament gauntlet -debug -repeat -pgnout gamest4.pgn -games 6
lucasart
Posts: 3243
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: matching 2 or more engines

Post by lucasart »

flok wrote:The natural choice would be something like xboard or arena
You start with the wrong assumption. The natural choice is cutechess-cli.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
lucasart
Posts: 3243
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: matching 2 or more engines

Post by lucasart »

lucasart wrote:
flok wrote:The natural choice would be something like xboard or arena
You start with the wrong assumption. The natural choice is cutechess-cli.
In fact, cutechess-cli is the only right choice for automated testing:

* If it has a single bug, even extremely rare that only crahses once ewvery 1000 games, you can throw it away. I'm pretty sure Arena does not pass this test. Reason: you need to play tens of thousands of games with zero crashes, to avoid manual intervention.

* If it has an internal overhead above 10ms, you can throw it away. That basically takes care of all GUis, especially Arena. I regularly play hundreds of thousand of games in 1.2"+0.01" and experience zero crashes or time losses: this is possible with DiscoCheck+cutechess-cli, but very few engines+UI are capable of that.

* If the boot time is more than 10ms, you can throw it away too. Reason: CLOP needs to play games one by one via an external script. The overhead of starting the UI and booting the engine processes is added, so it must be as small as possible.

* And of course, if it's not command line, you can throw it away too. Reason: you will need to use it in scripts for automating certain things.

The only UI I know that satisfies these requirements is cutechess-cli. So it is what you should use, unless you are ready to write the UI yourself (I started, but gave up realizing that it would never be anywhere near cutechess-cli in terms of reliability and features).
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
User avatar
hgm
Posts: 28480
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: matching 2 or more engines

Post by hgm »

Well, XBoard of course meets most of those specs easily. Only the CLOP-dictated startup time could be a problem. You seem to quote an unncessary tight requirement here anyway: surely with 2 x 1.2 sec per game there will be no significant difference between 10 msec and 100 msec startup.

A much worse problem is that many engines also have a sizable startup time (initializing their hash table or end-game cache), and there is little the interface can do about that. The natural solution (which I used before XBoard had an intrinsic tournament manager, and was still using PSWBTM, which also started a new XBoard instance for every game) is to play the games in batches of 10 or 100 per GUI/engine startup. I added the extra option -sameColorGames to XBoard for that purpose, with the aid of which every game requested by PSWBTM would in fact result in a 10-game match. If CLOP cannot handle that, it seems a CLOP problem to me.
lucasart
Posts: 3243
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: matching 2 or more engines

Post by lucasart »

hgm wrote:Well, XBoard of course meets most of those specs easily. Only the CLOP-dictated startup time could be a problem. You seem to quote an unncessary tight requirement here anyway: surely with 2 x 1.2 sec per game there will be no significant difference between 10 msec and 100 msec startup.
I mentionned 1.2"+0.01", but it can be even lower, or use depth limit instead of time, for example. Anyway, the bottom line is that the overhead needs to be minimal.
A much worse problem is that many engines also have a sizable startup time (initializing their hash table or end-game cache), and there is little the interface can do about that.
Indeed. Some engines have huge startup time. DiscoCheck is not one of those and has a very minimal startup time. At least that is something at the hand of the engine programmer, so those who need to use CLOP should think of reducing their overhead to begin with.
The natural solution (which I used before XBoard had an intrinsic tournament manager, and was still using PSWBTM, which also started a new XBoard instance for every game) is to play the games in batches of 10 or 100 per GUI/engine startup. I added the extra option -sameColorGames to XBoard for that purpose, with the aid of which every game requested by PSWBTM would in fact result in a 10-game match. If CLOP cannot handle that, it seems a CLOP problem to me.
Indeed. This is a CLOP limitation. The way CLOP works is that it runs a script that the user specifies (Python, Shell, directly executable file, or whatever you want) that plays ONE game and outputs a result as a single character in stdout ('W', 'L', or 'D'). A useful enhancement to CLOP would be to accept that the script plays a bulk of games and sends a string like "WWLLDLDWLLD". That way the overhead of starting cutechess-cli (or xboard) is only incurred once every N games instead of every games. I don't know if that fits well with the CLOP algorithm though.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.