Olithink 5.1.8 released because of better ChessDM vs Crafty

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

OliverBr
Posts: 846
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Olithink 5.1.8 released because of better ChessDM vs Crafty

Post by OliverBr »

Download here:

http://home.arcor.de/dreamlike/chess/

This is a funny case, because the only difference to 5.1.7 is the removal of two lines of code and this yielded to a much better Death Match against crafty:

Code: Select all

olithink517 - crafty20.14 : 285.5/1000 214-143-643 28%
olithink518 - crafty20.14 : 315.0/1000 243-144-613 31% 
Question to all of you: Do you think 1000 matches are enough to see the improvement (I don't see any risk of "inbreeding" here, because crafty is genetically very different to olithink)
OliverBr
Posts: 846
Joined: Tue Dec 18, 2007 9:38 pm
Location: Munich, Germany
Full name: Dr. Oliver Brausch

Re: Olithink 5.1.8 released because of better ChessDM vs Cra

Post by OliverBr »

Hi Gabor,
thank you for your reply. Can you give me a advice how I can easily calculate the ELO from the results of the 1000 games against Crafty?
BubbaTough
Posts: 1154
Joined: Fri Jun 23, 2006 5:18 am

Re: Olithink 5.1.8 released because of better ChessDM vs Cra

Post by BubbaTough »

OliverBr wrote:I don't see any risk of "inbreeding" here, because crafty is genetically very different to olithink
In my opinion the risk of testing against just crafty is almost as large as self test. You are tuning your engine to beat crafty (from the start positions you are using). Thus, you risk creating an engine particularly suited to beating crafty but perhaps worse against other engines. If crafty is bad against pawn storms for example, you may end up building an engine that does all sorts of overly risky pawn storms that would have been punished by other engines (just an example, no idea how crafty is with pawn storms). If you are going to do 1000 games, I would recommend 200 against 5 engines instead as more reliable results, even though it reduces your number of starting positions.

Regarding "how many" games are needed to detect an improvement, I leave that to the experts. I will say that I consider it dependent on what you are using the testing for. If it is for judging a feature or deciding what to use in ChessWars, perfect testing is not critical (1000 games sounds great to me). When sending an engine to a testing group who is going to spend a lot of time on your engine, I would be more conservative in my criteria as a sign of respect for their time investment.

-Sam
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Olithink 5.1.8 released because of better ChessDM vs Cra

Post by Dann Corbit »

OliverBr wrote:Hi Gabor,
thank you for your reply. Can you give me a advice how I can easily calculate the ELO from the results of the 1000 games against Crafty?
Here is a batch file (which can just as easily become shell scripts) and command script that I use to have BayesElo create Elo rating lists.

C:\tmp\prob>type ..\belostat.bat
copy %1.pgn test.pgn
bayeselo < belo.txt

C:\tmp\prob>type ..\belo.txt
readpgn test.pgn
elo
mm
offset 2500
exactdist
ratings >rating.txt
x
x

I really like BayesElo because I can give it millions of games and it never dies a horrible death.
User avatar
Ovyron
Posts: 4562
Joined: Tue Jul 03, 2007 4:30 am

Re: Olithink 5.1.8 released because of better ChessDM vs Cra

Post by Ovyron »

BubbaTough wrote:Thus, you risk creating an engine particularly suited to beating crafty but perhaps worse against other engines.
The exception is creating an engine that is particularly suited to have better results against Rybka, I claim that the new generation of chess engines that had a big jump in strength but people complained about their worse playing style were tweaked for Rybka 1-2, but Rybka 3 had an easier time beating them than she should because the tweaks didn't work anymore. Still, these engines got stronger.
krazyken

Re: Olithink 5.1.8 released because of better ChessDM vs Cra

Post by krazyken »

OliverBr wrote:Hi Gabor,
thank you for your reply. Can you give me a advice how I can easily calculate the ELO from the results of the 1000 games against Crafty?
the best way to do it is to download BayesELO and it will happily calculate everything for you.
User avatar
Ovyron
Posts: 4562
Joined: Tue Jul 03, 2007 4:30 am

Re: Olithink 5.1.8 released because of better ChessDM vs Cra

Post by Ovyron »

krazyken wrote:the best way to do it is to download BayesELO
Here's the instructions on how to use it, as it's a command line program just enter the following commands followed by enter:

readpgn x.pgn
elo
mm
exactdist
offset 2500
prior 0.1
ratings>allgames.txt

Change the x for the name of your PGN, 2500 for the starting rating and the prior line is optional, to be used when you have engines of very different strength on the PGN (Without it strong engines will be underrated and weak engines will be overrated, if they don't have enough games.)