What is an ELO change of N?

Discussion of chess software programming and technical issues.

Moderator: Ras

gflohr
Posts: 57
Joined: Fri Jul 23, 2021 5:24 pm
Location: Elin Pelin
Full name: Guido Flohr

What is an ELO change of N?

Post by gflohr »

I read a lot of stuff here like "implement feature XY gave my engine an ELO difference of N". Is there something like a common notion of how to measure these ELO changes or is that just gossip?

I am using cutechess-cli in sprt mode but I have observed that the reported ELO difference heavily depends on the time control and the opening book used and there are probably more.

By the way, this is how I currently "measure":

Code: Select all

cutechess-cli -engine conf=Last-Stable-Version -engine conf=Dev-Version -each tc=inf/30+0.4 book=Elo2400.bin -games 2 -rounds 2500 -repeat 2 -sprt elo0=0 elo1=20 alpha=0.03 beta=0.03 -concurrency 16 -ratinginterval 10 -pgnout sprt.pgn
User avatar
MartinBryant
Posts: 87
Joined: Thu Nov 21, 2013 12:37 am
Location: Manchester, UK
Full name: Martin Bryant

Re: What is an ELO change of N?

Post by MartinBryant »

I also use cutechess-cli for automated testing but haven't tried the sprt facility.

Some thoughts...

I do runs of 20,000 games which rarely may be terminated early if the change gives a value consistently well outside the error window.

I used to use a normal openings book like you but changed it a couple of months ago as I noticed that with so many games I was getting repeated openings. I now use the openings-8ply-10k.pgn file which gives you 10,000 unique openings (and playing white and black gives you 20,000 unique games).

These days sadly it is rare to find an improvement that is outside the error margins (which is still +/-3.6 after 20,000 games). But I usually adopt the change if it is +ve even though it is small. Also I have often seen changes looking promising after 'only' 5,000 games but sadly it dissipates to nothing after 20,000 :(

I also set the concurrency to one fewer than the number of cores to give cutechess-cli and the OS room to breathe :)
gflohr
Posts: 57
Joined: Fri Jul 23, 2021 5:24 pm
Location: Elin Pelin
Full name: Guido Flohr

Re: What is an ELO change of N?

Post by gflohr »

MartinBryant wrote: Tue Oct 19, 2021 11:33 am I used to use a normal openings book like you but changed it a couple of months ago as I noticed that with so many games I was getting repeated openings. I now use the openings-8ply-10k.pgn file which gives you 10,000 unique openings (and playing white and black gives you 20,000 unique games).
To me "Elo2400.bin" looks sufficient:

Code: Select all

$ polyglot info-book -bin Elo2400.bin
PolyGlot 1.4.70b by Fabien Letouzey.
Lines for white                :    46005
Lines for black                :    48602
Positions on lines for white   :    51081
Positions on lines for black   :    55459
Isolated positions             :     2511
MartinBryant wrote: Tue Oct 19, 2021 11:33 am These days sadly it is rare to find an improvement that is outside the error margins (which is still +/-3.6 after 20,000 games). But I usually adopt the change if it is +ve even though it is small. Also I have often seen changes looking promising after 'only' 5,000 games but sadly it dissipates to nothing after 20,000 :(
The things I currently implement are pretty basic. The problem is rather that the engine is too slow to search deep enough for some features to show an effect.
MartinBryant wrote: Tue Oct 19, 2021 11:33 am I also set the concurrency to one fewer than the number of cores to give cutechess-cli and the OS room to breathe :)
I think I have 17 logical CPUs, but I may better count again. :)