What is an ELO change of N?

gflohr · Post by **gflohr** » Tue Oct 19, 2021 8:41 am

I read a lot of stuff here like "implement feature XY gave my engine an ELO difference of N". Is there something like a common notion of how to measure these ELO changes or is that just gossip?

I am using cutechess-cli in sprt mode but I have observed that the reported ELO difference heavily depends on the time control and the opening book used and there are probably more.

By the way, this is how I currently "measure":

Code: Select all

cutechess-cli -engine conf=Last-Stable-Version -engine conf=Dev-Version -each tc=inf/30+0.4 book=Elo2400.bin -games 2 -rounds 2500 -repeat 2 -sprt elo0=0 elo1=20 alpha=0.03 beta=0.03 -concurrency 16 -ratinginterval 10 -pgnout sprt.pgn

MartinBryant · Post by **MartinBryant** » Tue Oct 19, 2021 11:33 am

I also use cutechess-cli for automated testing but haven't tried the sprt facility.

Some thoughts...

I do runs of 20,000 games which rarely may be terminated early if the change gives a value consistently well outside the error window.

I used to use a normal openings book like you but changed it a couple of months ago as I noticed that with so many games I was getting repeated openings. I now use the openings-8ply-10k.pgn file which gives you 10,000 unique openings (and playing white and black gives you 20,000 unique games).

These days sadly it is rare to find an improvement that is outside the error margins (which is still +/-3.6 after 20,000 games). But I usually adopt the change if it is +ve even though it is small. Also I have often seen changes looking promising after 'only' 5,000 games but sadly it dissipates to nothing after 20,000

I also set the concurrency to one fewer than the number of cores to give cutechess-cli and the OS room to breathe

gflohr · Post by **gflohr** » Tue Oct 19, 2021 2:24 pm

MartinBryant wrote: ↑Tue Oct 19, 2021 11:33 am I used to use a normal openings book like you but changed it a couple of months ago as I noticed that with so many games I was getting repeated openings. I now use the openings-8ply-10k.pgn file which gives you 10,000 unique openings (and playing white and black gives you 20,000 unique games).

To me "Elo2400.bin" looks sufficient:

Code: Select all

$ polyglot info-book -bin Elo2400.bin
PolyGlot 1.4.70b by Fabien Letouzey.
Lines for white                :    46005
Lines for black                :    48602
Positions on lines for white   :    51081
Positions on lines for black   :    55459
Isolated positions             :     2511

MartinBryant wrote: ↑Tue Oct 19, 2021 11:33 am These days sadly it is rare to find an improvement that is outside the error margins (which is still +/-3.6 after 20,000 games). But I usually adopt the change if it is +ve even though it is small. Also I have often seen changes looking promising after 'only' 5,000 games but sadly it dissipates to nothing after 20,000

The things I currently implement are pretty basic. The problem is rather that the engine is too slow to search deep enough for some features to show an effect.

MartinBryant wrote: ↑Tue Oct 19, 2021 11:33 am I also set the concurrency to one fewer than the number of cores to give cutechess-cli and the OS room to breathe

I think I have 17 logical CPUs, but I may better count again.

What is an ELO change of N?

What is an ELO change of N?

Re: What is an ELO change of N?

Re: What is an ELO change of N?