Nunemaker Chess Rating List (NCRL)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

can00336
Posts: 24
Joined: Sat May 16, 2015 8:07 am
Location: PA

Nunemaker Chess Rating List (NCRL)

Post by can00336 »

I have started my own rating list that focuses on the time scaling of engines. I don't have much CPU power, but I do what I can with what I've got.

http://chess.nunemaker.net/

Let me know what you think!
User avatar
Graham Banks
Posts: 41415
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: Nunemaker Chess Rating List (NCRL)

Post by Graham Banks »

can00336 wrote:I have started my own rating list that focuses on the time scaling of engines. I don't have much CPU power, but I do what I can with what I've got.

http://chess.nunemaker.net/

Let me know what you think!
Nice little project. :)
gbanksnz at gmail.com
cetormenter
Posts: 170
Joined: Sun Oct 28, 2012 9:46 pm

Re: Nunemaker Chess Rating List (NCRL)

Post by cetormenter »

Very interesting. Unfortunately you will need a lot of games to be able to come to any conclusions. Something that I would add is an additional graph with a sort of "scaling" factor where the number of points at each time control is normalized to the number of points at 15+0.125. Right now it is too hard to compare one engine to another without doing that math yourself.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Nunemaker Chess Rating List (NCRL)

Post by lkaufman »

can00336 wrote:I have started my own rating list that focuses on the time scaling of engines. I don't have much CPU power, but I do what I can with what I've got.

http://chess.nunemaker.net/

Let me know what you think!
Very informative! I think you made many wise decisions in setting up your tests. The 120 to 1 ratio of time to increment is very reasonable, the use of Ordo for the rating calculations is wise since it insures that higher scores mean higher ratings, and your engine selection seems to avoid including two engines by the same author in the same run, also wise. It's interesting to see how the draw percentage and White advantage climb with each doubling of time. It's also interesting that the spread of ratings from bottom to top doesn't vary so much with time; logically the longer times with more draws should have narrower ranges, but it seems that the stronger engines scale better than the weaker ones to compensate for this.

My only suggestion (other than playing more games!) would be regarding hash size. 64 MB is adequate for the fastest two or three levels, but not enough for the longer ones. Maybe 128 would be a better compromise if you don't want to bother with different hash sizes for different levels. But it's not a big deal.
Komodo rules!
User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Re: Nunemaker Chess Rating List (NCRL)

Post by cdani »

can00336 wrote:I have started my own rating list that focuses on the time scaling of engines. I don't have much CPU power, but I do what I can with what I've got.

http://chess.nunemaker.net/

Let me know what you think!

Thanks!! Is really very interesting.

Is not a surprise the behavior of Andscacs. I always tried to be sure that is better at long time controls that at short ones. Anyway is very nice to view it reflected, one never knows :-)
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Nunemaker Chess Rating List (NCRL)

Post by Frank Quisinsky »

Hello Chad,

nice to have a list more.
I added the link in my "Comparable" selection.
http://www.amateurschach.de/main/_comparable.htm

I will look from time to time.

THANKS!

Best
Frank
can00336
Posts: 24
Joined: Sat May 16, 2015 8:07 am
Location: PA

Re: Nunemaker Chess Rating List (NCRL)

Post by can00336 »

Graham Banks wrote:Nice little project. :)
Thanks! CCRL was one of my inspirations for doing this project.
can00336
Posts: 24
Joined: Sat May 16, 2015 8:07 am
Location: PA

Re: Nunemaker Chess Rating List (NCRL)

Post by can00336 »

cetormenter wrote:Very interesting. Unfortunately you will need a lot of games to be able to come to any conclusions. Something that I would add is an additional graph with a sort of "scaling" factor where the number of points at each time control is normalized to the number of points at 15+0.125. Right now it is too hard to compare one engine to another without doing that math yourself.
I have been slowly trying to complete another rating list that uses the top 100 4-ply chess openings instead of the top 10 2-ply. This should help with the "not enough games" problem.
It just takes forever on a single quad-core. I'm waiting for Broadwell-E 10 core CPUs to come out in the spring to upgrade.

I agree with the "scaling" factor idea, so I have added a table and graph to make this more readable. Thanks!
can00336
Posts: 24
Joined: Sat May 16, 2015 8:07 am
Location: PA

Re: Nunemaker Chess Rating List (NCRL)

Post by can00336 »

lkaufman wrote:Very informative! I think you made many wise decisions in setting up your tests. The 120 to 1 ratio of time to increment is very reasonable, the use of Ordo for the rating calculations is wise since it insures that higher scores mean higher ratings, and your engine selection seems to avoid including two engines by the same author in the same run, also wise. It's interesting to see how the draw percentage and White advantage climb with each doubling of time. It's also interesting that the spread of ratings from bottom to top doesn't vary so much with time; logically the longer times with more draws should have narrower ranges, but it seems that the stronger engines scale better than the weaker ones to compensate for this.

My only suggestion (other than playing more games!) would be regarding hash size. 64 MB is adequate for the fastest two or three levels, but not enough for the longer ones. Maybe 128 would be a better compromise if you don't want to bother with different hash sizes for different levels. But it's not a big deal.
In setting up my tests, I drew inspiration from many of the currently successful rating lists. I've added them to my "Thanks" page now.

I chose the 120 to 1 ratio since my rating list started as just 60+0.5, then I thought, "Hey, why not do some shorter/longer games too?".

Ever since Ordo went 1.0, I haven't used anything else. Its so powerful and easy to use!

In selecting engines, I tried to avoid engines that did not provide a benefit to correspondence chess players who want a plethora of unique, strong moves suggested by their engines. That's why I'm using Don's similarity tester, not as a clone detector, but as a true move similarity tester. Finding strong, but off-beat, moves using a unique engine helps keep chess exciting!

I expected the draw rate to rise with the time control, but the white advantage increasing seems odd, though I doubt that it is statistically significant. I haven't looked at the spread of ratings, but your explanation seems reasonable.

Since my list started at 60+0.5, 64MB hash seemed sufficient in my testing. Then my list added 15+0.125 and 30+0.25, still fine. It was only later that I added the longer time controls. I wanted all the time controls to utilize the same hash size to minimize the factors at play when comparing time controls, but in retrospect I would use a hash size of 256MB instead. Whenever I upgrade my hardware next, I will remake my list with a larger hash size, probably 256MB.
can00336
Posts: 24
Joined: Sat May 16, 2015 8:07 am
Location: PA

Re: Nunemaker Chess Rating List (NCRL)

Post by can00336 »

cdani wrote:Thanks!! Is really very interesting.

Is not a surprise the behavior of Andscacs. I always tried to be sure that is better at long time controls that at short ones. Anyway is very nice to view it reflected, one never knows :-)
I've been testing Andscacs since version 0.80, but I don't remember if earlier versions exhibited similar behavior or not.

Code: Select all

240+2 All Games
-----------------------
PLAYER         : RATING
Andscacs 0.84  :   2990
Andscacs 0.82  :   2922
Andscacs 0.81  :   2904
Andscacs 0.80  :   2874
I somehow missed testing Andscacs 0.83, but you have been making amazing progress in my rating list! Keep up the awesome work.