CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
Graham Banks
Posts: 41412
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Post by Graham Banks »

lkaufman wrote: Sun Oct 18, 2020 8:38 pm
Jouni wrote: Sun Oct 18, 2020 4:20 pm Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.
gbanksnz at gmail.com
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Post by lkaufman »

Graham Banks wrote: Mon Oct 19, 2020 12:27 am
lkaufman wrote: Sun Oct 18, 2020 8:38 pm
Jouni wrote: Sun Oct 18, 2020 4:20 pm Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.
Do you know whether the opponents and opening books for the SF11 tests were similar (I don't expect identical)? If so, and observing the 110 elo gap on the CCRL blitz list on 1 cpu (no comparable ratings on more than 1 cpu), then it sure looks like scaling, although I don't think anyone else has reported a scaling problem other than the normal one of getting smaller differences due to more draws at longer time controls. But this seems way beyond that.
Komodo rules!
User avatar
Graham Banks
Posts: 41412
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Post by Graham Banks »

lkaufman wrote: Mon Oct 19, 2020 4:53 am
Graham Banks wrote: Mon Oct 19, 2020 12:27 am
lkaufman wrote: Sun Oct 18, 2020 8:38 pm
Jouni wrote: Sun Oct 18, 2020 4:20 pm Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.
Do you know whether the opponents and opening books for the SF11 tests were similar (I don't expect identical)? If so, and observing the 110 elo gap on the CCRL blitz list on 1 cpu (no comparable ratings on more than 1 cpu), then it sure looks like scaling, although I don't think anyone else has reported a scaling problem other than the normal one of getting smaller differences due to more draws at longer time controls. But this seems way beyond that.
The games on the blitz list would have used different opening books.

I guess you're aware of the following thread?
http://talkchess.com/forum3/viewtopic.php?f=2&t=75313
gbanksnz at gmail.com
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Post by lkaufman »

Graham Banks wrote: Mon Oct 19, 2020 5:14 am
lkaufman wrote: Mon Oct 19, 2020 4:53 am
Graham Banks wrote: Mon Oct 19, 2020 12:27 am
lkaufman wrote: Sun Oct 18, 2020 8:38 pm
Jouni wrote: Sun Oct 18, 2020 4:20 pm Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.
Do you know whether the opponents and opening books for the SF11 tests were similar (I don't expect identical)? If so, and observing the 110 elo gap on the CCRL blitz list on 1 cpu (no comparable ratings on more than 1 cpu), then it sure looks like scaling, although I don't think anyone else has reported a scaling problem other than the normal one of getting smaller differences due to more draws at longer time controls. But this seems way beyond that.
The games on the blitz list would have used different opening books.

I guess you're aware of the following thread?
http://talkchess.com/forum3/viewtopic.php?f=2&t=75313
I was asking whether the SF11 tests on the 40/15 list used similar books and had similar opponents to SF12, not about the books used for blitz. Yes, I recall that post comparing 8 to 1 cpu ratings, but with more games the ratings now make much more sense there.
Komodo rules!
User avatar
Graham Banks
Posts: 41412
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Post by Graham Banks »

lkaufman wrote: Mon Oct 19, 2020 5:22 am
Graham Banks wrote: Mon Oct 19, 2020 5:14 am
lkaufman wrote: Mon Oct 19, 2020 4:53 am
Graham Banks wrote: Mon Oct 19, 2020 12:27 am
lkaufman wrote: Sun Oct 18, 2020 8:38 pm
Jouni wrote: Sun Oct 18, 2020 4:20 pm Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.
Do you know whether the opponents and opening books for the SF11 tests were similar (I don't expect identical)? If so, and observing the 110 elo gap on the CCRL blitz list on 1 cpu (no comparable ratings on more than 1 cpu), then it sure looks like scaling, although I don't think anyone else has reported a scaling problem other than the normal one of getting smaller differences due to more draws at longer time controls. But this seems way beyond that.
The games on the blitz list would have used different opening books.

I guess you're aware of the following thread?
http://talkchess.com/forum3/viewtopic.php?f=2&t=75313
I was asking whether the SF11 tests on the 40/15 list used similar books and had similar opponents to SF12, not about the books used for blitz. Yes, I recall that post comparing 8 to 1 cpu ratings, but with more games the ratings now make much more sense there.
SF11 would have been tested against the top twelve opponents plus a few others that were just outside that, at the time.
If you click on an engine name in our list, you will see a list of opponents and results.

I use a small range of opening books (ones that I trust) and a maximum 8 move depth for all my recent testing.
Whether or not it was exactly the same book, I don't know offhand.
gbanksnz at gmail.com
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Post by lkaufman »

Graham Banks wrote: Mon Oct 19, 2020 5:35 am
lkaufman wrote: Mon Oct 19, 2020 5:22 am
Graham Banks wrote: Mon Oct 19, 2020 5:14 am
lkaufman wrote: Mon Oct 19, 2020 4:53 am
Graham Banks wrote: Mon Oct 19, 2020 12:27 am
lkaufman wrote: Sun Oct 18, 2020 8:38 pm
Jouni wrote: Sun Oct 18, 2020 4:20 pm Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.
Do you know whether the opponents and opening books for the SF11 tests were similar (I don't expect identical)? If so, and observing the 110 elo gap on the CCRL blitz list on 1 cpu (no comparable ratings on more than 1 cpu), then it sure looks like scaling, although I don't think anyone else has reported a scaling problem other than the normal one of getting smaller differences due to more draws at longer time controls. But this seems way beyond that.
The games on the blitz list would have used different opening books.

I guess you're aware of the following thread?
http://talkchess.com/forum3/viewtopic.php?f=2&t=75313
I was asking whether the SF11 tests on the 40/15 list used similar books and had similar opponents to SF12, not about the books used for blitz. Yes, I recall that post comparing 8 to 1 cpu ratings, but with more games the ratings now make much more sense there.
SF11 would have been tested against the top twelve opponents plus a few others that were just outside that, at the time.
If you click on an engine name in our list, you will see a list of opponents and results.

I use a small range of opening books (ones that I trust) and a maximum 8 move depth for all my recent testing.
Whether or not it was exactly the same book, I don't know offhand.
Okay, it sounds like there were no huge differences in how they were tested, so it does indeed look like a scaling issue. We'll need more data from multiple sources to confirm this; perhaps I'll run some tests myself to try to understand what this means.
Komodo rules!
Jouni
Posts: 3278
Joined: Wed Mar 08, 2006 8:15 pm

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Post by Jouni »

Now TCEC has scaling test with 172 cores. Currently +90 to classic.
Jouni