Page 2 of 2

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Posted: Sun Oct 18, 2020 10:27 pm
by Graham Banks
lkaufman wrote:
Sun Oct 18, 2020 6:38 pm
Jouni wrote:
Sun Oct 18, 2020 2:20 pm
Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Posted: Mon Oct 19, 2020 2:53 am
by lkaufman
Graham Banks wrote:
Sun Oct 18, 2020 10:27 pm
lkaufman wrote:
Sun Oct 18, 2020 6:38 pm
Jouni wrote:
Sun Oct 18, 2020 2:20 pm
Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.
Do you know whether the opponents and opening books for the SF11 tests were similar (I don't expect identical)? If so, and observing the 110 elo gap on the CCRL blitz list on 1 cpu (no comparable ratings on more than 1 cpu), then it sure looks like scaling, although I don't think anyone else has reported a scaling problem other than the normal one of getting smaller differences due to more draws at longer time controls. But this seems way beyond that.

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Posted: Mon Oct 19, 2020 3:14 am
by Graham Banks
lkaufman wrote:
Mon Oct 19, 2020 2:53 am
Graham Banks wrote:
Sun Oct 18, 2020 10:27 pm
lkaufman wrote:
Sun Oct 18, 2020 6:38 pm
Jouni wrote:
Sun Oct 18, 2020 2:20 pm
Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.
Do you know whether the opponents and opening books for the SF11 tests were similar (I don't expect identical)? If so, and observing the 110 elo gap on the CCRL blitz list on 1 cpu (no comparable ratings on more than 1 cpu), then it sure looks like scaling, although I don't think anyone else has reported a scaling problem other than the normal one of getting smaller differences due to more draws at longer time controls. But this seems way beyond that.
The games on the blitz list would have used different opening books.

I guess you're aware of the following thread?
http://talkchess.com/forum3/viewtopic.php?f=2&t=75313

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Posted: Mon Oct 19, 2020 3:22 am
by lkaufman
Graham Banks wrote:
Mon Oct 19, 2020 3:14 am
lkaufman wrote:
Mon Oct 19, 2020 2:53 am
Graham Banks wrote:
Sun Oct 18, 2020 10:27 pm
lkaufman wrote:
Sun Oct 18, 2020 6:38 pm
Jouni wrote:
Sun Oct 18, 2020 2:20 pm
Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.
Do you know whether the opponents and opening books for the SF11 tests were similar (I don't expect identical)? If so, and observing the 110 elo gap on the CCRL blitz list on 1 cpu (no comparable ratings on more than 1 cpu), then it sure looks like scaling, although I don't think anyone else has reported a scaling problem other than the normal one of getting smaller differences due to more draws at longer time controls. But this seems way beyond that.
The games on the blitz list would have used different opening books.

I guess you're aware of the following thread?
http://talkchess.com/forum3/viewtopic.php?f=2&t=75313
I was asking whether the SF11 tests on the 40/15 list used similar books and had similar opponents to SF12, not about the books used for blitz. Yes, I recall that post comparing 8 to 1 cpu ratings, but with more games the ratings now make much more sense there.

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Posted: Mon Oct 19, 2020 3:35 am
by Graham Banks
lkaufman wrote:
Mon Oct 19, 2020 3:22 am
Graham Banks wrote:
Mon Oct 19, 2020 3:14 am
lkaufman wrote:
Mon Oct 19, 2020 2:53 am
Graham Banks wrote:
Sun Oct 18, 2020 10:27 pm
lkaufman wrote:
Sun Oct 18, 2020 6:38 pm
Jouni wrote:
Sun Oct 18, 2020 2:20 pm
Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.
Do you know whether the opponents and opening books for the SF11 tests were similar (I don't expect identical)? If so, and observing the 110 elo gap on the CCRL blitz list on 1 cpu (no comparable ratings on more than 1 cpu), then it sure looks like scaling, although I don't think anyone else has reported a scaling problem other than the normal one of getting smaller differences due to more draws at longer time controls. But this seems way beyond that.
The games on the blitz list would have used different opening books.

I guess you're aware of the following thread?
http://talkchess.com/forum3/viewtopic.php?f=2&t=75313
I was asking whether the SF11 tests on the 40/15 list used similar books and had similar opponents to SF12, not about the books used for blitz. Yes, I recall that post comparing 8 to 1 cpu ratings, but with more games the ratings now make much more sense there.
SF11 would have been tested against the top twelve opponents plus a few others that were just outside that, at the time.
If you click on an engine name in our list, you will see a list of opponents and results.

I use a small range of opening books (ones that I trust) and a maximum 8 move depth for all my recent testing.
Whether or not it was exactly the same book, I don't know offhand.

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Posted: Mon Oct 19, 2020 3:49 am
by lkaufman
Graham Banks wrote:
Mon Oct 19, 2020 3:35 am
lkaufman wrote:
Mon Oct 19, 2020 3:22 am
Graham Banks wrote:
Mon Oct 19, 2020 3:14 am
lkaufman wrote:
Mon Oct 19, 2020 2:53 am
Graham Banks wrote:
Sun Oct 18, 2020 10:27 pm
lkaufman wrote:
Sun Oct 18, 2020 6:38 pm
Jouni wrote:
Sun Oct 18, 2020 2:20 pm
Nice to have SF12 multicore data finally!
So the 40/15 list shows the elo gain from SF11 to SF12 (with NNUE) to be 40 elo on 4 threads or 44 elo on 1 thread. These are very small numbers compared to all other estimates, for examples CEGT 40/20 shows 95 elo. Does anyone have any explanation of why the CCRL gains are so small for SF12? I know that using Bayeselo contracts the gains, but even without that difference the gains would be only around 50. Is this a sign of very bad scaling, or perhaps a function of the choice of opponents, or opening books, or ...?
I've run all of the SF12 1CPU and 4CPU games for 40/15 so far.
The opponents are the other engines in the top twelve crosstables, plus a couple that are just outside.
The 1CPU games are using the Top10SuperGM2020 opening book with 8 move depth, while the 4CPU games are using the TWIC2018 book with 8 move depth.
Do you know whether the opponents and opening books for the SF11 tests were similar (I don't expect identical)? If so, and observing the 110 elo gap on the CCRL blitz list on 1 cpu (no comparable ratings on more than 1 cpu), then it sure looks like scaling, although I don't think anyone else has reported a scaling problem other than the normal one of getting smaller differences due to more draws at longer time controls. But this seems way beyond that.
The games on the blitz list would have used different opening books.

I guess you're aware of the following thread?
http://talkchess.com/forum3/viewtopic.php?f=2&t=75313
I was asking whether the SF11 tests on the 40/15 list used similar books and had similar opponents to SF12, not about the books used for blitz. Yes, I recall that post comparing 8 to 1 cpu ratings, but with more games the ratings now make much more sense there.
SF11 would have been tested against the top twelve opponents plus a few others that were just outside that, at the time.
If you click on an engine name in our list, you will see a list of opponents and results.

I use a small range of opening books (ones that I trust) and a maximum 8 move depth for all my recent testing.
Whether or not it was exactly the same book, I don't know offhand.
Okay, it sounds like there were no huge differences in how they were tested, so it does indeed look like a scaling issue. We'll need more data from multiple sources to confirm this; perhaps I'll run some tests myself to try to understand what this means.

Re: CCRL 40/15, 40/2 and FRC lists updated (17th October 2020)

Posted: Mon Oct 19, 2020 10:24 am
by Jouni
Now TCEC has scaling test with 172 cores. Currently +90 to classic.