Berserk 11 released. Number 2 free engine!?
Moderator: Ras
-
jhonnold
- Posts: 122
- Joined: Wed Feb 17, 2021 3:16 pm
- Full name: Jay Honnold
Re: Berserk 11 released. Number 2 free engine!?
This was a sanity check to make sure I didn't severely hurt incremental time controls. I'm sure this would have finished at 0.
-
Uri Blass
- Posts: 11108
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Berserk 11 released. Number 2 free engine!?
I do not see a gain based on the result but only no regression.
I expect In most cases more than 50% in SPRT test with these bounds
1.78 +- 5.79 is simply wrong.
This bounds may be correct with fixed number of games and I do not understand why people give these bounds with SPRT tests.
-
lkaufman
- Posts: 6279
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Berserk 11 released. Number 2 free engine!?
Are you saying that CEGT has switched from 40/3 (or 4) to 4' +2" for all blitz testing? I don't see any mention of this on the website. If so, have you also switched to increment testing to replace 40/20? I can't think of any reason to use increment for blitz but not for Rapid, although I believe CCRL does this (no idea why). Also, doesn't this make the 3' + 1" and 5' + 3" lists rather redundant? I know that they use ponder, but that does not seem to make any meaningful difference when comparing engines, as long as they all have a ponder option.Wolfgang wrote: ↑Tue Feb 21, 2023 2:23 pm Berserk 11.1 with a major bug-fix for anyone testing without an increment or in cyclical TCs.
https://github.com/jhonnold/berserk/releases/tag/11.1
Fortunately we currently test with 4'+2" instead of 40/3 or 40/4 repeated as we did in the past...![]()
Komodo rules!
-
Werner
- Posts: 3009
- Joined: Wed Mar 08, 2006 10:09 pm
- Location: Germany
- Full name: Werner Schüle
Re: Berserk 11 released. Number 2 free engine!?
At the moment I make all the games for our 40/20 list and I did not change to increment testing.
And I did not start testing Berserk 11 - so no problems here.
And I did not start testing Berserk 11 - so no problems here.
Last edited by Werner on Tue Feb 21, 2023 6:24 pm, edited 2 times in total.
Werner
-
Graham Banks
- Posts: 45043
- Joined: Sun Feb 26, 2006 10:52 am
- Location: Auckland, NZ
Re: Berserk 11 released. Number 2 free engine!?
I test with repeating time controls because many older engines don't understand increments, and in order to maintain a meaningful rating list, there must be correlation between newer and older engines.lkaufman wrote: ↑Tue Feb 21, 2023 5:08 pmAre you saying that CEGT has switched from 40/3 (or 4) to 4' +2" for all blitz testing? I don't see any mention of this on the website. If so, have you also switched to increment testing to replace 40/20? I can't think of any reason to use increment for blitz but not for Rapid, although I believe CCRL does this (no idea why). Also, doesn't this make the 3' + 1" and 5' + 3" lists rather redundant? I know that they use ponder, but that does not seem to make any meaningful difference when comparing engines, as long as they all have a ponder option.Wolfgang wrote: ↑Tue Feb 21, 2023 2:23 pm Berserk 11.1 with a major bug-fix for anyone testing without an increment or in cyclical TCs.
https://github.com/jhonnold/berserk/releases/tag/11.1
Fortunately we currently test with 4'+2" instead of 40/3 or 40/4 repeated as we did in the past...![]()
Also, I prefer repeating time controls for consistency of quality through the opening, middlegame and endgame phases.
gbanksnz at gmail.com
-
lkaufman
- Posts: 6279
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Berserk 11 released. Number 2 free engine!?
I thought that CCRL had totally switched blitz testing from 40/x to increment, is that wrong? Perhaps you just don't do any blitz testing. The reasons you mention don't seem to have anything to do with blitz vs Rapid, so I don't see any logical reason for running blitz with increment but Rapid with 40/x. Am I missing something? Regarding your last point, in general the engines search much deeper in the endgame in the same amount of time, so by using increment you are actually making the search depths more consistent in the endgame than with 40/x. Human GMs know that it usually is best to take much more time per move in the middle game than in the endgame (if the TC allows for this), and engine tests show the same thing. It is very clear that you get a much higher average quality of search with increment than with 40/x given an equal average total game time. But anyway, this is all the same for blitz and rapid, my main question is why they follow different policies? Or is it just up to the individual tester?Graham Banks wrote: ↑Tue Feb 21, 2023 6:23 pmI test with repeating time controls because many older engines don't understand increments, and in order to maintain a meaningful rating list, there must be correlation between newer and older engines.lkaufman wrote: ↑Tue Feb 21, 2023 5:08 pmAre you saying that CEGT has switched from 40/3 (or 4) to 4' +2" for all blitz testing? I don't see any mention of this on the website. If so, have you also switched to increment testing to replace 40/20? I can't think of any reason to use increment for blitz but not for Rapid, although I believe CCRL does this (no idea why). Also, doesn't this make the 3' + 1" and 5' + 3" lists rather redundant? I know that they use ponder, but that does not seem to make any meaningful difference when comparing engines, as long as they all have a ponder option.Wolfgang wrote: ↑Tue Feb 21, 2023 2:23 pm Berserk 11.1 with a major bug-fix for anyone testing without an increment or in cyclical TCs.
https://github.com/jhonnold/berserk/releases/tag/11.1
Fortunately we currently test with 4'+2" instead of 40/3 or 40/4 repeated as we did in the past...![]()
Also, I prefer repeating time controls for consistency of quality through the opening, middlegame and endgame phases.
Komodo rules!
-
Uri Blass
- Posts: 11108
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Berserk 11 released. Number 2 free engine!?
CCRL use increment only for blitz
Graham Banks does not test at blitz based on my understanding.
From the CCRL page
1)2'+1" is our fast "blitz" time control as from January 2020, with previous games played at 40/2 repeating time control.
CCRL have the right to do what they want but it make the list less reliable because I understand that the same engine can have games with different time control.
Rating of the engines at different type of time control may be different because it is possible that some engines have bad time management for one of the type of time control.
Graham Banks does not test at blitz based on my understanding.
From the CCRL page
1)2'+1" is our fast "blitz" time control as from January 2020, with previous games played at 40/2 repeating time control.
CCRL have the right to do what they want but it make the list less reliable because I understand that the same engine can have games with different time control.
Rating of the engines at different type of time control may be different because it is possible that some engines have bad time management for one of the type of time control.
-
Uri Blass
- Posts: 11108
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Berserk 11 released. Number 2 free engine!?
I also believe that you get a better quality of search with increment than with 40/x but I think that it can be more clear if somebody make a list with different types of time control when for every engine you will have both average time per game and rating so people see that you get higher rating with similiar average time per game with increment(it may be interesting to see the rating difference)
There are interfaces that allow to test with different type of time control for different engines
There are interfaces that allow to test with different type of time control for different engines
-
lkaufman
- Posts: 6279
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Berserk 11 released. Number 2 free engine!?
We did such tests with Komodo a couple years ago, and it was very clear that increment beat repeating TC with equal average time per game. I don't have the results anymore.Uri Blass wrote: ↑Tue Feb 21, 2023 8:27 pm I also believe that you get a better quality of search with increment than with 40/x but I think that it can be more clear if somebody make a list with different types of time control when for every engine you will have both average time per game and rating so people see that you get higher rating with similiar average time per game with increment(it may be interesting to see the rating difference)
There are interfaces that allow to test with different type of time control for different engines
Komodo rules!
-
lkaufman
- Posts: 6279
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Berserk 11 released. Number 2 free engine!?
I know they do this, but I have not seen any explanation or justification for still using repeating TC for Rapid 3 years after making the switch for blitz. I know the arguments for both methods, but they seem unrelated to Rapid vs blitz.Uri Blass wrote: ↑Tue Feb 21, 2023 8:15 pm CCRL use increment only for blitz
Graham Banks does not test at blitz based on my understanding.
From the CCRL page
1)2'+1" is our fast "blitz" time control as from January 2020, with previous games played at 40/2 repeating time control.
CCRL have the right to do what they want but it make the list less reliable because I understand that the same engine can have games with different time control.
Rating of the engines at different type of time control may be different because it is possible that some engines have bad time management for one of the type of time control.
Komodo rules!