From the CCRL 15/40 rating list
https://computerchess.org.uk/ccrl/4040/ ... t_all.html
Movei 00.8.438 (10 10 10) 2661 +9 −9 49.4% +0.1 36.5% 3082
Colossus 2008b 2638 +11 −11 49.2% +3.5 38.6% 2188
2661-9>2638+11 conclusion Movei is stronger at 15/40
From the CCRL 2+1 blitz rating list
https://computerchess.org.uk/ccrl/404/r ... t_all.html
Colossus 2008b 2632 +27 −27 50.2% −2.9 27.6% 428
Movei 00.8.438 (10 10 10) 2567 +30 −30 44.8% +41.4 26.3% 335
2632-27>2567+30 conclusion Colossus 2008b is stronger than Movei at 2+1
I wonder how many pairs you can find and if somebody can find all the pairs.
Unfortunately CCRL even does not test the same engines at blitz and at long time control so if we look at the top I can see 8 cpu leading the blitz list when 4 cpu leading the long time control list so finding manually the common engines is not an easy task.
Can somebody write a software to get a list of only engines that appear both in the CCRL blitz and the CCRL long time control and find all the pairs of engines when A is stronger than B at long time control and weaker than B at blitz.
I wonder how many pairs you can find.
Note that Movei (my own engine) when I stopped the developement many years ago was not designed to do better at long time control and I was surprised by the results.
I tested movei against significantly stronger engines in the past and found that it needs bigger time handicap to get 50% at long time control so I am sure that some weak engines that is based on some strong engine that is 100 times slower is going to do relatively better at long time control but usually weak engines do not work in that way.
Examples for engines that are relatively better at LTC
Moderators: chrisw, Rebel, Ras
-
- Posts: 10662
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
-
- Posts: 43103
- Joined: Sun Feb 26, 2006 10:52 am
- Location: Auckland, NZ
Re: Examples for engines that are relatively better at LTC
Colossus 2024a 64-bit has many more games running, so we'll see if that has any effect. It may not, but we'll find out soon enough.
The 40/15 and Blitz ratings are calculated separately, not from the same database.
The 40/15 and Blitz ratings are calculated separately, not from the same database.
gbanksnz at gmail.com
-
- Posts: 10662
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Examples for engines that are relatively better at LTC
I compared in the past with old version of Colossus.Graham Banks wrote: ↑Thu Jan 09, 2025 11:04 am Colossus 2024a 64-bit has many more games running, so we'll see if that has any effect. It may not, but we'll find out soon enough.
The 40/15 and Blitz ratings are calculated separately, not from the same database.
Note that I did not compare rating of different lists but the order of engines. There are not many cases when A is stronger than B in blitz with high level of confidence when the opposite happens at long time control with high level of confidence(and high level of confidence means that a-b>c+d when a is the higher rating b is the lower rating and c and d are possible errors that you give in the list).
For Colossus of course new version is better but it is not clear if there is an improvement from 2021b to 2024
I see that in blitz 2021b seems stronger
Colossus 2021b 64-bit 2801 +14 −14 48.7% +9.6 30.0% 1609
Colossus 2024a 64-bit 2767 +19 −19 54.7% −39.5 20.0% 893
2801-14=2787>2767+19=2786
At long time control you do not have enough games and we even have 2771-30<2759 so it is not clear if 2024a is stronger.
Colossus 2024a 64-bit 2771 +30 −30 45.4% +37.5 28.7% 296
Colossus 2021b 64-bit 2759 +16 −16 49.3% +6.6 33.7% 1012
-
- Posts: 10662
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Examples for engines that are relatively better at LTC
Colossus2021b is clearly stronger than Bright0.5c or Bright0.4a at blitz
Colossus 2021b 64-bit 2801 +14 −14 48.7% +9.6 30.0% 1609
Bright 0.5c 2749 +17 −17 45.6% +34.8 26.1% 1158
Bright 0.4a 2742 +17 −17 49.5% +3.6 26.3% 1145
Colossus 2021b 64-bit>=2801-14=2787
Bright 0.5c<=2749+17=2766
If I look at long time control then it seems to be the opposite
Bright 0.5c 2786 +10 −10 46.7% +22.8 40.9% 2225
Bright 0.4a 2783 +8 −8 46.6% +25.3 38.4% 3668
Colossus 2021b 64-bit 2759 +16 −16 49.3% +6.6 33.7% 1012
Bright 0.5c>=2786-10=2776
Colossus 2021b 64-bit<=2759+16=2775
Colossus 2021b 64-bit 2801 +14 −14 48.7% +9.6 30.0% 1609
Bright 0.5c 2749 +17 −17 45.6% +34.8 26.1% 1158
Bright 0.4a 2742 +17 −17 49.5% +3.6 26.3% 1145
Colossus 2021b 64-bit>=2801-14=2787
Bright 0.5c<=2749+17=2766
If I look at long time control then it seems to be the opposite
Bright 0.5c 2786 +10 −10 46.7% +22.8 40.9% 2225
Bright 0.4a 2783 +8 −8 46.6% +25.3 38.4% 3668
Colossus 2021b 64-bit 2759 +16 −16 49.3% +6.6 33.7% 1012
Bright 0.5c>=2786-10=2776
Colossus 2021b 64-bit<=2759+16=2775
-
- Posts: 10662
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Examples for engines that are relatively better at LTC
Another example that is maybe better
blitz
Colossus 2021b 64-bit 2801 +14 −14 48.7% +9.6 30.0% 1609
Coiled 1.2 (no NNUE) 64-bit 2728 +15 −15 50.8% −7.1 23.1% 1464
Coiled 1.1 (no NNUE) 64-bit 2727 +14 −14 53.6% −26.9 26.7% 1690
Colossus 2021b 64-bit>=2787>2741>=Coiled 1.1 (no NNUE) 64-bit
long time control
Coiled 1.1 (no NNUE) 64-bit 2796 +14 −14 49.8% +1.9 33.7% 1303
Coiled 1.2 (no NNUE) 64-bit 2783 +14 −14 48.3% +15.1 32.0% 1321
Colossus 2021b 64-bit 2759 +16 −16 49.3% +6.6 33.7% 1012
Coiled 1.1 (no NNUE) 64-bit>=2782>2775>=Colossus 2021b 64-bit
blitz
Colossus 2021b 64-bit 2801 +14 −14 48.7% +9.6 30.0% 1609
Coiled 1.2 (no NNUE) 64-bit 2728 +15 −15 50.8% −7.1 23.1% 1464
Coiled 1.1 (no NNUE) 64-bit 2727 +14 −14 53.6% −26.9 26.7% 1690
Colossus 2021b 64-bit>=2787>2741>=Coiled 1.1 (no NNUE) 64-bit
long time control
Coiled 1.1 (no NNUE) 64-bit 2796 +14 −14 49.8% +1.9 33.7% 1303
Coiled 1.2 (no NNUE) 64-bit 2783 +14 −14 48.3% +15.1 32.0% 1321
Colossus 2021b 64-bit 2759 +16 −16 49.3% +6.6 33.7% 1012
Coiled 1.1 (no NNUE) 64-bit>=2782>2775>=Colossus 2021b 64-bit
-
- Posts: 3656
- Joined: Thu Jun 07, 2012 11:02 pm
Re: Examples for engines that are relatively better at LTC
I'd suggest you run some tightly controlled tests to verify what you are seeing. I'd be wary of drawing conclusions from those lists directly - as Graham says they are different ratings pools, and I'd add also that there is a lot of noise in the lists as well with probably different books used etc