The Stockfish ELO problem

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Modern Times
Posts: 3781
Joined: Thu Jun 07, 2012 11:02 pm

Re: The Stockfish ELO problem

Post by Modern Times »

Rebel wrote: Sun Aug 07, 2022 2:29 pm Not or, but and.
Yes, some of both.
Sopel
Posts: 391
Joined: Tue Oct 08, 2019 11:39 pm
Full name: Tomasz Sobczyk

Re: The Stockfish ELO problem

Post by Sopel »

Rebel wrote: Sat Aug 06, 2022 10:04 pm Some remarks
1. Komodo scales extremely well (+56,+61,+59).
2. SF15 went down from +82 to +26 (last SF run, equals 40m/26m about CCRL 40/15 | CEGT 40/20 at 2CPU).
3. SF13 went up from -100 to -19.
4. Draw rate last SF run 91.7% but SF15 never lost a game.
That's very good news. An oracle engine doesn't scale, so Stockfish is closer to a perfect engine compared to komodo.
dangi12012 wrote:No one wants to touch anything you have posted. That proves you now have negative reputations since everyone knows already you are a forum troll.

Maybe you copied your stockfish commits from someone else too?
I will look into that.
smatovic
Posts: 3471
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: The Stockfish ELO problem

Post by smatovic »

Modern Times wrote: Sun Aug 07, 2022 12:44 pm Yes, there is another way of looking at it - either

- Stockfish scales badly, or
- Stockfish is incredibly good on low core counts and short time controls
Sopel wrote: Sun Aug 07, 2022 2:49 pm That's very good news. An oracle engine doesn't scale, so Stockfish is closer to a perfect engine compared to komodo.
+1

--
Srdja
Werewolf
Posts: 2058
Joined: Thu Sep 18, 2008 10:24 pm

Re: The Stockfish ELO problem

Post by Werewolf »

Could this work as a test method:

Compile a list of very hard, but solvable, test positions.

Compare time to solve for different engines on 1,2,4,8,16,32,64 cores?
MonteCarlo
Posts: 188
Joined: Sun Dec 25, 2016 4:59 pm

Re: The Stockfish ELO problem

Post by MonteCarlo »

Yeah, everything I've seen so far is consistent with the hypothesis that SF just starts out stronger rather than that something is amiss (as has been mentioned already, an imaginary perfect engine wouldn't scale at all).

If K really did just scale that much better than SF, then one would expect that there's some level with sufficently high core count and slow TC that K starts outscoring SF (rather than just approaching it as draw rate increases), and I've not seen this yet.

That's not to say that SF couldn't improve in this regard; I just haven't seen compelling evidence that this is an SF problem rather than a "nature of very high level chess" problem.

Cheers!
User avatar
Rebel
Posts: 7430
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: The Stockfish ELO problem

Post by Rebel »

Sopel wrote: Sun Aug 07, 2022 2:49 pm
Rebel wrote: Sat Aug 06, 2022 10:04 pm Some remarks
1. Komodo scales extremely well (+56,+61,+59).
2. SF15 went down from +82 to +26 (last SF run, equals 40m/26m about CCRL 40/15 | CEGT 40/20 at 2CPU).
3. SF13 went up from -100 to -19.
4. Draw rate last SF run 91.7% but SF15 never lost a game.
That's very good news. An oracle engine doesn't scale, so Stockfish is closer to a perfect engine compared to komodo.
:D

Meaning at increasing time control and more threads Komodo can catch up and overtake you? Oh wait, it already happened :wink:
90% of coding is debugging, the other 10% is writing bugs.
Sopel
Posts: 391
Joined: Tue Oct 08, 2019 11:39 pm
Full name: Tomasz Sobczyk

Re: The Stockfish ELO problem

Post by Sopel »

Rebel wrote: Sun Aug 07, 2022 8:21 pm
Sopel wrote: Sun Aug 07, 2022 2:49 pm
Rebel wrote: Sat Aug 06, 2022 10:04 pm Some remarks
1. Komodo scales extremely well (+56,+61,+59).
2. SF15 went down from +82 to +26 (last SF run, equals 40m/26m about CCRL 40/15 | CEGT 40/20 at 2CPU).
3. SF13 went up from -100 to -19.
4. Draw rate last SF run 91.7% but SF15 never lost a game.
That's very good news. An oracle engine doesn't scale, so Stockfish is closer to a perfect engine compared to komodo.
:D

Meaning at increasing time control and more threads Komodo can catch up and overtake you? Oh wait, it already happened :wink:
You can come up at any result with flawed enough methodology. This has the same issues as CCRL.
dangi12012 wrote:No one wants to touch anything you have posted. That proves you now have negative reputations since everyone knows already you are a forum troll.

Maybe you copied your stockfish commits from someone else too?
I will look into that.
User avatar
Rebel
Posts: 7430
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: The Stockfish ELO problem

Post by Rebel »

Sopel wrote: Sun Aug 07, 2022 9:43 pm
Rebel wrote: Sun Aug 07, 2022 8:21 pm
Sopel wrote: Sun Aug 07, 2022 2:49 pm
Rebel wrote: Sat Aug 06, 2022 10:04 pm Some remarks
1. Komodo scales extremely well (+56,+61,+59).
2. SF15 went down from +82 to +26 (last SF run, equals 40m/26m about CCRL 40/15 | CEGT 40/20 at 2CPU).
3. SF13 went up from -100 to -19.
4. Draw rate last SF run 91.7% but SF15 never lost a game.
That's very good news. An oracle engine doesn't scale, so Stockfish is closer to a perfect engine compared to komodo.
:D

Meaning at increasing time control and more threads Komodo can catch up and overtake you? Oh wait, it already happened :wink:
You can come up at any result with flawed enough methodology. This has the same issues as CCRL.
Calling something "flawed" without describing what is flawed is empty rhetoric. Instead (I think) it would be wise to put some energy in long time control with many cores. You have the hardware for it.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
RubiChess
Posts: 650
Joined: Fri Mar 30, 2018 7:20 am
Full name: Andreas Matthies

Re: The Stockfish ELO problem

Post by RubiChess »

Sopel wrote: Sun Aug 07, 2022 9:43 pm
Rebel wrote: Sun Aug 07, 2022 8:21 pm
Sopel wrote: Sun Aug 07, 2022 2:49 pm
Rebel wrote: Sat Aug 06, 2022 10:04 pm Some remarks
1. Komodo scales extremely well (+56,+61,+59).
2. SF15 went down from +82 to +26 (last SF run, equals 40m/26m about CCRL 40/15 | CEGT 40/20 at 2CPU).
3. SF13 went up from -100 to -19.
4. Draw rate last SF run 91.7% but SF15 never lost a game.
That's very good news. An oracle engine doesn't scale, so Stockfish is closer to a perfect engine compared to komodo.
:D

Meaning at increasing time control and more threads Komodo can catch up and overtake you? Oh wait, it already happened :wink:
You can come up at any result with flawed enough methodology. This has the same issues as CCRL.
The main issue in this rating list seems that SF15/4threads wasn't tested, only 14.1. At least SF15/4CPU is not mentioned in http://www.cegt.net/40_40%20Rating%20Li ... liste.html

But as this list also uses moves/time control, I want to mention this https://github.com/official-stockfish/S ... ssues/4000 again.

Regards, Andreas
Wolfgang
Posts: 989
Joined: Sat May 13, 2006 1:08 am

Re: The Stockfish ELO problem

Post by Wolfgang »

RubiChess wrote: Sun Aug 07, 2022 10:02 pm ...
The main issue in this rating list seems that SF15/4threads wasn't tested, only 14.1. At least SF15/4CPU is not mentioned in http://www.cegt.net/40_40%20Rating%20Li ... liste.html
...
Reason for that is that our main "40/20-4CPU" tester stopped testing.
But this will be made but takes some time
Best
Wolfgang
CEGT-Team
www.cegt.net
www.cegt.forumieren.com