Looking at the CCRL and CEGT rating lists I started to wonder why there is so little progress from version 13 to 15 and decided to do a scaling comparison between Stockfish and Komodo.
Some remarks
1. Komodo scales extremely well (+56,+61,+59).
2. SF15 went down from +82 to +26 (last SF run, equals 40m/26m about CCRL 40/15 | CEGT 40/20 at 2CPU).
3. SF13 went up from -100 to -19.
4. Draw rate last SF run 91.7% but SF15 never lost a game.
Advice for SF team, work on scaling.
90% of coding is debugging, the other 10% is writing bugs.
If i'm not mistaken, I think here : https://tcec-chess.com/bayeselo.txt
is various 32 threads nodes limited SF.
SF seems to scale well with nodes limites (and thus with TC).
I think this analysis is bunk because I don't trust the samples from CCRL and CEGT, due to repeat engines like Fat Fritz, as well as other "repeat" engines that people argue are not "repeat" engines because they are clueless. If you want to compare scaling against a pool of opponents, do exactly that. Get the same opponents. Run the same games, same openings, same machines.
Talkchess is dead without moderation. If you want my attention, contact me via andrew@grantnet.us
AndrewGrant wrote: ↑Sun Aug 07, 2022 12:43 am
I think this analysis is bunk because I don't trust the samples from CCRL and CEGT, due to repeat engines like Fat Fritz, as well as other "repeat" engines that people argue are not "repeat" engines because they are clueless. If you want to compare scaling against a pool of opponents, do exactly that. Get the same opponents. Run the same games, same openings, same machines.
I think you are making an argument that the engines are “repeat” on ethical grounds, not statistical grounds. Maybe Ed can help you out with similarity measures?
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
I'm playing Stockfish 060122 against the engines in the top 12 crosstable that it hasn't played yet.
Hopefully, it will then fall behind Stockfish 15 on the 40/15 list.
All of the games on the top 12 crosstable have been run by me on my 5950x, except for a tiny handful that were run in my Amateur series on my i7-4770k.
AndrewGrant wrote: ↑Sun Aug 07, 2022 12:43 am
I think this analysis is bunk because I don't trust the samples from CCRL and CEGT, due to repeat engines like Fat Fritz, as well as other "repeat" engines that people argue are not "repeat" engines because they are clueless. If you want to compare scaling against a pool of opponents, do exactly that. Get the same opponents. Run the same games, same openings, same machines.
I think you are making an argument that the engines are “repeat” on ethical grounds, not statistical grounds. Maybe Ed can help you out with similarity measures?
I do mean statistical. If you test SF/Komodo against the pool of { Stockfish, Komodo, Houdini, Sugar, Shashchess, Fat Fritz II, Fire, Ethereal, Leela, Berserk, Koivisto }, that pool is heavily skewed towards a Stockfish engine. Which means any result could be a result of a particular ability or inability to play against Stockfish. Competing hypothesis for the results seen.
I don't mean ethical, which is why I posted a list of engines above and a reader can make their own determination how much of the pool is Stockfish.
Talkchess is dead without moderation. If you want my attention, contact me via andrew@grantnet.us
AndrewGrant wrote: ↑Sun Aug 07, 2022 2:33 am
I do mean statistical. If you test SF/Komodo against the pool of { Stockfish, Komodo, Houdini, Sugar, Shashchess, Fat Fritz II, Fire, Ethereal, Leela, Berserk, Koivisto }, that pool is heavily skewed towards a Stockfish engine. Which means any result could be a result of a particular ability or inability to play against Stockfish. Competing hypothesis for the results seen.
I don't mean ethical, which is why I posted a list of engines above and a reader can make their own determination how much of the pool is Stockfish.
I’d be curious to see your tests on the similarity of SF and FF2. Any pgn’s you could share?
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
AndrewGrant wrote: ↑Sun Aug 07, 2022 2:33 am
I do mean statistical. If you test SF/Komodo against the pool of { Stockfish, Komodo, Houdini, Sugar, Shashchess, Fat Fritz II, Fire, Ethereal, Leela, Berserk, Koivisto }, that pool is heavily skewed towards a Stockfish engine. Which means any result could be a result of a particular ability or inability to play against Stockfish. Competing hypothesis for the results seen.
I don't mean ethical, which is why I posted a list of engines above and a reader can make their own determination how much of the pool is Stockfish.
I’d be curious to see your tests on the similarity of SF and FF2. Any pgn’s you could share?
Much easier to derive similarity from knowing its the same code, than from looking at PGNs. That can be left as an exercise for the reader as well.
Talkchess is dead without moderation. If you want my attention, contact me via andrew@grantnet.us
AndrewGrant wrote: ↑Sun Aug 07, 2022 2:33 am
I do mean statistical. If you test SF/Komodo against the pool of { Stockfish, Komodo, Houdini, Sugar, Shashchess, Fat Fritz II, Fire, Ethereal, Leela, Berserk, Koivisto }, that pool is heavily skewed towards a Stockfish engine. Which means any result could be a result of a particular ability or inability to play against Stockfish. Competing hypothesis for the results seen.
I don't mean ethical, which is why I posted a list of engines above and a reader can make their own determination how much of the pool is Stockfish.
I’d be curious to see your tests on the similarity of SF and FF2. Any pgn’s you could share?
Much easier to derive similarity from knowing its the same code, than from looking at PGNs. That can be left as an exercise for the reader as well.
You don’t have any evidence? By that logic, Ethereal would never improve as it’s Nets improved as the Engine code remained mostly the same.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".