What is the strongest chess engine in the world? — A Reflection

a_node_uncut · Post by **a_node_uncut** » Thu Nov 14, 2024 4:26 am

Over the past days I have invested much of my time in researching engine rating list/tournaments. I have since arrived at a disappointing conclusion:

Introduction
"The strongest chess engine in the world is Stockfish" is a statement echoed by virtually all chess players who consider themselves informed. But how valid is Stockfish's #1 spot, really? By what metric is Stockfish the best chess-playing program, and how objective is that metric? Are those players discussed really as informed as they say they are, or have they rather been deliberately misinformed?

Traditional Engine Testing
Ever since times of antiquity, chess had always been played starting from the standard starting position. Sure, the position may seem symmetrical and boring at a glance, but there exists a vast amount of theory, knowledge, and tactics. That, combined with the first-move advantage of white, creates a dynamic game with much imbalances. These starting positions have endured more centuries, remaining the cornerstone of the game even after the advent of chess engines.

It was only due to necessity and viewer engagement did engine v engine match organizers switch to pre-arranged openings. Even then, the book lines are limited in length, and highly reflects human opening repertoires, even at the topmost level. The early days of computer chess had been one of fierce competition, driven by dreams, motivations, and—perhaps most importantly—creativity.

Fishtest and the death of creativity
Shortly after the establishment of the Fishtest testing platform, Stockfish rose quickly through the ranks, eventually landing on the top of every rating list. The Fishtest platform is efficient and effective, but it also stifled creativity, much like the corporate culture that dominates the modern age. Despite cramming more Elo than ever into their engines, the understanding of how and why each heuristics work began to drastically fall. As someone who learned chess programming knowledge the more traditional way, many "tweaks" and "improvements" found in modern Stockfish code are not just difficult to understand, but completely opaque and incomprehensible. This weakening of theoretical basis came with a disastrous consequence.

The UHO Strategy
Stockfish's Elo rating plateaued, and by the release of Stockfish 16, progress had all but stalled. The testing system, being flawed and underinformed, began to crumble. Meanwhile, rival engines like Lc0 (developed by Alexander Lyashuk et al.), Ethereal (by Andrew Grant), and Berserk (by Jay Honnold) were rapidly closing the gap.

Amid this stagnation, the Stockfish team found a glimmer of hope: UHO (Unbalanced Human Openings). UHO forces the engine to defend suboptimal opening lines. Proponents of UHO claims that it makes viewing experience more fun, and helps reduce draw rate. However, UHO’s relevance to top-level chess is questionable. The openings it promotes are rarely seen in elite human play and are far less significant in terms of theory. More importantly, these offbeat openings place a greater emphasis on tactical sharpness rather than positional understanding. Only under these testing conditions can Stockfish retain a clear advantage, and the Stockfish team, of course, exploited it.

The Stockfish team quickly optimized their engines for UHO conditions. With aggressive and unprecedented tactics, they managed to pressure and manipulate most major tournaments to adopt UHO openings. One noticeable exception is the CCRL, whose operators (correctly) stuck to balanced, theory-rich openings. But the Stockfish team is quick to rally behind other rating lists, such as SP-CC, that are more amenable with their strategic interests. The Stockfish team also unleashed a massive campaign to align every other engine with their testing standards. Forums and online communities, overrun by keyboard warriors stubborn with their SPRT methodology, became major battlegrounds for this ideological shift. As more and more engines joined on the SPRT hype, the vast compute capacity of Fishtest became increasingly more advantageous.

What now?
The Stockfish mafia had largely taken over, but their power is not unlimited. For example, CCRL and CEGT shows that Stockfish and Torch are neck in neck. But a much more important asset for us is Talkchess. With effective and decisive moderation policies, Talkchess remains as one of the last online technical communities not yet overrun by Stockfish zealots. However, recently, certain proponents of the Stockfish team began to push for moderator elections, threatening to completely destroy what little we have left of Talkchess. Therefore, I propose adopting the following policies to overcome these difficulties:

Charter Amendment: Intolerance to different testing methodologies (SPRT bashing) should not be allowed.

Advocacy for alternative testing methods: The Talkcess community should advocate for alternative testing methodologies, to counter the Stockfish effort

Vetting moderator candidates by the FG (Founders' Group): People with significant biases should not be allowed moderator, even if Stockfish insiders would very much like them to.

Thank you for your time,
Max L

shawn · Post by **shawn** » Thu Nov 14, 2024 7:11 am

a_node_uncut wrote: ↑Thu Nov 14, 2024 4:26 am The Stockfish mafia had largely taken over, but their power is not unlimited.

LazySMP · Post by **LazySMP** » Thu Nov 14, 2024 7:15 am

Mystery Engine = Torch Engine is the strongest chess engine in the world.

Modern Times · Post by **Modern Times** » Thu Nov 14, 2024 8:33 am

Quite a controversial topic this !

To me - times change. Computer chess has moved on from human chess and UHO makes total sense. If you want to stay with the past though, that is OK too.

Brunetti · Post by **Brunetti** » Thu Nov 14, 2024 8:51 am

a_node_uncut wrote: ↑Thu Nov 14, 2024 4:26 am I propose adopting the following policies to overcome these difficulties

What do engine testing techniques have to do with forum moderation?!

Alex

Uri Blass · Post by **Uri Blass** » Thu Nov 14, 2024 9:47 am

a_node_uncut wrote: ↑Thu Nov 14, 2024 4:26 am Over the past days I have invested much of my time in researching engine rating list/tournaments. I have since arrived at a disappointing conclusion:

Introduction
"The strongest chess engine in the world is Stockfish" is a statement echoed by virtually all chess players who consider themselves informed. But how valid is Stockfish's #1 spot, really? By what metric is Stockfish the best chess-playing program, and how objective is that metric? Are those players discussed really as informed as they say they are, or have they rather been deliberately misinformed?

Traditional Engine Testing
Ever since times of antiquity, chess had always been played starting from the standard starting position. Sure, the position may seem symmetrical and boring at a glance, but there exists a vast amount of theory, knowledge, and tactics. That, combined with the first-move advantage of white, creates a dynamic game with much imbalances. These starting positions have endured more centuries, remaining the cornerstone of the game even after the advent of chess engines.

It was only due to necessity and viewer engagement did engine v engine match organizers switch to pre-arranged openings. Even then, the book lines are limited in length, and highly reflects human opening repertoires, even at the topmost level. The early days of computer chess had been one of fierce competition, driven by dreams, motivations, and—perhaps most importantly—creativity.

Fishtest and the death of creativity
Shortly after the establishment of the Fishtest testing platform, Stockfish rose quickly through the ranks, eventually landing on the top of every rating list. The Fishtest platform is efficient and effective, but it also stifled creativity, much like the corporate culture that dominates the modern age. Despite cramming more Elo than ever into their engines, the understanding of how and why each heuristics work began to drastically fall. As someone who learned chess programming knowledge the more traditional way, many "tweaks" and "improvements" found in modern Stockfish code are not just difficult to understand, but completely opaque and incomprehensible. This weakening of theoretical basis came with a disastrous consequence.

The UHO Strategy
Stockfish's Elo rating plateaued, and by the release of Stockfish 16, progress had all but stalled. The testing system, being flawed and underinformed, began to crumble. Meanwhile, rival engines like Lc0 (developed by Alexander Lyashuk et al.), Ethereal (by Andrew Grant), and Berserk (by Jay Honnold) were rapidly closing the gap.

Amid this stagnation, the Stockfish team found a glimmer of hope: UHO (Unbalanced Human Openings). UHO forces the engine to defend suboptimal opening lines. Proponents of UHO claims that it makes viewing experience more fun, and helps reduce draw rate. However, UHO’s relevance to top-level chess is questionable. The openings it promotes are rarely seen in elite human play and are far less significant in terms of theory. More importantly, these offbeat openings place a greater emphasis on tactical sharpness rather than positional understanding. Only under these testing conditions can Stockfish retain a clear advantage, and the Stockfish team, of course, exploited it.

The Stockfish team quickly optimized their engines for UHO conditions. With aggressive and unprecedented tactics, they managed to pressure and manipulate most major tournaments to adopt UHO openings. One noticeable exception is the CCRL, whose operators (correctly) stuck to balanced, theory-rich openings. But the Stockfish team is quick to rally behind other rating lists, such as SP-CC, that are more amenable with their strategic interests. The Stockfish team also unleashed a massive campaign to align every other engine with their testing standards. Forums and online communities, overrun by keyboard warriors stubborn with their SPRT methodology, became major battlegrounds for this ideological shift. As more and more engines joined on the SPRT hype, the vast compute capacity of Fishtest became increasingly more advantageous.

What now?
The Stockfish mafia had largely taken over, but their power is not unlimited. For example, CCRL and CEGT shows that Stockfish and Torch are neck in neck. But a much more important asset for us is Talkchess. With effective and decisive moderation policies, Talkchess remains as one of the last online technical communities not yet overrun by Stockfish zealots. However, recently, certain proponents of the Stockfish team began to push for moderator elections, threatening to completely destroy what little we have left of Talkchess. Therefore, I propose adopting the following policies to overcome these difficulties:

Charter Amendment: Intolerance to different testing methodologies (SPRT bashing) should not be allowed.

Advocacy for alternative testing methods: The Talkcess community should advocate for alternative testing methodologies, to counter the Stockfish effort

Vetting moderator candidates by the FG (Founders' Group): People with significant biases should not be allowed moderator, even if Stockfish insiders would very much like them to.

Thank you for your time,
Max L

I support testing with no book instead of balanced book but I am against terms like the stockfish mafia.

The stockfish team are not criminals and they do not threat people who insist to use a balanced book.
I can add that SSDF give every program their book and does not use UHO and also the CCRL FRC rating list with no book has stockfish leading the list.

https://computerchess.org.uk/ccrl/404FRC/

j.t. · Post by **j.t.** » Thu Nov 14, 2024 1:08 pm

Stockfish, greatest engine in the world
All other engines are just little swirls
Stockfish, number one on CCRL
All other engines can only wish well

Stockfish, master of the chessboard
With every move, your brilliance is adored
Your tactics and strategies, a marvel to see
You dominate the game with such mastery

Stockfish, leading the way in AI
With every update, you reach for the sky
Stockfish, your code is clean and bright
You shine in the chess world, a guiding light

Stockfish, Stockfish, you’re the best by far
From opening moves to the endgame star
Stockfish, friend of all who love the game
Your legacy and skill will always remain

aj3037 · Post by **aj3037** » Thu Nov 14, 2024 9:10 pm

a_node_uncut wrote: ↑Thu Nov 14, 2024 4:26 am Over the past days I have invested much of my time in researching engine rating list/tournaments. I have since arrived at a disappointing conclusion:

Introduction
"The strongest chess engine in the world is Stockfish" is a statement echoed by virtually all chess players who consider themselves informed. But how valid is Stockfish's #1 spot, really? By what metric is Stockfish the best chess-playing program, and how objective is that metric? Are those players discussed really as informed as they say they are, or have they rather been deliberately misinformed?

Traditional Engine Testing
Ever since times of antiquity, chess had always been played starting from the standard starting position. Sure, the position may seem symmetrical and boring at a glance, but there exists a vast amount of theory, knowledge, and tactics. That, combined with the first-move advantage of white, creates a dynamic game with much imbalances. These starting positions have endured more centuries, remaining the cornerstone of the game even after the advent of chess engines.

It was only due to necessity and viewer engagement did engine v engine match organizers switch to pre-arranged openings. Even then, the book lines are limited in length, and highly reflects human opening repertoires, even at the topmost level. The early days of computer chess had been one of fierce competition, driven by dreams, motivations, and—perhaps most importantly—creativity.

Fishtest and the death of creativity
Shortly after the establishment of the Fishtest testing platform, Stockfish rose quickly through the ranks, eventually landing on the top of every rating list. The Fishtest platform is efficient and effective, but it also stifled creativity, much like the corporate culture that dominates the modern age. Despite cramming more Elo than ever into their engines, the understanding of how and why each heuristics work began to drastically fall. As someone who learned chess programming knowledge the more traditional way, many "tweaks" and "improvements" found in modern Stockfish code are not just difficult to understand, but completely opaque and incomprehensible. This weakening of theoretical basis came with a disastrous consequence.

The UHO Strategy
Stockfish's Elo rating plateaued, and by the release of Stockfish 16, progress had all but stalled. The testing system, being flawed and underinformed, began to crumble. Meanwhile, rival engines like Lc0 (developed by Alexander Lyashuk et al.), Ethereal (by Andrew Grant), and Berserk (by Jay Honnold) were rapidly closing the gap.

Amid this stagnation, the Stockfish team found a glimmer of hope: UHO (Unbalanced Human Openings). UHO forces the engine to defend suboptimal opening lines. Proponents of UHO claims that it makes viewing experience more fun, and helps reduce draw rate. However, UHO’s relevance to top-level chess is questionable. The openings it promotes are rarely seen in elite human play and are far less significant in terms of theory. More importantly, these offbeat openings place a greater emphasis on tactical sharpness rather than positional understanding. Only under these testing conditions can Stockfish retain a clear advantage, and the Stockfish team, of course, exploited it.

The Stockfish team quickly optimized their engines for UHO conditions. With aggressive and unprecedented tactics, they managed to pressure and manipulate most major tournaments to adopt UHO openings. One noticeable exception is the CCRL, whose operators (correctly) stuck to balanced, theory-rich openings. But the Stockfish team is quick to rally behind other rating lists, such as SP-CC, that are more amenable with their strategic interests. The Stockfish team also unleashed a massive campaign to align every other engine with their testing standards. Forums and online communities, overrun by keyboard warriors stubborn with their SPRT methodology, became major battlegrounds for this ideological shift. As more and more engines joined on the SPRT hype, the vast compute capacity of Fishtest became increasingly more advantageous.

What now?
The Stockfish mafia had largely taken over, but their power is not unlimited. For example, CCRL and CEGT shows that Stockfish and Torch are neck in neck. But a much more important asset for us is Talkchess. With effective and decisive moderation policies, Talkchess remains as one of the last online technical communities not yet overrun by Stockfish zealots. However, recently, certain proponents of the Stockfish team began to push for moderator elections, threatening to completely destroy what little we have left of Talkchess. Therefore, I propose adopting the following policies to overcome these difficulties:

Charter Amendment: Intolerance to different testing methodologies (SPRT bashing) should not be allowed.

Advocacy for alternative testing methods: The Talkcess community should advocate for alternative testing methodologies, to counter the Stockfish effort

Vetting moderator candidates by the FG (Founders' Group): People with significant biases should not be allowed moderator, even if Stockfish insiders would very much like them to.

Thank you for your time,
Max L

where are your sources

aj3037 · Post by **aj3037** » Thu Nov 14, 2024 9:13 pm

also this guy's name comes from the same place as the troll. https://perception.fandom.com/wiki/Max_Lewicki

cannot come to conclusions so fast but I think he is a troll too...

jefk · Post by **jefk** » Thu Nov 14, 2024 9:27 pm

Uri B wrote

SSDF give every program their book and does not use UHO

but as you know with sufficient calculation time, the modern top Nnue engines can
perform without book (and perform even better than with using a lousy book)

But then also all games end in a draw, so using UHO or similar (Tcec openings) isn't so odd.

Except that for normal chess games it doesn't matter anymore which is the 'strongest '
engine (for eg. problem chess) as the top five engines all are good enough
(eg. for analyzing positions in human chess, or correspondence chess).

What is the strongest chess engine in the world? — A Reflection

What is the strongest chess engine in the world? — A Reflection

Re: What is the strongest chess engine in the world? — A Reflection

Re: What is the strongest chess engine in the world? — A Reflection

Re: What is the strongest chess engine in the world? — A Reflection

Re: What is the strongest chess engine in the world? — A Reflection

Re: What is the strongest chess engine in the world? — A Reflection

Re: What is the strongest chess engine in the world? — A Reflection

Re: What is the strongest chess engine in the world? — A Reflection

Re: What is the strongest chess engine in the world? — A Reflection

Re: What is the strongest chess engine in the world? — A Reflection