Multi-Threading at Root.

Werewolf · Post by **Werewolf** » Sun Aug 17, 2025 8:37 pm

This silly example below has a serious point. I am trying to write a chess engine and I'm grappling with the issue of multi-threading. If you are knowledgeable please could you answer, but don't ask an AI - I already did that. Thank you.

Imagine two different chess setups, designed for correspondence play.
Mr. Enthusiastic and Mr. Filthy-Rich are both busy men. Each loves chess but has little time for it. Mr. Enthusiastic buys a computer just for chess. Let's say, for the sake of argument, it's a modern machine with 16 cores but he runs Stockfish on one single core. He lets it think for 1 minute and whatever move it chooses is what he plays in his correspondence games.

Mr. Filthy-Rich hears about this and decides to challenge Mr. Enthusiastic to a 100 game match. He's also a little clueless but exceedingly wealthy. He decides to buy 100 computers which are identical to Mr. Enthusiastic's single computer. Like Mr. Enthusiastic he also runs Stockfish on one single core for one minute, but with a crucial difference. He makes each computer consider a different legal move in response to Mr. Enthusiastic's move. So, for example, if Mr. Enthusiastic opened the game with 1.e4 then Mr. Filthy-Rich would use 20 of his machines to look at the 20 legal replies for black - each machine only thinking for 1 minute and looking at a different reply each.

You may question the logic of Mr. Filthy-Rich. But do remember this billionaire cares nothing for money, only for beating his fellow man. He reasons that if he copies Mr. Enthusiastic's setup but employs more machines he will certainly win the match.

Now for the question: how much stronger is Mr. Filthy-Rich's setup to Mr. Enthusiastic's?

Robert Pope · Post by **Robert Pope** » Mon Aug 18, 2025 1:29 am

Here's my shot. I probably messed up some of the math:

1. Filthy Rich is essentially searching one ply deeper than his opponent.

2. The effective branching factor of Stockfish is something around 1.8 (someone could probably give a more accurate number here), so an additional ply is equivalent to searching 1/(1.8-1) -1 = 25% faster than double-speed.

3. A doubling in speed is typically worth 50-80 elo, probably on the low end, so say 50.

4. Scaling up for the deeper search is then roughly 50*1.25 = 62 elo. (Linear projection is fine here, there error bar in the doubling estimate exceeds the error in linear interpolation.)

5. Add in some error bars, and Filthy Rich is probably 50-70 elo higher with his setup.

Joerg Oster · Post by **Joerg Oster** » Mon Aug 18, 2025 8:44 am

Werewolf wrote: ↑Sun Aug 17, 2025 8:37 pm This silly example below has a serious point. I am trying to write a chess engine and I'm grappling with the issue of multi-threading. If you are knowledgeable please could you answer, but don't ask an AI - I already did that. Thank you.

Imagine two different chess setups, designed for correspondence play.
Mr. Enthusiastic and Mr. Filthy-Rich are both busy men. Each loves chess but has little time for it. Mr. Enthusiastic buys a computer just for chess. Let's say, for the sake of argument, it's a modern machine with 16 cores but he runs Stockfish on one single core. He lets it think for 1 minute and whatever move it chooses is what he plays in his correspondence games.

Mr. Filthy-Rich hears about this and decides to challenge Mr. Enthusiastic to a 100 game match. He's also a little clueless but exceedingly wealthy. He decides to buy 100 computers which are identical to Mr. Enthusiastic's single computer. Like Mr. Enthusiastic he also runs Stockfish on one single core for one minute, but with a crucial difference. He makes each computer consider a different legal move in response to Mr. Enthusiastic's move. So, for example, if Mr. Enthusiastic opened the game with 1.e4 then Mr. Filthy-Rich would use 20 of his machines to look at the 20 legal replies for black - each machine only thinking for 1 minute and looking at a different reply each.

You may question the logic of Mr. Filthy-Rich. But do remember this billionaire cares nothing for money, only for beating his fellow man. He reasons that if he copies Mr. Enthusiastic's setup but employs more machines he will certainly win the match.

Now for the question: how much stronger is Mr. Filthy-Rich's setup to Mr. Enthusiastic's?

Just fyi, I'm doing exactly this in Matefish. See https://github.com/joergoster/Stockfish ... #L279-L293
It's a pure speedup patch.

Werewolf · Post by **Werewolf** » Mon Aug 18, 2025 5:51 pm

Robert Pope wrote: ↑Mon Aug 18, 2025 1:29 am Here's my shot. I probably messed up some of the math:

1. Filthy Rich is essentially searching one ply deeper than his opponent.

2. The effective branching factor of Stockfish is something around 1.8 (someone could probably give a more accurate number here), so an additional ply is equivalent to searching 1/(1.8-1) -1 = 25% faster than double-speed.

3. A doubling in speed is typically worth 50-80 elo, probably on the low end, so say 50.

4. Scaling up for the deeper search is then roughly 50*1.25 = 62 elo. (Linear projection is fine here, there error bar in the doubling estimate exceeds the error in linear interpolation.)

5. Add in some error bars, and Filthy Rich is probably 50-70 elo higher with his setup.

Thanks. This is close to what Bob H estimated several years ago (he actually said 50 Elo Max IIRC).

However, I had this alarming exchange with Gemini 2.5 Pro: "Counter-intuitively, Mr. Filthy-Rich's setup is catastrophically weaker than Mr. Enthusiastic's. Mr. Enthusiastic would win the 100-game match by an overwhelming score.

Here’s why:

The core of a chess engine's strength isn't just raw calculation; it's the efficiency of its search. Engines use sophisticated algorithms, most famously alpha-beta pruning, to intelligently manage their time.

Mr. Enthusiastic (The Smart Searcher): His single core of Stockfish looks at all legal moves. Within milliseconds, it determines that most of them are bad (e.g., hanging a piece for no reason). It "prunes" these bad lines from its search tree and spends virtually all of its 1-minute thinking time analyzing the 2-4 most promising "candidate moves" to a very deep level.

Mr. Filthy-Rich (The Inefficient Delegator): His method completely destroys this efficiency. For a given position with 30 legal moves:

One computer spends a full minute deeply analyzing the best move, 1...e5.

Another computer spends a full minute deeply analyzing the worst possible blunder, like 1...g5??.

28 other computers spend a full minute deeply analyzing all the other moves, from good to mediocre to terrible.

The problem is that the computer analyzing the actual best move only gets 1 minute of thought, whereas Mr. Enthusiastic's single machine dedicates its entire 1-minute cycle to comparing the best few options. This means Mr. Enthusiastic's engine achieves a much greater search depth on the lines that actually matter. It will see tactical combinations and long-term positional plans that Mr. Filthy-Rich's setup is completely blind to.

In short, Mr. Filthy-Rich is paying for 59 of his computers to do useless work, while Mr. Enthusiastic's single computer is working with maximum efficiency.

The result is a huge difference in playing strength, likely well over 1000 Elo points. Mr. Enthusiastic is fielding a super-human Grandmaster, while Mr. Filthy-Rich has created a system that is tactically short-sighted and positionally clueless."

I replied: "Are you sure? Stockfish playing black in my example above would prune as normal, so if 1...e5 is best it would look at that move alone for the whole 1 minute. At the end Mr. Filthy-Rich would pick whatever move for black scored best from all the available moves from all the available computers. I don't see how black could possibly be disadvantaged. It's not like he has altered Stockfish's search."

Then came its reply:
"Yes, I am absolutely sure. You've hit on the subtle but critical reason why this setup is so weak.

You are correct that each of Mr. Filthy-Rich's computers runs its search normally within its assigned task. The machine told to analyze only 1...e5 will use its full minute and all its pruning logic to explore the consequences of 1...e5.

The disadvantage isn't in how Stockfish searches a given line; it's in the complete lack of comparison during the search.

A chess engine's power comes from dynamically allocating its time. Think of it this way:

Mr. Enthusiastic's Single Computer
It sees 30 legal moves.

In the first second, it realizes 25 of them are mediocre or bad. It essentially stops thinking about them.

It identifies 5 promising "candidate" moves.

For the next 59 seconds, it focuses exclusively on these 5 good moves, constantly comparing them against each other. It might spend 25 seconds on move A, realize move B is slightly better, and then shift its focus to spend 30 seconds on move B to confirm its superiority.

The key is that the most promising lines get the vast majority of the processing time, leading to a very deep search on the lines that actually matter."

Eventually it backed down and admitted it was wrong.

Multi-Threading at Root.

Multi-Threading at Root.

Re: Multi-Threading at Root.

Re: Multi-Threading at Root.

Re: Multi-Threading at Root.