Hi... I'm working on engine research

FireDragon761138 · Post by **FireDragon761138** » Fri Jan 16, 2026 2:29 am

hgm wrote: ↑Mon Jan 05, 2026 9:56 am Training a network for exactly reproducing win probability in principle desroys the possibility for the engine to convert a non-trivial win. E.g. a KBNK mate typically takes some 60 ply, but, as long as no B or N is blundered away, is always a certain win. There is no way a search would reach 60 ply if it has no guidance for what to prune; without pruning you must be happy to reach 8 ply. And there is no guidance, neither for prunig as for striving for intermediate goals (like getting the bare King into the corner), as all non-sacrificial end-leaves will have an identical 100% score. This reduces the engine to a random mover biased against giving away any material. No way its random walk through state space will ever bring a checkmate within the horizon. At some point it will stumble into the 50-move barrier, and there is nothing it can be done to avoid it.

A corrolary is that when an engine is trained to reproduce the true win rate on the games played by the latest version, it can never achieve the exact win rate. When it is still ignorant it will discover that it will have a higher chance of winning in positions where the mate is close, just because it has a larger chance to stumble on a position that has it within the search horizon. Those will then get higher score, which can be seen by the search from further away, so that the win rate there will also improve. But it has to keep a score gradient large enough to guide the play towards the mate from far away. And there is some minimum gradient for which this still works, because there will always be evaluation noise. So you will get some equilibrium, where positions far away from any mate only get, say, 90% win rate. Because that will indeed be their win rate, because the gradient from 90% to 100% is spread over so many moves that it has become so small that it fails to follow it in 10% of the cases.

This is fascinating... the data I'm seeing experimenting with different training parameters for baking the NNUE's is producing initial results that look exactly like what you are talking about. As I bake the NNUE's more and more, the tactical vision of the engine crystalizes, becomes sharper, but the lines are less interpretable by an LLM into classical chess strategic themes. Elo goes up, intelligibility goes down. And it's actually a pretty asymetric relationship. A few elo lost equals a huge gain in intelligibility.

It could be that the "eidos" of chess just isn't reducible to finite mathematical systems, at best you are going to get fractured representations. If this is true, I wouldn't expect a maximally powerful chess engine to necessarily be easily intelligible. Which is a difficulty, pedagogically. Which suggests Stockfish might not be the optimum chess engine for evaluation. Other, simpler or weaker engines might actually be superior.

FireDragon761138 · Post by **FireDragon761138** » Fri Jan 16, 2026 12:58 pm

I'm still working on the Theoria project... experimenting with Stockfish's search optimism on. I believe optimism might be useful for bridging the conceptual gap I was talking about earlier, a way to say "trust the evaluation, despite the dynamic complexity of a position". But it's a hypothesis I plan on actually testing before I decide to leave optimism in the code or not.

I'm getting verry good results with Theoria in terms of efficiency of evaluation. It produces about the same quality of evaluations as Stockfish, but with far fewer nodes searched. Stockfish is trying to get the evaluation to do heavy lifting, and this necessitates a certain amount of loss in positional understanding, I believe. Theoria might make a good engine to use for mass analysis, as it could end up using far less energy and computations vs. Stockfish.

Hi... I'm working on engine research

Re: Hi... I'm working on engine research

Re: Hi... I'm working on engine research