Indeed it points out a very useful insight. Tuning a buggy engine is a waste of time; the tuning is unlikely to remain good after you fixed an important bug.Henk wrote:Actually this was my best post in years.
To judge if you have bugs from the quality of play, it is necessary to have a not completely crazy tuning, but having an evaluation that is little more than piece values (even if only the classical 1:3:3:5:9) and a weak form of centralization for the minors and hefty bonus for 6th and especially 7th-rank Pawns (e.g. through PST) should be good enough to make you spot serious errors.
When I build a new engine from scratch I just watch a number of games against an engine of not-too-different strength (so the score will be between 25% and 75%), and watch them to decide why it loses the games it loses. If you play a little bit of Chess yourself, that is usually easy. So you can see if it just blunders away pieces in the middle-game, naively allows the opponent to advance passers to 6th or 7th rank before acting against them, destroys its King fortress to expose his King, allows the opponent to pile up material against his King fortress for a devastating mate attack, etc.
Blundering away material points to search bugs; they can be stamped out by the painful process of taking a position from the games where this happened, and trace the search tree for that position to diagnose which branch was under- or over-evaluated, and how exactly it got the faulty score. Good tests for the search are whether it is able to efficiently checkmate a bare King with Queen and Rook, and for a slightly more advanced search with TT table whether it will be able to see the promotion in KPK with wKe1, wPe2, bKe8 and the win in Fine #70.
The other things can only be cured by adding evaluation terms that address it. Giving a bonus of up to ~1 Pawn for maintaining a Pawn shield around the King, and discouraging it to walk towards the center, both in the early game phases (or at least as long as the opponent has a Queen), already goes a long way to solve early-checkmate problems against engines that do not grossly outsearch it. It does not require very precise tuning to see if this will ameliorate the observed problem. It should be obvious from watching a dozen fast test games whether it stays losing by the same mistakes (e.g. because you added the term with the wrong sign).
When developing micro-Max I discovered that an engine is much like a boat: as long as it has holes below the waterline, it will sink. It doesn't matter much how many such holes there are; having fewer holes just takes it longer to sink, but the result stays the same. Only when you plug the last hole it can stay afloat. Before that, plugging one hole (e.g. preventing it to expose his King early) just means that it will lose those game by another mistake (e.g. allowing the opponents passers to advance to 7th rank with impunity).

