Speed of winning

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

User avatar
hgm
Posts: 27701
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Speed of winning

Post by hgm »

This is just anecdotal ´evidence´:

I am currently tuning my mini-Shogi engine, by playing it against its previous version and 3 other opponents, from a set of starting positions. The other opponents are all a lot weaker, but because many starting positions are highly biased as well, they still score all better than 25%. (They are the only available programs capable of automated play.) The biasing in the initial positions makes the test less sensitive to improvements in my engine, though. (I still have to clean up the set by weeding out all guaranteed 1-0 positions, as playing those is just a waste of time, but I can only do that after having played each position enough times to identify those. And even then they are sometimes lost because engines crash or forfeit by illegal moves.)

But I did notice one thing: if I make an improvement that shows up as a modest improvement against the old version, which then hardly shows up as an improvement in the tests against the others, it does seem to beat them significantly faster than the previous version did. So in situations where you win most of the games, it seems there is some information in how fast you win. If the weaker version already won that position, there is nothing the new version can improve on score-wise, but it can still improve on the number of moves it needed to checkmate the opponent.

I have not quantatively investigated this, but I could imagine that the average game length of a win could be an important indicator of the strength of an engine. But perhaps this is only true in mini-Shogi. Has anyone investigated this in Chess?
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Speed of winning

Post by Evert »

I think the topic (or something similar) has been discussed here in the past, but I don't know exactly when or who participated in the discussion. I also don't remember whether there was a solid conclusion.

The idea makes intuitive sense though, and it would be a great help with testing because you can gain much more information from games that way (the other way is to analyse the games and see where the weaknesses are, but that's practically undoable; perhaps just for games where there's an abrupt change in score, meaning the engine blundered).

Something I have noticed that doesn't directly help here is that testing across multiple variants tends to be very beneficial. Whereas a particular change in normal chess may be only 5 elo, it could be 10-20 elo in Spartan chess (or vice-versa, presumably, though I' haven't actually ever seen that), so it's much easier to show that it is an improvement there. Of course that only saves you time if you're willing to believe that an improvement for one variant is also an improvement (or neutral) in other variants, which will almost certainly not always be true.