Best Style/Strength Ratio?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
A Distel
Posts: 2997
Joined: Thu Dec 30, 2010 12:33 pm

Re: Best Style/Strength Ratio?

Post by A Distel » Sat Feb 17, 2018 10:12 am

Ovyron wrote:What do you think about engine's style?

Apparently computer chess enthusiasts can be separated into three groups:

1. The largest group measures success on "elo" points. New programmers are introduced to this concept and dedicate their time to improving their engine's performance. By far, the largest amount of resources is used for testing chess engines and measuring their exact strength against other engines and previous versions. Any kind of change, even if it makes sense, is deemed a "regression" if it provides fewer elo, and set back, and authors may decide to leave bugs in, or code that they don't understand in their engine, if it prouces a higher elo.

2. There's a second group that doesn't care about elo, and cares about analysis features. For these, things like "MultiPV" or "go searchmoves" are features wanted in an engine. They measure success on solving best moves for positions, and want the engine that finds those moves the fastest. For this they are willing to interact with the positions and use non-standard analysis methods like Chessbase's DPA or Aquarium's IDEA. They need software to store their positions, like when the engine has learning to speed up analysis, and have pet positions or test suites that they use to measure how good an engine is at finding the best moves on these. Special versions of engines are made for these cases even if they have lower elo than default.

3. A third group enjoys the games that the engines play themselves, no less than those classic games played by human masters. Aiming to produce the same exciment of so called "immortal games", collecting engines that allow the user to experiment with settings and create personalities, posting interesting games by engines to be replayed by other users with the same enthusiams, so that these great games aren't just used to produce some elo and forgotten. These will join servers or channels to watch the engine games as they happen, and will treasure old versions of engines that produced games more interesting to replay. They measure success in something called "style"; the more style an enginer has, the better the games it produces, and the lower the style, the duller, less worth replaying games are produced.

Of course there are other groups, like those that want an engine to play against, those that use the engines to train, or to analyze their games but don't care about finding the best continuation, but about finding a continuation they understand that may help them in their chess games, etc.

But focusing on the first 3, what becomes apparent is that the first 2 have objective methods to measure their success, elo becomes exact with more games played, more positions solved or solving them faster is easily seen as better by the second group.

But what about the third one?

We currently have no way to measure style, and it seems to be one of those things that "you know it when you see it". But so far all methods to point out and claim that some engine's style is amazing have been subjective. We can't even agree on what's positional play or tactical play, because a positional style may be produced by discovering long term tactics.

Style isn't a binary compound where some engines start at 0 style and a maximum style is set at 100 and you place them in a line, because the engine's style depends on many things.

Even King Safety isn't like that, there may be an engine that launches king attacks on every game even if they're not successful, but takes care of keeping its king secure, while another may not attack as often, but doesn't care about the safety of their own King and may left it in the middle of the board. What style is better? Surely the latter seems more impresive it it manages to win by a split hair as otherwise it was about to be mated.

Another factor is material imbalances, when an engine seeks them out, and you can see a game praised as spectacular when a whole queen is sacrificed for long term advantage. But materialism is very important here, and engines of better style tend to be lower on material as per classic counting of it, and compesnate with positional factors.

Does the engine that doesn't have any problem sacrificing pieces against the enemy king to open it up and hang tight to its pawns have a worse style than one that throws away all its pawns to achieve higher piece activity?

What looks best, sacrificing a Bishop and a Knight and prove that it's sound because the other side has to defend? Or playing a rook to an attacked square and just leave it there because the opponent is worse capturing it, and then you do it with a queen...

Those are questions left unanswered because everything is used to measure engine's performance or their ability to finds solutions faster.

Mentioning actual names, the most impressive style I witnessed for years were by Thinker 5.3b Inert, but then Fizbo appeared...

My idea is that an engine's style might be measured by some Style/Strength ratio. Or a Strength/Style ratio, or whatever.

The point is that once you get enough elo, you would defeat weaker engines with a great style, and even better than engines of supposedly better style at the same level.

Checking only won games may not be the answer, as drawn games may present some great style despite the result, and part of style may be playing spectacular moves even if they lose. So, imagine some 3400 elo engine that goes all or nothing against everything and would rather lose than get a draw, producing spectacular play and still achieving high level, wouldn't that have a greats style, forcibly?

So with Fizbo, it may not have shown a style as great as Thinker, but it was playing much stronger and still with a superb style, so it'd have achieved a higher Style/Strength ratio. The best of both worlds.

And then it came Houdini with high Contempt, that I never tested to check its strength, but was achieving some incredible playing style and move choice...

What has inspired this thread is the recent developments with Stockfish Contempt. Stockfish has became stronger than any other public engine and the games it produces have been awesome!

While Stockfish was aiming for elo, and managed to top the rating lists, it perhaps also became the champion of the other two camps, with the best rate at solving positions, and with the best Style/Strength ratio because nothing else comes close to playing as attractively at this level.

Who do you think has the best ratio, if such a thing exists, and what would you propose so this could be measured objectively? Perhaps there's some unknown engine currently that would be best at this, and not specially by playing very strong, but we haven't caught it in the sea of games played.

Or perhaps we can increase Stockfish Contempt so that it theoretically has better style than anything else, is this the way to go?
+1 !
Political correctness is a straightjacket.

Post Reply