What exactly does "weakly" and "strongly" solved games mean

Discussion of chess software programming and technical issues.

Moderator: Ras

klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

What exactly does "weakly" and "strongly" solved games mean

Post by klx »

Hi, I was reading a bit about solving games, and am really confused around the levels a game can be solved. From Wikipedia we have:

Ultra-weak
Prove whether the first player will win, lose or draw from the initial position, given perfect play on both sides.

Weak
Provide an algorithm that secures a win for one player, or a draw for either, against any possible moves by the opponent, from the beginning of the game. That is, produce at least one complete ideal game (all moves start to end) with proof that each move is optimal for the player making it.

Strong
Provide an algorithm that can produce perfect moves from any position, even if mistakes have already been made on one or both sides.


First: What is weakly solved?

Several of the weakly solved games, such as Checkers, seem to have used an endgame database only consisting of win/draw/loss, together with a forward search. In this case, as far as I can tell, one can not figure out the optimal moves from the database, so wouldn't this make the game "ultra-weakly solved" (i.e. we have proven who won, not the optimal moves).

Moreover, what does "ideal" and "optimal" mean? E.g. if the game is won, would one have to find the shortest win? When combining forward search and endgame database, I think we often don't have this guarantee.

I'm also wondering what an "ideal game" looks like when the game is a draw? Even if we have a full distance-to-mate endgame database, there is no way to make progress for drawn states?

Second: What is strongly solved?

Let's say that I can solve a game by retrograde analysis. The program is already written, but it would take 1 year of computing time to run, and at the end of this it would produce a 400 terabyte compressed database, describing the distance to mate for every possible state. What would I have to do in order to claim that I've strongly solved the game?

Is it enough to just write the program that solves it together with proof that it would solve the game in a "reasonable" amount of time and resources (1 year; as opposed to say Chess). Or would I need to actually produce the database? Would the entire database need to be persisted at one time? How would I prove that I've solved the game, would I need to make these 400 terabyte available online?

Is there even consensus around what strongly solved means, or is it a up for individual interpretation? In the "Checkers Is Solved" paper, it is claimed that the game is strongly solved for <= 10 pieces through an endgame database, which as far as I can tell only includes win/draw/loss data from which we can't extract perfect moves.
[Moderation warning] This signature violated the rule against commercial exhortations.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: What exactly does "weakly" and "strongly" solved games mean

Post by hgm »

I suppose the given definitions are only meaningful when they are combined with a tie restraint. It requires only a very small program to 'strongly solve' chess through an alpha-beta search; problem is that it requires a time per move longer than the projected age of the universe to actually use it for playing a game.

Having an EGT containing all positions would count as strongly solved, as playing from the EGT would only require probing the positions reachable from the current position to get their 'distance to win'. This can presumably be done at any time control, if the EGT is on a modern storage device.

If you only have EGT for <= 10 pieces, and the start position of the game contains more pieces, you would have to do an alpha-beta search for finding a winning move from any position with more than 10 pieces. But you only have to search to the point where you enter the EGT, as you can use the DTW there as evaluation. If the required search is very deep, it can require significant tme. So it is possible that you have solved 'Correspondence Checkers', but not 'Blitz Checkers'.

Note that EGT typically contain a progress measure (distance to mate, distance to conversion, distance to zeroing), not just win/draw/loss info. Although the latter type does exist as well. (Usually known as 'bitabases').
klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

Re: What exactly does "weakly" and "strongly" solved games mean

Post by klx »

hgm wrote: Fri Jun 18, 2021 5:46 pmHaving an EGT containing all positions would count as strongly solved, as playing from the EGT would only require probing the positions reachable from the current position to get their 'distance to win'. This can presumably be done at any time control, if the EGT is on a modern storage device.
I buy this, but if the EGT only stores win, draw or loss for each state (as was the case with Checkers), I don't think I can get the "optimal" next move? In this case, how can they call it strongly solved. It's not even weakly solved, as I understand it.
hgm wrote: Fri Jun 18, 2021 5:46 pm Having an EGT containing all positions would count as strongly solved, as playing from the EGT would only require probing the positions reachable from the current position to get their 'distance to win'. This can presumably be done at any time control, if the EGT is on a modern storage device.

If you only have EGT for <= 10 pieces, and the start position of the game contains more pieces, you would have to do an alpha-beta search for finding a winning move from any position with more than 10 pieces.
Well, consider this:

I search alpha-beta, and at depth 15 I hit the EGT, which encodes a win in another 25 moves. This propagates up and so in total we have a guaranteed win in 40 moves.

BUT! If I had continued searching alpha-beta, at depth 30 there might have been either a non-EGT win, or an EGT win in 5 moves. So there was in fact a faster win. In this case, even though we found a guaranteed win, it was not optimal.

[Moderation] I might have accidentally damaged this posting; I tried to reconstruct it as much as possible.
[Moderation warning] This signature violated the rule against commercial exhortations.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: What exactly does "weakly" and "strongly" solved games mean

Post by hgm »

klx wrote: Fri Jun 18, 2021 6:27 pmI buy this, but if the EGT only stores win, draw or loss for each state (as was the case with Checkers), I don't think I can get the "optimal" next move? In this case, how can they call it strongly solved. It's not even weakly solved, as I understand it.
Indeed, an EGT that only contains WDL wouldn't count as a strong solution. Fortunately most EGT contain distance-to-win info.
Well, consider this:

I search alpha-beta, and at depth 15 I hit the EGT, which encodes a win in another 25 moves. This propagates up and so in total we have a guaranteed win in 40 moves.

BUT! If I had continued searching alpha-beta, at depth 30 there might have been either a non-EGT win, or an EGT win in 5 moves. So there was in fact a faster win.
This can never happen when you can enter the EGT with less than 10 pieces only through those with 10 pieces. In Chess that would be true. I am not sure if this is true in Checkers (of which I don't really know the rules). There might be multiple capture there. But perhaps this can be solved by a trick, where you consider multiple capture not as a single move, but as a sequence of separate moves, (some of which do not change the side to move), and extending the game state (by which the EGT is indexed) with the notion of an 'active piece'.
In this case, even though we found a guaranteed win, it was not optimal.
I don't think that 'optimal' in the sense of minimum number of moves is a requirement. It would be strange if it is, because it is a subjective measure. I.e., you would need a metric not based on the game rules to compare wins. E.g. when you would make a double capture, would it count as two moves or one? When I move Ra1-a8, would it count as 7 moves or as one? In blitz Chess we used to move the King as close to the clock as possible in dead-drawn situations, to gain time. Should a move Ka1-a2 count 8 times more than Kh1-h8, if the clock is next to the h-file?

[Edit] On second thought, I think even when you can enter the EGT at a point where there are fewer pieces than that maximum, it would not be a problem. As long as the EGT is really complete. The forward search will not be resolved until all branches will have hit the EGT (or terminated through achieving the win condition). So there never is anything deeper to search. If a search from a position that is in the EGT would give another DTW than specified in the EGT, it means your EGT is faulty.
klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

Re: What exactly does "weakly" and "strongly" solved games mean

Post by klx »

hgm wrote: Fri Jun 18, 2021 7:08 pm Indeed, an EGT that only contains WDL wouldn't count as a strong solution. Fortunately most EGT contain distance-to-win info.
Ok si this is part of my confusion since Jonathan Schaeffer refers to that as a strong solution in his paper. So my take is that what people refer to as "weak" or "strong" is perhaps a bit arbitrary.

Unless someone can explain how, my take is that Checkers (and many other games) are not weakly solved in the sense that they produced "all optimal moves from start to end".
hgm wrote: Fri Jun 18, 2021 7:08 pm This can never happen when you can enter the EGT with less than 10 pieces only through those with 10 pieces.
So here's what I think can happen.

We have a complete 8-men EGT.

There are currently 9 pieces on the board. White to move.

We are doing iterative deepening search. For the sake of simplicity, we are currently at depth 1.

White can do moves A or B. Move A is a capture move. The resulting state is in the EGT as a white win-in-10. So at this point we have a guaranteed win in 1 + 10 = 11 moves. Move B is not a capture, and gets some heuristic evaluation.

But! If we continue the search to depth 3, move B is followed by black C and then white capture move D, which is in the EGT as a white win-in-4, for a total of 3 + 4 = 7 moves. So, by searching deeper, we find a faster win.

So we really need to do the nominal search to the full depth of the current best solution to ensure that we have the fastest win.
hgm wrote: Fri Jun 18, 2021 7:08 pm But perhaps this can be solved by a trick, where you consider multiple capture not as a single move, but as a sequence of separate moves, (some of which do not change the side to move), and extending the game state (by which the EGT is indexed) with the notion of an 'active piece'.
nice :)
[Moderation warning] This signature violated the rule against commercial exhortations.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: What exactly does "weakly" and "strongly" solved games mean

Post by hgm »

I think we agree on this. Heuristic evaluation should not be allowed. Or, in other words, positions not in the EGT should be considered non-quiet.

Checkers is probably reasonably benign, (comapred to Chess), because most moves are irreversible (perhaps all moves?). When you don't have to deal with infinitely long branches, you can safely do a depth-first search that only considers a node a leaf when it hits the EGT. Or falls victim to the equivalent of mate-distance pruning.

As to whether Checkers is really solved: I suppose this depends on what you mean by 'optimal'. I would say that a move is optimal when it doesn't bring you to a position of lower game-theoretical value. In a game without reversible moves any sequence of such optimal moves would eventually result in a win. Even when you have no clue about the DTW, and only have WDL info.
Pio
Posts: 338
Joined: Sat Feb 25, 2012 10:42 pm
Location: Stockholm

Re: What exactly does "weakly" and "strongly" solved games mean

Post by Pio »

hgm wrote: Fri Jun 18, 2021 10:14 pm I think we agree on this. Heuristic evaluation should not be allowed. Or, in other words, positions not in the EGT should be considered non-quiet.

Checkers is probably reasonably benign, (comapred to Chess), because most moves are irreversible (perhaps all moves?). When you don't have to deal with infinitely long branches, you can safely do a depth-first search that only considers a node a leaf when it hits the EGT. Or falls victim to the equivalent of mate-distance pruning.

As to whether Checkers is really solved: I suppose this depends on what you mean by 'optimal'. I would say that a move is optimal when it doesn't bring you to a position of lower game-theoretical value. In a game without reversible moves any sequence of such optimal moves would eventually result in a win. Even when you have no clue about the DTW, and only have WDL info.
I consider an EGTB to be optimal if it for each won position saves one of the moves that minimises the size of the proof tree. The reason why I think it is optimal is because doing so will make it easier for others to prove/confirm the EGTB is correct. It is the same reason I think a simple proof in mathematics is much better than a complicated one since the simple proof is much more likely to be correct and also is much easier to verify.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: What exactly does "weakly" and "strongly" solved games mean

Post by hgm »

EGTs normally do not contain moves. Encoding a move takes usually more space than encoding a DTM or DTZ.
Pio
Posts: 338
Joined: Sat Feb 25, 2012 10:42 pm
Location: Stockholm

Re: What exactly does "weakly" and "strongly" solved games mean

Post by Pio »

hgm wrote: Fri Jun 18, 2021 11:12 pm EGTs normally do not contain moves. Encoding a move takes usually more space than encoding a DTM or DTZ.
I know that it will take up approximately three times more space than win/draw/loss, but it will not be much bigger than DTM or DTZ, at least not for the size of fewer pieces EGTB.

The nice thing with such a format will be that it will lead to very human like play and will also be a great oracle to train NN engines on.

It will be tricky to program for chess because of the 50 move rule however.
jkominek
Posts: 72
Joined: Tue Sep 04, 2018 5:33 am
Full name: John Kominek

Re: What exactly does "weakly" and "strongly" solved games mean

Post by jkominek »

To kix: I can understand why you're confused. On first reading the definitions are clear. Then on second thought it's "Hmm, does that mean... no wait. What?". At least that was my experience. The definitions carry a certain oral tradition, as it were. Let me try my hand at clarifying.

To being I'll steer away from Wikipedia and instead quote syzygy quoting Victor Allis from Allis's PhD thesis of 1994. See http://fragrieu.free.fr/SearchingForSolutions.pdf, page 8. These are definitions that are in the literature. They are likely what a reviewer would have in mind were you to, say, submit a paper to the Journal of the ICGA, or if it's really notable news, to Science (as Jonathan Schaeffer et al. did with his announcement that 8x8 checkers is weakly solved).
According to Victor Allis:
ultra-weakly solved
For the initial position(s), the game-theoretic value has been determined.

weakly solved
For the initial position(s), a strategy has been determined to obtain at least the game-theoretic value of the game, for both players, under reasonable resources.

strongly solved
For all legal positions, a strategy has been determined to obtain the game-theoretic value of the position, for both players, under reasonable resources.
As a minor point of wording, the Allis definitions use the term strategy instead of algorithm, and the preferable phrasing "a strategy has been determined" (past tense) rather than the active "provide an algorithm that can produce". This isn't a homework assignment for crying out loud.

The glaring and discussion-inducing aspect of the above definitions are the wiggle words "under reasonable resources." Those words do not belong in mathematics for the obvious reason that they cannot be formalized. Not to mention that reasonable computing resources of 2021 were unreasonable (nearly unobtainable) in 1994. Proofs are supposed to be timeless. Nonetheless, the reader knows what he's getting at and why it's in there. The presence of those words is quite interesting; I'll offer my viewpoint towards the end of this post.

Victor Allis included a remark on this point, sufficient to let him to get on with the thesis.
We remark that the reasonable resources mentioned may be a subject of
discussion. The size of the resources is meant only to give an approximate
indication of the time and computing equipment allowed for reproducing a
solving strategy. Without these restrictions, it could be argued that, for
instance, chess could be weakly solved. As a strategy to solve chess, an
alpha-beta search through the full game tree suffices. The reasonable resources
mentioned should typically allow the use of a state-of-the-art computer and
several minutes of computation time per move.
Zeroth: What is ultra-weakly solved? (I'll take the liberty of pretending you asked this too.)
The concept of ultra-weakly is included because mathematics contains a large swath of results by non-constructive proof. Some property of the problem is exploited to determine the result, but the exact path from here to there is unavailable. The most relied-upon tool in the mathematicians' arsenal for doing so is proof by contradiction.

Allis provides the example of the game of hex as being ultra-weakly solved. For sake of argument let me sketch out, i.e. make up, what a proof for chess could look like. Assume the starting position is a win for white. Using the properties of chess and several chalkboards of derivations later - this is the fantastical part - it is proven that a win requires at least knight odds. The starting position does not have knight odds, and so cannot be a win for white. Repeat proof by contradiction for black, showing that chess is not a win for black. Therefore, the game theoretic value of chess is drawn.

"Second: What is strongly solved?"
(Jumping ahead as I think this is easier to answer.)
The usual implementation are Distance To Metric endgame tablebases paired with minimizing one-ply search. It is not the only possible form, but is the familiar one.

A first point of clarification is that even though tablebases are computed to some minimizing/maximizing metric (DTM, DTC, DTZ, etc.) a satisfactory strategy need not be so constrained. A strategy could shuffle around for a good while in won positions before going in for the kill. Wasting time does not exclude it from qualifying as strongly solved. This is where the Wikipedia article's phrases of "produce perfect moves" can be misleading. It is easy to get hung up on that. With regards to strongly solving a game, perfect does not mean optimal against a given game-play metric. (As HGM points out, DTM for example is not a part of the rules of chess.)

Perfect means two things:
  • a) no move by a perfect player changes the game theoretic value of win or draw (the value changes only downward through a player's mistake; a player cannot get out of a loss on its own actions alone), and
  • b) progress is guaranteed to eventually reach a game-terminating node of the same theoretic value. This is why the Allis definition says "obtain the game-theoretic value of the position". Looping around endlessly in a circuit of potentially-but-not-won positions does not count as a winning strategy. This is the hazard of WDL bitbases. (Technically part b can be folded into a. In chess infinite delay runs afoul of the 50-move/75-move rule. Running afoul of the rule by making a bad move changes the game theoretical value, which is prevented by a.)
A second point is that strongly solving a game does not preclude search, including search deeper than one ply. RdM's syzygy code+databases pursues capture sequences to determine WDL/DTZ(50) values, and that is okay. The definition does not take a stand on how much search is acceptable or what algorithms are allowed. It merely indicates intent with a vague "reasonable resources" stipulation.

Thus your natural follow-up question: Is it enough to just write the program that solves it together with proof that it would solve the game in a "reasonable" amount of time and resources (1 year; as opposed to say Chess). Or would I need to actually produce the database?

For any game that is non-trivial you need to produce the database. This is for two reasons. First, the definition requires that a strategy has been determined. That is to say, it has already been computed. The surest way to prove that it has been computed is to store the data that the associated Strategy requires. It is this past-tense stipulation that closes the loophole of thinking one can write an alpha-beta prover and proclaim Tada! I have fully solved chess in all of its glory. You haven't. It is by calculating the answer in actuality that brings it from the realm of the theoretical into the realized.

The second reason is that mathematics and computer science are human disciplines, and humans are fallible. To build confidence you want to be able provide results that can be verified and replicated. Think of solving two-player perfect information games as the big brother of proving the four color theorem. The original proof was an exhaustive enumeration of many mini-proofs that the referees had to cross-examine for publication.

The sheer size of chess tablebases is problematic from the goal of verification, a point Guy Haworth likes to make. (He would not enjoy retracting faulty Endgame News.) In the early years Ken Thompson found discrepancies between his results and the published work of the Russian pioneers Arlazarov and Futer, which caused consternation. So it is a very good idea to publish your source code. By this standard the Lomonosov team failed the protocols of science. They published neither their software nor released the full 7-man data. But no one has found mistakes presented by their website or mobile App, and the pedigree of the authors is solid. So they get cut some slack.

To reword your question from another angle, can there be such a thing as a pure-search proof (strategy), i.e. no stored data at all? Yes. You can have a pure-search proof for tic-tac-toe and other small games. The point is that they have been computed, and re-computed, on a time scale that fits well inside normal human activity. More germane to chess, 5x5 mini-chess has been weakly solved using a pure-search strategy. The attacking forces are so locked in immediate battle that retrograde analysis did not have to be resorted to.

"First: What is weakly solved?"
(Now to the middle case.)
First off, yes, the University of Alberta Chinook team have weakly solved 8x8 checkers. As an explanatory mechanism I really like Figure 2 from their Checkers is Solved Science paper. You've read the paper so I won't attempt to encapsulate it. I'll just note that they use what is accepted as the right way to go about solving, a method called Proof Number Search, with exact value endgame tablebases used to "raise the water level" of where game play lines terminate. After extensive use of a search/solver to back up game theoretic values (it does not engage in speculative pruning), the Chinook team saved the core of the proof tree.

In the case of 5x5 mini-chess, the proof tree can be downloaded as a pgn file. http://membres-lig.imag.fr/prost/MiniChessResolution/

This aspect is what I suspect you missed. With the game result known to be a draw, the stored proof tree allows black to preserve the draw in reply to any response by white, and vice versa. In their own words: "The proof tree shows the perfect lines of play needed to achieve a draw. If one side makes a losing mistake, the proof tree may not necessarily show how to win. This additional information is not necessary for proving the draw result." (pg. 1521, col. 2) To answer your question as to what ideal and optimal means, it means that the perfect player never makes a move that drops the half-point, i.e. turns that game theoretic value from drawn to loss. It would be desirable that if the opponent made a mistake that it always be converted to a win, but as they write it is not required. That's the territory of a strong solution.

This is a good place to address another of your questions. Do the Chinook WDL endgame tablebases qualify as strongly solved for sub-games up to N men? I'm on the fence on this one. They provide the information needed to never change the game theoretical value when moving, but in the case of won positions can game termination be guaranteed. Checkers has a must-capture rule. Conceivably a combination of a search algorithm and the WDL data would be enough to guarantee progress, satisfying the requirements of a Strategy. If not, you've caught them!

Precisely stated the 2 to 10-man tablebases provide perfect and exhaustive WDL information generated through retrograde analysis. The combination of exhaustive plus retrograde is sometimes treated as shorthand for strongly solved. In the Chinook team's subsequent paper Partial Information Endgame Database they touch on this at the top of section 2.1
Endgame databases traditionally contain three possible, win (W ), loss (L) or
tie/draw (T ). A new database is computed by repeatedly iterating through all
positions in the database, filling in values for positions that are known, and
repeating until all values are known. Note that we do not store the number of
moves needed to resolve each position, which may be needed to successfully play
out some positions.

"Extra: Is there even consensus around what strongly solved means, or is it a up for individual interpretation?"

I'd say there is consensus among those who read the literature, which usually means those who have been weaned through the research university system. But when you venture off-plantation into say reddit or chess.com discussions, boy is there ever an abundance of not-understanding. The discussions veer off into whether the latest Stockfish has solved chess, or whatnot.

To Pio, I agree that minimizing the size of a proof tree is a worthwhile goal, but you are switching categories. Optimal in this context refers to game play quality, not to ease of verifying a proof (or to a sense of mathematical aesthetics).

Finally, an unsolicited opinion to offer.

Here is what I think is "really" going on when you solve a game, in whole or in part. You are computing perfect game result information while also converting high time complexity into high space complexity. Chess has a certain amount of inherent information that emerges from applying the rules to the board/starting position. Applying the rules generates a gigantic finite state graph that cannot be squeezed below some limit. To find this information you pay for it either with a small little alpha-beta program that know the rules, to which you grant a king's ransom worth of time. Or you pay for it by a king's ransom worth of storage space, to which is paired a fast (near constant-time) lookup algorithm. Or somewhere in between. The definitions for solving two-player perfect information games invites you to convert time into space. It suggests the magnitude of time-into-space conversion by stipulation of "reasonable resources" to obtain, but does not draw a hard line anywhere on the spectrum.

I find this viewpoint interesting because it ties into information theory and Kolmogorov complexity, deep concepts and disciplines of computer science.

This is quite a long post. Sorry about that!