Is there any project coming to solve chess?

jkominek · Post by **jkominek** » Mon Nov 20, 2023 12:19 am

petero2 wrote: ↑Fri Nov 17, 2023 7:40 pm The quote from Nowakowski (which in turn comes from Allis, "Searching for Solutions in Games and Artificial Intelligence) also contains the clause "assuming reasonable computing resources". It seems difficult to precisely define "weakly/strongly solved" in a meaningful way that excludes a provably correct minimax algorithm that takes 1e1000 years to run on current hardware.

However, I thought it would be more interesting to discuss my proposed informal definition of "weakly solved in practice". It seems to me the minimum requirement to make such a claim is to provide a chess playing entity that no one is able to beat from the starting position no matter how many times they try. I thought it would be interesting if someone that claims that chess has been "weakly solved in practice" would explain what this alleged chess playing entity would be.

I happen to find your topic more interesting than the one that is consuming the majority of the discussion on this thread, so if no one else will, I'll continue the line with a couple thoughts, before the whole thread drops off the front page.

To start, what I'd like to implore is for the phrase "weakly solved in practice" to be dropped from usage. It adds confusion by mixing categories that do not go together. Chess is a mathematical entity. To say that the game has been Strongly, Weakly, or Ultra-weakly solved is to establish a property of the mathematical entity through (usually computationally expensive) deductive proof -- with the proviso of "reasonable resources" complicating things, as has been discussed elsewhere on this forum. For chess, the word solved should be allowed to associate exclusively to the realm of mathematical proofs, not letting it get tangled up with empirical measurements, however impressive and compelling the entire body of chess analysis may be. Solving is proving.

The phrase I prefer to use is "unbeatable in practice." And, I think it is not a hard claim to make. Assemble a high-end computer with today's strongest engine, front-end with an expertly constructed opening book, back-end with the full suite of endgame tablebases, put it onto a playing server at a specific time control, then dare any and all comers to defeat it from the opening position. The precise claim would then be: system V on play server W has not lost in X games over Y days, implying an estimated error rate of not larger than 10 to the minus Z mistakes per move. What is being established is an empirical lower bound against an open group of adversarial players. In the case of multiple claimants, ranking is not by Elo, but length of the undefeated streak.

To make it interesting, as they say in pool, money is put on the table. Say a prize fund it set up, and the entry fee to challenge is 1000 ducets. Funds are dispersed to undefeated players on a time schedule. If you defeat an undefeated player you win that player's money. You then become a target. I don't gamble myself and do not know to devise a successful incentive structure, but I figure something like that could work. Back in the day perhaps ICC or FICS would be the server of choice for such a King of the Hill challenge between engines. What would be the best choice today? Playchess or Lichess? I'm not sure.

Larry Kaufman has gone on the record stating that he does not think Komodo could be defeated under conditions like I have in mind. If they were willing to do that before the team and program retired then the arguments would transfer to how much is good enough? Would five years undefeated count as "unbeatable in practice". Making a declaration on a lower bound estimate is a judgement call, but at least the arguments would be transferred away from a make-up-your-own definition of weakly solving chess.

A second area of continuing discussion, I expect, would center around the system configuration. For the first claimants I'm in favor of unconstrained system construction, like in the early days of ICCA competitions when teams supplied their own hardware, search engine, opening book, and tablebases (and operator). In contrast you propose engines-only running deterministically. I'll quote you here.

I propose the following definition. You provide a publicly available chess playing entity that behaves deterministically. For example you might specify:

* Using Stockfish 16 with specified UCI parameter settings
* Use a publicly available opening book, possibly empty
* Search limits, for example 1e9 nodes per move using a single search thread

You then claim that this chess playing entity is unbeatable when playing a game from the starting position, regardless of which color it plays. If
no one is able to prove this claim to be wrong in a long time (i.e. no one can ever beat the entity), we might consider this entity to have "weakly
solved chess in practice".

Is anyone willing to make such a claim for a deterministic publicly available chess playing entity?

Actually the opening book move selection does not have to be deterministic, but using a book that has more than one move for a position just makes it
harder to obtain an unbeatable entity, because an opponent just needs to find a refutation for one of the book moves to prove that the entity is
beatable.

For this definition to have practical value, it must be possible to run the entity on existing hardware in a reasonable amount of time, just as the
mathematical "weakly solved" definition specifies that the algorithm must be possible to run on existing hardware in reasonable amount of time.

I think your proposal is fine as a follow-on effort. It's not so much attempting to demonstrate that a practically unbeatable chess playing entity exists today, which I consider to be pretty easy, but rather in attempting to locate that dividing line between beatable and unbeatable.

I have some measurements that suggest where one could start looking for the minimally strong unbeatable entity (standard starting position only). This is from a power-of-2 scaling experiment I've been running using the Chess324 positions as the opening book. If I subset to the standard opening position I have some tentative dividing lines for various versions of Stockfish.

Code: Select all

 18) Stockfish 16 Hash 2048 Threads 1 Nodes 134217728   3646.8 :     10 (+0,=10,-0)  50.0%
 21) Stockfish 16 Hash 1024 Threads 1 Nodes 67108864    3637.3 :     18 (+0,=17,-1)  47.2%  ****

 27) Stockfish 15.1 Hash 512 Threads 1 Nodes 8388608    3628.2 :    122 (+33,=89,-0)  63.5%
 36) Stockfish 15.1 Hash 512 Threads 1 Nodes 4194304    3609.6 :    208 (+110,=96,-2)  76.0% ****

  9) Stockfish 15 NET Hash 512 Threads 1 Nodes 33554432 3662.0 :    162 (+75,=87,-0)  73.1%
 24) Stockfish 15 NET Hash 512 Threads 1 Nodes 16777216 3631.7 :    188 (+84,=103,-1)  72.1% ****

 17) Stockfish 14 Hash 1024 Threads 1 Nodes 67108864    3647.2 :    122 (+32,=90,-0)  63.1%
 20) Stockfish 14 Hash 512 Threads 1 Nodes 33554432     3640.3 :    128 (+38,=86,-4)  63.3%  ****

 13) Stockfish 13 Hash 4096 Threads 1 Nodes 268435456   3650.8 :     38 (+9,=29,-0)  61.8%
  8) Stockfish 13 Hash 2048 Threads 1 Nodes 134217728   3667.9 :     72 (+26,=45,-1)  67.4%  ****

 40) Stockfish 12 Hash 2048 Threads 1 Nodes 134217728   3603.4 :     66 (+11,=55,-0)  58.3%
 31) Stockfish 12 Hash 1024 Threads 1 Nodes 67108864    3616.7 :    122 (+42,=72,-8)  63.9%  ****

 71) Stockfish 11 Hash 8192 Threads 1 Nodes 536870912   3460.9 :     18 (+0,=11,-7)  30.6%   ****

 63) Stockfish 10 Hash 8192 Threads 1 Nodes 536870912   3505.5 :     16 (+0,=14,-2)  43.8%   ****

I'm not deliberately running an adversarial Beat the Champ competition. The above is just a side result from an ongoing round robin competition.
(Engine only; no endgame tablebases are configured.) But as it stands, single threaded Stockfish 16 with a fixed budget of 128M nodes per move is the target configuration to prove defeatable. In case you wonder, Stockfish 16 at 64 M nodes lost to Stockfish 15 at 256 M nodes. Stockfish 12 through 15.1 simply have not yet met a tough enough opponent to induce a mistake.

It is the contention of some here that even when granted an infinite node budget, a concerted attacker can induce the latest Stockfish to make a mistake (when run deterministically) - presumably due to the aggressive selective pruning of Stockfish. I'm not convinced of that, for the claim depends both on the properties of the engine and of the position. For the one-the-edge opening lines used in TCEC competitions, the claim is entirely believable. For, say, a mate in 5 position of king and queen versus bare king, no adversarial opponent will be able to prevent an infinite node budget Stockfish from finding the win. The standard opening position lies somewhere in between this spectrum. My matches find it to be among the most drawish of all the Chess324 openings.

Not that my opinion changes the truth, but I do not find it inconceivable that a sufficiently well budgeted finite, deterministic Stockfish can hold the opening position against an exploit-seeking algorithm, from now until the end of life on Earth. Why, by writing the previous sentence I have conceived of it, literally! But before putting that to the test, I'm interested to see how well an unconstrained system would do. Perhaps if enough antagonism builds up here in the barroom, the fight will be "taken outside" to settle the matter.

OneTrickPony · Post by **OneTrickPony** » Mon Nov 20, 2023 12:49 am

Are you saying "it is OK if the 'weak solution in practice' loses a game every now and then because nobody is perfect and people should stop complaining so much"?

What I am saying is that current SF with 30 minutes per move on a good CPU and a very basic opening book will not lose a game of chess against any entity. My claim is exactly the claim you described as:

My answer to this question I have already given: for me it is inconceivable that SF or any other existing engine can draw against ANY play.

Your intuition is that it's inconceivable. My intuition is that it's very likely. My understanding of chess is that the margin is so wide that it's very very unlikely that current Stockfish gets even close to losing from a starting position, especially if we provide an opening book to guide into really rock solid territory (like Marshall vs 1.e4).

, but SF's search and evaluation have holes and I find it not realistic to believe that these holes couild never cost SF a game.

It has many holes it's true. On the other hand we know those holes exist because we have run other engines or SF with much more computing power to prove them. As it is right now no one is even remotely close to showing a shadow of an edge, let alone a win against SF from starting position. My intuition is that if those holes existed we would have stumbled upon at least one. Chess might not have a nice mathematical structure but it is a game people developer pretty good intuition for. I find it very unlikely that a combination of people looking for holes with very powerful hardware wouldn't discover at least one hole somewhere.

It runs contradictory to your intuition of a possible narrow path. I think for the narrow path to be feasible the margins would need to be slimmer than they are.

Not a weak solution according to the definition.

It's not a proof but it might be a weak solution, no? Unless you mean that it's not deterministic but I see it as a detail which can be worked around.

To start, what I'd like to implore is for the phrase "weakly solved in practice" to be dropped from usage. It adds confusion by mixing categories that do not go together. Chess is a mathematical entity. To say that the game has been Strongly, Weakly, or Ultra-weakly solved is to establish a property of the mathematical entity through (usually computationally expensive) deductive proof -- with the proviso of "reasonable resources" complicating things, as has been discussed elsewhere on this forum. For chess, the word solved should be allowed to associate exclusively to the realm of mathematical proofs, not letting it get tangled up with empirical measurements, however impressive and compelling the entire body of chess analysis may be. Solving is proving.

I think you should appreciate our usage of "weakly solved in practice". Instead you could be facing "essentially solved" as authors of Cepheus have used in their publication in Science (they haven't really solved it but got close enough for practical purposes).

Uri Blass · Post by **Uri Blass** » Mon Nov 20, 2023 8:01 am

There are basically 2 questions:
1)How much memory in terms of chess positions and good move in them you need to prove chess is a draw(note that you only need to memorize part of the legal positions because you will never get part of them against some solving strategy and you do not need best move in every position but only moves that are good enough to draw in every position).

2)How many positions and moves we are going to be able to memorize in the future thanks to hardware improvement and what is the size of the tree that we can expect to be able to memorize.

For today abilities I see at least 236 petabytes that is more than 10^17 bytes in one supercomputer(I am not sure what is the best we have today and I would like to see a graph of number of bytes in memory of the best supercomputer as a function of time to see how much progress we got)

https://en.wikipedia.org/wiki/NDMC_Supercomputer

OneTrickPony · Post by **OneTrickPony** » Mon Nov 20, 2023 10:32 am

Uri, I think your estimations about the number of positions might be right or maybe we need even less.
As you pointed out we need an engine that provides "candidate drawing strategy" (CDS). We can make it so:

1)It goes for a draw any chance it gets which means: detection of 3-fold, perpetual check or known TB draw every time it gets (your point)
2)Once the eval is high enough for our side that we are confident it will not lose instead of suggesting the best move it goes for a decent one which is a capture, pawn move or makes one of those more likely in the near future

This will cut the search tree very heavily. I think the main bottleneck is going to be computing power so we can get an engine producing CDS reasonably fast as well as speed of storage so we can recall already calculated positions reasonably fast. If we can get it to output say 1M moves per second on one computer then to get 10^17 we just need 1000 computers and 3-4 years - entirely feasible!

I imagine the process will start trying to prove some very good opening positions are at least a draw for one side and take it from there to more difficult challenges. Once we see a headline "black at least draws after 1.g4" we can start counting years until someone runs the whole thing.
I am optimistic about this happening during my lifetime (I am 41 years old).

Uri Blass · Post by **Uri Blass** » Mon Nov 20, 2023 5:02 pm

OneTrickPony wrote: ↑Mon Nov 20, 2023 10:32 am Uri, I think your estimations about the number of positions might be right or maybe we need even less.
As you pointed out we need an engine that provides "candidate drawing strategy" (CDS). We can make it so:

1)It goes for a draw any chance it gets which means: detection of 3-fold, perpetual check or known TB draw every time it gets (your point)
2)Once the eval is high enough for our side that we are confident it will not lose instead of suggesting the best move it goes for a decent one which is a capture, pawn move or makes one of those more likely in the near future

This will cut the search tree very heavily. I think the main bottleneck is going to be computing power so we can get an engine producing CDS reasonably fast as well as speed of storage so we can recall already calculated positions reasonably fast. If we can get it to output say 1M moves per second on one computer then to get 10^17 we just need 1000 computers and 3-4 years - entirely feasible!

I imagine the process will start trying to prove some very good opening positions are at least a draw for one side and take it from there to more difficult challenges. Once we see a headline "black at least draws after 1.g4" we can start counting years until someone runs the whole thing.
I am optimistic about this happening during my lifetime (I am 41 years old).

I agree except the target of the engine should be to end the game as fast as possible without losing.
There is no point in searching to force a draw when there is a faster way to force mate against some bad line.

syzygy · Post by **syzygy** » Wed Nov 22, 2023 1:33 am

Uri Blass wrote: ↑Mon Nov 20, 2023 12:13 am
syzygy wrote: ↑Sun Nov 19, 2023 9:52 pm
Uri Blass wrote: ↑Sun Nov 19, 2023 3:31 pm Stockfish with one core is deterministic and I would like to see if somebody can produce a win against it at 1,000,000,000 nodes per move that is clearly less than one hour per move.
It is just a matter of finding a single position that can be reached from the initial position and where Stockfish's search and evalution mess up.

It can be highly difficult to find such a position in a reasonable time, but given enough time you can systematically probe how deterministic SF responds to your moves. If you don't give SF a book, known opening theory will probably already allow you to get SF into a position where it has to play carefully.

Maybe it is possible but I am not sure if it is possible and I believe that only using stockfish at 100,000,000,000 nodes per move is not enough.
This statement suggests that you do not appreciate what it would mean for a deterministic SF to draw against ANY play.

FInding a win against a determinisitc SF is not a matter of "if SF uses N nodes per move, then it can be beaten by SF using 1000*N nodes per move". It is about finding a single line where SF happens to screw up. You are allowed to investigate a trillion lines to find a single one that the particular version of deterministic SF you picked cannot handle.
I clearly understand that you need only to find a single line that you can beat stockfish at 1,000,000,000 nodes per move.
I understood that finding a single line is good enough.
I did not claim that it is impossible to find a single line but only that probably stockfish with more nodes is not going to be enough.

1 node is enough. Just systematically test all of deterministic SF's lines until you find a crack. Of course it will take impractically long if you are just tryng out random lines, but even with random moves you are going to find a winning line (unless the deterministic SF you picked is a weak solution to chess, which is extremely unlikely).

If you did not claim that it is impossible, then apparently you did not claim that SF at 1,000,000,000 nodes per move is a weak solution/unbeatable.

I believe that it is not going to be easy to find it and even if it is possible in a few years we are going to get some engine that make it impossible.

If you give deterministic SF a lot of time per move, then it will take a long time to test each line against it. But I think any reasonable engine with a few seconds per move will be able to find a win if you give it enough time and you do enough backtracking, and you ignore the time that SF needs to calculate its moves. You do not need to find a strong move, you just need to find a line where the particular deterministic version of Stockfish you have picked happens to screw up.

So again, this is not about finding strong moves. This is not about trying to fight SF with 1000x or 1,000,000x the nodes per move. That is all entirely unnecessary. It is only about finding one single line that SF happens to mishandle.

syzygy · Post by **syzygy** » Wed Nov 22, 2023 1:49 am

OneTrickPony wrote: ↑Mon Nov 20, 2023 12:49 amIt runs contradictory to your intuition of a possible narrow path. I think for the narrow path to be feasible the margins would need to be slimmer than they are.

Nothing to do with the narrow path.

The narrow winning path from the starting position "probably" does not exist. But it is not possible to prove that it does not exist (without a gigantic computational effort for which we do not have the resources). Therefore we cannot solve chess. That is the argument against "we will solve chess in a few years because SF's eval says this or that".

This has nothing to with my claim that SF is beatable from the starting position.
WIth the latter I just mean that there is at least one line (and in reality billions of lines) in which SF makes an entirely unforced blunder that moves the game from theoretically drawn to theoretically lost for SF. This line is not a "narrow line". The line has nothing to do with perfect play. It is just a line that gets SF into a drawn position where SF screws up because of the holes it has in its evaluation and search, and perhaps some unlucky hash table collisions.

I think you should appreciate our usage of "weakly solved in practice". Instead you could be facing "essentially solved" as authors of Cepheus have used in their publication in Science (they haven't really solved it but got close enough for practical purposes).

If words don't matter, then why not call checkers chess? Then chess has been weakly solved by Jonathan Schaeffer.

Uri Blass · Post by **Uri Blass** » Wed Nov 22, 2023 6:58 am

syzygy wrote: ↑Wed Nov 22, 2023 1:49 am
OneTrickPony wrote: ↑Mon Nov 20, 2023 12:49 amIt runs contradictory to your intuition of a possible narrow path. I think for the narrow path to be feasible the margins would need to be slimmer than they are.
Nothing to do with the narrow path.

The narrow winning path from the starting position "probably" does not exist. But it is not possible to prove that it does not exist (without a gigantic computational effort for which we do not have the resources). Therefore we cannot solve chess. That is the argument against "we will solve chess in a few years because SF's eval says this or that".

This has nothing to with my claim that SF is beatable from the starting position.
WIth the latter I just mean that there is at least one line (and in reality billions of lines) in which SF makes an entirely unforced blunder that moves the game from theoretically drawn to theoretically lost for SF. This line is not a "narrow line". The line has nothing to do with perfect play. It is just a line that gets SF into a drawn position where SF screws up because of the holes it has in its evaluation and search, and perhaps some unlucky hash table collisions.

I think you should appreciate our usage of "weakly solved in practice". Instead you could be facing "essentially solved" as authors of Cepheus have used in their publication in Science (they haven't really solved it but got close enough for practical purposes).
If words don't matter, then why not call checkers chess? Then chess has been weakly solved by Jonathan Schaeffer.

I guess you mean:
"It is just a line that gets SF into a lost position where SF screws up because of the holes it has in its evaluation and search, and perhaps some unlucky hash table collisions."

Note that SF is not designed to be unbeatable and it is easy to change it by changing the code.

Hash table collisions is easy to fix by saving all the position in the hash and if the target is to avoid losing then the evaluation can prefer a forced draw and not something that the engine evaluates as advantage but the engine is not sure that it is not losing.

If it is hard to beat SF even when it is not designed to go for a draw then I guess that it is going to be harder if it is designed to go for a draw.

It may be interesting to know for checkers:
1)Number of possible positions
2)Number of possible positions that you can reach with some good non losing strategy after one ply,2 plies,3 plies,....

I suspect that the number is smaller when you use a good non losing strategy that does not try to win when you can force a tablebase draw faster.

jefk · Post by **jefk** » Wed Nov 22, 2023 11:06 am

Back to this thread coz the other thread (chess is a draw) becoming even worse;
NB It was already known 150 years ago that it's a draw (B/W equilibrium) by former world champ Steinitz
who als had studied two years math btw. Chesskobra mentioned some Gm's who also are mathematicians
they ofcourse all would confirm White can only win if Black makes a mistake; this is a known fact. Go ask
them ChessKobra, instead of making such silly claims (they all know, but won't like to make statements
as 'chess is a draw' because they think that's not good for the game of chess (*.

If the first move by White is a strong opening move as e4, d4, or Nf3, there still are on average tow
to three possible solid replies by Black as Uri Blass continued and this pattern continues (for less strong moves
as eg 1.h3 or 1.a3 there are many more solid moves for Black btw). Now continue this strategy (trying to find
a 'winning strategy' as in 4-in-a-row), then after e5 e5 lets say we take 2.Nf3 as strongest move, then again there
are about approx 3 solid moves for Black (Nc6, Nf6, and d6!?). And so on; the patter is widening(!), not narrowing,
(like eg in 4-in-a-row) so there is no winning strategy for White.

But according to Zermelo's theory, for a game as chess if being a win for White, there is a winning strategy
(like in 4-in-a-row-, confirmed by Mcts search). But in chess, also confirmed by Mcts search (shashchess
6 thread Mcts starts at about 0.3 eval lets say 55 pct but then gradually this draw estimate is going down
to 50 pct; you can do it with Lco as well (better maybe but i don't have a fast GPU).

The above reasoning is the 'ingenious argument' sysygy talked about; imo
its the ultraweak proof that chess is a draw (just like years before, Nash fouond
an ultraweak proof for Hex). syzygy will disagree, fine with me, but i haven't got
any scientific references yet about the definition of 'ultraweak solving'. But ok,
it's now 'essentially solved' i would say.;

For the rest for 'weakly solving' ( a futile exercise for math purists) i leave it to the number crunchers,
as Schaeffer with his checkers, it may be an interesting computer exercise but imo completely
futile, although it can be done. Because chess is a finite game, after this draw tree widening for
about 2.5^110 times or so, it's still a draw, but you're probably within the endgame bases reach, and
will easily see it's a draw. Can't be done ?
Nope, although i'm not an expert in computer science Imo it most likely *can* be done, because there
are lot's of transpositions, and in many situations you hit 3 times repetition, or the egtb earlier than
move 110. For the rest it would be a completely futile exercise, a bit like trying to show pi remains
irrational after one trillion decimals or so (and yeah it will, syzygy, although the first proof wasn't solid,
it only showed that it's a transcendental number (hey i can do some math too you know Lol

(*) well i do make such statement as 'chess is a draw with perfect play' because it's useful for
ICCF to get out the rabbit hole which they now are in at highest levels).

syzygy · Post by **syzygy** » Fri Nov 24, 2023 1:19 pm

Uri Blass wrote: ↑Wed Nov 22, 2023 6:58 am
syzygy wrote: ↑Wed Nov 22, 2023 1:49 am
OneTrickPony wrote: ↑Mon Nov 20, 2023 12:49 amIt runs contradictory to your intuition of a possible narrow path. I think for the narrow path to be feasible the margins would need to be slimmer than they are.
Nothing to do with the narrow path.

The narrow winning path from the starting position "probably" does not exist. But it is not possible to prove that it does not exist (without a gigantic computational effort for which we do not have the resources). Therefore we cannot solve chess. That is the argument against "we will solve chess in a few years because SF's eval says this or that".

This has nothing to with my claim that SF is beatable from the starting position.
WIth the latter I just mean that there is at least one line (and in reality billions of lines) in which SF makes an entirely unforced blunder that moves the game from theoretically drawn to theoretically lost for SF. This line is not a "narrow line". The line has nothing to do with perfect play. It is just a line that gets SF into a drawn position where SF screws up because of the holes it has in its evaluation and search, and perhaps some unlucky hash table collisions.

I think you should appreciate our usage of "weakly solved in practice". Instead you could be facing "essentially solved" as authors of Cepheus have used in their publication in Science (they haven't really solved it but got close enough for practical purposes).
If words don't matter, then why not call checkers chess? Then chess has been weakly solved by Jonathan Schaeffer.
I guess you mean:
"It is just a line that gets SF into a lost position where SF screws up because of the holes it has in its evaluation and search, and perhaps some unlucky hash table collisions."

No, I meant what I wrote. This should not be difficult? You cannot screw up in a lost position, since you are already lost in a lost position. To lose a draw, you need to blunder in a drawn position.

Note that SF is not designed to be unbeatable and it is easy to change it by changing the code.

Duh?

Is there any project coming to solve chess?

Re: Is there any project coming to solve chess?

Re: Is there any project coming to solve chess?

Re: Is there any project coming to solve chess?

Re: Is there any project coming to solve chess?

Re: Is there any project coming to solve chess?

Re: Is there any project coming to solve chess?

Re: Is there any project coming to solve chess?

Re: Is there any project coming to solve chess?

Re: Is there any project coming to solve chess?

Re: Is there any project coming to solve chess?