Out of the Kai 450 positions, 99 not found by SF

Laskos · Post by **Laskos** » Tue Aug 13, 2019 8:16 am

Uri Blass wrote: ↑Tue Aug 13, 2019 5:46 am I think that if we claim that stockfish does not find the right move then we need some evidence that the move we claim to be better is really the better move(for example some tree that if we go forward and backward in the tree stockfish can learn to play the right move).

It's almost like saying that if Stockfish doesn't find the right moves to solve chess, we go forward and backward with Stockfish to fill its hash with useful info and solve it. The positions were chosen by statistics of _outcomes_, that is -- finality of the chess games. Yes, often flimsy statistics and ad-hoc pickings on what seemed to me reasonable to pick with some confidence. I guess Stockfish of today has very little idea about the outcome (the finality) of a quiet balanced position in openings or midgames. I am against picking the solutions based on some engine analysis in these positional test suites. I checked to see when exactly I built Openings200, it was built almost 3 years ago. It could have been easily proven junk by Lc0 when it arrived, and that would have probably happened if I had built the positional suite by engine analysis (that happened to STS suite, in fact). To my surprise, my approach was somehow validated by Lc0, it comes in Openings200 (revised or not) far ahead of regular eval AB engines.

Dann Corbit · Post by **Dann Corbit** » Tue Aug 13, 2019 9:31 am

One caution with using outcomes is that book lines and memorized openings can have.errors along the way. For obvious reasons, error is.more probable following a.path without deep analysis. If your position is a book exit point and the opponents are high end engines of modern correspondence players at decent time control then this problem is.minimized. Personally, I like a combination of statistics and computer analysis. Naturally, no system is perfect and we are never sure until we have chased the analysis all the way to forced mate.

Now, I know that you know these things. But I state them here for.others.

jp · Post by jp » Tue Aug 13, 2019 10:12 am

Laskos wrote: ↑Tue Aug 13, 2019 8:16 am
Uri Blass wrote: ↑Tue Aug 13, 2019 5:46 am I think that if we claim that stockfish does not find the right move then we need some evidence that the move we claim to be better is really the better move
I checked to see when exactly I built Openings200, it was built almost 3 years ago. It could have been easily proven junk by Lc0 when it arrived, and that would have probably happened if I had built the positional suite by engine analysis (that happened to STS suite, in fact). To my surprise, my approach was somehow validated by Lc0, it comes in Openings200 (revised or not) far ahead of regular eval AB engines.

I know that you (and others) believe there are other reasons to think each of them is valid, but it's a problem having two uncertain things 'validating' each other by correlating highly, even if it's possible that the end conclusion turns out to be true.

corres · Post by **corres** » Tue Aug 13, 2019 10:20 am

If we disregard the obviously weak continuations there are relatively few strategical opening positions with only one solving. If there is a position in which the best continuation is different for Stockfish and for Leela we can make decision established manner if we make some(?) game between Stockfish and Leela starting from that position with the "best" moves.

Laskos · Post by **Laskos** » Tue Aug 13, 2019 11:02 am

jp wrote: ↑Tue Aug 13, 2019 10:12 am
Laskos wrote: ↑Tue Aug 13, 2019 8:16 am
Uri Blass wrote: ↑Tue Aug 13, 2019 5:46 am I think that if we claim that stockfish does not find the right move then we need some evidence that the move we claim to be better is really the better move
I checked to see when exactly I built Openings200, it was built almost 3 years ago. It could have been easily proven junk by Lc0 when it arrived, and that would have probably happened if I had built the positional suite by engine analysis (that happened to STS suite, in fact). To my surprise, my approach was somehow validated by Lc0, it comes in Openings200 (revised or not) far ahead of regular eval AB engines.
I know that you (and others) believe there are other reasons to think each of them is valid, but it's a problem having two uncertain things 'validating' each other by correlating highly, even if it's possible that the end conclusion turns out to be true.

While about my positional test-suites I had great doubts myself, about positional superiority of Lc0 on a reasonable GPU with a strong net, I have few doubts. Maybe I don't understand what "positional" means. What is obvious, from games, test-suites, is that Lc0 is clearly weaker tactically compared to not only Stockfish, but even to much weaker modern AB engines with a regular eval. It is again obvious to me that to be the strongest engine on my PC from regular openings, Lc0 compensates by its very strong "positional" play, maybe in my wrong understanding of the notion of "positional", as some sort of conjugate of "tactical". I am curious, do you have some confidence that a strong Lc0 is superior "positionally" to a strong regular eval AB engine?

That's why I was pretty happy that Lc0 comes far atop in my dubious positional suites. And after all, is it common sense to say: databases of human games are wrong, Lc0 is not that strong positionally so a validation by it means nothing, and the only way to check for correctness is to go back and forth from each position with Stockfish for hours on end? My common sense would tell me that it is a wrong methodology for what I understand to be a positional test suite.

zullil · Post by **zullil** » Tue Aug 13, 2019 12:39 pm

Laskos wrote: ↑Tue Aug 13, 2019 8:16 am
Uri Blass wrote: ↑Tue Aug 13, 2019 5:46 am I think that if we claim that stockfish does not find the right move then we need some evidence that the move we claim to be better is really the better move(for example some tree that if we go forward and backward in the tree stockfish can learn to play the right move).
It's almost like saying that if Stockfish doesn't find the right moves to solve chess, we go forward and backward with Stockfish to fill its hash with useful info and solve it. The positions were chosen by statistics of _outcomes_, that is -- finality of the chess games. Yes, often flimsy statistics and ad-hoc pickings on what seemed to me reasonable to pick with some confidence. I guess Stockfish of today has very little idea about the outcome (the finality) of a quiet balanced position in openings or midgames. I am against picking the solutions based on some engine analysis in these positional test suites. I checked to see when exactly I built Openings200, it was built almost 3 years ago. It could have been easily proven junk by Lc0 when it arrived, and that would have probably happened if I had built the positional suite by engine analysis (that happened to STS suite, in fact). To my surprise, my approach was somehow validated by Lc0, it comes in Openings200 (revised or not) far ahead of regular eval AB engines.

What does "best move" or "right move" even mean here? Surely there are many moves in most of these positions that are equal with correct play---likely all leading to draws.

And forgive me, but in my opinion relying on statistics from completed games, especially human games, is like relying on noise created by cosmic radiation. My guess is that 95% of games that contain one of these positions are decided by significant blunders made subsequent to the positions.

This said, Cfish-dev still thinks Rac1 should be played in the first position in Dann's original post:

info depth 54 seldepth 76 multipv 1 score cp 95 nodes 1843684510413 nps 34496397 hashfull 1000 tbhits 0 time 53445711 pv a1c1 g4h5 d2c4 h5g6 f1g2 f7f6 c2b2 b8a7 a3a4 g6f7 c5d3 c8d7 c4a5 d7c7 d3c5 d8b8 a5c4 b8d8 b2d2 d5e7 c4a5 d8b8 c1b1 g8g7 e1c1 e8c8 a5c4 c8d8 d2e2 h7h5 b4b5 a6b5 a4b5 a7c5 d4c5 f7c4 c1c4 c7e5 h2h4 g7g6 h4g5 f6g5 f2f4 g5f4 g3f4 e5e6 b5c6 e7c6 g2c6 b7c6 b1b8 d8b8 c4d4 g6g7 d4d6 b8b1 g1h2

The move Qb2 remains as "second best": info depth 53 currmove c2b2 currmovenumber 2

Will now let Lc0 spend some hours on the same position. If it can run that long without crashing...

zullil · Post by **zullil** » Tue Aug 13, 2019 1:32 pm

zullil wrote: ↑Tue Aug 13, 2019 12:39 pm
Laskos wrote: ↑Tue Aug 13, 2019 8:16 am
Uri Blass wrote: ↑Tue Aug 13, 2019 5:46 am I think that if we claim that stockfish does not find the right move then we need some evidence that the move we claim to be better is really the better move(for example some tree that if we go forward and backward in the tree stockfish can learn to play the right move).
It's almost like saying that if Stockfish doesn't find the right moves to solve chess, we go forward and backward with Stockfish to fill its hash with useful info and solve it. The positions were chosen by statistics of _outcomes_, that is -- finality of the chess games. Yes, often flimsy statistics and ad-hoc pickings on what seemed to me reasonable to pick with some confidence. I guess Stockfish of today has very little idea about the outcome (the finality) of a quiet balanced position in openings or midgames. I am against picking the solutions based on some engine analysis in these positional test suites. I checked to see when exactly I built Openings200, it was built almost 3 years ago. It could have been easily proven junk by Lc0 when it arrived, and that would have probably happened if I had built the positional suite by engine analysis (that happened to STS suite, in fact). To my surprise, my approach was somehow validated by Lc0, it comes in Openings200 (revised or not) far ahead of regular eval AB engines.
What does "best move" or "right move" even mean here? Surely there are many moves in most of these positions that are equal with correct play---likely all leading to draws.

And forgive me, but in my opinion relying on statistics from completed games, especially human games, is like relying on noise created by cosmic radiation. My guess is that 95% of games that contain one of these positions are decided by significant blunders made subsequent to the positions.

This said, Cfish-dev still thinks Rac1 should be played in the first position in Dann's original post:

info depth 54 seldepth 76 multipv 1 score cp 95 nodes 1843684510413 nps 34496397 hashfull 1000 tbhits 0 time 53445711 pv a1c1 g4h5 d2c4 h5g6 f1g2 f7f6 c2b2 b8a7 a3a4 g6f7 c5d3 c8d7 c4a5 d7c7 d3c5 d8b8 a5c4 b8d8 b2d2 d5e7 c4a5 d8b8 c1b1 g8g7 e1c1 e8c8 a5c4 c8d8 d2e2 h7h5 b4b5 a6b5 a4b5 a7c5 d4c5 f7c4 c1c4 c7e5 h2h4 g7g6 h4g5 f6g5 f2f4 g5f4 g3f4 e5e6 b5c6 e7c6 g2c6 b7c6 b1b8 d8b8 c4d4 g6g7 d4d6 b8b1 g1h2

The move Qb2 remains as "second best": info depth 53 currmove c2b2 currmovenumber 2

Will now let Lc0 spend some hours on the same position. If it can run that long without crashing...

Lc0 switches fairly quickly from Qb2 to Rac1.

info depth 21 seldepth 66 time 854320 nodes 29572406 score cp 87 hashfull 139 nps 34615 tbhits 0 pv a1c1 b8d6 d2c4 d6f8 f1g2 f8g7 c2b3 f5f4 e3f4 e8e1 c1e1 g5f4 c4e5 g4h5 b3c4 d5b6 c4c3 f4g3 h2g3 h5g6 g2f3 b6d5 c3c1 d5f6 c1d2 f6d5 g1g2 d8e8 f3g4 c8a8 f2f4 d5f6 g4f3 a8c8 d2d1 e8e7 g3g4 h7h5 e5g6 e7e1 d1e1 f7g6 g4h5 f6h5 e1e4 g8h8 f4f5 c8f5 e4f5
bestmove a1c1 ponder b8d6

But maybe a longer search is needed ...

Laskos · Post by **Laskos** » Tue Aug 13, 2019 2:32 pm

zullil wrote: ↑Tue Aug 13, 2019 12:39 pm
Laskos wrote: ↑Tue Aug 13, 2019 8:16 am
Uri Blass wrote: ↑Tue Aug 13, 2019 5:46 am I think that if we claim that stockfish does not find the right move then we need some evidence that the move we claim to be better is really the better move(for example some tree that if we go forward and backward in the tree stockfish can learn to play the right move).
It's almost like saying that if Stockfish doesn't find the right moves to solve chess, we go forward and backward with Stockfish to fill its hash with useful info and solve it. The positions were chosen by statistics of _outcomes_, that is -- finality of the chess games. Yes, often flimsy statistics and ad-hoc pickings on what seemed to me reasonable to pick with some confidence. I guess Stockfish of today has very little idea about the outcome (the finality) of a quiet balanced position in openings or midgames. I am against picking the solutions based on some engine analysis in these positional test suites. I checked to see when exactly I built Openings200, it was built almost 3 years ago. It could have been easily proven junk by Lc0 when it arrived, and that would have probably happened if I had built the positional suite by engine analysis (that happened to STS suite, in fact). To my surprise, my approach was somehow validated by Lc0, it comes in Openings200 (revised or not) far ahead of regular eval AB engines.
What does "best move" or "right move" even mean here? Surely there are many moves in most of these positions that are equal with correct play---likely all leading to draws.

First, this is from the "perfect player" viewpoint, all positions are w/d/l. In a world significantly below the "perfect player" level as the today's chess engine world, the best we have are estimators of w/d/l, whose evals are continuous from w to l, and as much as possible, monotonous (the monotonicity in fact gives the strength of the engine, more there is, better it plays). And then, how do you know the w/d/l shape of the space of possible positions or occurring in regular games positions in chess from the "perfect player" point of view? Are there many bottlenecks with one single best move? What is the usual dominant in regular games, many equivalent best moves? Maybe the oddities occurring in databases do mean something close to bottlenecks? We do have 6 and even 7 men tablebases, and although I haven't sit to analyze them, there are both of these cases, and interesting enough, an engine like Stockfish by itself (no TBs), with much of hardcoded endgame knowledge, often finds solutions (bm) even in these bottlenecks, often showing itself as a good estimator.

And forgive me, but in my opinion relying on statistics from completed games, especially human games, is like relying on noise created by cosmic radiation. My guess is that 95% of games that contain one of these positions are decided by significant blunders made subsequent to the positions.

What is so bad about openings and humans? Most of opening theory to our days is built by humans, engines' contribution is small and mostly came with "tricks", and often in messy positions. I think you just imply that we can throw away all opening theory and replace it with Stockfish analysis. That is, with the Cerebellum book. Good luck with that!

This said, Cfish-dev still thinks Rac1 should be played in the first position in Dann's original post:

info depth 54 seldepth 76 multipv 1 score cp 95 nodes 1843684510413 nps 34496397 hashfull 1000 tbhits 0 time 53445711 pv a1c1 g4h5 d2c4 h5g6 f1g2 f7f6 c2b2 b8a7 a3a4 g6f7 c5d3 c8d7 c4a5 d7c7 d3c5 d8b8 a5c4 b8d8 b2d2 d5e7 c4a5 d8b8 c1b1 g8g7 e1c1 e8c8 a5c4 c8d8 d2e2 h7h5 b4b5 a6b5 a4b5 a7c5 d4c5 f7c4 c1c4 c7e5 h2h4 g7g6 h4g5 f6g5 f2f4 g5f4 g3f4 e5e6 b5c6 e7c6 g2c6 b7c6 b1b8 d8b8 c4d4 g6g7 d4d6 b8b1 g1h2

The move Qb2 remains as "second best": info depth 53 currmove c2b2 currmovenumber 2

Will now let Lc0 spend some hours on the same position. If it can run that long without crashing...

zullil · Post by **zullil** » Tue Aug 13, 2019 2:43 pm

Laskos wrote: ↑Tue Aug 13, 2019 2:32 pm
zullil wrote: ↑Tue Aug 13, 2019 12:39 pm
Laskos wrote: ↑Tue Aug 13, 2019 8:16 am
Uri Blass wrote: ↑Tue Aug 13, 2019 5:46 am I think that if we claim that stockfish does not find the right move then we need some evidence that the move we claim to be better is really the better move(for example some tree that if we go forward and backward in the tree stockfish can learn to play the right move).
It's almost like saying that if Stockfish doesn't find the right moves to solve chess, we go forward and backward with Stockfish to fill its hash with useful info and solve it. The positions were chosen by statistics of _outcomes_, that is -- finality of the chess games. Yes, often flimsy statistics and ad-hoc pickings on what seemed to me reasonable to pick with some confidence. I guess Stockfish of today has very little idea about the outcome (the finality) of a quiet balanced position in openings or midgames. I am against picking the solutions based on some engine analysis in these positional test suites. I checked to see when exactly I built Openings200, it was built almost 3 years ago. It could have been easily proven junk by Lc0 when it arrived, and that would have probably happened if I had built the positional suite by engine analysis (that happened to STS suite, in fact). To my surprise, my approach was somehow validated by Lc0, it comes in Openings200 (revised or not) far ahead of regular eval AB engines.
What does "best move" or "right move" even mean here? Surely there are many moves in most of these positions that are equal with correct play---likely all leading to draws.
First, this is from the "perfect player" viewpoint, all positions are w/d/l. In a world significantly below the "perfect player" level as the today's chess engine world, the best we have are estimators of w/d/l, whose evals are continuous from w to l, and as much as possible, monotonous (the monotonicity in fact gives the strength of the engine, more there is, better it plays). And then, how do you know the w/d/l shape of the space of possible positions or occurring in regular games positions in chess from the "perfect player" point of view? Are there many bottlenecks with one single best move? What is the usual dominant in regular games, many equivalent best moves? Maybe the oddities occurring in databases do mean something close to bottlenecks? We do have 6 and even 7 men tablebases, and although I haven't sit to analyze them, there are both of these cases, and interesting enough, an engine like Stockfish by itself (no TBs), with much of hardcoded endgame knowledge, often finds solutions (bm) even in these bottlenecks, often showing itself as a good estimator.

And forgive me, but in my opinion relying on statistics from completed games, especially human games, is like relying on noise created by cosmic radiation. My guess is that 95% of games that contain one of these positions are decided by significant blunders made subsequent to the positions.

What is so bad about openings and humans? Most of opening theory to our days is built by humans, engines' contribution is small and mostly came with "tricks", and often in messy positions. I think you just imply that we can throw away all opening theory and replace it with Stockfish analysis. That is, with the Cerebellum book. Good luck with that!

This said, Cfish-dev still thinks Rac1 should be played in the first position in Dann's original post:

info depth 54 seldepth 76 multipv 1 score cp 95 nodes 1843684510413 nps 34496397 hashfull 1000 tbhits 0 time 53445711 pv a1c1 g4h5 d2c4 h5g6 f1g2 f7f6 c2b2 b8a7 a3a4 g6f7 c5d3 c8d7 c4a5 d7c7 d3c5 d8b8 a5c4 b8d8 b2d2 d5e7 c4a5 d8b8 c1b1 g8g7 e1c1 e8c8 a5c4 c8d8 d2e2 h7h5 b4b5 a6b5 a4b5 a7c5 d4c5 f7c4 c1c4 c7e5 h2h4 g7g6 h4g5 f6g5 f2f4 g5f4 g3f4 e5e6 b5c6 e7c6 g2c6 b7c6 b1b8 d8b8 c4d4 g6g7 d4d6 b8b1 g1h2

The move Qb2 remains as "second best": info depth 53 currmove c2b2 currmovenumber 2

Will now let Lc0 spend some hours on the same position. If it can run that long without crashing...

info depth 25 seldepth 70 time 1671642 nodes 54668541 score cp 94 hashfull 282 nps 32703 tbhits 0 pv a1c1 b8d6 d2c4 d6f8 f1g2 f8g7 c2b3 f5f4 e3f4 e8e1 c1e1 g5f4 c4e5 g4h5 a3a4 d5e7 b3c3 h5g6 a4a5 f4g3 h2g3 c8c7 g2e4 e7c8 c3f3 c8d6 e4d3 c7e7 e1e3 e7f6 f3e2 f6g5 g1g2 d8e8 e2d1 g6d3 d1d3 e8e7 d3d1 g5f5 e3f3 f5h5 d1c1 h7h6 c1d2 d6b5 f3f4 b5d4 f4d4 h5e5 d4d8 g8h7 d2d3 f7f5 d8d6 e7f7 d6d7 f7d7
bestmove a1c1 ponder b8d6
quit

Uri Blass · Post by **Uri Blass** » Tue Aug 13, 2019 3:17 pm

Laskos wrote: ↑Tue Aug 13, 2019 2:32 pm What is so bad about openings and humans? Most of opening theory to our days is built by humans, engines' contribution is small and mostly came with "tricks", and often in messy positions. I think you just imply that we can throw away all opening theory and replace it with Stockfish analysis. That is, with the Cerebellum book. Good luck with that!

How do you measure that most of opening theory is built by humans?

It is possible that humans played many moves based on analysis by engines so the fact that they were the first to play a move does not prove that engines did not practically built it.

You can say only about old theory that it is built by humans and it is not clear about novelties in the last 20 years.

I think that it may be interesting if somebody throw away all opening theory and replace it with stockfish analysis with a lot of computer time and see if other books can beat this book.

Out of the Kai 450 positions, 99 not found by SF

Re: Out of the Kai 450 positions, 99 not found by SF

Re: Out of the Kai 450 positions, 99 not found by SF

Re: Out of the Kai 450 positions, 99 not found by SF

Re: Out of the Kai 450 positions, 99 not found by SF

Re: Out of the Kai 450 positions, 99 not found by SF

Re: Out of the Kai 450 positions, 99 not found by SF

Re: Out of the Kai 450 positions, 99 not found by SF

Re: Out of the Kai 450 positions, 99 not found by SF

Re: Out of the Kai 450 positions, 99 not found by SF

Re: Out of the Kai 450 positions, 99 not found by SF