LC0 bad search and poor tactics

Werewolf · Post by **Werewolf** » Sat Jan 26, 2019 11:01 pm

yanquis1972 wrote: ↑Sat Jan 26, 2019 6:52 pm yeah, what i found when looking at test10 & test40 in this position isn't that they avoid f6 (although it may be that t10 (11248) was able to see it was a mistake eventually; i don't recall) but that, unlike t30, avoid getting themselves into the position in the first place. which is another reason to avoid 8+ move opening positions in testing NNs, imo.

Test 40 is the one I'm really interested in. But test 30 is still about 200 elo better than 40. It has been so long I can't remember how long test 30 has been trained for, but I think test 40 will need a few more months until it is mature.

corres · Post by **corres** » Sat Jan 26, 2019 11:22 pm

brianr wrote: ↑Sat Jan 26, 2019 12:52 pm Yes, NN engines have a distinctly different style of play, however collectively software improvements have contributed more strength than one might think. Yes, hardware gets faster and certainly helps quite a lot, but software refinements in the aggregate are a major factor. For example: null move, search pruning, eval sophistication and tuning, hash tables, killer moves, move ordering, rigorous testing, tablebases, etc.
NN engines are still in their infancy; counting from Giraffe (not the first, but arguably the first reasonably strong NN engine publicly available), they are playing like toddlers (less than 4 years old). Leela is just learning to walk.

I am afraid if today Leela is a "toddler" only she never will be adult.
You are too optimistic about the future of Leela.
But let you are right, I wish.

Dann Corbit · Post by **Dann Corbit** » Sun Jan 27, 2019 3:02 am

lkaufman wrote: ↑Sat Jan 26, 2019 6:01 pm 4...dxc4 is not a mistake at all, it is the famous Vienna variation, which has been my favorite defense to the Queen's Gambit for years and is played by top GMs. The gambit line 6.Bc4 was only found a few years ago, and it is not clear that it gives White any more advantage than any normal White opening does, as long as Black doesn't try to win the second pawn. The mistake 10...f6? is only refuted by a spectacular piece sacrifice that is very hard to discover. So I don't think this is a very good example to make your point.

I am dying of curiosity To know what the piece sacrifice is ( or do you mean the knight?)

M ANSARI · Post by **M ANSARI** » Sun Jan 27, 2019 2:05 pm

I think Lc0 is indeed very weak in tactics when compared to modern AB engines. I was looking at some of the games of a tourney I made against latest Dev SF and it lost many points by overlooking 1 move simple tactics. But even with this handicap it still trounced SF by around 90 ELO points. The way it plays the opening and middle game is very very impressive. I will see if I can post a few of these games with some analysis. I think Lc0 can gain a lot of easy ELO points by simply using some AB search to keep a sanity check on things and to avoid overlooking simple tactics that can ruin a winning position in one move. In the tourney I made I looked at 3 of LC0 losses and all were from totally winning positions where SF had itself down by +3 or more in evaluation. Then just one bad tactical oversight and things get reversed from totally winning to totally losing. Also I remember another game where it didn't see that it walked into a mating net in a totally drawn rook and queen endgame where SF immediately saw the mate after the LC0 blunder. I think this is very normal considering that LC0 is still just in its infancy, but this does mean that once the program matures things it will be a formidable engine!

yanquis1972 · Post by **yanquis1972** » Sun Jan 27, 2019 6:14 pm

M ANSARI wrote: ↑Sun Jan 27, 2019 2:05 pm I think Lc0 is indeed very weak in tactics when compared to modern AB engines. I was looking at some of the games of a tourney I made against latest Dev SF and it lost many points by overlooking 1 move simple tactics. But even with this handicap it still trounced SF by around 90 ELO points. The way it plays the opening and middle game is very very impressive. I will see if I can post a few of these games with some analysis. I think Lc0 can gain a lot of easy ELO points by simply using some AB search to keep a sanity check on things and to avoid overlooking simple tactics that can ruin a winning position in one move. In the tourney I made I looked at 3 of LC0 losses and all were from totally winning positions where SF had itself down by +3 or more in evaluation. Then just one bad tactical oversight and things get reversed from totally winning to totally losing. Also I remember another game where it didn't see that it walked into a mating net in a totally drawn rook and queen endgame where SF immediately saw the mate after the LC0 blunder. I think this is very normal considering that LC0 is still just in its infancy, but this does mean that once the program matures things it will be a formidable engine!

I’m still waiting for both AB search to be implemented simply to check deepminds work and particularly to have even a single core AB search running alongside for bkunderchecking/winfinding. One of the obvious advantages of a GPU dedicated engine is that it leaves so much of the CPU free, I think it’s only a matter of time before someone exploits that. Houdart, Lefler, and the SF + leela teams must all at least be curious. Hopefully test40 is a massive success and we can move on to wide experimentation.

A0 was a proof of concept and chess was a tiny tangent, I think once we get past the awe there’s a lot to be applied to the basic framework.

mbabigian · Post by **mbabigian** » Sun Jan 27, 2019 7:40 pm

One interesting note on the search after reading more about it. If I understand correctly what I was reading on the discord, they implemented a dynamic CPUCT where the value changes to prune more heavily as search deepens. This is similar to AB engines which also prune more heavily at deeper depths with LMR etc.

The critical difference is that AB engines start the search from scratch when the root position is moved (aside from having a populated hash) and LC0 does not. This will cause bad moves later in the PV to have large visits that often can't be overcome in reasonable time.

For example, say we search 100 million nodes at the root and we have this line 1 e4 e6 2 d4 d6 3c4 c6 Nf3 4 g6. Say the white side is being played by an AB engine and both white and black agree this line is best except g6 is a blunder. LC0 might have 95 million visits on g6 on move 1, and they are retained as you move the root closer to g6. Now LC0 is analyzing it's move 4 and is no longer using a less exploratory CPUCT and starts to realize g6 is terrible. It starts plowing visits into say, b6 which is the best move, but due to the narrower CPUCT at move 1, b6 only has 10000 visits to start with. LC0 never has the time to crank the visit count up over 95 million for b6 and it blunders. I used a completely fictions move setup and numbers so we don't troll off into what GPU do you have, what network, and don't discuss search. I look forward to seeing how this example will be derailed however.

If my understanding of the dynamic CPUCT is correct, the longer both programs stay with the same PV before a blunder, especially if the blunder is half a dozen ply in, the less likely LC0 will search its way out of it when presented with the position at root. This may help explain some of its later middle game/endgame issues.

One of the reasons I ended up examining these issues to begin with was because I was curious how often LC0 changes its best move after 1 million nodes. I tried many positions to find ones where several moves were equally good. I was not surprised to find that LC0 rarely changes its mind after 1 million nodes, making 10s of millions of nodes searched afterwards mostly in vein. As I moved forward through an opening, I watched this heavy pile up of visits on the original PV and noticed that as I moved through a game, the ability to "change its mind" got harder and harder if the game followed the PV for a while.

For those that will put time into rethinking the current search design, I hope this made sense and that my understanding of a variable CPUCT is not wrong.

Food for thought.

Werewolf · Post by **Werewolf** » Sun Jan 27, 2019 11:50 pm

mbabigian wrote: ↑Sun Jan 27, 2019 7:40 pm One interesting note on the search after reading more about it. If I understand correctly what I was reading on the discord, they implemented a dynamic CPUCT where the value changes to prune more heavily as search deepens. This is similar to AB engines which also prune more heavily at deeper depths with LMR etc.

The critical difference is that AB engines start the search from scratch when the root position is moved (aside from having a populated hash) and LC0 does not. This will cause bad moves later in the PV to have large visits that often can't be overcome in reasonable time.

For example, say we search 100 million nodes at the root and we have this line 1 e4 e6 2 d4 d6 3c4 c6 Nf3 4 g6. Say the white side is being played by an AB engine and both white and black agree this line is best except g6 is a blunder. LC0 might have 95 million visits on g6 on move 1, and they are retained as you move the root closer to g6. Now LC0 is analyzing it's move 4 and is no longer using a less exploratory CPUCT and starts to realize g6 is terrible. It starts plowing visits into say, b6 which is the best move, but due to the narrower CPUCT at move 1, b6 only has 10000 visits to start with. LC0 never has the time to crank the visit count up over 95 million for b6 and it blunders. I used a completely fictions move setup and numbers so we don't troll off into what GPU do you have, what network, and don't discuss search. I look forward to seeing how this example will be derailed however.

If my understanding of the dynamic CPUCT is correct, the longer both programs stay with the same PV before a blunder, especially if the blunder is half a dozen ply in, the less likely LC0 will search its way out of it when presented with the position at root. This may help explain some of its later middle game/endgame issues.

One of the reasons I ended up examining these issues to begin with was because I was curious how often LC0 changes its best move after 1 million nodes. I tried many positions to find ones where several moves were equally good. I was not surprised to find that LC0 rarely changes its mind after 1 million nodes, making 10s of millions of nodes searched afterwards mostly in vein. As I moved forward through an opening, I watched this heavy pile up of visits on the original PV and noticed that as I moved through a game, the ability to "change its mind" got harder and harder if the game followed the PV for a while.

For those that will put time into rethinking the current search design, I hope this made sense and that my understanding of a variable CPUCT is not wrong.

Food for thought.

Very interesting. If what you're saying is correct it explains a lot.

Ovyron · Post by **Ovyron** » Mon Jan 28, 2019 6:54 am

lkaufman wrote: ↑Sat Jan 26, 2019 6:01 pm4...dxc4 is not a mistake at all

Right.

I recommend people that say "this is a mistake" or "that is a mistake" to play some correspondence chess against strong opponents. What you'll find it's amazing the amount of moves that aren't mistakes at all, not even those that clearly look like blunders. Like, there's positions where S10 gives +1.50 scores, but if you follow the variations, it just shuffles pieces around and eventually realizes the position is no higher than +0.80 (though a more realistic score would be 0.15 or something).

Nobody is going to win a prize playing those moves that are 1.20 centipawns worse than what the engine suggests, but objectively, those moves aren't mistakes, but you might need days to analyze the position to find out, and it can't be seen in mickey mouse time control games (as my old friend Paul used to call them), mush less moves like 4...dxc4 that might turn out to actually be the strongest ones.

M ANSARI · Post by **M ANSARI** » Mon Jan 28, 2019 7:49 am

I just did a 50 game match with SF latest Dev against Lc0 and although Lc0 won that match easily (+90 ELO), I can easily point to many one move howlers. I mean where a position would move from +5 for Lc0 (totally winning agreed by both sides) to -5 in one move. Now of course you could say a 3 min and 2 sec increment is very fast time control and those one move tactical misses are to be expected .... BUT ... I took those positions and put Lc0 on it and let it sit for a long time. Even after 10 minutes in one position, it could not see much difference between 2 moves ... one that is totally 0.00 and the other that SF and Komodo immediately goes around +4 and losing. It took SF and Komodo less than 1 second to see the tactic. In another position where Lc0 was also up +5 it missed a simple tactic where it would finally break through and win. It actually ended up missing that tactic and moving from +5 to losing the entire game. I am putting together a post with these things I saw and I must say I was incredibly impressed. I have been around computer chess long enough to realize that the small things that cause the engine to play some bad moves will be easily fixed soon enough, so that part doesn't really concern me. What got my attention more was that in basically every game Lc0 got out of the opening with a nice advantage. Sure it didn't win many of those games and even lost some winning positions, but that seems like it will be easy to patch up.

I think there is something that can be improved in the way it searches. I don't know much how the engine works, but if it is following AI then it should mimic how a strong GM plays chess. I wouldn't change how it plays the opening or middle game, but then there should be a trigger where it starts to look for tactics in a position ... sort of like the old Rybka Winfinder. If you look at how a human plays chess, they will play positionally and then they will feel that there is a tactical shot in the position ... they might not be able to calculate it quickly in time, but they know it is there and they will start looking for the working tactic. If Lc0 could use the unused CPU power it has at its disposal to identify and look for these tactics it will be an even more formidable chess engine. Not only would it be stronger, but it will also be a much more useful chess engine for analysis.

By the way, if you want to see something incredible ... watch this tourney of Magnus Carlsen playing in a Lichess 1 minute bullet tourney just a few weeks back. Amazing positional acumen with incredible tactical awareness ... what a potent combination. If you enjoy the game of chess you will really enjoy this video

https://www.youtube.com/watch?v=8RO1YrQMBQ8

corres · Post by **corres** » Mon Jan 28, 2019 9:57 am

As I see there a lot of peoples who have been deluded the fast development of Leela and the obvious issues in the play of Leela.
I should like to point to some facts:
1, The NN technique is about 40 years old, only the using in chess engine is a relative new thing. Mainly the last few years this technique is used in wide circle. Maybe this gave the idea using NN in chess engines.
But Leela is not the original realization of this trick. As her developers stated they want to reinvent the works
of Google team who made the Alpha Zero chess machine. The superficial texts of DeepMind make hard their works but these information accelerate their works too.
2, The development of AB type engines happened together with the development of PC hardware and the development of the theory of game. DeepMind and Leela team used these result also. Moreover when the team of Leela began to work on Leela we can have the appropriate hardware: the GPUs of NVIDIA. If we could use only processors and ATI type GPUs to Leela, we can not reach with Leela the Elo of the top AB engines.
3. Every technique have the own benefit and drawback. These are tightly connected to that technique so the full elimination of them is impossible. Because if this in the future also an AB type engine will be a tactical monster with poorer strategical knowledge and a NN type engine will be a strategical monster with poorer tactical knowledge. We hope if Leela team will make bigger NN this will have more knowledge and Leela will make fewer mistake.
4. Because the different evaluation of chess positions and the different playing style it is hard to make a combined engine from AB engine and NN engine. At this time we should use our own chess knowledge to
decide the truth.

LC0 bad search and poor tactics

Re: LC0 bad search and poor tactics

Re: LC0 bad search and poor tactics

Re: LC0 bad search and poor tactics

Re: LC0 bad search and poor tactics

Re: LC0 bad search and poor tactics

Re: LC0 bad search and poor tactics

Re: LC0 bad search and poor tactics

Re: LC0 bad search and poor tactics

Re: LC0 bad search and poor tactics

Re: LC0 bad search and poor tactics