a crying shame (re: self-learning engines)

Ovyron · Post by **Ovyron** » Tue Jan 21, 2020 7:09 am

Yeah, I don't think dkappe understands the kind of learning we're talking about. The user would go and play moves in the position, in some variation it's interested in, outside of Leela's horizon (so Leela is unable to see them from the root), and make Leela give a score to the position here. Then the user does some "reverse analysis", where previous nodes are visited. Leela would use what it has learned from the future position to have a more accurate score of the line, and would either remain showing this learned score for the position for previous moves (without "searching" it, so it's done very fast), or would switch moves to a better one for the sides (because it now scores better), so the user would need to go forward in this one and repeat the process.

Eventually, Leela will not switch moves, and will show this mainline at the root AND THE PV AND SCORE WILL RESEMBLE SOMETHING THAT LEELA WOULD SHOW AT DEPTH 50 (this is something I do with Learning Stockfish to get high relative depth in a 10 years old CPU) IN AN INCREDIBLY SHORTER TIME FRAME. Whew. So this learning is about getting much faster results by interactively analyzing the positions and letting the engine learn line refutations (by, say, showing it lines where Stockfish knows the best moves) in a specific line, a very different process than NN learning.

Michael Sherwin · Post by **Michael Sherwin** » Sun Feb 09, 2020 8:56 pm

carldaman wrote: ↑Sat Jan 18, 2020 9:46 pm The more I think about it, the more I'm convinced it's quite a bit of a travesty that these new self-learning programs actually do not support live learning!

I've recently run a match between the freeware Critter 1.6a, with its session file learning switched on, and the much-vaunted Fat Fritz, a commercial program I had to pay for, and it took less that 10 games for Critter to turn the match score around in its favor, due to its active learning, and trounce its GPU-based 'stronger' opponent. Of course, no real surprise here, but I get the feeling most people don't get this basic fact straight, that learning can steadily and significantly increase playing (and analysis) strength.

Yet, I haven't seen much discussion, let alone requests, about letting Leela (or its derivatives) continuing to learn as it plays. Is it that big a deal to change the program so this is made possible? It is puzzling and deflating to see Fat Fritz not being able to continue to learn while a free program can.

OK, as some might say, why should they bother, since they (the Leela team) do this work for free? Well, in that case why won't Chessbase? They surely have the resources to add such a useful feature! In either case, I think it's a poor argument to say that it would be difficult, since a lot more work has gone into prior coding, and adding such a learning feature is a guaranteed instant winner, and worth adding, even with serious effort.

The same thing goes for Komodo, or any other commercial endeavor playing 'catch-up' with stronger engines. Learning automatically and quickly adds strength. It's a simple fact. Can anyone argue with a straight face that they don't care about engine strength? This is what Komodo's programmers imply when they claim that there is just "not enough interest" in this feature. Not enough interest in far greater engine strength?! Cue laugh track, please...

Anyway, this post is not really directed at Komodo's programmers, but rather at Leela and Fat Fritz, much hyped engines that are based on self-learning, yet they can't learn a lick after you've installed them. If you still aren't convinced that learning works exceedingly well, check out Richard Vida's Critter or even Michael Sherwin's Romichess, both freeware, and the latter even open source.

Hi carldaman, I'm glad RomiChess still has a small fanclub.

Anyway just for informational purposes I'm quoting a reply I made to another of your post that mentions Romi.

Thanks carldaman for being accurate about RomiChess's learning. I regret putting both types of learning into Romi because all people seem to mention is the Monkey See Monkey Do and totally neglect the Reinforcement Learning loaded into the hash table before each search. Both are based on real world animal learning.

Monkey See Monkey Do
In 2005 I read a story about scientist on a Japanese island digging up wild potatoes and washing them in a stream. The monkeys on the island saw what the scientist were doing and then they started to wash their potatoes. Since it is just mimicking what works as long as it continues to work it can be used to make moves instantly. People call this a book approach. Maybe in a way it is book learning. But, primarily it is reinforcement learning applied instantly requiring no search.

Pavlov's Dog Experiments
Also in 2005 I read about Pavlov's dog experiments and how the dogs learned by getting a reward or a punishment. I simply adapted that to computer chess by giving the winning sides moves a small reward and the losing sides moves a small punishment. These were saved in an ever growing tree structure. If MSMD did not produce an instant move and if there was a subtree of previously played games stored then the scores of all the positions with their reward/punishment values are loaded into the hash table prior to the search. All this does is sway the search to play better moves or simply to avoid punished moves. Critic's insist that there are many flaws to this approach. One major criticism is that once Romi is out of the stored tree then it is of no further value. This is not true for two reasons. Reason one is that if Romi plays d4 openings better than e4 openings and the PDE causes Romi to play d4 instead of e4 then Romi has benefited. This applies to any move choice that has enough instances to provide reliable learning. Reason two is that the hash tables retain the influence of the learning long after the learning data has run out. Let's say a bishop on g5 is pinning a knight on f6 and the opponent plays h6 and the PDE prefers Bh4 more than Bf6 or Bf4 then the Bh4 influence will be in the hash table even after the PDE data has run out. Therefore if the opponent plays h6 say six moves latter the influence that Bh4 is better is still in the hash table.

An experiment against iirc 6 top engines using Bob's humongous book for those engines with Romi only using its learn.dat file showed a linear increase of 50 elo for Romi every 5000 games. An experiment that I conducted using the 10 Nunn positions playing both sides against Glaurung repeated 20 times for a total of 400 games resulted in a linear gain from 5% for Romi after the first set to 95% for Romi in set 20.

corres · Post by **corres** » Sun Feb 09, 2020 11:10 pm

carldaman wrote: ↑Sat Jan 18, 2020 9:46 pm I've recently run a match between the freeware Critter 1.6a, with its session file learning switched on, and the much-vaunted Fat Fritz, a commercial program I had to pay for, and it took less that 10 games for Critter to turn the match score around in its favor, due to its active learning, and trounce its GPU-based 'stronger' opponent. Of course, no real surprise here, but I get the feeling most people don't get this basic fact straight, that learning can steadily and significantly increase playing (and analysis) strength.
...

I think you run that match without opening book and because of this the two engine used a very-very tight "opening book" what is useless in practical tournaments.
If you would use even a very tight but useful opening repertoire you would run many-many games between engines to gain some Elo owing to self learning.

carldaman · Post by **carldaman** » Sun Feb 09, 2020 11:33 pm

Thanks, Michael, I had replied in the other thread without checking this one first.

Code: Select all

Hi Michael, it is great to see you posting here.

I'm glad that Romi's learning is the way it is. Easily one of my favorite engines to spar against. I just don't know what the next game may bring. :D

The reinforcement learning in Romi predated AlphaZero's self-learning by many years. This certainly deserves more recognition. One can only wonder where Romi would be, if it could train on millions upon millions of games...

Best wishes,
Carl

carldaman · Post by **carldaman** » Sun Feb 09, 2020 11:41 pm

corres wrote: ↑Sun Feb 09, 2020 11:10 pm
carldaman wrote: ↑Sat Jan 18, 2020 9:46 pm I've recently run a match between the freeware Critter 1.6a, with its session file learning switched on, and the much-vaunted Fat Fritz, a commercial program I had to pay for, and it took less that 10 games for Critter to turn the match score around in its favor, due to its active learning, and trounce its GPU-based 'stronger' opponent. Of course, no real surprise here, but I get the feeling most people don't get this basic fact straight, that learning can steadily and significantly increase playing (and analysis) strength.
...
I think you run that match without opening book and because of this the two engine used a very-very tight "opening book" what is useless in practical tournaments.
If you would use even a very tight but useful opening repertoire you would run many-many games between engines to gain some Elo owing to self learning.

Yes, I tend to agree. My version of FF is not the strongest by far, since the GPU is slow. Critter was able to quickly take advantage of tactical shortcomings in a bookless match and aim for favorable lines, though no games were repeated.

The more I think about it, it feels like a ripoff in this day-and-age that a commercial engine can come out without some kind of continuous-learning ability.

Ovyron · Post by **Ovyron** » Mon Feb 10, 2020 1:54 am

carldaman wrote: ↑Sun Feb 09, 2020 11:41 pm The more I think about it, it feels like a ripoff in this day-and-age that a commercial engine can come out without some kind of continuous-learning ability.

Engine developers would disagree, as they continue to remove learning from their engines that used to have it...

I could also blame Marco Costalba (he never got Learning implemented into Stockfish master even though fully working learning code was known, just like he removed Aggressiveness, Cowardice, and the other ways to tweak the engines on a whim for no good reason, other than... no measurable ELO gain for the learning code or for the tweaking parameter code so he just kept those things away from Stockfish, and only users got to suffer.)

carldaman · Post by **carldaman** » Mon Feb 10, 2020 3:23 am

I'm just as frustrated with the SF developers, but they do work for free, and have to answer to no one if they don't wish to (one of the reasons they don't mind providing a free product), but it's different with commercial entities who receive money from their customers.

Ovyron · Post by **Ovyron** » Mon Feb 10, 2020 4:23 am

carldaman wrote: ↑Mon Feb 10, 2020 3:23 am but it's different with commercial entities who receive money from their customers.

Not really, people vote with their wallets, if commercial chess software is sent out there with poor quality and still produces a good profit, the commercial developers don't have a reason to improve. This is the fault of the customers, and the only difference is customer's expectations (if I'm paying money then I expect this level of quality.)

Let's talk specifically about Fritz 17. The strongest thing on it, if you have the hardware to run it, is Fat Fritz. But this is just lc0 with a different weight file, and everything I've read suggests that Sergio's weight files or others are better than Fat Fritz. People are being suggested to drop Fat Fritz's weight file and switch it for a lc0 one, so then all you're buying is arena in the desert!

What about the Ginkgo engine? I like its style, but even if Stockfish and Leela didn't exist, Ginkgo is behind Laser, Defensechess, Rofchade, Fire, Xiphos and Ethereal in strength, all free! 4 of them are even open source! I haven't had the time to check, but just by big numbers it's really hard for Ginkgo to have the best playing style after strength calibration. So this is like buying snow in Antarctica.

The GUI is next, and I haven't seen anything that would be worthy of getting this if you already have Fritz 12 or later (and the only reason Fritz 12 is the minimum is because they dropped Playchess support for earlier versions!)

Unless it's your first Fritz GUI then it's not worth the asking price, but I bet it sold well.

Back to topic, something that would have made Fat Fritz A MUST HAVE would have been if they added continuous-learning to Leela! When I was using her in my game against mmt to attack Leela's white side, there were positions where after 11k nodes she had a position as -0.90, while Stockfish was showing -4.00. If you inserted Stockfish's move into Leela, she would see the truth, and her score would go down to -2.00 and proceed to fall faster.

However, after going back to the previous position, Leela would continue showing her -0.90 scores, which made her useless to find a better white defense, because at the root she could never find this line. With continuous-learning she could have stored that this position was -2.00 in some learn file, and then before checking a move on a weights file she'd have checked it from the learn file. And then from the root she'd have seen this line loses and suggested something else. INSTANTLY!

Then an user would just need to have Leela visit these lines until one stood out to neutralize Stockfish's attacks (well, except that in the example 1.g4 might lose so you never see that, but still, you'd eventually find the toughest defense as agreed by Stockfish+Leela.)

The sad truth of all this is that if Fat Fritz was Leela+continuous learning, it'd have sold mostly the same, just like Rybka's sales didn't increase when she had learning (so it was removed in R4.1), Houdini's sales didn't increase when it had learning (so it was removed in version 4), and Shredder's sales didn't increase when it had learning (so it was removed in version 13.)

Heck, not even Stockfish with continuous learning was downloaded, or cared about by the community, much.

People have voted with their wallets, and they voted against continuous learning.

corres · Post by **corres** » Mon Feb 10, 2020 8:06 am

Ovyron wrote: ↑Mon Feb 10, 2020 4:23 am ...
People have voted with their wallets, and they voted against continuous learning.

The base issue of continuous learning is it enhances the knowledge of the engine only in some special position but in general it makes lower the Elo because the continuous learning makes slower the search.

Ovyron · Post by **Ovyron** » Mon Feb 10, 2020 9:14 am

Yeah, but when you're analyzing a game, you want an engine that knows what's going on on the current position. The position where you'd want to make a move, or find the best possible continuation. And it's here that continuous learning eventually catches up to better search or evaluation.

Who cares about the ELO of the engine on the rest of chess? The only thing that matters is this position you're analyzing, and its mainline (the best moves that the sides can play from here), if you can find it. ELO is an average of what you're expected to get on the current position, and indeed, Stockfish is best in most, but if with continuous learning you can get there faster, sometimes much faster, and sometimes it's the only way yo get there, then you'd rather want to use an engine with this feature instead of one with higher ELO that lacks it.

a crying shame (re: self-learning engines)

Re: a crying shame (re: self-learning engines)

Re: a crying shame (re: self-learning engines)

Re: a crying shame (re: self-learning engines)

Re: a crying shame (re: self-learning engines)

Re: a crying shame (re: self-learning engines)

Re: a crying shame (re: self-learning engines)

Re: a crying shame (re: self-learning engines)

Re: a crying shame (re: self-learning engines)

Re: a crying shame (re: self-learning engines)

Re: a crying shame (re: self-learning engines)