Dylan Sharp Vs. Harvey Williamson (G4)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Uri Blass
Posts: 10267
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by Uri Blass »

Ovyron wrote: Sat Jan 18, 2020 11:00 am
Zenmastur wrote: Thu Jan 16, 2020 2:10 pm I have used a gui that does the does basically the same thing.
Hopefully you're not talking about Aquarium's IDEA, which is an abomination and you're basically wasting 90% of resources when you use it. The effects of learning can't really be emulated by a GUI, which would rely on Exclude Moves in lines where it thinks "exploration" is necessary or by others where you check the nodes in Multi-PV and go back to a previous position when the mainline's score falls under some previous line's score.

The engine needs to see the analysis, and when it sees that this doesn't work and this other thing doesn't work it'll automatically find the best line and go deeper in it without you having to play the moves. The magic happens because the engine will automatically tell you if another line is worth considering (as you revisit a previous node it'll switch to it) or not (and it'd just repeat the same move with updated score.)

With IDEA if the opponent has some plan and it works against all your variations it'll take a looong while to play all of them out until the tree is filled and the score is finally useful, with Learning the engine only needs to see it once, then it'll see other tries transpose and will show the useful score on your second revisit! So that's how I was managing to "deem positions as lost" after only visiting a single node...

Sadly, over the years very few users have been able to get the concept, and they don't use it, so it has been removed from engines (Rybka 3 had it removed and Houdini 4 had it removed and Shredder 13 had it removed, etc.) and then the name "Learning" was used for entirely different and unrelated features that fudge scores depending on game result or just bring a TT back as if you didn't unload the engine (but the engine forgets as positions are overwritten.)

So one has to rely on private software, if one is lucky...
Zenmastur wrote: Thu Jan 16, 2020 2:10 pm Until then I know my method works and is relatively fast. It does however, require a lot of memory but doesn't need any special software. Memory can be bought almost anywhere, those specialty programs can't.
I get it, though apparently the private programs remain so because of Stockfish's licence. The programmers making them would freely share them but they don't want their sources to be known, if Stockfish allowed people to create closed derivatives who knows how many Learning Stockfishes would we have. But I guess this is a completely different subject entirely.

But, hey, Jeremy's Bernstein open implementation of learning for Stockfish from 2014 is still here:

https://open-chess.org/download/file.ph ... 6bac1adc60

I still don't get why nobody has implemented it for latest Stockfish and given an up to date engine public learning. But if my opponents had access to Learning and all my advantage against them would vanish, then the current situation is actually the best for me (the point is having a learning engine yourself, not that it's public) and I should shut up about it.
I doubt if we can trust what engines say because engines have bugs and can bring back wrong evaluations from TT.

It may be interesting to test stockfish on some cursed wins with DTZ=101 when stockfish does not use tablebases to see in how many cases stockfish is going to find a wrong winning score after a long search.

Here is one example(I did not try to use stockfish but it may be interesting if somebody test stockfish only with 5 piece syzygy tablebases to see if it is going to see a winning score or it is going to see the correct draw score).

https://syzygy-tables.info/?fen=8/2k5/3 ... _w_-_-_0_1
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by carldaman »

Ovyron wrote: Sat Jan 18, 2020 11:00 am
But, hey, Jeremy's Bernstein open implementation of learning for Stockfish from 2014 is still here:

https://open-chess.org/download/file.ph ... 6bac1adc60

I still don't get why nobody has implemented it for latest Stockfish and given an up to date engine public learning. But if my opponents had access to Learning and all my advantage against them would vanish, then the current situation is actually the best for me (the point is having a learning engine yourself, not that it's public) and I should shut up about it.
I wonder about the same thing. See my thread about the self-learning engines lacking learning, lol, where I didn't even get to mention JB's old SF with proven, functional learning. Everyone is obsessed with Elo, yet hardly anyone understands how important engine learning is in this regard. :shock:
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by Zenmastur »

Uri Blass wrote: Sat Jan 18, 2020 9:54 pm
I doubt if we can trust what engines say because engines have bugs and can bring back wrong evaluations from TT.

It may be interesting to test stockfish on some cursed wins with DTZ=101 when stockfish does not use tablebases to see in how many cases stockfish is going to find a wrong winning score after a long search.

Here is one example(I did not try to use stockfish but it may be interesting if somebody test stockfish only with 5 piece syzygy tablebases to see if it is going to see a winning score or it is going to see the correct draw score).

https://syzygy-tables.info/?fen=8/2k5/3 ... _w_-_-_0_1
Why would you want to do this without table bases?

Under ICCF rules you can claim wins based on 7-man TB's for games starting this year regardles of the win length. If the game started before this year you can claim a win based on 6-man TB's regardless of win length. So, I don't see the point of doing any analysis with less than 7-man unless you don't have the hardware to support 7-man TB's.

I did a very quick analysis of this position. By ply 10 it finds Rc1+ in 0.02 seconds on a slow machine (i.e. @12Mnps). It holds this move to at least depth 62. I checked the first 7 plies and they were all best moves. This is with no TB's on Jan 7 2020 version of SF. This is without any deep reverse analysis. Just start the engine and let it run. I started a reverse analysis and it started finding mates immediately and began reducing the mate length. I didn't bother trying to back the mates up to the root because this machine doesn't have enough memory to do a mate in 50 unless you want to spend a few days on it. I don't want to spend that much time on this example. But I think this is enough to prove the point.

Regards,

Zenmastur
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
Uri Blass
Posts: 10267
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by Uri Blass »

The reason I want to do it without tablebases is that for analysis we do not have tablebases for every position and
I want to know about the ability of chess engines to analyze chess positions and get correct conclusions and I can learn about the ability to do it for positions with more than 7 pieces from analyzing tablebases positions without tablebases(or with tablebases that are only one less piece then the number of pieces on the board).

You are right that under ICCF rules you can claim wins based on 7 man TB's but it is not the rules of chess and in any case I already stopped to play under ICCF that I expect even under these rules that ICCF is going to have more than 90% draws.

computers are supposed to help humans to analyze OTB games and humans may be interested in correct analysis when cursed win is simply a draw.
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by Ovyron »

carldaman wrote: Sun Jan 19, 2020 12:40 am Everyone is obsessed with Elo, yet hardly anyone understands how important engine learning is in this regard. :shock:
I suspect this is THE reason Learning died. The problem with it is that an engine with Learning will have a lower Elo on games. What seems to happen is that the engine faces a position with 2 playable moves that are close, say, one has 0.20 score and the other has 0.18 score. This is actually some accurate move ordering. But the game is a draw so the engine learns that the first move is actually some 0.16. The next time this line is played it'll play the 0.18 move, but since the score difference was accurate it'll go down to some 0.14.

Eventually the score of the main move will go so low that the engine will try a third one, and eventually the score can get so bad that it might play some poor move and lose! On analysis the player is expected to analyze all these variations thoroughly and see that the 0.20 move is better than the others. On games the engine without learning plays it every time and performs better than the learning one exploring others. Which would cause it to have a worse performance and have no use for learning for the majority of users (who want more ELO so they have no use for a function that decreases it.)
Uri Blass
Posts: 10267
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by Uri Blass »

Ovyron wrote: Sun Jan 19, 2020 9:52 am
carldaman wrote: Sun Jan 19, 2020 12:40 am Everyone is obsessed with Elo, yet hardly anyone understands how important engine learning is in this regard. :shock:
I suspect this is THE reason Learning died. The problem with it is that an engine with Learning will have a lower Elo on games. What seems to happen is that the engine faces a position with 2 playable moves that are close, say, one has 0.20 score and the other has 0.18 score. This is actually some accurate move ordering. But the game is a draw so the engine learns that the first move is actually some 0.16. The next time this line is played it'll play the 0.18 move, but since the score difference was accurate it'll go down to some 0.14.

Eventually the score of the main move will go so low that the engine will try a third one, and eventually the score can get so bad that it might play some poor move and lose! On analysis the player is expected to analyze all these variations thoroughly and see that the 0.20 move is better than the others. On games the engine without learning plays it every time and performs better than the learning one exploring others. Which would cause it to have a worse performance and have no use for learning for the majority of users (who want more ELO so they have no use for a function that decreases it.)
good learning can earn elo and if this is the result of learning then it is not a good learning.

It is possible to learn without memorizing scores.

I remember that romichess's learning was by copying the opponent moves(in case that the opponent won).
The idea was that if romichess lose a game with black then it learns that the moves of the opponent are correct and simply is going to play the moves of the opponent with white as long as the opponent played romichess's moves with black.

In this way you cause a stronger opponent(with no learning) simply to play against itself and after enough games the stronger opponent cannot win against you.

Assuming not all games of the opponent against itself are draws then you can learn even some lines to win against the stronger opponent so if you have a match with enough games you win against the stronger opponent.
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by Ovyron »

Uri Blass wrote: Sun Jan 19, 2020 10:09 amgood learning can earn elo and if this is the result of learning then it is not a good learning.

It is possible to learn without memorizing scores.
Yes, what you mention has been implemented in what people call an "experience file", and it does work, and it does brings Elo. People have even performed well without a book, by using the "experience file" as some kind of book learning with success.

The advantage is that a book can only change the weights of the moves but can't add new ones, while an experience file can try new ones (moves with bad scores that are actually good would eventually be tried, improving Elo relative to a book that never plays it.)

The disadvantage is that the engine spends time on the opening, losing time in the process against people that play book moves instantly, so the quality of the better moves is offset by having fewer time than the opponent once the opening is over.

Some people has settled in a hybrid approach with book learning (sacrificing opening exploration) followed by an experience file only used when out of book, but I have found success by just finding holes on the openings played by specific opponents offline and just adding them to the book (but this is sitting on top of great private efforts already done by others.)

What RomiChess does is nothing special, to emulate it you just need to lose games (oh dear) and add them to your book and mark the winning moves green. Then the book will have learned the moves and the engine will play the winning moves instantly.

Unfortunately, nothing about this is useful for analysis, where it's much better to do this reverse learning analysis to find the best move much faster than it'd take an experience file to find it after many games.
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by carldaman »

Ovyron wrote: Sun Jan 19, 2020 9:52 am
carldaman wrote: Sun Jan 19, 2020 12:40 am Everyone is obsessed with Elo, yet hardly anyone understands how important engine learning is in this regard. :shock:
I suspect this is THE reason Learning died. The problem with it is that an engine with Learning will have a lower Elo on games. What seems to happen is that the engine faces a position with 2 playable moves that are close, say, one has 0.20 score and the other has 0.18 score. This is actually some accurate move ordering. But the game is a draw so the engine learns that the first move is actually some 0.16. The next time this line is played it'll play the 0.18 move, but since the score difference was accurate it'll go down to some 0.14.

Eventually the score of the main move will go so low that the engine will try a third one, and eventually the score can get so bad that it might play some poor move and lose! On analysis the player is expected to analyze all these variations thoroughly and see that the 0.20 move is better than the others. On games the engine without learning plays it every time and performs better than the learning one exploring others. Which would cause it to have a worse performance and have no use for learning for the majority of users (who want more ELO so they have no use for a function that decreases it.)
Correct, this method of learning, as you describe it, would not guarantee improvement (real learning), since it could also learn wrong things.

Re: Romichess learning, I was under the impression that it combines two methods: copying the opponents winning moves, until those fail (aka 'monkey-see, monkey-do'), but also using some sort of session file loaded into hash.
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by Michael Sherwin »

carldaman wrote: Mon Jan 20, 2020 12:23 am
Ovyron wrote: Sun Jan 19, 2020 9:52 am
carldaman wrote: Sun Jan 19, 2020 12:40 am Everyone is obsessed with Elo, yet hardly anyone understands how important engine learning is in this regard. :shock:
I suspect this is THE reason Learning died. The problem with it is that an engine with Learning will have a lower Elo on games. What seems to happen is that the engine faces a position with 2 playable moves that are close, say, one has 0.20 score and the other has 0.18 score. This is actually some accurate move ordering. But the game is a draw so the engine learns that the first move is actually some 0.16. The next time this line is played it'll play the 0.18 move, but since the score difference was accurate it'll go down to some 0.14.

Eventually the score of the main move will go so low that the engine will try a third one, and eventually the score can get so bad that it might play some poor move and lose! On analysis the player is expected to analyze all these variations thoroughly and see that the 0.20 move is better than the others. On games the engine without learning plays it every time and performs better than the learning one exploring others. Which would cause it to have a worse performance and have no use for learning for the majority of users (who want more ELO so they have no use for a function that decreases it.)
Correct, this method of learning, as you describe it, would not guarantee improvement (real learning), since it could also learn wrong things.

Re: Romichess learning, I was under the impression that it combines two methods: copying the opponents winning moves, until those fail (aka 'monkey-see, monkey-do'), but also using some sort of session file loaded into hash.
Thanks carldaman for being accurate about RomiChess's learning. I regret putting both types of learning into Romi because all people seem to mention is the Monkey See Monkey Do and totally neglect the Reinforcement Learning loaded into the hash table before each search. Both are based on real world animal learning.

Monkey See Monkey Do
In 2005 I read a story about scientist on a Japanese island digging up wild potatoes and washing them in a stream. The monkeys on the island saw what the scientist were doing and then they started to wash their potatoes. Since it is just mimicking what works as long as it continues to work it can be used to make moves instantly. People call this a book approach. Maybe in a way it is book learning. But, primarily it is reinforcement learning applied instantly requiring no search.

Pavlov's Dog Experiments
Also in 2005 I read about Pavlov's dog experiments and how the dogs learned by getting a reward or a punishment. I simply adapted that to computer chess by giving the winning sides moves a small reward and the losing sides moves a small punishment. These were saved in an ever growing tree structure. If MSMD did not produce an instant move and if there was a subtree of previously played games stored then the scores of all the positions with their reward/punishment values are loaded into the hash table prior to the search. All this does is sway the search to play better moves or simply to avoid punished moves. Critic's insist that there are many flaws to this approach. One major criticism is that once Romi is out of the stored tree then it is of no further value. This is not true for two reasons. Reason one is that if Romi plays d4 openings better than e4 openings and the PDE causes Romi to play d4 instead of e4 then Romi has benefited. This applies to any move choice that has enough instances to provide reliable learning. Reason two is that the hash tables retain the influence of the learning long after the learning data has run out. Let's say a bishop on g5 is pinning a knight on f6 and the opponent plays h6 and the PDE prefers Bh4 more than Bf6 or Bf4 then the Bh4 influence will be in the hash table even after the PDE data has run out. Therefore if the opponent plays h6 say six moves latter the influence that Bh4 is better is still in the hash table.

An experiment against iirc 6 top engines using Bob's humongous book for those engines with Romi only using its learn.dat file showed a linear increase of 50 elo for Romi every 5000 games. An experiment that I conducted using the 10 Nunn positions playing both sides against Glaurung repeated 20 times for a total of 400 games resulted in a linear gain from 5% for Romi after the first set to 95% for Romi in set 20.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
carldaman
Posts: 2283
Joined: Sat Jun 02, 2012 2:13 am

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by carldaman »

Hi Michael, it is great to see you posting here.

I'm glad that Romi's learning is the way it is. Easily one of my favorite engines to spar against. I just don't know what the next game may bring. :D

The reinforcement learning in Romi predated AlphaZero's self-learning by many years. This certainly deserves more recognition. One can only wonder where Romi would be, if it could train on millions upon millions of games...

Best wishes,
Carl