Leela Nets for Human Training and "Style"

Ferdy · Post by **Ferdy** » Sat Jun 29, 2019 4:04 pm

supersharp77 wrote: ↑Sat Jun 29, 2019 9:00 am
Ferdy wrote: ↑Sat Jun 29, 2019 3:43 am
supersharp77 wrote: ↑Sat Jun 29, 2019 3:19 am
Ferdy wrote: ↑Sat Jun 29, 2019 2:19 am Preliminary draft on how to extract positions for personality test suite.

I plan to start with Tal.
Code: Select all
Generate Tal test suite

1. Get Tal's games

2. Read each position where Tal is to move

3. Analyze each position with Stockfish at 30s, on multipv 2

4. Save the position for test suite only if all below conditions are satisfied.
    a. score of top 1 move >= -400cp and score of top 1 move <= 400cp. 
        We exclude positions that already have decisive advantage or disadvantage.
    b. The difference between the score of top 1 move and the score of top 2 move
        must not be more than 200cp. 
	Positions with alternative move that is clearly bad will be excluded.
		
5. Positions saved in epd format contains a bm that is the actual move by Tal in the game
Suggestions are welcome.
This could take some time...I would start with a basket of opening choices..(Gambit vs Tactical vs Solid vs Positional etc)
Lets say Sicilian opening for a start.

supersharp77 wrote: ↑Sat Jun 29, 2019 3:19 amThen based on the move order choices
Move order choices?
Well..yes you gotta know the Chess Openings (ex It's The Sicilian Defense ) Sharp would be Najdorf Lines 6.Bg5
Quiet would be 6.Be2 Poison Pawn variation would be 'Gambit' 6. g3 would be a quiet line..Lasker Pelikan would be sharp Dragon variation would be tactical/sharp Classical QGD would be 'quiet'..Ruy Lopez Marshall attack would be 'Gambit'
Classical French would be positional.. French Winawer 'Tactical or Gambit' etc...

It is something that you have a position classification added in the epd.
Example:
rnbqkb1r/1p2pppp/p2p1n2/8/3NP3/2N5/PPP2PPP/R1BQKB1R w KQkq - c0 "Bg5=Sharp, Be2=quiet";

[d]rnbqkb1r/1p2pppp/p2p1n2/8/3NP3/2N5/PPP2PPP/R1BQKB1R w KQkq -

So if we test an engine and it chooses Bg5 we will increment its Sharp style counter.

So we need to layout the style before hand.
1. Sharp style or Tactical style
2. Quiet style or Positional style
3. Ender style (usually simplifies positions to reach an ending)
4. others.

After testing we can have a range of styles weights.

Code: Select all

engine: Stockfish, tactical: 75%, positional: 15%, ender: 2%, others: 8%
engine: Lc0, tactical: ...

Some works will be spent classifying these position/bm/style combo manually. If we can create an algorithm to detect styles automatically then it would be easier to generate such test suites.

Got some links:
http://chess.geniusprophecy.com/chess-styles.html
https://thechessworld.com/articles/gene ... y-against/
https://www.pathtochessmastery.com/2012 ... ucted.html
https://chess.stackexchange.com/questio ... -are-there

supersharp77 · Post by **supersharp77** » Sat Jun 29, 2019 9:49 pm

Ferdy wrote: ↑Sat Jun 29, 2019 4:04 pm
supersharp77 wrote: ↑Sat Jun 29, 2019 9:00 am
Ferdy wrote: ↑Sat Jun 29, 2019 3:43 am
supersharp77 wrote: ↑Sat Jun 29, 2019 3:19 am
Ferdy wrote: ↑Sat Jun 29, 2019 2:19 am Preliminary draft on how to extract positions for personality test suite.

I plan to start with Tal.
Code: Select all
Generate Tal test suite

1. Get Tal's games

2. Read each position where Tal is to move

3. Analyze each position with Stockfish at 30s, on multipv 2

4. Save the position for test suite only if all below conditions are satisfied.
    a. score of top 1 move >= -400cp and score of top 1 move <= 400cp. 
        We exclude positions that already have decisive advantage or disadvantage.
    b. The difference between the score of top 1 move and the score of top 2 move
        must not be more than 200cp. 
	Positions with alternative move that is clearly bad will be excluded.
		
5. Positions saved in epd format contains a bm that is the actual move by Tal in the game
Suggestions are welcome.
This could take some time...I would start with a basket of opening choices..(Gambit vs Tactical vs Solid vs Positional etc)
Lets say Sicilian opening for a start.

supersharp77 wrote: ↑Sat Jun 29, 2019 3:19 amThen based on the move order choices
Move order choices?
Well..yes you gotta know the Chess Openings (ex It's The Sicilian Defense ) Sharp would be Najdorf Lines 6.Bg5
Quiet would be 6.Be2 Poison Pawn variation would be 'Gambit' 6. g3 would be a quiet line..Lasker Pelikan would be sharp Dragon variation would be tactical/sharp Classical QGD would be 'quiet'..Ruy Lopez Marshall attack would be 'Gambit'
Classical French would be positional.. French Winawer 'Tactical or Gambit' etc...
It is something that you have a position classification added in the epd.
Example:
rnbqkb1r/1p2pppp/p2p1n2/8/3NP3/2N5/PPP2PPP/R1BQKB1R w KQkq - c0 "Bg5=Sharp, Be2=quiet";

[d]rnbqkb1r/1p2pppp/p2p1n2/8/3NP3/2N5/PPP2PPP/R1BQKB1R w KQkq -

So if we test an engine and it chooses Bg5 we will increment its Sharp style counter.

So we need to layout the style before hand.
1. Sharp style or Tactical style
2. Quiet style or Positional style
3. Ender style (usually simplifies positions to reach an ending)
4. others.

After testing we can have a range of styles weights.
Code: Select all
engine: Stockfish, tactical: 75%, positional: 15%, ender: 2%, others: 8%
engine: Lc0, tactical: ...
Some works will be spent classifying these position/bm/style combo manually. If we can create an algorithm to detect styles automatically then it would be easier to generate such test suites.

Got some links:
http://chess.geniusprophecy.com/chess-styles.html
https://thechessworld.com/articles/gene ... y-against/
https://www.pathtochessmastery.com/2012 ... ucted.html
https://chess.stackexchange.com/questio ... -are-there

There you go! Thats the idea...Now using this Najdorf basic position after a6...White has 6.Bg5 sharp 6 a4 positional
6.Be2 solid 6.Be3 (varied) 6.g3 positional 6. h3 Tactical/positional 6.f3 attacking etc... take the poison pawn variation the engines love to "decline the gambit" (black does not take on b2) then further pruning would be needed to take into account those move choices

Albert Silver · Post by **Albert Silver** » Sun Jun 30, 2019 12:50 am

Ferdy wrote: ↑Fri Jun 28, 2019 9:22 am
Albert Silver wrote: ↑Fri Jun 28, 2019 3:53 am
Ferdy wrote: ↑Fri Jun 28, 2019 2:38 am One simple and automatic way is thru personality. Example Fischer, collect positions (with some criteria) where Fischer is to move, make it as a test suite where the bm in the epd is Fischer's move in the actual game, then measure the engine's performance by counting how many moves match the bm in the epd.
It is a great idea, but I would not use any criteria. Best moves (meaning famous ones) will almost always be tactical in nature, which means they will be beyond style, as they are the single best move period. Style is precisely when there are multiple viable options, and one is chosen. Those are the ones that distinguish style. One way might be simply to get all the games of top players with distinctive styles, though not when they were too young or old, run the engine, and see whose moves match the most. You then may have a measure of the style or player(s) said engine matches the closest.
Given position after e4 e5 nf3 nc6 bb5 a6 ba4 nf6 0-0 be7 re1 b5.
Fischer is white it is not necessary to include that in fischer personality test suite with bb3 as bm. This is what I mean by collect positions with some criteria. This is only in the opening, there can be in middle and ending.

While I can understand not repeating the position in a test suite, I would still include it. The reason is that the openings would be precisely where a player like Fischer would make the most thought out decisions. His predilection for the Ruy, for 1.e4, for the Najdorf are all a part of his style and key to obtaining the positions he liked best.

dkappe · Post by **dkappe** » Sun Jun 30, 2019 1:33 am

While this is all very interesting, training nets to play like a particular player isn't really what my approach can do. It's more like "speculative", "conservative," etc.

Ferdy · Post by **Ferdy** » Sun Jun 30, 2019 1:41 am

Albert Silver wrote: ↑Sun Jun 30, 2019 12:50 am
Ferdy wrote: ↑Fri Jun 28, 2019 9:22 am
Albert Silver wrote: ↑Fri Jun 28, 2019 3:53 am
Ferdy wrote: ↑Fri Jun 28, 2019 2:38 am One simple and automatic way is thru personality. Example Fischer, collect positions (with some criteria) where Fischer is to move, make it as a test suite where the bm in the epd is Fischer's move in the actual game, then measure the engine's performance by counting how many moves match the bm in the epd.
It is a great idea, but I would not use any criteria. Best moves (meaning famous ones) will almost always be tactical in nature, which means they will be beyond style, as they are the single best move period. Style is precisely when there are multiple viable options, and one is chosen. Those are the ones that distinguish style. One way might be simply to get all the games of top players with distinctive styles, though not when they were too young or old, run the engine, and see whose moves match the most. You then may have a measure of the style or player(s) said engine matches the closest.
Given position after e4 e5 nf3 nc6 bb5 a6 ba4 nf6 0-0 be7 re1 b5.
Fischer is white it is not necessary to include that in fischer personality test suite with bb3 as bm. This is what I mean by collect positions with some criteria. This is only in the opening, there can be in middle and ending.
While I can understand not repeating the position in a test suite, I would still include it. The reason is that the openings would be precisely where a player like Fischer would make the most thought out decisions. His predilection for the Ruy, for 1.e4, for the Najdorf are all a part of his style and key to obtaining the positions he liked best.

This is my initial typical plan of generating player test position.

Code: Select all

Generate Tal test suite

1. Get Tal's games

2. Read each position where Tal is to move

3. Analyze each position with Stockfish at 30s, on multipv 2

4. Save the position for test suite only if all below conditions are satisfied.
    a. score of top 1 move >= -400cp and score of top 1 move <= 400cp. 
        We exclude positions that already have decisive advantage or disadvantage.
    b. The difference between the score of top 1 move and the score of top 2 move
        must not be more than 200cp. 
	Positions with alternative move that is clearly bad will be excluded.
		
5. Positions saved in epd format contains a bm that is the actual move by Tal in the game

Conditions to save position is covered in item 4.

Ferdy · Post by **Ferdy** » Sun Jun 30, 2019 1:51 am

dkappe wrote: ↑Sun Jun 30, 2019 1:33 am While this is all very interesting, training nets to play like a particular player isn't really what my approach can do. It's more like "speculative", "conservative," etc.

The more varied nets you can create the better. Lets see how it would perform against a GM's test suites.

It would also be fun to see the results of alpha/beta engines in the range of 2400 to 2800 CCRL40/4 rating points.

Albert Silver · Post by **Albert Silver** » Sun Jun 30, 2019 3:47 am

Ferdy wrote: ↑Sun Jun 30, 2019 1:41 am
Albert Silver wrote: ↑Sun Jun 30, 2019 12:50 am
While I can understand not repeating the position in a test suite, I would still include it. The reason is that the openings would be precisely where a player like Fischer would make the most thought out decisions. His predilection for the Ruy, for 1.e4, for the Najdorf are all a part of his style and key to obtaining the positions he liked best.
This is my initial typical plan of generating player test position.
Code: Select all
Generate Tal test suite

1. Get Tal's games

2. Read each position where Tal is to move

3. Analyze each position with Stockfish at 30s, on multipv 2

4. Save the position for test suite only if all below conditions are satisfied.
    a. score of top 1 move >= -400cp and score of top 1 move <= 400cp. 
        We exclude positions that already have decisive advantage or disadvantage.
    b. The difference between the score of top 1 move and the score of top 2 move
        must not be more than 200cp. 
	Positions with alternative move that is clearly bad will be excluded.
		
5. Positions saved in epd format contains a bm that is the actual move by Tal in the game
Conditions to save position is covered in item 4.

While I completely sympathize with the reasonable idea of rejecting tactical shots, which are independent of style, another problem makes that impossible. Especially for a player like Tal. He would regularly sacrifice material, wrongly, rightly, just to confuse the issue. That was his style. SF and engines would likely claim he was just a tremendously lucky idiot, who somehow survived or won games he had 'botched' by dropping a piece. But of course that was a stylistic choice.

I never said this would be easy...

Initially, i'd just accept that this will be part of the noise that needs to be worked with, and if this proves to be completely bad, come back to the drawing board and see what tweaks can be made.

Ovyron · Post by **Ovyron** » Sun Jun 30, 2019 5:27 am

dkappe wrote: ↑Fri Jun 28, 2019 2:04 amI've hit upon a good approach, but am struggling with how to "measure" the style. Surely this problem must have been addressed in these forums before.

It has been pointed out in the past many times, we've had people like me talking about the style of engines (usually the ones I have the most experience with, like Thinker Inert, Zappa Mexio Dissident Agressor, Toga Chekov, Hiarcs Paderborn 2007, The King (ChessMaster), Pro Deo, Komodo KingHunter, Houdini 6 Contempt 10, Naum 3, and more recently Fizbo and Andscacs), recently I talked at length about how Rybka's playing style changed from version to version and how it compared with her contemporaries. I proposed the Seven Muses Project that would have allowed an engine of interesting playing style to remain competitive against the best, and such a method was automated in chess tool Aiquiri.

But this is the first time I see serious attempts at discussing how to measure style.

One huge problem with it is that, style, as it is, is not always about what moves are played in a position, but about leading the game into positions where piece sacrifices or King attacks, are possible. Sure, once they're possible, 90% of engines will play them, but how many of those would have aimed to this position in the first place? The craziest games by Thinker Inert were those where the engine would reach those positions at all costs, and remember, a good measurer of style would also take into account engines that go into positions that make favorable for the opponent to play sacrifices against them.

Because, if an engine plays always into positions where opponent always exchanges a piece for three pawns against them, it's not the opponent who should be given the style bonuses... Rybka 3 Dynamic was an engine that was aiming towards material imbalances (rook v piece and 2 pawns and such), that's also a clear style.

What I was proposing was to play a few games with an engine that is known already, subjectively, to display some playing style. Say, Fritz 10.1 was an engine known for its famous king attacks. Once you have those games, you give bonuses to engines that play the moves that lead to the positions where the king attack is possible, and so on.

Unfortunately, one thing apparent is that, the stronger an engine gets, the less it displays its "style." Which means style is nothing but suboptimal moves, and what we see in those games is ever worse suboptimal moves, and it's only then that the style triumphs. Match that style against the likes of Stockfish 10 and you'll see how playing without style is stronger. This is the reason Houdini 6 with Contempt 10 is the strongest thing you'll see that displays a very attractive playing style, and yet, Contempt 2 is the default, because the other 8 points just make the engine weaker and weaker.

chrisw · Post by **chrisw** » Sun Jun 30, 2019 11:35 am

Ferdy wrote: ↑Sun Jun 30, 2019 1:41 am
Albert Silver wrote: ↑Sun Jun 30, 2019 12:50 am
Ferdy wrote: ↑Fri Jun 28, 2019 9:22 am
Albert Silver wrote: ↑Fri Jun 28, 2019 3:53 am
Ferdy wrote: ↑Fri Jun 28, 2019 2:38 am One simple and automatic way is thru personality. Example Fischer, collect positions (with some criteria) where Fischer is to move, make it as a test suite where the bm in the epd is Fischer's move in the actual game, then measure the engine's performance by counting how many moves match the bm in the epd.
It is a great idea, but I would not use any criteria. Best moves (meaning famous ones) will almost always be tactical in nature, which means they will be beyond style, as they are the single best move period. Style is precisely when there are multiple viable options, and one is chosen. Those are the ones that distinguish style. One way might be simply to get all the games of top players with distinctive styles, though not when they were too young or old, run the engine, and see whose moves match the most. You then may have a measure of the style or player(s) said engine matches the closest.
Given position after e4 e5 nf3 nc6 bb5 a6 ba4 nf6 0-0 be7 re1 b5.
Fischer is white it is not necessary to include that in fischer personality test suite with bb3 as bm. This is what I mean by collect positions with some criteria. This is only in the opening, there can be in middle and ending.
While I can understand not repeating the position in a test suite, I would still include it. The reason is that the openings would be precisely where a player like Fischer would make the most thought out decisions. His predilection for the Ruy, for 1.e4, for the Najdorf are all a part of his style and key to obtaining the positions he liked best.
This is my initial typical plan of generating player test position.
Code: Select all
Generate Tal test suite

1. Get Tal's games

2. Read each position where Tal is to move

3. Analyze each position with Stockfish at 30s, on multipv 2

4. Save the position for test suite only if all below conditions are satisfied.
    a. score of top 1 move >= -400cp and score of top 1 move <= 400cp. 
        We exclude positions that already have decisive advantage or disadvantage.
    b. The difference between the score of top 1 move and the score of top 2 move
        must not be more than 200cp. 
	Positions with alternative move that is clearly bad will be excluded.
		
5. Positions saved in epd format contains a bm that is the actual move by Tal in the game
Conditions to save position is covered in item 4.

Different approach, but involves a bit of chess programming....

Build evaluation feature list. mobility for each piece, pieces hanging, pieces attacked, attacks around king, something about pawns, castle status, each piece development, checks, and so on, use your imagination.
Perform linear regression using EPDs for large pool of players and for individual players you are interested in to get optimal weights.
Deviations from large pool norm will maybe indicate a play style.
I’ld be inclined to junk positions beyond move 30 or 40 and early opening positions. I’ld also use game result to measure cases of down material but won game.

Ferdy · Post by **Ferdy** » Sun Jun 30, 2019 2:26 pm

chrisw wrote: ↑Sun Jun 30, 2019 11:35 am Different approach, but involves a bit of chess programming....

Build evaluation feature list. mobility for each piece, pieces hanging, pieces attacked, attacks around king, something about pawns, castle status, each piece development, checks, and so on, use your imagination.

Sure we can do that.

chrisw wrote: ↑Sun Jun 30, 2019 11:35 am Perform linear regression using EPDs for large pool of players and for individual players you are interested in to get optimal weights.

Which are your independent and dependent variables? Which weight to optimize? Can you give an example?

Leela Nets for Human Training and "Style"

Re: Leela Nets for Human Training and "Style"

Re: Leela Nets for Human Training and "Style"

Re: Leela Nets for Human Training and "Style"

Re: Leela Nets for Human Training and "Style"

Re: Leela Nets for Human Training and "Style"

Re: Leela Nets for Human Training and "Style"

Re: Leela Nets for Human Training and "Style"

Re: Leela Nets for Human Training and "Style"

Re: Leela Nets for Human Training and "Style"

Re: Leela Nets for Human Training and "Style"