Piece handicap elo diff, an idea for Kai Laskos for testing

lkaufman · Post by **lkaufman** » Tue Jun 28, 2016 9:52 pm

Laskos wrote:
lkaufman wrote:
How did you achieve variety at fixed depth? My guess is you used two threads, since a single thread run at fixed depth should give you identical games and hence all draws or all wins. Or did you do something else?
I tested earlier these days at time control 2s+0.02s both Stockfish and Komodo. For variety of openings, I took the handicap opening position as start and played 4 plies with the random mover (several thousands very fast games), building a handicap opening book (PGN) for each handicap.

Here are the results at 2+0.02:
Code: Select all
Stockfish dev.

tc=2+0.02
Bishop c1
 +33  =36 -931   5.1%
Bishop f1
 +17  =40 -943   3.7%

Komodo 10

tc=2+0.02
Bishop c1
 +47  =27 -926   6.0%
Bishop f1
 +29  =37 -934   4.7%
Again, for some reason f Bishop seems more valuable.

I played out the two handicap positions using the Monte Carlo option on Fritz 15 at 7 ply, a thousand games each. Results were the opposite of yours; with f1 gone White scored 3%, but with c1 gone just 1.5%! I think your method was invalid because a random mover would often forfeit castling as White with f1 gone but rarely with c1 gone. I recall you didn't like the MC feature, but I don't think you said why. Is there some reason you think my test would be invalid or biased due to some aspect of MC as implemented by ChessBase (for Rybka and for Fritz 15)?

Laskos · Post by **Laskos** » Tue Jun 28, 2016 10:40 pm

lkaufman wrote:
Laskos wrote:
lkaufman wrote:
How did you achieve variety at fixed depth? My guess is you used two threads, since a single thread run at fixed depth should give you identical games and hence all draws or all wins. Or did you do something else?
I tested earlier these days at time control 2s+0.02s both Stockfish and Komodo. For variety of openings, I took the handicap opening position as start and played 4 plies with the random mover (several thousands very fast games), building a handicap opening book (PGN) for each handicap.

Here are the results at 2+0.02:
Code: Select all
Stockfish dev.

tc=2+0.02
Bishop c1
 +33  =36 -931   5.1%
Bishop f1
 +17  =40 -943   3.7%

Komodo 10

tc=2+0.02
Bishop c1
 +47  =27 -926   6.0%
Bishop f1
 +29  =37 -934   4.7%
Again, for some reason f Bishop seems more valuable.
I played out the two handicap positions using the Monte Carlo option on Fritz 15 at 7 ply, a thousand games each. Results were the opposite of yours; with f1 gone White scored 3%, but with c1 gone just 1.5%! I think your method was invalid because a random mover would often forfeit castling as White with f1 gone but rarely with c1 gone. I recall you didn't like the MC feature, but I don't think you said why. Is there some reason you think my test would be invalid or biased due to some aspect of MC as implemented by ChessBase (for Rybka and for Fritz 15)?

You mean the 4-ply random mover book will miss the kingside castling when it's available within 2 moves without f1 Bishop? This effect must be in a range of small amount of Elo points, say 3-5 at most. I can probably build a book disabling all castling rights for both sides from the start, if that issue seems important to you.

I haven't used Fritz GUI and its MC for a while, if I remember well, MC there is non-adaptive with the number of playouts, if it picks wrong most promising lines early on when building the tree, it is stuck with them for a biased outcome. I forgot what other issue there were and in what circumstances.

lkaufman · Post by **lkaufman** » Tue Jun 28, 2016 11:07 pm

Laskos wrote:
lkaufman wrote:
Laskos wrote:
lkaufman wrote:
How did you achieve variety at fixed depth? My guess is you used two threads, since a single thread run at fixed depth should give you identical games and hence all draws or all wins. Or did you do something else?
I tested earlier these days at time control 2s+0.02s both Stockfish and Komodo. For variety of openings, I took the handicap opening position as start and played 4 plies with the random mover (several thousands very fast games), building a handicap opening book (PGN) for each handicap.

Here are the results at 2+0.02:
Code: Select all
Stockfish dev.

tc=2+0.02
Bishop c1
 +33  =36 -931   5.1%
Bishop f1
 +17  =40 -943   3.7%

Komodo 10

tc=2+0.02
Bishop c1
 +47  =27 -926   6.0%
Bishop f1
 +29  =37 -934   4.7%
Again, for some reason f Bishop seems more valuable.
I played out the two handicap positions using the Monte Carlo option on Fritz 15 at 7 ply, a thousand games each. Results were the opposite of yours; with f1 gone White scored 3%, but with c1 gone just 1.5%! I think your method was invalid because a random mover would often forfeit castling as White with f1 gone but rarely with c1 gone. I recall you didn't like the MC feature, but I don't think you said why. Is there some reason you think my test would be invalid or biased due to some aspect of MC as implemented by ChessBase (for Rybka and for Fritz 15)?
You mean the 4-ply random mover book will miss the kingside castling when it's available within 2 moves without f1 Bishop? This effect must be in a range of small amount of Elo points, say 3-5 at most. I can probably build a book disabling all castling rights for both sides from the start, if that issue seems important to you.

I haven't used Fritz GUI and its MC for a while, if I remember well, MC there is non-adaptive with the number of playouts, if it picks wrong most promising lines early on when building the tree, it is stuck with them for a biased outcome. I forgot what other issue there were and in what circumstances.

No, I mean that with f1 missing White will often play early Kf1, thus forfeiting the right to castle. This is much more serious than a couple elo, I think. Your solution of not allowing castling would fix the problem, though it makes the games less like normal chess. Better (if you can) would be to disallow king moves during the random phase unless forced.

Laskos · Post by **Laskos** » Wed Jun 29, 2016 4:02 am

lkaufman wrote:
No, I mean that with f1 missing White will often play early Kf1, thus forfeiting the right to castle. This is much more serious than a couple elo, I think. Your solution of not allowing castling would fix the problem, though it makes the games less like normal chess. Better (if you can) would be to disallow king moves during the random phase unless forced.

I managed with Pgn-extract to weed out all openings involving Kf1, there were quite a substantial number of them, about 10% of all random openings. Also, kept only unique openings. Testing now with the new, cleaned opening books, will post later.

Laskos · Post by **Laskos** » Wed Jun 29, 2016 5:15 am

Laskos wrote:
lkaufman wrote:
No, I mean that with f1 missing White will often play early Kf1, thus forfeiting the right to castle. This is much more serious than a couple elo, I think. Your solution of not allowing castling would fix the problem, though it makes the games less like normal chess. Better (if you can) would be to disallow king moves during the random phase unless forced.
I managed with Pgn-extract to weed out all openings involving Kf1, there were quite a substantial number of them, about 10% of all random openings. Also, kept only unique openings. Testing now with the new, cleaned opening books, will post later.

You seem to be right, now the results are within error margins:

Code: Select all

Komodo 10 

tc=2+0.02 

Bishop c1 
 +14  =18 -968   2.3% 
Bishop f1 
 +10  =28 -962   2.4%

Will check also in more very fast games at fixed depth for accuracy.

Laskos · Post by **Laskos** » Wed Jun 29, 2016 6:04 am

Laskos wrote:
Laskos wrote:
lkaufman wrote:
No, I mean that with f1 missing White will often play early Kf1, thus forfeiting the right to castle. This is much more serious than a couple elo, I think. Your solution of not allowing castling would fix the problem, though it makes the games less like normal chess. Better (if you can) would be to disallow king moves during the random phase unless forced.
I managed with Pgn-extract to weed out all openings involving Kf1, there were quite a substantial number of them, about 10% of all random openings. Also, kept only unique openings. Testing now with the new, cleaned opening books, will post later.
You seem to be right, now the results are within error margins:
Code: Select all
Komodo 10 

tc=2+0.02 

Bishop c1 
 +14  =18 -968   2.3% 
Bishop f1 
 +10  =28 -962   2.4%
Will check also in more very fast games at fixed depth for accuracy.

Within error margins at fixed depth=7 too, with more games. I also cleaned the c1 book from Kd1 moves (there were not many anyway), because I cleaned all Kf1 moves from f1 book.

Code: Select all

Komodo 10 

depth=7

Bishop c1 
 +135  =169 -3696   5.4% 
Bishop f1 
 +128  =163 -3709   5.2%

lkaufman · Post by **lkaufman** » Wed Jun 29, 2016 6:53 am

Laskos wrote:
Laskos wrote:
Laskos wrote:
lkaufman wrote:
No, I mean that with f1 missing White will often play early Kf1, thus forfeiting the right to castle. This is much more serious than a couple elo, I think. Your solution of not allowing castling would fix the problem, though it makes the games less like normal chess. Better (if you can) would be to disallow king moves during the random phase unless forced.
I managed with Pgn-extract to weed out all openings involving Kf1, there were quite a substantial number of them, about 10% of all random openings. Also, kept only unique openings. Testing now with the new, cleaned opening books, will post later.
You seem to be right, now the results are within error margins:
Code: Select all
Komodo 10 

tc=2+0.02 

Bishop c1 
 +14  =18 -968   2.3% 
Bishop f1 
 +10  =28 -962   2.4%
Will check also in more very fast games at fixed depth for accuracy.
Within error margins at fixed depth=7 too, with more games. I also cleaned the c1 book from Kd1 moves (there were not many anyway), because I cleaned all Kf1 moves from f1 book.
Code: Select all
Komodo 10 

depth=7

Bishop c1 
 +135  =169 -3696   5.4% 
Bishop f1 
 +128  =163 -3709   5.2%

Thanks. Your last result does suggest (not too strongly) that the f1 bishop is worth slightly more, as Komodo already believes, as theory says, and as makes sense because only the f1 bishop can check the enemy king on its home square or either of its castled squares. But it's a very small difference, as your result shows. So I guess there's nothing for us to improve here. If you find anything else surprising by this sort of testing, please let us know.

Evert · Post by **Evert** » Wed Jun 29, 2016 8:01 am

lkaufman wrote: Thanks. Your last result does suggest (not too strongly) that the f1 bishop is worth slightly more, as Komodo already believes, as theory says, and as makes sense because only the f1 bishop can check the enemy king on its home square or either of its castled squares.

I dislike evaluation terms that are based on such transient features.
Or do you mean that the bishop that is currently on a square of the same colour as the enemy king gets a small bonus?

lkaufman · Post by **lkaufman** » Wed Jun 29, 2016 8:26 am

Evert wrote:
lkaufman wrote: Thanks. Your last result does suggest (not too strongly) that the f1 bishop is worth slightly more, as Komodo already believes, as theory says, and as makes sense because only the f1 bishop can check the enemy king on its home square or either of its castled squares.
I dislike evaluation terms that are based on such transient features.
Or do you mean that the bishop that is currently on a square of the same colour as the enemy king gets a small bonus?

Rybka did it the crude way, based on the bishop's initial status, while Komodo does it something like the way you suggest. A third way would be to define "king's bishop" as the one that started on the same color as the current wing of the opponent's king, on the grounds that it will tend to migrate to b1,b8,g1, or g8 as these are the highest valued squares in middlegame pieces square tables. But the effect is pretty small and barely worth the time needed to make these decisions.

Laskos · Post by **Laskos** » Wed Jun 29, 2016 9:27 pm

lkaufman wrote:
Thanks. Your last result does suggest (not too strongly) that the f1 bishop is worth slightly more, as Komodo already believes, as theory says, and as makes sense because only the f1 bishop can check the enemy king on its home square or either of its castled squares. But it's a very small difference, as your result shows. So I guess there's nothing for us to improve here. If you find anything else surprising by this sort of testing, please let us know.

After I played with Pgn-extract this morning, I tried to make use of one of its features, with an interesting, unrelated to topic result. It seems that Komodo and Stockfish have different castling patterns. While Komodo is more focused on kingside castling, Stockfish is a bit more varied, making more queenside castlings, more opposite side castlings and more no castlings at all. Be warned that these were depth=7 games.

Code: Select all

Komodo 10
8,000 games depth=7
Castling

Kingside : 5547
Queenside:  791
No castl : 1662 
Opposite :  228

Code: Select all

Stockfish dev
8,000 games depth=7
Castling

Kingside : 4718
Queenside: 1060
No castl : 2222 
Opposite :  278

Piece handicap elo diff, an idea for Kai Laskos for testing

Re: Piece handicap elo diff, an idea for Kai Laskos for test

Re: Piece handicap elo diff, an idea for Kai Laskos for test

Re: Piece handicap elo diff, an idea for Kai Laskos for test

Re: Piece handicap elo diff, an idea for Kai Laskos for test

Re: Piece handicap elo diff, an idea for Kai Laskos for test

Re: Piece handicap elo diff, an idea for Kai Laskos for test

Re: Piece handicap elo diff, an idea for Kai Laskos for test

Re: Piece handicap elo diff, an idea for Kai Laskos for test

Re: Piece handicap elo diff, an idea for Kai Laskos for test

Re: Piece handicap elo diff, an idea for Kai Laskos for test