Some handicap results and conclusions.

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Some handicap results and conclusions.

Post by Lyudmil Tsvetkov »

lkaufman wrote:
Lyudmil Tsvetkov wrote:
Laskos wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

another one

SF says initially 20-30cps black edge.

actually, white has substantial advantage and might be even winning that.

I don't know what Komodo say on this, but white is much better.
What is your basis for saying that White is much better here and also in the Scandanavian gambit position? To me they are both quite equal, I would have trouble choosing which side I wanted to play. Just saying you think so might count for something if you were Carlsen, but as it is you need some evidence. I could try MC playouts but you would say they are not valid due to fixed depth or to weakness of Fritz 15.
I did play at 10''+0.1'' Komodo self games on Scandinavian, building book and all that. The performance is 50.6% for White, very balanced, below average White opening. So, you are right that score should be around 0.00, as it is both in SF and Komodo.
when did I claim the Scandinavian position is won for white?

I said, white has small, but clear advantage, which was confrimed by your results, btw.

white is still winning, is not it?

so, if Komodo and SF would assess it as slightly advantageous for black, while white wins in actual case, I am very much more correct, of course.

what I claimed is that on the position featured above, the non-Scandinavian position, white has significant advantage, and that will also translate in score, if you are so kind to run some tests with the second position I posted.
I can't run an arbitrary position in our tester without Involvement by Mark, and he is on vacation, but I did run it overnight on the Fritz 15 MC tester at 16 ply, which is more or less bullet chess (Fritz 16 ply is stronger than Komodo or Stockfish 16 ply, though much slower). This is your own position, not the Scandanavian one. After 742 games White scored 48.7% for minus 9 elo; Komodo rated it at zero after half a minute or so. I also ran it at ten ply to see if the increased depth at 16 ply helped White, but I got exactly 50-50 at ten ply. So I see no evidence that White is better. As for Kai's result, a +4 elo result for the Scandanavian gambit is so tiny as to be meaningless and does not suggest that the eval should be more than perhaps a couple centiply above zero even if it were confirmed with a zillion games. Maybe Stockfish does overweight mobility a bit in your examples if it reports negative scores, but Komodo seems to be right on the money here.
whatever.

not the first time that a winning position has been declared draw, or a right person wrong.

why could not you simply run some matches with Komodo at multiple cores, it will provide for randomness?

sorry to repeat it again, but fixed depth tests are simply no good.

I am completely certain the second position is either won for white, or very close to that.

I tried couple of self-tests, and score rises to above 70-80cps in the next 10 moves or so almost invariably, and I am using much longer TC, at least couple of seconds per move, so I simply don't buy the theory black could be near drawing or even having the advantage.

I hope Kai can run some TC matches with Komodo/SF on this position.
David Xu
Posts: 47
Joined: Mon Oct 31, 2016 9:45 pm

Re: Some handicap results and conclusions.

Post by David Xu »

Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:
Laskos wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

another one

SF says initially 20-30cps black edge.

actually, white has substantial advantage and might be even winning that.

I don't know what Komodo say on this, but white is much better.
What is your basis for saying that White is much better here and also in the Scandanavian gambit position? To me they are both quite equal, I would have trouble choosing which side I wanted to play. Just saying you think so might count for something if you were Carlsen, but as it is you need some evidence. I could try MC playouts but you would say they are not valid due to fixed depth or to weakness of Fritz 15.
I did play at 10''+0.1'' Komodo self games on Scandinavian, building book and all that. The performance is 50.6% for White, very balanced, below average White opening. So, you are right that score should be around 0.00, as it is both in SF and Komodo.
when did I claim the Scandinavian position is won for white?

I said, white has small, but clear advantage, which was confrimed by your results, btw.

white is still winning, is not it?

so, if Komodo and SF would assess it as slightly advantageous for black, while white wins in actual case, I am very much more correct, of course.

what I claimed is that on the position featured above, the non-Scandinavian position, white has significant advantage, and that will also translate in score, if you are so kind to run some tests with the second position I posted.
I can't run an arbitrary position in our tester without Involvement by Mark, and he is on vacation, but I did run it overnight on the Fritz 15 MC tester at 16 ply, which is more or less bullet chess (Fritz 16 ply is stronger than Komodo or Stockfish 16 ply, though much slower). This is your own position, not the Scandanavian one. After 742 games White scored 48.7% for minus 9 elo; Komodo rated it at zero after half a minute or so. I also ran it at ten ply to see if the increased depth at 16 ply helped White, but I got exactly 50-50 at ten ply. So I see no evidence that White is better. As for Kai's result, a +4 elo result for the Scandanavian gambit is so tiny as to be meaningless and does not suggest that the eval should be more than perhaps a couple centiply above zero even if it were confirmed with a zillion games. Maybe Stockfish does overweight mobility a bit in your examples if it reports negative scores, but Komodo seems to be right on the money here.
whatever.

not the first time that a winning position has been declared draw, or a right person wrong.

why could not you simply run some matches with Komodo at multiple cores, it will provide for randomness?

sorry to repeat it again, but fixed depth tests are simply no good.

I am completely certain the second position is either won for white, or very close to that.

I tried couple of self-tests, and score rises to above 70-80cps in the next 10 moves or so almost invariably, and I am using much longer TC, at least couple of seconds per move, so I simply don't buy the theory black could be near drawing or even having the advantage.

I hope Kai can run some TC matches with Komodo/SF on this position.
Until you provide either evidence for your claims or a grandmaster title, your "certainty" counts for less than nothing. Saying that "White is winning" means nothing without reasoning behind it. Watch, I can do that kind of thing too:

Black is completely winning in the position you provide. The development and mobility advantage Black has completely outweighs White's extra pawn, and White will quickly find himself on the receiving end of a mating attack.

Doesn't make it true just because I said it, though. And just because you said something doesn't make it true, either.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Some handicap results and conclusions.

Post by Lyudmil Tsvetkov »

David Xu wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:
Laskos wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

another one

SF says initially 20-30cps black edge.

actually, white has substantial advantage and might be even winning that.

I don't know what Komodo say on this, but white is much better.
What is your basis for saying that White is much better here and also in the Scandanavian gambit position? To me they are both quite equal, I would have trouble choosing which side I wanted to play. Just saying you think so might count for something if you were Carlsen, but as it is you need some evidence. I could try MC playouts but you would say they are not valid due to fixed depth or to weakness of Fritz 15.
I did play at 10''+0.1'' Komodo self games on Scandinavian, building book and all that. The performance is 50.6% for White, very balanced, below average White opening. So, you are right that score should be around 0.00, as it is both in SF and Komodo.
when did I claim the Scandinavian position is won for white?

I said, white has small, but clear advantage, which was confrimed by your results, btw.

white is still winning, is not it?

so, if Komodo and SF would assess it as slightly advantageous for black, while white wins in actual case, I am very much more correct, of course.

what I claimed is that on the position featured above, the non-Scandinavian position, white has significant advantage, and that will also translate in score, if you are so kind to run some tests with the second position I posted.
I can't run an arbitrary position in our tester without Involvement by Mark, and he is on vacation, but I did run it overnight on the Fritz 15 MC tester at 16 ply, which is more or less bullet chess (Fritz 16 ply is stronger than Komodo or Stockfish 16 ply, though much slower). This is your own position, not the Scandanavian one. After 742 games White scored 48.7% for minus 9 elo; Komodo rated it at zero after half a minute or so. I also ran it at ten ply to see if the increased depth at 16 ply helped White, but I got exactly 50-50 at ten ply. So I see no evidence that White is better. As for Kai's result, a +4 elo result for the Scandanavian gambit is so tiny as to be meaningless and does not suggest that the eval should be more than perhaps a couple centiply above zero even if it were confirmed with a zillion games. Maybe Stockfish does overweight mobility a bit in your examples if it reports negative scores, but Komodo seems to be right on the money here.
whatever.

not the first time that a winning position has been declared draw, or a right person wrong.

why could not you simply run some matches with Komodo at multiple cores, it will provide for randomness?

sorry to repeat it again, but fixed depth tests are simply no good.

I am completely certain the second position is either won for white, or very close to that.

I tried couple of self-tests, and score rises to above 70-80cps in the next 10 moves or so almost invariably, and I am using much longer TC, at least couple of seconds per move, so I simply don't buy the theory black could be near drawing or even having the advantage.

I hope Kai can run some TC matches with Komodo/SF on this position.
Until you provide either evidence for your claims or a grandmaster title, your "certainty" counts for less than nothing. Saying that "White is winning" means nothing without reasoning behind it. Watch, I can do that kind of thing too:

Black is completely winning in the position you provide. The development and mobility advantage Black has completely outweighs White's extra pawn, and White will quickly find himself on the receiving end of a mating attack.

Doesn't make it true just because I said it, though. And just because you said something doesn't make it true, either.
don't talk BS.

I have checked this thousand times, that is why I have the courage to say it.

I am don't hold an official GM title, but I am stronger than at least half of the GMs around, I would say stronger than even most of them.

wait for a conclusive test, and then apologise.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Some handicap results and conclusions.

Post by Lyudmil Tsvetkov »

2 test games, giving SF couple of seconds per move(this is longer than 1 min. bullet games, for example, not to mention 16 plies depth or shorter TC).

[pgn][Event "Blitz 1m"]
[Site "Microsoft"]
[Date "2017.08.11"]
[Round "?"]
[White "Stockfish 8 64 POPCNT"]
[Black "SF 8, owner"]
[Result "*"]
[Annotator "owner"]
[SetUp "1"]
[FEN "r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"]
[PlyCount "36"]
[TimeControl "60"]

{512MB, OWNER-PC} 1. e3 {-0.20/15 1} d5 {-0.17/15 1} 2. d4 {0.22/16 1} e4 {0.
23/17 2} 3. c4 {0.37/17 1} a6 {0.31/15 1} 4. cxd5 {0.49/16 1} Nb4 {0.45/18 1}
5. Nc3 {0.44/18 1} Nbxd5 {0.44/17 0} 6. Nge2 {0.54/17 1} Bg4 {0.46/19 1} 7. a3
{0.49/19 1} Bd6 {0.34/19 1} 8. Nxd5 {0.48/19 1} Nxd5 {0.44/20 0} 9. Qc2 {0.38/
20 1} Nf6 {0.44/20 1} 10. Nc3 {0.44/20 0} Rc8 {0.56/19 1} 11. Be2 {0.53/19 0}
Bxe2 {0.58/20 1} 12. Qxe2 {0.59/19 1} O-O {0.48/21 1} 13. O-O {0.50/20 1} Re8 {
0.52/21 1} 14. Bd2 {0.46/20 1} Bc7 {0.57/17 0} 15. Rac1 {0.52/17 0} Qd6 {0.51/
18 1} 16. g3 {0.50/17 0} Qd7 {0.50/20 1} 17. f3 {0.39/19 1} exf3 {0.41/17 0}
18. Qxf3 {0.53/17 0} b5 {0.44/19 0} *

[Event "Blitz 1m"]
[Site "Microsoft"]
[Date "2017.08.11"]
[Round "?"]
[White "Stockfish 8 64 POPCNT"]
[Black "SF 8, owner"]
[Result "*"]
[Annotator "owner"]
[SetUp "1"]
[FEN "r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"]
[PlyCount "25"]
[TimeControl "60"]

{512MB, OWNER-PC} 1. e3 {-0.24/15 1} d5 {-0.14/14 1} 2. d4 {0.13/16 1} exd4 {
0.00/16 1} 3. exd4 {0.00/20 1} Bd6 {0.07/20 1} 4. Be2 {0.08/17 1} Bc7 {0.09/19
1} 5. Nf3 {0.23/17 1} Bg4 {0.14/18 2} 6. h3 {0.21/18 1} Bh5 {0.29/19 2} 7. Nc3
{0.60/19 1} Ne4 {0.33/19 0} 8. Nxe4 {0.56/19 0} dxe4 {0.55/21 1} 9. Ng1 {0.34/
20 1} Bg6 {0.36/19 1} 10. c3 {0.61/20 0} O-O {0.54/20 0} 11. h4 {0.49/20 1} h6
{0.37/20 1} 12. h5 {0.46/20 0} Bh7 {0.39/18 0} 13. Nh3 {0.59/17 0} *

[/pgn]

as easily seen, both games start with negative white score, but some 15 moves later/30 plies, score gets to some 60cps or so white advantage.

this is quite a nice edge, don't you agree?

and I have seen that tens and tens of times, same story.

so, you think I don't have the right to claim what I have repeatedly witnessed and know for certain?
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Some handicap results and conclusions.

Post by Lyudmil Tsvetkov »

[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

caveat: above, as black has large dynamic advantage, white should play very carefully, so there are just 3 reasonable white first moves, that not only hold, but also give white substantial, probably winning advantage:

- e3
- d3
- c4

testing with just these 3 moves, or leaving the engine with multiple cores to randomise itself should score very favourably for white; using a wider book might very well provide distorted results.
JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: Some handicap results and conclusions.

Post by JJJ »

[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

At depth 27 my brainfish score +0,31 for the starting position for white and want to play e3.

I also make it play against itself the first move with 5 secondes per move and quickly rise his advantage.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Some handicap results and conclusions.

Post by lkaufman »

Lyudmil Tsvetkov wrote:[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

caveat: above, as black has large dynamic advantage, white should play very carefully, so there are just 3 reasonable white first moves, that not only hold, but also give white substantial, probably winning advantage:

- e3
- d3
- c4

testing with just these 3 moves, or leaving the engine with multiple cores to randomise itself should score very favourably for white; using a wider book might very well provide distorted results.

When I stopped my MC test, White was still under 49% after 1208 games at 16 ply. I tried a self-play game with Komodo at 4 min + 2 sec; White opening 1c4, but after 1...d5 Black obviously has an improved version of the White side of the Morra Gambit, which is considered quite equal. Black kept a favorably score for a while, then gradually White pulled it to zero and drew. But one game means nothing, and I don't have time to play the hundred or more needed to determine anything. But I would choose Black if I had to play this position for a lot of money against another player of my level.
Komodo rules!
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Some handicap results and conclusions.

Post by Lyudmil Tsvetkov »

lkaufman wrote:
Lyudmil Tsvetkov wrote:[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

caveat: above, as black has large dynamic advantage, white should play very carefully, so there are just 3 reasonable white first moves, that not only hold, but also give white substantial, probably winning advantage:

- e3
- d3
- c4

testing with just these 3 moves, or leaving the engine with multiple cores to randomise itself should score very favourably for white; using a wider book might very well provide distorted results.

When I stopped my MC test, White was still under 49% after 1208 games at 16 ply. I tried a self-play game with Komodo at 4 min + 2 sec; White opening 1c4, but after 1...d5 Black obviously has an improved version of the White side of the Morra Gambit, which is considered quite equal. Black kept a favorably score for a while, then gradually White pulled it to zero and drew. But one game means nothing, and I don't have time to play the hundred or more needed to determine anything. But I would choose Black if I had to play this position for a lot of money against another player of my level.
The Morra is actually quite bad for white.

no full equality there, maybe white could still hold, but no full equality.

as Jean rightly says, score only increases.
I don't know why would you continue to claim full equality, when top engines would show consistently white edge in almost all games.

maybe at some point, you will be eager to run a long long test, when you understand that this is a real deficiency of top engines.

not having quite time, otherwise would have posted some 100+ similar positions, with even more convincing evaluation failures.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Some handicap results and conclusions.

Post by lkaufman »

Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:[d]r1bqkb1r/pp1p1ppp/2n2n2/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

caveat: above, as black has large dynamic advantage, white should play very carefully, so there are just 3 reasonable white first moves, that not only hold, but also give white substantial, probably winning advantage:

- e3
- d3
- c4

testing with just these 3 moves, or leaving the engine with multiple cores to randomise itself should score very favourably for white; using a wider book might very well provide distorted results.

When I stopped my MC test, White was still under 49% after 1208 games at 16 ply. I tried a self-play game with Komodo at 4 min + 2 sec; White opening 1c4, but after 1...d5 Black obviously has an improved version of the White side of the Morra Gambit, which is considered quite equal. Black kept a favorably score for a while, then gradually White pulled it to zero and drew. But one game means nothing, and I don't have time to play the hundred or more needed to determine anything. But I would choose Black if I had to play this position for a lot of money against another player of my level.
The Morra is actually quite bad for white.

no full equality there, maybe white could still hold, but no full equality.

as Jean rightly says, score only increases.
I don't know why would you continue to claim full equality, when top engines would show consistently white edge in almost all games.

maybe at some point, you will be eager to run a long long test, when you understand that this is a real deficiency of top engines.

not having quite time, otherwise would have posted some 100+ similar positions, with even more convincing evaluation failures.
Since there is plenty of data on the Morra gambit, let's talk about that (after 4.Nxc3) rather than your composed position. In the Hiarcs powerbook, mostly strong engine games, White's performance rating is one elo above the opponents' average rating. In my own database of GM games plus correspondence games since Rybka came out, White's performance was six elo below the Opponents' average. Each sample above a thousand games. So maybe it's fair to say that if forced to choose a side, you should choose Black, but a proper eval should be very close to zero, maybe something like -.03 or so, based on these results. I suspect that you would like it to be evaluated -.20 or so, but the data doesn't support this. Your composed position is obviously worse for White than the Black side of the Morra, so it seems clear to me that White would score below 50% in either GM or engine games.
I am interested in this not to argue, but because if you can actually convince me that Black is substantially better in the Morra I might try to modify Komodo accordingly. I don't mind being proven wrong if I can learn from it, but I need hard evidence, not just a couple of games.
Komodo rules!
leavenfish
Posts: 282
Joined: Mon Sep 02, 2013 8:23 am

Re: Some handicap results and conclusions.

Post by leavenfish »

Conclusion: Current Stockfish is still about 30-40 points stronger than the current Komodo....

tick...tick...tick...