Stockfish Rollercoaster Effect

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: Stockfish Rollercoaster Effect

Post by yanquis1972 »

carldaman wrote: Thu Dec 20, 2018 3:25 am
Robert Pope wrote: Wed Dec 19, 2018 3:58 pm
Ovyron wrote: Tue Dec 18, 2018 12:00 pm
corres wrote: Mon Dec 03, 2018 10:11 am1, Every type of contempt weakens the engines.
No, if this was the case it'd have lower elo on the rating lists. This contempt is just a draw-avoider, for analysis, the user is expected to take care of the cases where avoiding draw is worse than the alternatives, for the rest of scenarios Contempt is superior.
Contempt does weaken the engine, relative to perfect play. By avoiding draw when draw is the best outcome, contempt opens you up to blunders that lose the game. The only reason it doesn't lower the elo is that it is being used against even weaker engines. Avoiding draws extends the games and exposes Stockfish to the risk of blunders, but it happens that the other engines will blunder more often and it is a net gain against those engines. But from a perfect-player perspective, it is playing sub-optimal moves and will score worse against a better player.

Contempt, if small, introduces the risk of making (slightly) inaccurate moves, not outright blunders. It's a risk I'm willing to live with, because the moves chosen will be of better use - from my perspective. Very often the contempt-induced moves will also be of better quality, leading to favorable situations that the stronger engine can exploit. Quite useful for analysis as well.
my reasoning is that chess is a draw with perfect play, but the only way to score is to win, so generally speaking moves with demonstrate higher win potential should be given a bonus over moves that don't. i have no idea how that's implemented (w/out context, it's how chess machines inherently work) & i really hadn't considered this before ovyron's post, but as an ideal it makes sense to me. if SF gets results with it even in self-play (vs zero contempt), that's hugely significant, if not outright proof of concept.
felistime
Posts: 1
Joined: Sun Nov 24, 2019 6:24 pm
Full name: Enrico Felis

Re: Stockfish Rollercoaster Effect

Post by felistime »

Ovyron wrote: Mon Dec 03, 2018 1:45 am
I actually have a private Stockfish where this isn't necessary because it has different hashes for each side, but more on that later.


The Double Hash

Finally, as I've said, a double hash could be implemented, so that this paradigm can function without the user needing to fire up an instance for each side. Stockfish would just keep one hash file for white and another for black and use them accordingly so one can analyze without polluting the hash table.
Could you please share the source code of stockfish with implemented double hash?
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: Stockfish Rollercoaster Effect

Post by Ovyron »

I don't have the source code.

Though apparently I was wrong, the engine does still have a single hash, it does use separate learning files for each side, so the hash would be rewritten by the corresponding file depending on who's to move. It doesn't make sense to use a hash for each side if one is enough and you just need 2 learning files, I guess.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Stockfish Rollercoaster Effect

Post by corres »

Robert Pope wrote: Wed Dec 19, 2018 3:58 pm ...
Contempt does weaken the engine, relative to perfect play. By avoiding draw when draw is the best outcome, contempt opens you up to blunders that lose the game. The only reason it doesn't lower the elo is that it is being used against even weaker engines. Avoiding draws extends the games and exposes Stockfish to the risk of blunders, but it happens that the other engines will blunder more often and it is a net gain against those engines. But from a perfect-player perspective, it is playing sub-optimal moves and will score worse against a better player.
I totally agree you.
In practice there are games in what engine is helped by using contempt to get better results and there are games in what contempt causes worse results.
Contempt is a tool for the stronger player to extend the game for getting more opportunity of winning.
Dynamic contempt has another effect too: It tightens the search window and enhance the selectivity of search.
This effect may help finding mate, material benefit, big positional advantage and it may hinder concealed disadvantages.
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: Stockfish Rollercoaster Effect

Post by MikeB »

To me, observing from a little bit of distance, the sole reason of contempt was to gain more Elo rating points against weaker engines. It had nothing to do with making Stockfish stronger, just gaining more Elo against weaker opponents. In fact, the way it was implemented, it was the highest possible contempt value which does not lose Elo when playing against itself ( but how about when they play against stronger engines now NN ?) that's not tested.. They have wasted a lot of resources on testing related to this function, imho.
Image
Branko Radovanovic
Posts: 89
Joined: Sat Sep 13, 2014 4:12 pm
Location: Zagreb, Croatia
Full name: Branko Radovanović

Re: Stockfish Rollercoaster Effect

Post by Branko Radovanovic »

Ovyron wrote: Mon Dec 03, 2018 1:45 am In the new paradigm, the 30% move increases the chances of white winning, and increases the chance of black winning, so it makes sense that white would show a bigger score for itself, and black would show a bigger score for itself, causing the roller coaster effect. The problem is the user wanting to use white's score for black, and black's score for white.

The new paradigm is right, and that's why it leads to better move choices and more elo.
There's an error in your reasoning. "It makes sense that white would show a bigger score for itself" - assuming players of equal strength, no it doesn't, because in an otherwise equal position, whatever increases the chances of white winning, also equally increases the chances of white losing.

Of course, the premise of positive contempt is that you are stronger than your opponent, in which case it is indeed plausible that, by reducing the drawing chances, you increase the chances of winning and your expected score.

Therefore, the new paradigm simply cannot be right in the analysis context, because it's logically impossible for white to be stronger than black and for black to be stronger than white, all at the same time. Making such contradictory assumptions will inevitably lead to contradictory scores.
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: Stockfish Rollercoaster Effect

Post by Ovyron »

Branko Radovanovic wrote: Mon Jan 06, 2020 1:06 pm whatever increases the chances of white winning, also equally increases the chances of white losing.
But not necessarily by the same amount. A move might increase white's chances of winning by 10% but only increase its losing chances by 5%. Contempt 0 might see that white losing chances increase by 5% with some move and discard it, even though a weaker opponent wouldn't have played into that 5% anyway (so it's irrelevant, you can ignore those increasing losing chances against them.)
Branko Radovanovic wrote: Mon Jan 06, 2020 1:06 pmTherefore, the new paradigm simply cannot be right in the analysis context, because it's logically impossible for white to be stronger than black and for black to be stronger than white, all at the same time. Making such contradictory assumptions will inevitably lead to contradictory scores.
It seems that you picture chess players as having some real fixed ELO before the game start, and that they will play with this strength the whole game. After playing hundreds of games we could have seen what their ELO was and see who was really the stronger player. In this case the paradigm used by the weaker of the players was wrong, because they were picking their moves using wrong assumptions (that their opponent was weaker.)

What happens in real life, in real games is very different, the ratings of the moves displayed varies, sometimes by a lot. A player can play moves with some 3400 ELO 80% of the game, some 3000 10%, 2800 5%, 2600% 4% and 2400 1%. Or something. In such case what you want is that 1% weak move that they'll play to be significant, because if on that position a 3400 ELO player and a 2400 ELO player are going to play the same move, then the fluctuation in strength will be useless for you. Let's take a look at the Exchange Spanish:

[d]r1bqkbnr/1ppp1ppp/p1B5/4p3/4P3/5N2/PPPP1PPP/RNBQK2R b KQkq -

For the sake of argument let's assume that here bxc6 is really bad. We should be able then to assign Ratings to the moves. Let's say that bxc6 is a 2700 elo move, some players with ELO 2701 and above will not play it. If the very next move your opponent's ELO fluctuates to 2400, then they might play bxc6.

Contempt would now say that if the fluctuation happens, are you happy with the position after bxc6, or would you rather have some position after 4.Ba4 where black plays a 2400 ELO move?

White playing weaker than black and black playing weaker than white is something that happens from move to move of the game. Because, it's imperfect opponents we're talking about, which means they'e fallible and can lose the game, which means they're going to blunder and play a low rated move, you just don't know when.

The new paradigm picks moves that take the most advantage of the blunders that the opponent will play, while the old paradigm assumes that the opponent can see your analysis (because if you find a refutation for a line you automatically assume that because it exists the opponent sees it), which is wrong by default.

The other day I posted a game against a ICCF rated 2400 player where I won because he moved the Knight to the wrong square, so that was the blunder that decided the game and if I didn't use contempt (because my rating was 250 ELO lower) he might have moved the Knight to a wrong square in a position where doing so wouldn't have mattered.

If white has a score of 0.30 and black has a score of -0.30, it just means both sides believe the other would be in deeper trouble if they played the blunder on this position, than if it was just 0.20/-0.20 or 0.10/-0.10. The old paradigm would show 0.00 and we wouldn't know anything about this. So it's not about the new one being right or wrong, but about it being more useful.

But you have to start with the assumption that the opponent will blunder, lest the best moves from both sides will be Draw Offer followed by Accept Offer.
Branko Radovanovic
Posts: 89
Joined: Sat Sep 13, 2014 4:12 pm
Location: Zagreb, Croatia
Full name: Branko Radovanović

Re: Stockfish Rollercoaster Effect

Post by Branko Radovanovic »

Ovyron wrote: Mon Jan 06, 2020 3:20 pm If white has a score of 0.30 and black has a score of -0.30, it just means both sides believe the other would be in deeper trouble if they played the blunder on this position, than if it was just 0.20/-0.20 or 0.10/-0.10. The old paradigm would show 0.00 and we wouldn't know anything about this. So it's not about the new one being right or wrong, but about it being more useful.
But here is the trouble: if white has a score of 0.30 and black has a score of -0.30, this means that both sides believe they have the advantage, i.e. that their expected game score is > 0.5. It is OK for both white and black to believe this (and, consequently, choose an appropriate strategy), but it can't be both true: whatever the relative playing strengths of black and white are, the sum of their expected game scores at any given moment can't be other than 1.

With respect to contempt, the difference between a match setting and an analysis setting is that in the match there are two sides with possibly incompatible prior beliefs, whereas in analysis there can be only one, consistent prior belief about the relative strength of the players, and whatever that belief is, it should result - at least theoretically - in symmetric (zero-sum) scoring.

Analyzing using positive contempt for both sides is useful only if one wishes to simulate a hypothetical match situation. If both black and white use positive contempt, at least one of them operates with an incorrect assumption, which may be interesting in itself.
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: Stockfish Rollercoaster Effect

Post by Ovyron »

Branko Radovanovic wrote: Tue Jan 07, 2020 12:45 am But here is the trouble: if white has a score of 0.30 and black has a score of -0.30, this means that both sides believe they have the advantage, i.e. that their expected game score is > 0.5.
It's not about the game score, it's from one position to the next. It doesn't make sense to apply a 0.30/-0.30 to the entire game, you just apply this to the current one, the next can be 0.40/-0.40 or 0.35/-0.25 or 0.00/-0.50 or something else, you don't know, and you don't care, because this looks at the position in a vacuum. If the opponent is going to blunder on this position then you're right and thanks to Contempt you have predicted that you had the advantage (but you didn't know because you thought maybe your opponent would have played the move where they didn't blunder.) With Contempt 0 the same thing happens (the opponent is going to blunder regardless of what you play), the difference is that you maximize the advantage that you get when they blunder (this is known as "active play", with "passive play" the opponent might blunder and lose 0.05 centipawns which are barely noticeable.)

And, anyway, even if this whole paradigm was wrong, it works in practice, so if by following it you're going to play the wrong moves from the wrong reasons and win games you'd have previously drawn, then it's a good paradigm to follow.
With respect to contempt, the difference between a match setting and an analysis setting is that in the match there are two sides with possibly incompatible prior beliefs, whereas in analysis there can be only one
Wait, analysis for the sake of analysis? No paradigm works for that, because you're going to blunder, but you can't know where, so your tree is going to have a false assumption somewhere, but it doesn't matter because you never have to stop the clock and commit to a move. To know the truth about a position you need to be playing it against someone, hopefully someone that will punish you when the blunder happens (so you know where was your false assumption.)

For all the chess positions that are analyzed I can say the next objective truth: White and black have 0% chance of winning, 0% chance of losing and 0% chance of a draw, because no game is being played, and the end of the game will never be reached.