Opposite Color Bishop Endgames

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Stephen Ham
Posts: 2409
Joined: Wed Mar 08, 2006 8:40 pm
Location: Eden Prairie, Minnesota
Full name: Stephen Ham

Re: Opposite Color Bishop Endgames

Post by Stephen Ham » Sat May 11, 2019 1:02 am

That is why I posted as I did.

In your link, I recall Majid Ansari's post of over five years ago when he wrote:

I actually watched SF win a very nice opposite bishop endgame against Houdini. The struggle in the endgame was to avoid a drawn opposite bishop endgame and I thought SF handled it extremely well. Houdini seems to also realize that opposite bishop endgames are drawn but it is not as accurate as SF and is way too optimistic about drawing chances. SF seems much more adept at this and with its extremely deep endgame search ability it is just too strong for Houdini in such positions.

So, that's why I assumed that modern engines now have proper coding for opposite colored bishop endgames. Nonetheless, I understand that such coding is very difficult. For example, although such endgames are highly drawish, opposite colored bishop middlegames tend to be dynamic. So, when does one tell the engine to go from one extreme to the other?

Today, with 6-man endgame TBs, perhaps programmers consider this a minimal issue. Alas, I only have a 5-man TB. But, I will be upgrading my hardware this summer and so will purchase a 6-man TB stick.

All the best,
Steve

User avatar
hgm
Posts: 23205
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Opposite Color Bishop Endgames

Post by hgm » Sat May 11, 2019 6:26 am

abulmo2 wrote:
Fri May 10, 2019 10:31 pm
I have some trouble to understand you, to be good at chess IS to have a high Elo.
Not at all. Having a high Elo can be achieved by only doing good moves in a sub-set of all chess positions, namely the positions that can be reached by the moves you select. You can arbitrarily suck in positions that you will never reach because you avoid them. And I would not call an entity that heavily sucks in a large sub-set of all positions "good at chess". I would say it just knows a trick to win.

LC0 is a good example: it sucks in tactical positions, as is easily demonstrated from its performance in tactical test suits. But in games it just avoids tactically complex positions, so it still has a pretty high Elo for a poor Chess program.

It all boils down to the difference between average and worst case. Elo is just an average, and an average that can be skewed at that, as the player has a large say in the set of positions it is measured on, or their weighting. Reliability of a product is measured from its worst-case behavior, though; average is not very important. If there was a car that used on average somewhat less fuel than its competitors, but would not run at all when it is raining, virtually no one would even consider buying it.

Also see Dann's response.

abulmo2
Posts: 161
Joined: Fri Dec 16, 2016 10:04 am
Contact:

Re: Opposite Color Bishop Endgames

Post by abulmo2 » Sat May 11, 2019 10:12 am

Stephen Ham wrote:
Fri May 10, 2019 4:48 am

The following 8-piece endgame just arose in a Raubfisch X40a3-Cfish match:
[d]4b3/8/6k1/6p1/p6p/K3B1P1/8/8 w 01

Virtually all humans immediately see that this is a draw.
A simple variation of the above position:
[d]4b3/8/2p3k1/2K3p1/7p/4B1P1/8/8 w - - 0 1
Here it is a black win. I just move a pawn (and the opponent king, but it does matter less). The opposite color bishops are still there, but the position is no more a dead draw. I conclude that in your above position the problem is more on the pawn structure (edge pawn are difficult to promote) than in the opposite color bishops. The problem, while testing an opposite color bishop code, is that the test will probably encounter more positions like mine than like yours.
Richard Delorme

User avatar
hgm
Posts: 23205
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Opposite Color Bishop Endgames

Post by hgm » Sat May 11, 2019 10:27 am

abulmo2 wrote:
Sat May 11, 2019 10:12 am
The problem, while testing an opposite color bishop code, is that the test will probably encounter more positions like mine than like yours.
Sure. But that is no reason not to distinguish those.

And indeed 'unlike Bishops' plays no role in the given position, as the drawing plan sacs a Bishop.

chrisw
Posts: 1478
Joined: Tue Apr 03, 2012 2:28 pm

Re: Opposite Color Bishop Endgames

Post by chrisw » Sat May 11, 2019 11:34 am

hgm wrote:
Sat May 11, 2019 6:26 am
abulmo2 wrote:
Fri May 10, 2019 10:31 pm
I have some trouble to understand you, to be good at chess IS to have a high Elo.
Not at all. Having a high Elo can be achieved by only doing good moves in a sub-set of all chess positions, namely the positions that can be reached by the moves you select. You can arbitrarily suck in positions that you will never reach because you avoid them. And I would not call an entity that heavily sucks in a large sub-set of all positions "good at chess". I would say it just knows a trick to win.
Living to adult age and reproducing can only be achieved by eating non-poisonous, nutritious things from the subset of all chewable bitesized things, namely the chewable things you select.
You can arbitrarily die from poisoning from poisonous things you never eat because you avoid them. And I would not call an entity that arbitrarily dies in a world with a large subset of poisionous chewable things "good at surviving and reproducing". I would say it just knows a trick to avoid poison.

LC0 is a good example: it sucks in tactical positions, as is easily demonstrated from its performance in tactical test suits. But in games it just avoids tactically complex positions, so it still has a pretty high Elo for a poor Chess program.
Person A is a dismal failure at IQ tests, but very successful in life.
Person B is brilliant at IQ tests, but unable to tie his shoelaces.

Normally here, we would criticise the IQ test for actually being not very good at measuring "General Intelligence", but I can see you regard "the test" as the real measure and "real life performance" as a test that failed to measure up to the IQ test.

It all boils down to the difference between average and worst case. Elo is just an average, and an average that can be skewed at that, as the player has a large say in the set of positions it is measured on, or their weighting.
In the case of zero-sum games, ELO is a life-performance measurement. All players start from exactly the same position. If there is a general mismatch between performance on "test-suites" and "life-performance" then it is the "test-suites" that are inadequate. If we were to introduce tigers into a flock of sheep, it is no use claiming after all the sheep are dead that tigers fail the quality of wool test hitherto deemed as the ultimate measure of quality in the real world.

Reliability of a product is measured from its worst-case behavior, though; average is not very important. If there was a car that used on average somewhat less fuel than its competitors, but would not run at all when it is raining, virtually no one would even consider buying it.

Also see Dann's response.

Ferdy
Posts: 3860
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: Opposite Color Bishop Endgames

Post by Ferdy » Sat May 11, 2019 12:52 pm

Stephen Ham wrote:
Fri May 10, 2019 4:48 am
Hello All,

The following 8-piece endgame just arose in a Raubfisch X40a3-Cfish match:
[d]4b3/8/6k1/6p1/p6p/K3B1P1/8/8 w 01

Virtually all humans immediately see that this is a draw. White will exchange pawns and then sac his Bishop for Black's dark-squared pawn, leading to a King versus King, Bishop, and pawn endgame draw. However, at this position, Raubfisch scored it -1.79 while Cfish scored it -2.30! I'm playing with 5-man Nalimov TBs.

Since opposite color bishop endgames with more than 6-7 pieces occur frequently, shouldn't programmers code their engines to know when such positions are draws? Then, the materially weaker side could learn to draw games they'd otherwise lose, while materially stronger sides could score wins by avoiding these draws.

Both of the above engines are Stockfish derivatives, suggesting that the strongest A-B engine also lacks this coding. So, are programmers counting on users to have 6-man TBs for engine probing?

Many moves were played since the above position. At move 130, after endless shuffling of bishops, Raubfisch scored it as a small white disadvantage while Cfish still scored it -2.30 until going over 40 plies, when the evaluation dropped to 0.00. They then drew a couple moves later.

All the best,
Steve
I think most programmers have implemented the famous KvKBP (rook-pawn, wrong bishop color) equal ending.

Image

With the availability of EGT most programmers concentrate more on other areas to improve their engine.

From your posted position on my computer Stockfish 10 (no egt) evaluates the position at around -1.26 for white. I believe this is a good score already.

Stephen Ham wrote:
Fri May 10, 2019 4:48 am
Virtually all humans immediately see that this is a draw.
I doubt that.


BTW does Raubfish and Cfish used Nalimov EGT? Can you post the whole game in pgn format?

syzygy
Posts: 4408
Joined: Tue Feb 28, 2012 10:56 pm

Re: Opposite Color Bishop Endgames

Post by syzygy » Sat May 11, 2019 1:42 pm

hgm wrote:
Sat May 11, 2019 6:26 am
Not at all. Having a high Elo can be achieved by only doing good moves in a sub-set of all chess positions, namely the positions that can be reached by the moves you select. You can arbitrarily suck in positions that you will never reach because you avoid them. And I would not call an entity that heavily sucks in a large sub-set of all positions "good at chess". I would say it just knows a trick to win.
Yet that is exactly how human GMs approach the game of chess.

User avatar
hgm
Posts: 23205
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Opposite Color Bishop Endgames

Post by hgm » Sun May 12, 2019 7:42 am

chrisw wrote:
Sat May 11, 2019 11:34 am
Living to adult age and reproducing can only be achieved by eating non-poisonous, nutritious things from the subset of all chewable bitesized things, namely the chewable things you select.
You can arbitrarily die from poisoning from poisonous things you never eat because you avoid them. And I would not call an entity that arbitrarily dies in a world with a large subset of poisionous chewable things "good at surviving and reproducing". I would say it just knows a trick to avoid poison.
It wouldn't automatically qualify as a "culinary expert", that is for sure.
Person A is a dismal failure at IQ tests, but very successful in life.
Person B is brilliant at IQ tests, but unable to tie his shoelaces.

Normally here, we would criticise the IQ test for actually being not very good at measuring "General Intelligence", but I can see you regard "the test" as the real measure and "real life performance" as a test that failed to measure up to the IQ test.
You got it completely the wrong way around. It is playing games that exposes you to such a small fraction of all positions in the "real world" of Chess that it is similar to an IQ test (of the special kind, where the subject under test can pick the questions he likes most). And it is seeing the draw in the position of the OP that is as trivial as tying shoe laces. You are the one trying to elevate the IQ test to the real measure.
In the case of zero-sum games, ELO is a life-performance measurement. All players start from exactly the same position. If there is a general mismatch between performance on "test-suites" and "life-performance" then it is the "test-suites" that are inadequate. If we were to introduce tigers into a flock of sheep, it is no use claiming after all the sheep are dead that tigers fail the quality of wool test hitherto deemed as the ultimate measure of quality in the real world.
'Life-performance' in an IQ test...

chrisw
Posts: 1478
Joined: Tue Apr 03, 2012 2:28 pm

Re: Opposite Color Bishop Endgames

Post by chrisw » Sun May 12, 2019 9:36 am

hgm wrote:
Sun May 12, 2019 7:42 am
chrisw wrote:
Sat May 11, 2019 11:34 am
Living to adult age and reproducing can only be achieved by eating non-poisonous, nutritious things from the subset of all chewable bitesized things, namely the chewable things you select.
You can arbitrarily die from poisoning from poisonous things you never eat because you avoid them. And I would not call an entity that arbitrarily dies in a world with a large subset of poisionous chewable things "good at surviving and reproducing". I would say it just knows a trick to avoid poison.
It wouldn't automatically qualify as a "culinary expert", that is for sure.
It would however be a necessary condition to qualify each and every ancestor going far back into time as a successful survivor and reproducer of one H.G.Muller.
Person A is a dismal failure at IQ tests, but very successful in life.
Person B is brilliant at IQ tests, but unable to tie his shoelaces.

Normally here, we would criticise the IQ test for actually being not very good at measuring "General Intelligence", but I can see you regard "the test" as the real measure and "real life performance" as a test that failed to measure up to the IQ test.
You got it completely the wrong way around.
haha! that’s your problem, as already shown.
It is playing games that exposes you to such a small fraction of all positions in the "real world" of Chess
Your “world of chess” is an wholly arbitrary and imaginary construct which means anything you want it to, whether you add “real” to the front of it or not. Chess is a board game played, played according to rules. If there is any “world” to it, it comes from the groups of agents playing the game.
The positions that arise are a small subset. Currently the only machine entity that can claim any sort of coherent strategy for the entire set of chess positions is a AB or minimax material counter. All positional heuristics are added in on an assumed but unstated programming rule that the position being considered is one of the normal subset, and as heuristics are refined they more and more demand the position be of the normal subset. Ask yourself what would be the use, other than as random noise and waste of computing time, of the bishop pair heuristic for one of your all-world-of-chess positions where one side had four bishops? SF itself, the strongest AB program, is developed entirely on the basis of what works statistically, eg what works with your “small fraction” of chess positions met, in the proportion to which they are met.

that it is similar to an IQ test (of the special kind, where the subject under test can pick the questions he likes most).
No. Now you really are in full on upside down mode.
Your chess IQ test is one in which the test setter picks the questions he likes most. The dominant paradigm in computer chess was, until disproven by DeepMind researchers, that “chess is tactics”, I seem to recollect being the only person, in a minority of one, years and years ago, arguing that this was a nonsense, and that one day would come along a program with positional knowledge that was going to blow away this material/tactical paradigm.
What you are doing, from old way of thinking, surprise, surprise, is elevate the “chess is tactics” paradigm with tactical suite IQ testing, as if these types of positions were more representative of “chess” than ability to play positional.
You select the test based on old paradigm, the test fails to perform against new paradigm, so you blame the new paradigm. Very typical of the old academic invested in an old field coming up against change. Kuhn, History of Scientific Revolutions.

And it is seeing the draw in the position of the OP that is as trivial as tying shoe laces.
Expressed in heuristics and logic is very easy, but you should know by now that actual programming of even simple special case heuristics is actually not easy at all. There are almost invariable special cases of the special cases, only discovered later by stupidly losing.
You are the one trying to elevate the IQ test to the real measure.
Silly comment when I am clearly arguing the direct opposite.
In the case of zero-sum games, ELO is a life-performance measurement. All players start from exactly the same position. If there is a general mismatch between performance on "test-suites" and "life-performance" then it is the "test-suites" that are inadequate. If we were to introduce tigers into a flock of sheep, it is no use claiming after all the sheep are dead that tigers fail the quality of wool test hitherto deemed as the ultimate measure of quality in the real world.
'Life-performance' in an IQ test...
That is often said, but it is not true at all. You may as well claim your journey from Amsterdam to Berlin is the car speedometer.
IQ tests are predictive of life-performance, supposedly, that’s the purpose of such tests and why they are composed. But IQ test is not life performance. Life performance is not IQ test, any more than anything is the measuring device that measures it.

User avatar
hgm
Posts: 23205
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Opposite Color Bishop Endgames

Post by hgm » Sun May 12, 2019 9:42 am

Well, it isn't rocket science. It is the difference between measuring an average on a small, unrepresentative and manipulatable sub-set of problems or measuring the poor-case behavior on the total of all problems.

If you want to advertise that you cannot understand this, you are welcome. I won't waste any more time on it.

BTW, it seems you read 'in' as 'is'...

Post Reply