Opposite Color Bishop Endgames

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Opposite Color Bishop Endgames

Post by Laskos »

hgm wrote: Mon May 13, 2019 9:52 pm There shouldn't even have been a story.

LC0 is poor at tactics. So it is a poor chess program. PERIOD.

The initial statement was never withspoken. To the point would have been if you would have responded "Not at all, LC0 is brilliant at tactics. It is on par with the best engines in solving tactical positions". Instead the insane bla-bla started... But youy know what they say: "if you cannot dazzle them with brilliance, baffle them with bullshit". You sure diplayed a good attempt at that, here.
That Lc0 is "poor at tactics" is relative and maybe not that relevant. One can say that traditional engines are insanely strong at tactics, aren't they compared to humans? Lc0 on reasonable hardware still seems stronger than humans at tactics, but it is also stronger positionally, so, all in all, is a balanced chess playing entity from an unbiased human view. Traditional engines are a bit extremists, insanely tactically strong, and positionally often below even a regular human.

There are other issues, to me more important, like many endgames and especially "chess variants" or "arbitrary legal chess positions" playing. As Crem pointed out, endgames can be corrected by some non-zero approach, and are already most often very good. Some variants can be learned by training specifically on them, and that would be interesting, as strong traditional engines are stronger on them than the "Chess Lc0", but I guess "Variant Lc0" can be stronger than the traditional engines. Nevertheless, that seems a flaw of at least Zero-NN approach, if not generally of NN machine learning: the result seems to lack "guiding principles", "abstraction", generalization. The approach is general, but the result is a "very narrow specialist", with Lc0 being a specialist in Chess from the standard opening position, or "Chess as defined".

Still, you have to appreciate it, I think in time Lc0 or similar will correct the whole opening theory, a thing traditional chess engines could have done only narrowly on some lines leading to complications. To me, the appearance of Lc0 is probably the greatest thing in the chess engine development as a "consumer" of the last 20 years I am following it from time to time.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Opposite Color Bishop Endgames

Post by mar »

abulmo2 wrote: Sat May 11, 2019 12:31 am In the case of opposite color bishops, it costs Elo because, quite often, avoiding a drawn game leads to a lost one.
Quite the contrary, most opposite color bishop endgames are drawn rather than won/lost.
Even the dumbest form of scaling gives a measurable elo gain, so better fix your implementation.
Martin Sedlak
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Opposite Color Bishop Endgames

Post by chrisw »

hgm wrote: Tue May 14, 2019 7:34 am That just means it has a high Elo. Like I pointed out high Elo and 'good at chess' are entirely different things.
LC0 and Stockfish are recognised as the best chess playing entities in the world.

What part of comparative and superlative grammar do you not understand?
good - better - best
If they are the best chess playing entities, they are by definition, good at playing chess.

Elo is measured on a tiny and possibly completely unrepresentative subset of all chess positions that are likely to occur in games,
That's just not true. Lack of thought again.
Elo "is measured on" the entire set of recorded chess games within the pool. It's a statistic which takes into account every game every entity in the pool has played against all the other entities. When A plays B and scores a win, A's Elo rating is adjusted by a function which adjusts for the results all of B's games against all other entities, which in turn are adjusted by the results of C, D, E, and so on.
hence we call it a "Rating List", it relatively rates all the entities in the pool, in a fair and objective manner, based on actual result. It takes all the games it can, namely those that are available, and uses them all.

Correcting the above ...
Elo is measured on (a tiny) all and every chess game recorded in the entire pool of chess games and (possibly completely unrepresentative subset of all chess positions that are likely to occur in games) the entire set of all games that actually have occurred with the pool,

Since we can't produce the entire set of possible games, we choose to make a rating list based on the results of play of all entities in the entire set of recorded games. The entire set of recorded games is as close as we are going to get to a representative sample.

like a human IQ test is in general not representative for all problems one might encounter in daily life.
Just wrong, wrong and more wrong in an endless stream
Elo is not equivalent to an IQ test. It's not "like an IQ test"
Elo is a statistic based on actual real performance against other entities taking into account all recorded games of chess. All recorded games of chess are, by definition, not even a sample, they are the set of all games ever (seriously) played. And therefore not just representative of the problems a chess entity might encounter in real play, they are the set of problems that all chess entities have encountered in real play.
IQ test by contrast is a particular biased pen and paper set of problems probably quite rarely met in real life. It's a predictive biased manufactured test. As a test, an IQ test is similar to using a manufactured list of chess problems as a metric. Tactical test suites are not a representative subset either of all possible chess positions, nor a representative subset of positions likely to be encountered in chess games. They are positions where (allegedly) only one continuation will do. Most game positions have usually several possible continuations. What better example of a non-representative set could there be?!

And then there also remains the problem of it being just an average.
Yes, if you want to construct one metric out of a large set of data there usually is some form of averaging involved. It's called statistics.
User avatar
hgm
Posts: 27789
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Opposite Color Bishop Endgames

Post by hgm »

bla bla...