Knight equals 48 pawns?

Robert Pope · Post by **Robert Pope** » Fri Feb 22, 2019 3:55 am

jp wrote: ↑Thu Feb 21, 2019 3:45 am
Robert Pope wrote: ↑Tue Feb 19, 2019 4:11 pm
jp wrote: ↑Tue Feb 19, 2019 5:39 am
jhellis3 wrote: ↑Sat Feb 16, 2019 6:15 pm Criticizing LC0 for having a superior eval (more quickly recognizing known wins) seems a bit silly...
What makes you believe Lc0 has a superior eval? Its evals have always been too optimistic.
A superior eval is identified by how properly it orders a set of positions by their desirability, not by how well the underlying scores of that function line up with some preconceived notion of value.

But that hasn't been identified, has it?

What do you think is the preconceived notion of value? The knight value Larry mentioned?

The pre-conceived notion of value is this:

The UCI standard requires scores to be either reported as centiPawns or distance to mate.

The whole notion that the value of a position can be somehow consistently be described as a multiple of the value of a pawn is simply a relic of the historic evolution of computer chess, being driven by materialistic bean-counting, with some positional tweaks. It's a handy heuristic when you have nothing better, but is really meaningless from a theoretic value standpoint. The closer you get to positions that are won for one side, the clearer this becomes.

hgm · Post by **hgm** » Fri Feb 22, 2019 9:54 am

Be that as it may, it doesn't seem to apply here. There is nothing special about Knight odds that would justify special treatment; it is just an advantage of a Knight, similar in winning prospects to the advantage of 3 or 4 extra Pawns, or an extra Rook versus 2 or 3 Pawns in an opening-type position. An advantage that heuristically is known to be an almost certain win with good quality play. As any engine can and should know, you don't need a NN for that. LC0 does not have a forced checkmate wihin its horizon, so there is no mate score, and no reason or excuse at all to switch to a non-standard scale to express heuristic advantages.

Robert Pope · Post by **Robert Pope** » Fri Feb 22, 2019 3:58 pm

hgm wrote: ↑Fri Feb 22, 2019 9:54 am Be that as it may, it doesn't seem to apply here. There is nothing special about Knight odds that would justify special treatment; it is just an advantage of a Knight, similar in winning prospects to the advantage of 3 or 4 extra Pawns, or an extra Rook versus 2 or 3 Pawns in an opening-type position. An advantage that heuristically is known to be an almost certain win with good quality play. As any engine can and should know, you don't need a NN for that. LC0 does not have a forced checkmate wihin its horizon, so there is no mate score, and no reason or excuse at all to switch to a non-standard scale to express heuristic advantages.

Heuristically, yes, maybe a knight advantage is similar to a 3 or 4 pawn advantage. But Lc0 doesn't evaluate heuristically, in the same sense that traditional programs do. That's why it can struggle so much in uncommon positions, like not knowing what to do with 3 queens, so it tosses one away. It doesn't have the training experience to evaluate how effective a four pawn advantage is in the opening, so we shouldn't be surprised if it doesn't evaluate that the same as a single knight advantage on an otherwise full board.

I'm all for creating better transformations of Lc0 evaluations into centipawn-type scores. But that doesn't necessarily mean something is broken in Lc0 when its score doesn't line up with what someone thinks it "ought" to be. The nature of how it is trained and the existence of unexplored parts of the search space will lead to blindspots for Lc0, as well as areas of exceptional insight. Where those occur is just going to show up in different areas than a more traditional evaluation.

hgm · Post by **hgm** » Fri Feb 22, 2019 5:25 pm

Most of that seems plain wrong. LC0 does evaluate heuristically; the only way it differs in this from other programs is that it warps the scale to expected score rather than an additive (and thus linear) advantage. Although this is partly necessary, because the output of NN cells only has a finite range, the way the score was mapped to a value in the available range was a design choice. The NN was trained to throw away excess material that would only speed up the win, by having the training software intentionally withhold the information from it how long it would take to convert the win. So yes, you could say that is broken by design.

That it has never seen a 3-Pawn advantage early in the opening is not really relevant (because the NN has generalizing capability, and if it is trained to recognize that a 3-Pawn advantage in the late middle-game is winning, it would almost certainly think this is always the case, unless it was fed explicit training examples showing the contrary. Plus that it is a suspect claim anyway, as early in its training history it must have seen many training examples where it bungled away three Pawn (or a Knight). This is how it first learned a Knight is worth about 3 Pawns in almost any position, knowledge you could not do without if you want to play chess of a reasonable quality.

Robert Pope · Post by **Robert Pope** » Fri Feb 22, 2019 7:07 pm

hgm wrote: ↑Fri Feb 22, 2019 5:25 pm That it has never seen a 3-Pawn advantage early in the opening is not really relevant (because the NN has generalizing capability, and if it is trained to recognize that a 3-Pawn advantage in the late middle-game is winning, it would almost certainly think this is always the case, unless it was fed explicit training examples showing the contrary.

Maybe someone else will chime in and tell me I'm totally off-base, but I don't think that it generalizes in the same way that stockfish generalizes.

Stockfish thinks a knight is worth 300 (or whatever), so if it is up a knight, that's +300, up two knights is +600 and up five knights is +15000. For Leela, being up five knights is outside of the scope of what it has been trained on, so what it's eval returns is rather unpredictable. It's like doing linear regression where your inputs are almost all between -1 and 1, and then seeing what it predicts when you put in a 4. That's why it requires an order of magnitude more data - it doesn't generalize the same way, and the parts of the space that it hasn't been trained on can end up anywhere at all. And positions that seem similar to us or to Stockfish can totally baffle neural nets (and on the other hand, this is part of what makes them able to spot positional advantages that Stockfish is clueless about).

There are image recognition nets that can identify pictures of cats, but if you change a handful of pixels in an image, it starts calling them chairs, even though the two pictures look absolutely identical to a human.

Anyway, I don't have a horse in this race, so I'm going to try to stop arguing for the sake of arguing. I do think trying to get a good score conversion is a bit of trying to fit a square peg into a round hole.

lkaufman · Post by **lkaufman** » Fri Feb 22, 2019 7:45 pm

It seems to me that you are both making valid points in theory but not talking about the actual data. It is clear to me that the win perc. numbers reported by Lc0 in the opening are quite reasonable, both absolutely and relative to one another, up until the win prob. gets to about 99%, roughly knight odds. So Lc0 does know that in the opening a knight is worth about four pawns, but it has no clue that two knights are worth eight pawns. So if win percentages were converted to centipawn scores by the logistic formula for example and scaled to match Komodo or Houdini on average, it would show quite reasonable evals for knight odds, roughly same as for four pawns and roughly four times pawn odds, but it would undervalue larger handicaps. Two knights up might show up as around +5 or 6 instead of 7 or 8 for example. As long as the reported evals are reasonable in positions where an amateur human player would not resign, I think that is good enough, and getting it right up thru at least knight odds qualifies.

By the way, I note that it is not feasible to play against Lc0 at knight odds, because it soon starts to blunder pieces almost randomly. At smaller handicaps like pawn f7 and move or two this doesn't happen, it plays very well. It's almost as if all win probabilities beyond 99% are the same.

hgm · Post by **hgm** » Fri Feb 22, 2019 8:16 pm

This blundering away of pieces is a consequence of the winning probability saturating; by the time it is >99% it just drawns in the noise. It was not trained to recognize that two Knights up is better than 1 Knight up, won is won. This makes it extremely sluggish in converting won positions. Better training would not clip the score as hard as a logistic, but would indeed make the NN response as a function of centi-Pawn advantage look more like an arctan. Then you could take the tan to convert it to cP, and would still have reasonable resolution in the range of certain wins. What it does now is like playing from a WDL EGT without search...

jp · Post by jp » Sat Feb 23, 2019 3:13 am

lkaufman wrote: ↑Fri Feb 22, 2019 7:45 pm By the way, I note that it is not feasible to play against Lc0 at knight odds, because it soon starts to blunder pieces almost randomly. At smaller handicaps like pawn f7 and move or two this doesn't happen, it plays very well. It's almost as if all win probabilities beyond 99% are the same.

Does it do this blundering from both sides (knight up and knight down) or only from knight up?

lkaufman · Post by **lkaufman** » Sat Feb 23, 2019 5:52 pm

jp wrote: ↑Sat Feb 23, 2019 3:13 am
lkaufman wrote: ↑Fri Feb 22, 2019 7:45 pm By the way, I note that it is not feasible to play against Lc0 at knight odds, because it soon starts to blunder pieces almost randomly. At smaller handicaps like pawn f7 and move or two this doesn't happen, it plays very well. It's almost as if all win probabilities beyond 99% are the same.
Does it do this blundering from both sides (knight up and knight down) or only from knight up?

Both sides, but more noticeably when down a piece or more. Basically, if its winning prob. is below 2% or so, it just "resigns" by making random blunders. Even taking more time doesn't generally cure this. Although Lc0 can probably beat top GMs giving them pawn or maybe pawn and move handicap just like Komodo does, I don't think that Lc0 can even beat a 1400 rated human at knight odds (Komodo can give knight odds in blitz to GMs, but in slow games only to players aroun 2000). Even with somewhat higher win probs., say around 5% or so, Lc0 still blunders, though not as often. I've seen the same behavior from a couple different networks, I don't think the specific network is the problem. Perhaps it's a problem with rarely occurring material balances, such as being down two pieces or more.

Uri Blass · Post by **Uri Blass** » Sat Feb 23, 2019 7:03 pm

I think that the solution to the problem should simply be not to consider all wins the same so lc0 care about number of moves to mate.

You can decide that games continue till mate and the score for the winner is max(0.9,1-0.0001*number of moves) when people train lc0 based on this information.

I wonder if this type of training cannot help lc0 to become stronger.

Knight equals 48 pawns?

Re: Knight equals 48 pawns?

Re: Knight equals 48 pawns?

Re: Knight equals 48 pawns?

Re: Knight equals 48 pawns?

Re: Knight equals 48 pawns?

Re: Knight equals 48 pawns?

Re: Knight equals 48 pawns?

Re: Knight equals 48 pawns?

Re: Knight equals 48 pawns?

Re: Knight equals 48 pawns?