Linear vs. Nonlinear Evalulation

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Linear vs. Nonlinear Evalulation

Post by Gerd Isenberg »

I have a question about the definition of linear versus nonlinear evaluation.
Linear eval is the dot-product of features and weights. Assuming a passed pawn piece-square table contains for instance the square (x*x) of baserank distance (x), is this still linear evaluation?

Is nonlinear eval necessarily connected with piece interactions? Or what is its exact definition?

Thanks,
Gerd
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Linear vs. Nonlinear Evalulation

Post by hgm »

This is a bit arbitrary. But I guess he dependence on individual piece positions is not taken into account in this classification. Piece-square tables are in general filled with irregular functions that are far from linear, but no one would consider that non-linear eval. A Bishop pair term could be called non-linear, but I don't believe anyone would do that either.
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Linear vs. Nonlinear Evalulation

Post by Gerd Isenberg »

hgm wrote:This is a bit arbitrary. But I guess he dependence on individual piece positions is not taken into account in this classification. Piece-square tables are in general filled with irregular functions that are far from linear, but no one would consider that non-linear eval. A Bishop pair term could be called non-linear, but I don't believe anyone would do that either.
Yes, one can interpret piece square table as weight vector, which are multiplied with a vector of boolean (absence or existence) features. Thus a "linear" combination or dot-product, even if the features are only zero or one.

Code: Select all

for all i sum += Feature[i] * weight[i];
The terms linear versus nonlinear evalulation were mentioned in several papers, but I wonder what their definition is. May be its arbitrary or outdated.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Linear vs. Nonlinear Evalulation

Post by bob »

Gerd Isenberg wrote:
hgm wrote:This is a bit arbitrary. But I guess he dependence on individual piece positions is not taken into account in this classification. Piece-square tables are in general filled with irregular functions that are far from linear, but no one would consider that non-linear eval. A Bishop pair term could be called non-linear, but I don't believe anyone would do that either.
Yes, one can interpret piece square table as weight vector, which are multiplied with a vector of boolean (absence or existence) features. Thus a "linear" combination or dot-product, even if the features are only zero or one.

Code: Select all

for all i sum += Feature[i] * weight[i];
The terms linear versus nonlinear evalulation were mentioned in several papers, but I wonder what their definition is. May be its arbitrary or outdated.
I believe the general idea is that if you sum weights, the function is linear. As in y = aX + b; If you product things, say king position * pieces attacking squares close to king * pawn shelter around king, then it is considered beyond linear or as it is sometimes called, beyond first-order.
Pradu
Posts: 287
Joined: Sat Mar 11, 2006 3:19 am
Location: Atlanta, GA

Re: Linear vs. Nonlinear Evalulation

Post by Pradu »

Gerd Isenberg wrote:I have a question about the definition of linear versus nonlinear evaluation.
Linear eval is the dot-product of features and weights. Assuming a passed pawn piece-square table contains for instance the square (x*x) of baserank distance (x), is this still linear evaluation?

Is nonlinear eval necessarily connected with piece interactions? Or what is its exact definition?

Thanks,
Gerd
I guess the usual definition of a linear function is one where the output of a sum of the inputs is equal to the the output of each individual input. A function f is linear iff:

f(a+b) = f(a) + f(b).
f(c*a) = c*f(a)

So I'd guess an evaluation function is linear if

f(feature1_a + feature1_b,feature2_a + feature2_b,feature3_a + feature3_a...) = f(feature1_a,feature2_a,feature3_a) + f(feature1_b,feature2_b,feature3_b) = f(feature1_a)+ f(feature2_a) + f(feature3_a)+f(feature1_b)+f(feature2_b)+f(feature3_b) ...

PSTs would be nonlinear because PST[sq1+sq2] != PST[sq1] + PST[sq2]

Simple material eval and mobility eval could be linear:
materialEval(onepawn + onepawn) = materialEval(onepawn) + materialEval(onepawn)
Pradu
Posts: 287
Joined: Sat Mar 11, 2006 3:19 am
Location: Atlanta, GA

Re: Linear vs. Nonlinear Evalulation

Post by Pradu »

Pradu wrote:PSTs would be nonlinear because PST[sq1+sq2] != PST[sq1] + PST[sq2]
I guess PSTs are linear if you define your function differently (with 64 discrete inputs instead of one). For example take a knight position PST:
PSTeval(ispieceonsq1 , ispieceonsq2, ispieceonsq3... ispieceonsq64)
where ispieceonsq has values of
0 - no piece
1 - my piece
-1 - opponent piece

and you can define the function as
PSTeval(x0,x1,x2...) = PST[0]*x0 + PST[1]*x1 + PST[2]*x2

ignoring everything except square 1:
PSTeval(0+1) = PSTeval(0)+PSTeval(1) OK
PSTeval(0+0) = PSTeval(0) + PSTeval(0) OK
PSTeval(1+0) = PSTeval(1) + PSTeval(0) OK
PSTeval(1+1) = PSTeval(1) + PSTeval(1) OK (but will never occur on a real chessboard)

For non symmetric pawn PSTs you could take in 128 values (64 for each side) instead of 64 and still define the PST function the same way and be good to go.
Pradu
Posts: 287
Joined: Sat Mar 11, 2006 3:19 am
Location: Atlanta, GA

Re: Linear vs. Nonlinear Evalulation

Post by Pradu »

Gerd Isenberg wrote:I have a question about the definition of linear versus nonlinear evaluation.
Is there any advantage to having a linear vs non-linear evaluation function? Would it make analyzing an evaluation function significantly easier if it was linear?
Tord Romstad
Posts: 1808
Joined: Wed Mar 08, 2006 9:19 pm
Location: Oslo, Norway

Re: Linear vs. Nonlinear Evalulation

Post by Tord Romstad »

I guess the usual definition of a linear function is one where the output of a sum of the inputs is equal to the the output of each individual input.
Not quite. The above is the definition of an additive function. However, you proceed to give the correct definition:
A function f is linear iff:

f(a+b) = f(a) + f(b).
f(c*a) = c*f(a)
The first of these equations says that the function is additive, and the second that it is homogeneous of degree 1. Together, they say that the function is linear.

At least to me, it is fairly obvious that every additive real number function is linear. Nevertheless, it is wrong, if you accept the axiom of choice. The proof is simple and beautiful:

Regard R (the set of real numbers) as a vector space over Q (the rational numbers). Pick two nonzero real numbers, one rational and one irrational. The numbers 1 and pi will do, and I'll use those for the rest of the discussion. Because 1 and pi are linearly independent over Q, we can construct a basis B for R over Q with 1 and pi as two of the basis elements (this is where we need the axiom of choice; the axiom of choice is equivalent to the existence of a basis for all vectorspaces).

Now construct a Q-linear map f: R -> R by f(1) = 0, and f(b) = 1 for all other basis vectors b. Because f is Q-linear, it is additive. However, it is not R-linear: We see that get f(pi) = 1, but pi * f(1) = 0.

Like all proofs using the axiom of choice, this proof is non-constructive: It doesn't help us find a particular real number function which is additive but non-linear. Indeed, I am fairly sure that noone will ever find such a function. This is a good example of why the axiom of choice is somewhat controversial: While the AC by itself looks intuitive and obvious, it can be used to give non-constructive proofs for a lot of very counterintuitive results.

Of course, in computer chess, most functions are not real number functions, but integer functions (modulo 2^32 or 2^64), and for such functions additivity and linearity are of course the same thing.
So I'd guess an evaluation function is linear if

f(feature1_a + feature1_b,feature2_a + feature2_b,feature3_a + feature3_a...) = f(feature1_a,feature2_a,feature3_a) + f(feature1_b,feature2_b,feature3_b) = f(feature1_a)+ f(feature2_a) + f(feature3_a)+f(feature1_b)+f(feature2_b)+f(feature3_b) ...

PSTs would be nonlinear because PST[sq1+sq2] != PST[sq1] + PST[sq2]

Simple material eval and mobility eval could be linear:
materialEval(onepawn + onepawn) = materialEval(onepawn) + materialEval(onepawn)
You can't define what it means that en evaluation function is linear without first defining what a "feature" means. For a program using a material + piece square table eval, it would be most natural to let the features be the set of all (piece, square) pairs on the board. Write this set as {(p_1, s_1), ..., (p_n, s_n)}. Then the evaluation function would look like this:

Code: Select all

eval = MaterialValue[p_1] + ... + MaterialValue[p_n] + PST[p_1][s_1] + ... + PST[p_n][s_n]
This function is clearly linear. However, as soon as you add something like a penalty for doubled pawns, the evaluation function is no longer linear, because eval({(WhitePawn, c3), (WhitePawn, c4)}) does not equal eval({(WhitePawn, c3)}) + eval({(WhitePawn, c4)}). Of course, you could make it linear again simply by extending your set of features, and regarding each doubled pawn on the board as an individual feature.

In other words, all evaluation functions can be regarded as linear, or as non-linear. It always depends on how you define the features. For the complex evaluation functions found in strong chess programs, I don't think it makes much sense to talk about linearity of the evaluation function as a whole, but discussing the linearity of some of the components of the evaluation function could still be interesting.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Linear vs. Nonlinear Evalulation

Post by hgm »

Gerd Isenberg wrote: Yes, one can interpret piece square table as weight vector, which are multiplied with a vector of boolean (absence or existence) features. Thus a "linear" combination or dot-product, even if the features are only zero or one.

Code: Select all

for all i sum += Feature[i] * weight[i];
The terms linear versus nonlinear evalulation were mentioned in several papers, but I wonder what their definition is. May be its arbitrary or outdated.
The point is that you can make anything linear that way. Just put al non-linearity in the Feature, and set al the weights to 100% (and make them user-configurable as UCI option. :wink: )

Suppose I have a term that is proportional to the square of the number of Grasshoppers in my evaluation function. I can define a Feature that is 1 if there are exactly i Grasshopers on the board, and 0 otherwise. And then I set weight to i*i. And, magical trick, a purely quadratic function has suddenly become linear...

So the statement that something is linear always has to be accompanied by the information "as a function of what?" to acquire meaning. That something is a linear function of the weights is not all that significant. (In fact it is merely the definition of "weights".) Non-linear as a function of piece position (e.g. passer advance) through a PST is something quite different from non-linear in piece values (e.g. through material tables).

In itself the claim a program uses "non-linear evaluation" is at par with detergent manufacturers claiming their product "launders whiter".
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Linear vs. Nonlinear Evalulation

Post by Gerd Isenberg »

Pradu wrote:
Gerd Isenberg wrote:I have a question about the definition of linear versus nonlinear evaluation.
Is there any advantage to having a linear vs non-linear evaluation function? Would it make analyzing an evaluation function significantly easier if it was linear?
I am only interested in the definition, related for instance to Levy's statement about:

Good, I.J. (1968). A Five-Year Plan for Automatic Chess Machine Intelligence II pp. 110-115

David Levy (1988). Computer Chess Compendium, 3 Position Evaluation pp. 112:
Perhaps non-linear evaluation functions will become popular at some future date, in which case some of Good's ideas will come into their own.
I wonder whether the "future" already began more than 20 years after ;-)