tuning via maximizing likelihood

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: maximum a posteriori

Post by AlvaroBegue »

Daniel Shawul wrote:
AlvaroBegue wrote:
Daniel Shawul wrote:A third point :

Since most evaluation functions are linear[...]
Mine certainly isn't. I am not sure how you express in a linear function that endgames with opposite-color bishops are drawish, or that you need to coordinate several pieces to mount an effective attack on the king. Those things are naturally non linear.
Note that we are talking about linearity with regard to the parameters. In other words, no two parameters should mulitply or divide (they can sum & subtract) or be used as a test condition to apply another evaluation term. With this definition, i can see my evaluation function to be totally linear. For example, for king safety, I use the number of attacks from pieces, calculate the phase etc and multiply by KING_SAFETY_WEIGHT. Now since the phase is determined by number of pieces (not sum value of pieces), number of attacks by presence of a piece (not its value), it is still linear.

The opposite color bishops case should also be a property of the position and be linear in the paramters. I am calculating the gradient of the eval with respect to the parameters for each position, so if you don't change the position the gradient should remain the same. With the stochastic sampling, you are sampling different positions so the mean of the gradients will be different from iteration to iteration. But if you consider all positions, you only need to compute it once. Computing and storing the Hessian is probably an overkill especially with large number of parameters.

Can you give an eval term that is non-linear by this definition ?
We are using the same definition of non-linearity. Of course KING_SAFETY_WEIGHT can be tuned and its contribution to the score is linear, but presumably there are parameters that are implied in the definition of king safety. For instance, I count an attack by a knight as 2, by a bishop as 2, by a rook as 3 and an attack by a queen as 5. I add these contributions for one side, compute the square and that is the term that is scaled by something like your KING_SAFETY_WEIGHT. But I would like to be able to adjust the numbers involved in the definition of the term (2, 2, 3 and 5) as well.

The drawishness of opposite-color bishops is implemented as a final step in the evaluation function, after the linear combination of terms has already been computed, and it consists of a scaling factor that is applied if certain conditions apply. If I wanted to tune this scaling factor at the same time as any of the other parameters, I wouldn't be looking at a linear function.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: maximum a posteriori

Post by Daniel Shawul »

AlvaroBegue wrote:
We are using the same definition of non-linearity. Of course KING_SAFETY_WEIGHT can be tuned and its contribution to the score is linear, but presumably there are parameters that are implied in the definition of king safety. For instance, I count an attack by a knight as 2, by a bishop as 2, by a rook as 3 and an attack by a queen as 5. I add these contributions for one side, compute the square and that is the term that is scaled by something like your KING_SAFETY_WEIGHT. But I would like to be able to adjust the numbers involved in the definition of the term (2, 2, 3 and 5) as well.
The keyword is in bold, you probably have nothing in eval that is non-linear at the moment :) I have the piece attack counts constant and only have KING_SAFETY_WEIGHT.

Code: Select all

int score = ((9 * ATTACK_WEIGHT * attack + 1 * TROPISM_WEIGHT * tropism) >> 4);
The same goes for mobility evaluation for me -- ofcourse i can easily make non-linear if i want to...
The drawishness of opposite-color bishops is implemented as a final step in the evaluation function, after the linear combination of terms has already been computed, and it consists of a scaling factor that is applied if certain conditions apply. If I wanted to tune this scaling factor at the same time as any of the other parameters, I wouldn't be looking at a linear function.
No one uses the the actual piece value of pieces to determine drawishness. The number of pieces or a constant 1-3-3-5-9 count is enough for this purpose. For example, I have the following which is absolutely linear.

Code: Select all

	opposite colored bishop endings
	*/
	if&#40;w_piece_value_c + b_piece_value_c <= 16
		&& w_piece_value_c == b_piece_value_c
		&& w_bishop_c == 1 
		&& b_bishop_c == 1 
		&& is_light&#40;plist&#91;wbishop&#93;->sq&#41; != is_light&#40;plist&#91;bbishop&#93;->sq&#41;
		&& ABS&#40;w_pawn_c - b_pawn_c&#41; <= 2
		) &#123;
		if&#40;w_piece_value_c + b_piece_value_c == 6&#41; &#123;
			if&#40;w_pawn_c >= b_pawn_c&#41;
				w_win_chance /= 2;
			if&#40;b_pawn_c >= w_pawn_c&#41;
				b_win_chance /= 2;
		&#125; else &#123;
			if&#40;w_pawn_c >= b_pawn_c&#41;
				w_win_chance = &#40;3 * w_win_chance&#41; / 4;
			if&#40;b_pawn_c >= w_pawn_c&#41;
				b_win_chance = &#40;3 * b_win_chance&#41; / 4; 
		&#125;
	&#125;
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: maximum a posteriori

Post by AlvaroBegue »

I am confused. You insist that I don't have non-linear parameters in my evaluation function, even though I just gave you two examples.

In the code you posted for opposite-color bishops, you do things like `w_win_chance /= 2;'. I imagine there are parameters involved in the computation of w_win_chance. If you want to also tune the 2 in that formula, your evaluation function is not linear.

You seem to be saying "make those parameters untunable constants and then the function is still linear". Well... yeah. But I want to tune all the parameters.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: maximum a posteriori

Post by Daniel Shawul »

AlvaroBegue wrote:I am confused. You insist that I don't have non-linear parameters in my evaluation function, even though I just gave you two examples.

In the code you posted for opposite-color bishops, you do things like `w_win_chance /= 2;'. I imagine there are parameters involved in the computation of w_win_chance. If you want to also tune the 2 in that formula, your evaluation function is not linear.

You seem to be saying "make those parameters untunable constants and then the function is still linear". Well... yeah. But I want to tune all the parameters.
Alvaro, you are so thick, it seems you would do anything to hang on to a position you accidentally held in a conversation -- just like when you blundered with the [-1,1] range with my OP. For you it seems whatever you said at first is your final position... You know you can be wrong, overlook something, etc ... it is not a sin to change your mind when something new shows up.

Here is your evaluation function -- which you tuned in a totally linear manner -- those factors like 0.825662 etc certainly must come from your tuning; you have a 0.1 factor for king safety, bishop drawishness has an untuned factor 3 etc... If you actually realized your evaluation was linear, would you have done the tuning with the autodiff stuff ?? Like I already said, you seem to argue for the sake of it ... Best to ignore you and move on with what i am doing really.

Code: Select all

  // All together now!
  int score = static_cast<int>&#40;material_score * 0.825662
			       + knight_outpost_score * 0.46
			       + (&#40;attack&#91;White&#93; > 0 ? attack&#91;White&#93; * attack&#91;White&#93; &#58; 0&#41;
				  - &#40;attack&#91;Black&#93; > 0 ? attack&#91;Black&#93; * attack&#91;Black&#93; &#58; 0&#41;) * 0.1
			       + passed_pawns_score * 0.723808
			       + helpless_pawns_score
			       + doubled_pawns_score * 1.03766
			       + unsupporting_pawn_score
			       + king_safety_score * 0.984506
			       + mobility_score * 1.22597
			       + rooks_score
			       + bishops_score * 1.033962
			       + &#40;b.active == White ? 1 &#58; -1&#41; * 10.0
			       + random_score&#41;;

  // Pawns and opposite-color bishops only makes for a drawish endgame
  if &#40;whites == &#40;b.bb&#91;WhiteKing&#93;|b.bb&#91;WhiteBishop&#93;|b.bb&#91;WhitePawn&#93;) &&
      blacks == &#40;b.bb&#91;BlackKing&#93;|b.bb&#91;BlackBishop&#93;|b.bb&#91;BlackPawn&#93;) &&
      b.bb&#91;WhiteBishop&#93; && b.bb&#91;BlackBishop&#93; &&
      !&#40;b.bb&#91;WhiteBishop&#93; & &#40;b.bb&#91;WhiteBishop&#93; - 1&#41;) &&
      !&#40;b.bb&#91;BlackBishop&#93; & &#40;b.bb&#91;BlackBishop&#93; - 1&#41;) &&
      (!&#40;b.bb&#91;WhiteBishop&#93; & LightSquares&#41;) != (!&#40;b.bb&#91;BlackBishop&#93; & LightSquares&#41;))
    score /= 3;
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: maximum a posteriori

Post by AlvaroBegue »

Yes, we better ignore each other.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: maximum a posteriori

Post by Daniel Shawul »

This seems to work very well. Computing the jacobian on 1 million positions takes about 20 sec, after that there is no need to call eval() again. The CG iterations are super-fast now!

It is basically replacing the evaluation function of each position by a linear function

Code: Select all

   eval_i = C0_i + p1 * C1_i + p2 * C2_i ... pn * Cn_i
where the coefficients C1..Cn are pre-computed for each position, and stored in the jacobian. Given m posititions and n parameters, the jacobian is mxn (actually i store also the result in there so mx(n+1))

Code: Select all

J = &#91;C0_1   C1_1 .....  Cn_1, R1
     C0_2   C1_2 .....  Cn_2, R2
                 .....
     C0_m   C1_m .....  Cn_m, Rm&#93;
The way I implemented it now just uses the coefficients to compute an evaluation given parameters P1,P2...Pn. So it still uses the non-linear CG optimizer that computes gradients at every step -- only now each eval call is much faster in the case of a linear eval. I think the idea may also be extended to some non-linear evaluation functions. For example, for eval that is non-linear in P2 like

Code: Select all

   eval_i = C0_i + p1 * C1_i + p2 * p2 * C2_i ... pn * Cn_i.
The jacobian is no more a real jacobian but who cares anyway, we just need the eval calls to be faster by extracting a function out of each position and storing it.

Daniel
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: maximum a posteriori

Post by Evert »

Daniel Shawul wrote: Since most evaluation functions are linear, we can compute and store the Jacobian for all the fens in the database once at start up. The jacobian is a constant matrix of (number_of_positions)X(number_of_parameters). Then we will never have to call the engines eval() again for computing gradients. I haven't tried it but this should be very efficient compute wise and also avoid the need for automatic differentiation while getting the benefit of less eval calls.
:shock:
That's very perceptive. I never noticed that (I did figure that since the evaluation was linear, there was no point in being careful about numerical differentiation and treating it as a black box), but this should considerably improve the calculation speed.
By the way, do you intend to summarise what you ended up doing? There's a lot of really useful information in this thread, but at the moment it's a bit scattered.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: maximum a posteriori

Post by Daniel Shawul »

Evert wrote:
Daniel Shawul wrote: Since most evaluation functions are linear, we can compute and store the Jacobian for all the fens in the database once at start up. The jacobian is a constant matrix of (number_of_positions)X(number_of_parameters). Then we will never have to call the engines eval() again for computing gradients. I haven't tried it but this should be very efficient compute wise and also avoid the need for automatic differentiation while getting the benefit of less eval calls.
:shock:
That's very perceptive. I never noticed that (I did figure that since the evaluation was linear, there was no point in being careful about numerical differentiation and treating it as a black box), but this should considerably improve the calculation speed.
By the way, do you intend to summarise what you ended up doing? There's a lot of really useful information in this thread, but at the moment it's a bit scattered.
Thanks!
There are some things to try as I am not able to get any improvement out of my tuning yet!
At the moment, i can't say which method gives better results because of this. Once I am done I may write a blog post about it.

Daniel