Stockfish's tuning method

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
ethanara
Posts: 134
Joined: Mon May 16, 2011 4:58 pm
Location: Denmark
Contact:

Re: Stockfish's tuning method

Post by ethanara » Sat Oct 08, 2011 3:21 pm

Actually, im 13, not because its much difference.
Its a deal Marco :) Here are the values i want to know

Code: Select all

const Score MobilityBonus
const Value OutpostBonus
const Score ThreatBonus
const Score ThreatenedByPawnPenalty
const Score RookOn7thBonus
const Score QueenOn7thBonus
const Score RookOpenFileBonus
const Score RookHalfOpenFileBonus
const Value TrappedRookPenalty
const Score TrappedBishopA1H1Penalty
const int InitKingDanger
const Value PawnValueMidgame
const Value PawnValueEndgame
const Value KnightValueMidgame;
const Value KnightValueEndgame
const Value BishopValueMidgame
const Value BishopValueEndgame
const Value RookValueMidgame
const Value RookValueEndgame
const Value QueenValueMidgame
const Value QueenValueEndgame
const int MgPST
const int EgPST
Hope i won the "game"
Any other games??

mcostalba
Posts: 2679
Joined: Sat Jun 14, 2008 7:17 pm

Re: Stockfish's tuning method

Post by mcostalba » Sat Oct 08, 2011 4:28 pm

Already tuned

Code: Select all

const Score MobilityBonus
const Value OutpostBonus
const Score ThreatBonus
const Score ThreatenedByPawnPenalty
const Score RookOn7thBonus
const Score QueenOn7thBonus
const Score RookOpenFileBonus
const Score RookHalfOpenFileBonus
const Value PawnValueMidgame
const Value PawnValueEndgame
const Value KnightValueMidgame;
const Value KnightValueEndgame
const Value BishopValueMidgame
const Value BishopValueEndgame
const Value RookValueMidgame
const Value RookValueEndgame
const Value QueenValueMidgame
const Value QueenValueEndgame
const int MgPST
const int EgPST
Not tuned

Code: Select all

const Value TrappedRookPenalty
const Score TrappedBishopA1H1Penalty
const int InitKingDanger

UncombedCoconut
Posts: 319
Joined: Fri Dec 18, 2009 10:40 am
Location: Naperville, IL

Re: Stockfish's tuning method

Post by UncombedCoconut » Sat Oct 08, 2011 9:12 pm

Heh, makes sense that TrappedBishopA1H1Penalty isn't tuned. That one is only relevant at all in FRC games.

mcostalba
Posts: 2679
Joined: Sat Jun 14, 2008 7:17 pm

Re: Stockfish's tuning method

Post by mcostalba » Sat Oct 08, 2011 9:42 pm

UncombedCoconut wrote:Heh, makes sense that TrappedBishopA1H1Penalty isn't tuned. That one is only relevant at all in FRC games.
Yes :-), also TrappedRookPenalty it doesn't make a lot of sense to tune you just need an high enough value to say to the engine "don't do that".

Another story is the InitKingDanger table, in this case tuning would be good but being king safety it is very sensible to TC so that we should tune at long time control and it will take forever....I have tried in the past to manually tune, changing a bit the table values, but never come with something better than what we have now.

A very interesting project would be to use Remi's CLOP to re-tune the parameters, instead of writing a glue script between CLOP and SF, being lazy, I was even thinking to use cutechess-cli to start the engines and manage the game, so that the glue script would be even smaller and simpler: cutechess-cli takes already in account all those stuff like passing parameters to engines, returning the result and managing the game /timeouts/crashes and so on....

But it is just an idea right now and I guess will remain an idea if someone doesn't volunteer and steps up.

User avatar
Eelco de Groot
Posts: 3885
Joined: Sun Mar 12, 2006 1:40 am
Location: Groningen

Re: Stockfish's tuning method

Post by Eelco de Groot » Sun Oct 09, 2011 1:44 am

mcostalba wrote:
UncombedCoconut wrote:Heh, makes sense that TrappedBishopA1H1Penalty isn't tuned. That one is only relevant at all in FRC games.
Yes :-), also TrappedRookPenalty it doesn't make a lot of sense to tune you just need an high enough value to say to the engine "don't do that".
Larry Kaufman made a remark about that when he was still testing Rybka values with very fast games, something like that it was much too high. But I don't know if this was an actual test of Stockfish or the equivalent in Rybka. Both would be interesting I think 8-)
Another story is the InitKingDanger table, in this case tuning would be good but being king safety it is very sensible to TC so that we should tune at long time control and it will take forever....I have tried in the past to manually tune, changing a bit the table values, but never come with something better than what we have now.
I think if you do it right that it would mainly change the playing style, but not necessarily add a lot of "knowledge" Something more aggressive might work against weaker engines but not against stronger, so the pool of adversary engines would also play a role.

I am not expecting much of it and just making some casual observations but right now there is an experiment where not only all the attacks and attackers are counted, but if the opponent has enough attacking material, in principle also all the defenders are counted in much the same way as the attackers.

The idea is that sometimes there are enough especially heavy pieces for instance, defending the King position so that the opponent will have a hard time breaking through. But the only way that is taken into account (in Stockfish' eval right now) is by the "undefended" bitboard which is a pretty small parameter. So I do think there might something missing there.

So I now have

Code: Select all

namespace {

  // Struct EvalInfo contains various information computed and collected
  // by the evaluation functions.
  struct EvalInfo {

    // Pointer to pawn hash table entry
    PawnInfo* pi;

    // attackedBy[color][piece type] is a bitboard representing all squares
    // attacked by a given color and piece type, attackedBy[color][0] contains
    // all squares attacked by the given color.
    Bitboard attackedBy[2][8];

    // kingZone[color] is the zone around the enemy king which is considered
    // by the king safety evaluation. This consists of the squares directly
    // adjacent to the king, and the three (or two, for a king on an edge file)
    // squares two ranks in front of the king. For instance, if black's king
    // is on g8, kingZone[WHITE] is a bitboard containing the squares f8, h8,
    // f7, g7, h7, f6, g6 and h6.
    Bitboard kingZone[2];

    // kingAttackersCount[color] is the number of pieces of the given color
    // which attack a square in the kingZone of the enemy king.
    int kingAttackersCount[2];
	
    // kingDefendersCount[color] is the number of pieces of the given color
    // which attack a square in the kingZone of the own king.
    int kingDefendersCount[2];

    // kingAttackersWeight[color] is the sum of the "weight" of the pieces of the
    // given color which attack a square in the kingZone of the enemy king. The
    // weights of the individual piece types are given by the variables
    // QueenAttackWeight, RookAttackWeight, BishopAttackWeight and
    // KnightAttackWeight in evaluate.cpp
    int kingAttackersWeight[2];
	
    // kingDefendersWeight[color] is the sum of the "weight" of the pieces of the
    // given color which attack a square in the kingZone of the own king. The
    // weights of the individual piece types are given by the variables
    // QueenAttackWeight, RookAttackWeight, BishopAttackWeight and
    // KnightAttackWeight in evaluate.cpp
    int kingDefendersWeight[2];

     // kingAdjacentZoneAttacksCount[color] is the number of attacks to squares
    // directly adjacent to the king of the given color. Pieces which attack
    // more than one square are counted multiple times. For instance, if black's
    // king is on g8 and there's a white knight on g5, this knight adds
    // 2 to kingAdjacentZoneAttacksCount[BLACK].
    int kingAdjacentZoneAttacksCount[2];
	
    // kingAdjacentZoneDefenceCount[color] is the number of defending moves to
    // squares directly adjacent to the king of the given color. Pieces which defend
    // more than one square are counted multiple times. For instance, if black's
    // king is on g8 and there's a black knight on g5, this knight adds
    // 2 to kingAdjacentZoneDefenceCount[BLACK].
    int kingAdjacentZoneDefenceCount[2];
  };
And things like

Code: Select all

  template<Color Us, bool HasPopCnt>
  void init_eval_info&#40;const Position& pos, EvalInfo& ei&#41; &#123;

    const BitCountType Max15 = HasPopCnt ? CNT_POPCNT &#58; CpuIs64Bit ? CNT64_MAX15 &#58; CNT32_MAX15;
    const Color Them = &#40;Us == WHITE ? BLACK &#58; WHITE&#41;;
	
	int count;
    Bitboard b = ei.attackedBy&#91;Them&#93;&#91;KING&#93; = pos.attacks_from<KING>&#40;pos.king_square&#40;Them&#41;);
    ei.attackedBy&#91;Us&#93;&#91;PAWN&#93; = ei.pi->pawn_attacks&#40;Us&#41;;

    // Init king safety tables only if we are going to use them
    if &#40;pos.non_pawn_material&#40;Us&#41; >= QueenValueMidgame + RookValueMidgame&#41;
    &#123;
      ei.kingZone&#91;Us&#93; = &#40;b | &#40;Us == WHITE ? b >> 8 &#58; b << 8&#41;);
      b &= ei.attackedBy&#91;Us&#93;&#91;PAWN&#93;;
		if &#40;b&#41;
		&#123;
			count = count_1s<Max15>&#40;b&#41;;
			ei.kingAttackersCount&#91;Us&#93; = count;
			ei.kingAdjacentZoneAttacksCount&#91;Us&#93; = count_1s<Max15>&#40;ei.attackedBy&#91;Them&#93;&#91;KING&#93;& ei.attackedBy&#91;Us&#93;&#91;PAWN&#93;);
			ei.kingAttackersWeight&#91;Us&#93; = ei.kingAttackersCount&#91;Us&#93; * KingAttackWeights&#91;PAWN&#93;;
		&#125;
		else
			ei.kingAdjacentZoneAttacksCount&#91;Us&#93; = ei.kingAttackersWeight&#91;Us&#93; = ei.kingAttackersCount&#91;Us&#93; = 0;
		ei.kingAdjacentZoneDefenceCount&#91;Them&#93; = ei.kingDefendersWeight&#91;Them&#93; = ei.kingDefendersCount&#91;Them&#93; = 0;
    &#125; else
        ei.kingZone&#91;Us&#93; = ei.kingAttackersCount&#91;Us&#93; = 0;
  &#125;
Sorry the indents are a mess, whenever I post something modified in the MSVC editor here, and I have to change everything to get rid of some invisible tabstops. But I hope the idea is straightforward enough. Well whenever anybody wants to test it in Stockfish as well, it is easy enough to code because the basic structure is already there.

All I can say now (the code was added in Rainbow Serpent eval only today) is that it will lower the evals a lot, because there will be less attackUnits whenever there are defenders and you value them about the same as the attackers. Even in Rainbow Serpent that has a lot higher weight for King Safety in principle, I think I can already see that the evals are much tamer. But I don't know if this adds much "knowledge". I don't think it is basically wrong though.

Just an idea, it is not an official proposal for a patch or anything :) No results that I have for it either. If anyone wants to test actual eval.cpp that is in use in Rainbow serpent in Stockfish for a patch I could post it.

Eelco
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan

mcostalba
Posts: 2679
Joined: Sat Jun 14, 2008 7:17 pm

Re: Stockfish's tuning method

Post by mcostalba » Sun Oct 09, 2011 5:24 am

Eelco de Groot wrote: I am not expecting much of it and just making some casual observations but right now there is an experiment where not only all the attacks and attackers are counted, but if the opponent has enough attacking material, in principle also all the defenders are counted in much the same way as the attackers.

The idea is that sometimes there are enough especially heavy pieces for instance, defending the King position so that the opponent will have a hard time breaking through. But the only way that is taken into account (in Stockfish' eval right now) is by the "undefended" bitboard which is a pretty small parameter. So I do think there might something missing there.
Yes, I agree with your analysis. Actually this is the way how it is implemented in Critter and I was thinking about this just few days ago when I read that part in the pascal sources of Critter and I thought this was a more natural and logical approach, perhaps at the cost of a small speed slowdown because you have to add some extra logic in the very hot path evaluate_pieces().

I agree with you this idea deserves some test.

mcostalba
Posts: 2679
Joined: Sat Jun 14, 2008 7:17 pm

Re: Stockfish's tuning method

Post by mcostalba » Sun Oct 09, 2011 7:09 am

Eelco de Groot wrote: Just an idea, it is not an official proposal for a patch or anything :)
No problem, I have done that. When I like some idea I am quite fast coding it ;-)

I have pushed to github a new branch called king_defenders and the last 2 patches are the result of this discussion.

https://github.com/mcostalba/Stockfish/ ... _defenders

The first just introduce the infrastructure without changing the functionality, the second enables it. Note that very possibly the parameters of the second patch are far from optimal and a tuning is really needed.

edwardyu
Posts: 34
Joined: Mon Nov 17, 2008 5:58 am

Re: Stockfish's tuning method

Post by edwardyu » Sun Oct 09, 2011 8:34 am

Hi Remi,

The Stockfish's tuning method uses an already good starting base value for tuning. Is this equivalent to using a narrow MIN MAX range around the starting value in CLOP? Or you prefer a wider range in order to explore more possibilities?

Rémi Coulom
Posts: 404
Joined: Mon Apr 24, 2006 6:06 pm
Contact:

Re: Stockfish's tuning method

Post by Rémi Coulom » Sun Oct 09, 2011 9:04 am

edwardyu wrote:Hi Remi,

The Stockfish's tuning method uses an already good starting base value for tuning. Is this equivalent to using a narrow MIN MAX range around the starting value in CLOP? Or you prefer a wider range in order to explore more possibilities?
Using a wide range and letting CLOP focus on the right interval by itself is the most efficient. As I wrote, this is one advantage of CLOP over Stockfish's tuning method: the user of the algorithm does not have to guess good values for such parameters of the optimization algorithm. CLOP will figure out good values by itself.

Rémi

zamar
Posts: 613
Joined: Sun Jan 18, 2009 6:03 am

Re: Stockfish's tuning method

Post by zamar » Sun Oct 09, 2011 9:32 am

Rémi Coulom wrote:

Using a wide range and letting CLOP focus on the right interval by itself is the most efficient. As I wrote, this is one advantage of CLOP over Stockfish's tuning method: the user of the algorithm does not have to guess good values for such parameters of the optimization algorithm. CLOP will figure out good values by itself.

Rémi
I think that answering the following question would be of a great practical importance: How many iterations do you need to CLOP outperform SPSA method with a good starting value? (You've already showed that this happens in infinity)
Joona Kiiski

Post Reply