The Inferno Thread, part II

hgm · Post by **hgm** » Sat Mar 07, 2020 5:24 pm

Three years ago I ran a kind of blog here on the creation of an engine for Tenjiku Shogi ( http://talkchess.com/forum3/viewtopic.php?f=7&t=63356 ). This never progressed much beyond the search and move generation; because I was a bit in a hurry (I had entered it in a correspondence tournament before I had even started designing it, and my clock was running), I started running it with a makeshift material-only evaluation (using piece values obtained from the internet) as soon as I could. Even with this evaluation, and half its time already elapsed, it still managed to finish second in the tourney. The year after it, it even won, because it had white in the key game against the title defender. (And having white is a pretty significant advantage,in Tenjiku Shogi.) Last year, however, it played black again, and dismally lost, despite coming out of the opening with a little advantage.

The reason was that the material-only evaluation utterly sucked; it did not encourage pushing its Pawns (which in the opening phase can be very dangerous, as they would allow enemy Fire Demons to sneak into your camp). And it did not even have piece-square tables, just a centralization bonus for all pieces. Which made all its strong sliders 'float up' against its mostly unbroken Pawn wall like air bubbles against an ice sheet, after which it was just waiting to be slaughtered by a well-planned breakthough of the opponent in a selected place. Even the piece values sucked, making it take many of the unfavorable trades the opponent smartly offered it. (This mostly because of the lack to take promotability into account correctly.)

The 4th championship in which it participates is just starting, so it is high time to shape up its evaluation a bit. I want to report about that project here. Rather than reviving the old thread, I started this new one.

I would like to fix the following things that proved to be obvious and embarrasing shortcomings last year:

* Pawn structure: scattering the Pawns too much (creating too many holes for diagonal movers) should still be discouraged. But not by absolute discouragement of any Pawn move, but by detecting and penalizing actual holes. So that holes will be closed after they served their use. Advancing all Pawns one rank would keep the Pawn wall as closed as it can be, and should be encouraged. It creates more space for the pieces behind it to reposition themselves, and (very important!) is allows the original Pawn rank (which is the foremost rank of the promotion zone) to be protected by sideway sliders.

* Mobility: especially for slider development it is very important to optimize actual mobility, rather than mindlessly have them march towards the center as far as they can. This also provides a drive for pushing Pawns.

* Stepper clusters: there are many pieces that move like a King with some moves missing. On the 16x16 board it takes very long to get them into play, even for defensive purposes (as they start on the back rank it takes 4 moves to even use one to protect Pawns on their starting rank!). Initially they seem almost useless. But in last year's game they were decisive. For an isolated stepper adventurously attacking is rather suicidal. But clusters of steppers mutually protecting each other can become very dangerous when they approach the promotion zone (somewhat like connected passed super-Pawns). As they all do promote, with a quite significant gain in value.

* Piece values should be rationalize w.r.t. promotability, through some algorithm that can estimate how much of the (latent) promotion value you will be able to harvest, considering the current actual material balance of the base values. You will not be able to promote anything before you have worn down the opponent's defense of the promotion zone, and doing the latter will unavoidably force you to trade much of your material for his. And you won't get any promotion gain from the pieces you trade. In last year's game the engine was left with sliders that gained almost nothing on promotion bythe time the zones became undefensible against slider intrusion, while the opponent had smartly kept a few sliders with strong promotion, amongst which the one that promoted to the virtually invincible Fire Demon. (Which we now think is worth any number of enemy pieces.)

hgm · Post by **hgm** » Sat Mar 07, 2020 6:01 pm

Piece-Square Tables

The incremental evaluation will have to move away from simple-minded centralization to true Piece-Square Tables. The makeshift evaluation was using centerFactor[pieceType]*centerBonus[square], making all pieces center in the same way (albeit with different priority). This often led to very stupid moves.

For one, centralizing in a direction in which a piece can slide is in lowest-order quite pointless. If pieces like Rooks get a centralizing PST, you would get the effect (in orthodox Chess terms) that after Nc3 they cannot weight to harvest a few PST points by Rb1, and to follow it up after the Bishop removes itself from c1 with Rc1, etc. A huge loss of tempos for no purpose, as the Rook is not necessarily better on d1/e1 as it is on a1, if there were no open files there. The moves have no preparation value, and moving the Rook should only be encouraged if it goes to an open file (i.e. mobility driven). Rook-like pieces (and Tenjiku Shogi has many piece types the moves of which are sub-sets of those of the Rook) should have a completely neutral PST. Even a piece that only slides in one orthogonal dimension (and steps in the other) doesn't really benefit from centralizing it in the dimension where it steps; the best location in that dimension would still be the one where its sliding move is on an open file (or rank, as the case may be).

For pieces with diagonal moves this is a bit different; close to the center both their forward moves can hit the enemy lines, while in the wings only a single move does. But that needs a more clever PST than just penalizing distance to the center, as (again in Chess terms) a Bishop on e1 is not necessarily much better than one on b1. You have to be more advanced, and not blocked by your own Pawns before you start hitting the enemy lines in two places. So all (sub-)Queens should get a carefully designed PST, one that encourages a location where it can be more active, while not encouraging pointless shuffling behind its own Pawn line.

The steppers (sub-Kings) need subtle treatment. These mostly start on the back rank. But they are sorely needed near the Pawns (which start on 5th rank), to close breaches in that rank through which the opponent's light-weight pieces like Bishops 'poke you in the eye' (skewering many nearly-Queens), with pieces of low value to shelter behind. But it takes 3 or 4 moves to get them there, and in the opening tempos are costly. One should count himself lucky if it is possible to get even a single stepper at the Pawn line. So the behavior that should be encouraged is to 'race' a single stepper forward as fast as it can, rather than first advancing a number of steppers to second rank. This behavior can be encouraged by by awarding an increasingly large difference between the bonus of the ranks (as in Chess we give to passed Pawns). So that it is much more profitable (PST-wise) to step from 3rd to 4th rank than from 1st to 2nd. Beyond 5th rank there is no need to encourage furthe advance of single steppers, though; in fact it is ill advised to have single steppers advance onto the opponent's half. So from 7th-rank on the PST values should again decrease.

hgm · Post by **hgm** » Sat Mar 07, 2020 8:14 pm

Slider Mobility

Unlike PST, which are trivial to implement, and the only problem is to tune them well, mobility is a potentially time-consuming calculation. So a lot can be gained by a clever implementation.

The main reason for wanting mobility is to build in a tendency for good slider development. On the 16x16 board having a slider 'look out' through an opening in the Pawn chain typically gives an enormous increase of the number of moves. The bonus for this should drive the creation of such escape holes in the Pawn chain. Low-valued sliders should get priority over high-valued sliders; if the 'exit' from your own camp is in use by a high-valued piece, the opponent can conquer the ray by positioning a (protected) lower-valued piece there, possibly skewering your occupant of the ray with something valuable behind it, forcing you to close the ray with some low-valued piece (if you are so lucky that you can...).

Also, for moving the sliders out into the open, the priority must be for low-valued sliders. These typically have fewer directions to move in, so that the mobility bonus per direction should more than compensate that. Perhaps the different directions should also have different weights for the same piece; forward moves are typically more profitable than sideway or backward moves. Awarding forward mobility more than backward mobility of course creates a tendency to move sliders backwards; the PST should compensate that tendency with a forward gradient.

For the moment I am not worrying too much on optimizing attacks on enemy vs friendly pieces; the latter are not legal moves, but they protect the blocker, and this can be useful too. If we make no difference between attacking an empty square, enemy or friend, leapers will have a mobility only dependent on their location, which can be handled by PST. The leapers in Tenjiku Shogi are almost all sub-Kings, and the most useful task for those that I have perceived so far is to protect their own Pawns by marching up to just behind the Pawn wall. Optimizing their number of legal moves would just discourage that. Unlike in orthodox Chess, you don't have to worry much about them being chased away by an even lower-valued piece, so retreat paths are not really a plus.

So we only worry about slider mobility. As the engine works mostly incrementally, we would want to calculate the mobility bonus incrementally too. With 30 pieces per side that have sliding moves in one or more directions, calculating the mobility in each node from scratch would be very expensive. Sliding moves change when you move a slider: the mobility from the old location disappears, and the mobility in the new position has to be calculated. If we keep the mobility bonus of each piece in the piece list, we can simply deduct that from the total mobility bonus. (And then have to properly save and restore it in MakeMove()/UnMake(), so we can replace it by the new mobility bonus.) We can also take care of the disappearence of the mobility bonus of a captured piece that way. (And we don't even have to save/restore that, as deeper in the tree that piece-list entry will not be used.) So basically the required effort is equivalent to calculating the mobility of a single piece (in its new location) from scratch. A good improvement over calculating it for 2x30 pieces!

The Inferno engine has a routine to generate all moves of a single piece ('AddMoves()'), and update its capture targets in the attack map. It is actually used for adding as well as removing pieces, depending on a 'sign' parameter that controls whether the attacks are added to or subtracted from the attack map. For this reason it is probably not a good idea to have it calculate the new mobility as well, as we only need to do that in MakeMove() for the to-square (the mobilty on the from-square already known from the piece-list), while in UnMake(), this doesn't have to be done for either the from- or -to-square (as we just put the saved value back). So we will add a new routine MobilityCalc(), to be called from MakeMove() after the move is made. For each of the 8 primary directions the piece slides in (which can be indicated in a byte in the piece list, from which we extract the set bits to loop through the directions), we then consult the view distance in that direction, multiply it by the per-piece, per-direction mobility weight, and add the lot. The total is stored in the piece list, and added to the totalMobility (or, even easier, the incremental Evaluation).

Some real code:

Code: Select all

unsigned char slideMask[NPIECES];     // mask indicating sliding directions (one bit per direction)
unsigned char mobWeight[NPIECES][8];  // mobility bonus per squere, for each direction
unsigned char pieceMobility[NPIECES]; // per-piece mobility bonus (0 for empty square)

int
MobilityCalc (int sqr)
{
  int mob = 0;
  int piece = board[sqr];                 // piece number
  int todo = slideMask[piece];            // set of infinite-range sliding directions
  while(todo) { // for all sliding directions
    int d = lowBit[todo];                 // extract next direction
    int dist = viewDist[sqr].d[d];        // distance to nearest obstacle in that direction
    int vec = step[d];                    // board step in that direction
    dist -= (board[sqr + dist*vec] == 1); // subtract 1 if obstacle is board edge
    mob += dist*mobWeight[piece][d];      // weight per direction
    todo -= 1 << d;
  }
  return mob; // returns mobility bonus for this piece
}

void
MakeMove (UndoInfo *u, int move)
{
  ...

  u->savedMob = pieceMobility[u->piece];
  u->gain -= pieceMobility[u->piece];               // lose mobility bonis in old location
  u->gain += pieceMobility[u->victim];              // gain mobility bonus of victim
  pieceMobility[u->piece] = MobilityCalc(u->toSqr); // remember mobility on to-square
  u->gain -= pieceMobility[u->piece];               // gain mobility bonus in new location
}

void
UnMake (UndoInfo *u)
{
  ...

  pieceMobility[u->piece] = u->savedMob; // restore mobility bonus of moved piece
}

This version of MobilityCalc() tries to be accurate, by distinguishing the case where the slider runs into the board edge (boundary guard encoded as 1) from one where it runs into a friendly or enemy piece (board[...] > 1). Because we do count the attack / protection, but there is never any use for stepping off board! I doubt whether this refinement will really buy us much; it could be traded for speed by removing the calculation of 'vec' and the line below it, which then would treat the edge and occupied-square cases equal.

Of course one could even try to distinguish the cases where the occupant is a friend or a foe. But this opens a can of worms, as making such a distinction would also require us to adapt the mobility of the piece (and the total) when the occupant 'flips sign' on a capture. The victim could have been under attack and protection of many others, and although the attack map can be used to look up which pieces exactly, this will be a major hassle. So for now attacks and protections are valued equally.

Another thing I am wondering about is whether it is really necessary to tabulate the weight of all directions separately. Note that directions with weight 0 can simply be omitted from the slideMask[]. The only diversification I had in mind is to make forward slides count twice as heavy as sideway slides. This seems to depend only on direction, so that the 2d array could be factored into the product of a weight per piece and a weight per direction. But onfortunately 'forward' is a concept that is player dependent, so that white and black need different direction tables. I guess this could be solved by tabulating the 8 direction weights cyclicly in a table of 12 entries, and access it as directionWeight[d + offset], where offset = 4*stm (since stm is 0 or 1).

Code: Select all

int
MobilityCalc (int sqr, int side)
{
  static int directionWeight[12] = { 4, 4, 2, 1, 1, 1, 2, 4, 4, 4, 2, 1 };
  int mob = 0;
  int piece = board[sqr];                 // piece number
  int todo = slideMask[piece];            // set of infinite-range sliding directions
  while(todo) { // for all sliding directions
    int d = lowBit[todo];                 // extract next direction
    int dist = viewDist[sqr].d[d];        // distance to nearest obstacle in that direction
    int vec = step[d];                    // board step in that direction
    dist -= (board[sqr + dist*vec] == 1); // subtract 1 if obstacle is board edge
    mob += dist*directionWeight[d+side];  // weight per direction (side = 0 or 4)
    todo -= 1 << d;
  }
  return mob*pieceWeight[piece]; // returns mobility bonus for this piece
}

(The indicated values in directionWeight[] are of course tunable, and assumes all sliding moves have non-zero mobility bonus, the backward directions just somewhat lower (namely 1). The directions are counted clockwise, d=0 is north. Of course even with this table setting some directions could be forced to have weight 0, by not includingthem in the slideMask[]. If we intend to give all backward directions weight 0, we can dispense with the color-dependence of the direction table, and give backward as well as forward deirections all twice the weight of sideway directions, the slideMask[] (which is piece-dependent and thus color-aware) causing skipping of the backward directions.)

hgm · Post by **hgm** » Sun Mar 08, 2020 10:39 am

Interposition and Discovery

Now moving a slider is of course not the only reason that its mobility can change; it often also changes as a side effect of moving some other pieces onto or out of its path. Inferno already contains several routines to deal with discovered or blocked slider moves for the purpose of updating the attack map. It would be natural to integrate the mobility update in these. In particular, there is a routine LiftPiece(), which applies the effects of evacuating a given square. This is called from MakeMove() for the from-square, and occasionally for removing victims of side-effect captures (such as e.p. in orthodox Chess). While extending the discovered slider moves (which formerly were attacking the square), it is automatically aware of the length of the extended path, and can correct the mobility of the responsible slider. It can then return the total change of the mobility eval to the caller (which is responsible for the update of the incermental evaluation).

Code: Select all

int
LiftPiece (int sqr)
{
  int total = 0;
  int piece = board[sqr]; // piece that is lifted
  ...
  for(side=0; side<2; side++) { // for both colors
    int color = side*ASIZE;
    int csqr = sqr + color;                  // element for 'sqr' in attack map for applicable color
    unsigned int att = attacks[csqr];        // attacks by this color
    int todo = att >> 24;                    // ordinary sliding attacks
    while(todo) {
      int d = lowBit[todo];                  // direction of next incoming attack
      int dist = viewDist[sqr].d[d^4];       // up-stream view
      int vec = step[d];
      int attacker = board[sqr - dist*vec];  // attacker must be up-stream obstacle
      int left = ranges[attacker][d] - dist; // how much of its range is left beyond us
      if(left > 0) {                         // to waste less time on the ubiquitous stepper attacks
        int dnDist = viewDist[sqr].d[d];     // attack reaches beyond us, so might now hit something
        if(left >= dnDist) {                 // the discovered move reaches the target behind us
          int s = csqr + dnDist*vec;         // new down-stream target
          attacks[s] |= A_SLIDE << d;
          /****** Update mobility ******/
          if(slideMask[attacker] & 1<<d) {          // discovered move has mobility bonus
            dnDist -= (board[sqr+dnDist*vec] == 1); // discard off-board step
            dnDist *= directionWeight[d+4*side];    // correct direction for player POV!
            dnDist *= pieceWeight[attacker];        // calculate bonus
            pieceMobility[attacker] += dnDist;      // and award it to the slider
            total += dnDist;                        // and to the total mobility balance
          }
          /*** end of mobility calc ***/
        }
      }
      todo -= (1<<d);
    }

    ...

    total = -total; // causes white POV for the balance, by negating total of black pass
  }

  return total; // return change of mobility balance (white POV) to caller
}

The additional code is put in a code section that is only executed when an attack on the evacuated square has enough extra range to hit a downstream target (which can be the board edge). Stepper attacks would never do that. Tenjiku Shogi has some pieces with finite-range sliding moves, but then the range is always 2. Discovering such a move by evacuating the adjacent square only extends the move one step, usually not enough to hit anything. In case it does, the test on the slideMask[] will rule it out, as such moves are not supposed to contribute to slider mobility.

Inferno has two routines for doing the opposite, occupying a square: PutBack() for undoing an earlier call of LiftPiece(), for the purpose of UnMake(), and PlacePiece(), for performing non-captures in MakeMove(). The difference is that PutBack() can assume the data in the attack map for the square is still preserved, so it can immediately know which slider moves passed over it (and block those). A non-capture will land on a square for which the attack map still holds the data from when it was last evacuated, which in general will not be up to date anymore. So PlacePiece() will have to gather the attack info for the square all by itself, from scratch.

PutBack() is therefore somewhat similar to LiftPiece(), extracting directions of overpassing slider moves from the attack map for the square. A difference, though, is that it has no need to find the range of the reponsible sliders, and therefore identify those sliders themselves: if there is an attack on a downstream target the range was apparently enough, and the attack will be blocked no matter what. For the mobility update we would have to look for the source of the attack, to know to which piece we should assign the change in mobility bonus (and its magnitude).

So I am a bit in doubt here what is the better approach. All this work (calculating the bonus change, and figuring out the responsible piece) was already done in LiftPiece(). We could have remembered the piece, and its old mobility, so that PutBack() can simply restore it. This is a bit tricky, though, as evacuating a square can potentially discover 8 slider moves, and the old mobility of all the involved sliders would have to be saves and restored. In addition, Tenjiku Shogi features multiple captures: a Fire Demon can roast up to 8 pieces in a single move, and each of the corresponding square could discover slider moves (in addition to those discovered at the from-square of the Demon). The same slider could have passed over multiple occupied squares (compare a Rook moving along a rank over both squares evacuated by an e.p. capture in Chess), and it would be essential to put back the oldest mobility value saved for it, to properly UnMake the move.

If we take care to PutBack() the pieces in UnMake() in the reverse order as MakeMove() removed them through calling LiftPiece(), there can never be any problem. And the existing code already does that. So if we pass a pointer to an area in the UndoInfo to LiftPiece(), where there is a stack for on which it can push pieces that saw their mobility change, and the old mobility value. UnMake() can then simply run through this stack backwards for restoring everything.

That leaves the case of PlacePiece(), which in UnMake() will be undone through a call to LiftPiece(). As LiftPiece() was already set up to update the mobility for discovered sliders from scratch, there is no need for a separate save and restore to do this. We just have to let PlacePiece() apply the mobility changes, calculated from scratch. PlacePiece() works by identifying pairs of 'anti-podes' in each of the 4 ray orientations, and then attributes any attacks on one of those coming from the direction of the given square to the other (just copying the corresponding attack bit, without caring whether it was set or cleared). For mobility we would have to test at that point if there indeed was an attack that got blocked, (i.e. the copied bit was set), and whether that corresponds to a direction for which a mobility bonus is awarded. If so, we can decrease the mobility bonus of the piece in the opposite direction by the length of the blocked stretch, with the applicable weight.

hgm · Post by **hgm** » Sun Mar 08, 2020 2:17 pm

Ugly PlacePiece() - The routine PlacePiece() needs to examine board and attack maps along a ray through the given square on both sides of the latter, meaning that there is duplication of the effort of calculating their address and retrieving them if we loop over all 8 directions. So it is more efficient to just loop over 4 orientations. But this leads to duplication of most of the code in the loop, with exchanged roles of the up-stream and down-stream targets, which is ugly code-wise. For now the latter solution is chosen:

Code: Select all

  for(d=0; d<8; d++) { // look in all 8 directions for attacks to block
    int64 slide = A_SLIDE << d;                  // bit indicating normal slide in this direction
    int vec = step[d];
    int dnDist = viewDist[sqr].d[d];
    int s = sqr + dnDist*vec;                    // down-stream obstacle
    int upDist = viewDist[sqr].d[d^4];
    int t = sqr - upDist*vec;                    // upstream obstacle
    if(d<4) { // only four direction pairs
      int side;
      int64 antiSlide = slide << 4;              // bit indicating normal slide in opposite direction
      for(side=0; side<2; side++) {              // for attacks of both colors
        int color = side*ASIZE;
        int64 dnAttacks = attacks[s+color];      // attacks on down-stream obstruction
        int64 upAttacks = attacks[t+color];      // attacks on up-stream obstruction
        int64 dnSlide = dnAttacks & slide;       // slider attack coming from our direction on down-stream target
        int64 upSlide = upAttacks & antiSlide;   // and in opposite direction from other side
        attacks[s+color] = dnAttacks - dnSlide;  // both these are now blocked, remove them
        attacks[t+color] = upAttacks - upSlide;
        att[side] |= dnSlide | upSlide;          // and both now attack us instead
        upAttacks &= jumpType[d & 1];            // jump-slide attacks in current direction set (orth or diag)
        att[side] |= upAttacks & dnAttacks;      // those that attack targets on both sides must also attack us
        /*********** mobility update ***********/
        if(dnSlide) {
          int attacker = board[t];               // identify slider responsible for the attack
          if(slideMask[attacker] & 1<<d) {       // this piece gets mobility bonus in this direction
            int bonus = dnDist;                  // length of blocked stretch
            bonus -= (board[s] == 1);            // discard steping off-board
            bonus *= directionWeight[d+4*side];  // direction is corrected for player POV
            bonus *= pieceWeight[attacker];      // calculate bonus for the blocked stretch
            pieceMobility[attacker] -= bonus;    // responsible slider loses it
            total -= bonus;                      // also account total
          }
        }
        if(upSlide) {
          int attacker = board[s];               // identify slider responsible for the attack
          if(slideMask[attacker] & 1<<d+4) {     // this piece gets mobility bonus in this direction
            int bonus = upDist;                  // length of blocked stretch
            bonus -= (board[t] == 1);            // discard steping off-board
            bonus *= directionWeight[d+4-4*side];// direction is corrected for player POV
            bonus *= pieceWeight[attacker];      // calculate bonus for the blocked stretch
            pieceMobility[attacker] -= bonus;    // responsible slider loses it
            total -= bonus;                      // also account total
          }
        }
        total = -total;
        /***** end of mobility calculation *****/
      }
    }
    ...
  }

hgm · Post by **hgm** » Sun Mar 08, 2020 5:54 pm

Stepper Clusters

To encourage the engine to storm the enemy camp with a cluster of steppers, and to act against it in a timely fashion when the opponent tries this, we could use the following evaluation terms: the Piece-Square Table should penalize advance of steppers through the enemy board half, and this should be more than compensated by a bonus that only a pair of adjacent steppers can get. The latter could be achieved by awarding a (location-dependent) bonus for each stepper that is protected by another stepper. Such a bonus could go up rapidly with the size of the cluster. E.g. four steppers in a 2x2 formation could provide 6 mutual protections, while two separate pairs only have two mutiual protections (one in each pair).

So there is a large cooperative effect. Perhaps too large; the steppers only have a sub-set of the King moves, so they cannot always keep each other protected in every relative location. Also, the optimally compact 2x2 constellation must be (temporarily) abandoned for making it possible to advance the cluster, reducing the number of interactions, and if this lowers the score too much it will just freeze the cluster where it is, rather than making it push on to promotion.

So it would perhaps be better to award a large bonus only for the first protector, and a more modest bonus for each additional protector. The increase of the protection bonus for a single general with advance should then outweigh the breaking of one or two extra protections for making the advance possible.

Of course we want to calculate this evaluation term incrementally. Moving a stepper (or capturing one) can easily go accompanied by examination of the squares it attacks, to see if these are occupied by friendly steppers, so that the cluster bonus is affected. How much the creation or destruction of such a protection is worth, depends on how many other protectors the target already had, though. The easiest way to get that information is to store the number of stepper proectors of a stepper in the piece list, just as for sliders we store their mobility bonus. Removal of the stepper can then decrement this number for all the stepper neighbors it protected, while appearence similarly increments it. How much total score such an increment/decrement is worth is then dependent on the remaining number of protections, and the location of the protected piece, possibly as the product of two tabulated values. This makes the update quite cheap, especially if we arrange it such that the location-dependent factor is 0 everywhere on the own board half, and don't even bother with checking for neighbors when the removed / appearing stepper is not advanced far enough so that its targets could qualify for a protection bonus. Then the advancement test would be the only thing that is done for steppers other than the very few that ever make it into the opponent half.

Code culd look somewhat like this:

Code: Select all

  static unsigned char weight[9] = { 0, 4, 1, 1, 0, 0, 0, 0, 0 };
                                    // W   B
  static signed char clusterTab[] = { -1, -1, // 1st rank
                                      -1, -1,
                                      -1, -1,
                                      -1, -1,
                                      -1,  0, // white Pawn rank
                                      -1,  5,        
                                       0,  3,
                                       1,  2,
                                       2,  1,
                                       3,  0,
                                       5, -1,
                                       0, -1, // black Pawn rank
                                      -1, -1,
                                      -1, -1,
                                      -1, -1,
                                      -1, -1  // 16th rank
                                    };

if(piece >= STEPPERS) {              // mover is stepper
  int total = 0;
  int rank = sqr >> 4;               // this is actually 2*rank
  int color = piece & 1;
  int bonus = clusterTab[rank+color]; 
  if(bonus >= 0) { // on rank where neighbor could have bonus
    todo = dirMask[piece];           // directions we move in
    while(todo) { // loop through those
      int d = lowBit[todo];          // a direction into which we move
      int to = sqr + step[d];        // to-square of that move
      int target = board[sqr+vec];   // occupant of square we hit
      if(target > STEPPER &&         // it is a stepper too
         !(piece - target & 1)) {    // and a friend
        int prot = protection[target]--;
        bonus = clusterTab[(to>>4)+color];
        total -= bonus*weight[prot];
      }
      todo -= 1 << d;
    }
  }
}

hgm · Post by **hgm** » Mon Mar 09, 2020 8:44 am

The code given above is wrong, or at least incomplete: it takes into account the effect of an appearing / disappearing stepper on neighboring steppers, but it does not account for the protection of the stepper itself. And the latter doesn't necessarily come from directions in which it moves. So the whole idea of following its moves to see what is there is a bust. We must examine all neighbors, because there can be a protector there.

Fortunately there still is a way to do this efficiently: Inferno keeps track of square occupation through 'bit-strips', which are basically white and black bitboards for (overlapping) 16x3 zones centered on a rank. In the bitstrip for the rank of the stepper we can thus mask out a 9x9 area (also killing the bit for the stepper itself), and then use bit-extraction to loop over the friendly neighbors. And then examine whether these are steppers, and whether we protect them, or they protect us.

Once on the enemy half a piece will probably not have many friendly neighbors, so that this will not very expensive, unless there indeed is an advanced stepper cluster. (And when not on the enemy half, or for non-steppers, we don't have to do it at all.)

I still don't like it very much that we basically have to do this 4 times per move: both for the from-square and the to-square, and then both in MakeMove() and UnMake(). It seems something could be gained here by saving and restoring old values.

Unfortunately the info on a 3x3 area in the bitstrips does not have the same layout as that of a summary of the move directions kept in dirMask[piece]. It cannot have, as it also contains info on the central square, and is thus spread over 9 bits, while the dirMask[] only has 8 bits. E.g. the bitstrip for 3rd rank would contain info on the squares

msb - ... h4 h3 h2 g4 g3 g2 f4 f3 f2 e4 e3 e2 d4 d3 d2 c4 c3 c2 b4 b3 b2 a4 a3 a2 0 0 0 - lsb,

where the neighborhood of e3 is indicated in red. While the dirMask[] would indicate the targets of a stepper on e3 as

msb - e4 f4 f3 f2 e2 d2 d3 d4 - lsb.

To use the dirMask as an indication of whether a stepper attacks a given neighbor square, or is attacked by a stepper there, we have to translate the bit order in the bitstrip to that in the dirMask. We might as well do this bit by bit, as we will have to test each friendly neighbor for being a stepper (and protecting us back) separately anyway.

To extract friendly neighbors from the bitstrip we shift the relevant 3x3 area to the 9 lowest bits, and mask away the others. The number 0-8 of the extracted bit can than be used to index tables that indicate the move (in dirMask layout) we need to reach that square, or the reverse one. (Looking up the latter separately is probably faster than calculating it by nibble-swapping; the required table only measures 9 bytes. By coincidence it is the table required for the reverse moves is just the table for the moves itself read backwards!) This can be ANDed with the dirMask of the central piece and of the neighbor, respectively (after verifying the neighbor as a stepper), to check for protective interaction. If there is one we have to update the protection count of the neigbor, and add up all protections of the central piece.

The routine AddMoves(), which is called by LiftPiece(), PutBack() and PlacePiece(), already accesses the relevant bitstrip, so this would probably be the best place to put the cluster-eval code. The following code could be put there:

Code: Select all

int burnStep[] = { V5, V6, V7, V4, 0, V0, V3, V2, V1 };              // existing table of steps to neighbor squares
unsigned char burnDir[] = { 0x20, 0x40, 0x80, 0x10, 0, 1, 8, 4, 2 }; // direction sets for steps above
signed char deltaRank[] = { 2, 2, 0, -2, -2, -2, 0, 2 };             // (2x) rank change of the above steps
unsigned char protWeight[] = { 0, 4, 5, 6, 6, 6, 6, 6, 6 };          // relative bonus for given nr of protectors
signed char deltaProt[] =      { 4, 1, 1, 0, 0, 0, 0, 0 };           // how the values above change on in/decrease of that number
                           

  int bonus = 0;                        // total cluster-bonus change
  if(piece >= STEPPER &&                // piece is a stepper
     1 << 16*stm+r & 0xFF8001FF) {      // on rank where it could have/cause cluster bonus
    int protectors = 0;                 // counter for number of protectors
    int i = 2*r + stm;                  // for indexing pairBonus, which alternates white and black values
    int ourMoves = dirMask[piece];      // bitmap of directions we move in
    int todo = burnMask[r][stm];        // relevant bitstrip (rank r and file f already calculated)
    todo >>= 3*f+3;                     // collect bits for 3x3 area in lowest 9
    todo &= 0x1EF;                      // mask away others and central square
    while(todo) { // loop over friendly neighbors
      int s = sqr + neighbor[n];          // neighbor square occupied by friend
      int friend = board[s];              // get it
      if(friend >= STEPPER) {             // we found an adjacent stepper
        if(ourMoves & burnDir[n]) {       // we protect neighbor
          int rank = i + deltaRank[n];    // (2x) rank of neighbor (on which bonus depends)
          int p = protection[friend];     // old protection count of neighbor
          protection[friend] = p + sgn;   // update it
          bonus += pairBonus[rank] *      // calculate associated bonus change, based on position, ...
                   deltaProt[p+(sgn>>1)]; // ... and protector count (sgn>>1 = 0 or -1)
        }
        if(dirMask[friend] & burnDir[8-n]) // neighbor protects us
          protectors++;                    // count it
      }
      todo -= 1<<n;
    }
    protection[piece] += sgn*protectors;                // should start or end as 0
    bonus += protWeight[protectors]*pairBonus[2*r+stm]; // bonus change for piece itself
    bonus *= sgn;
  }

Most of this is straightforward; the most tricky thing is that the AddMoves routine gets passed a sgn parameter +1 or -1 depending on whether it has to make the piece appear or disappear, but that the value associated with an increment or decrement of the protector number has to be taken from deltaProt[] at the lowest of the two. If there had been separate code for incrementing and decrementing this would just be done bey calculating the bonus before or after the count change, respectively. But now the index of deltaProt has therefore to be explicitly corrected for the case at hand.

There still is an issue of how to communicate the calculated bonus change to where it can be used to adjust the incremental evaluation, as there are several levels of function calls between MakeMove() (where it could be added to the 'gain' field of the UndoInfo struct) and AddMoves().

hgm · Post by **hgm** » Mon Mar 09, 2020 5:29 pm

Some more thoughts on this:

Perhaps it would be better to use separate code for sgn == 1 (placing a piece) and sgn == -1 (removing it). When removing it there would be no need to count the protection of the removed piece: it is already known, stored in protections[piece]. And we just have to subtract the bonus for that, and then set it to 0. No need to look up the moves of the neighbors (through their dirMask[]) to see if they protected us.

So we could actually just look at the squares the removed piece attacked, to test if there was a friendly stepper there, through code like I first gave. Only when we place a piece (sgn > 0) we would have to count how many protectors it has. Of course we would have to adjust the protection[] counts of the neighbors in all cases, but the code for that would simplify a bit when we know whether we are incrementing or decrementing it.

A way to speed up the removal part further would be to first intersect the set of squares attacked by the removed piece with the set of friendly neighbors. To do this with an AND operation the format of one of the sets would have to be transformed first, however. We can translate the neighborhood from the bit strip by a lookup in a table of 512 bytes. Or themove set (dirMask[piece]) by a table lookup in a table of 256 9-bit items (so likely this would have to be ints to make it efficient). Both uncomfortably large. The move sets are all left-right symmetric, though. So some of their bits are redundant, and it torns out that bits 7, 6 and 5 (corresponding to directions NW, W and SW) are just copies of bits 1, 2 and 3 (NE, E and SE). So we can mask with 31 to keep only the lowest 5 bits, and use the result as index in a table of 32 int entries that represent the move set in bitstrip format. So with an extra

todo &= moves2squares[ourMoves & 31];

we would have limited the examination of neighbors to friendly pieces we protect. The moves2squares[] table needs only be initialized for those move combinations that actually appear in the game. The table is still 32 x 4 = 128 bytes. Not sure if it would pay (w.r.t. cache pressure) to shrink it further at the price of some post-processing: the 9-bits bitstrip section for the 3x3 area we need will always have a 0 in bit in the center. We could squeeze that out in the table to fit it into a byte, and process the retrieves value as x + (x & 0xF0) to slip it back in.

Raphexon · Post by **Raphexon** » Wed Oct 14, 2020 3:29 pm

Sorry for invading your thread, but is there a download available for Inferno?

hgm · Post by **hgm** » Wed Oct 14, 2020 5:28 pm

Alas, no. Inferno is not public yet. I do plan to release it one day. But I do not consider it finished yet, and I only work on it off and on. I did implement the mobility evaluation, and used it in the yearly correspondence championship. But it became clear to me that the evaluation is no good. It attaches way to much weight to sideway and backward moves. And this sort of creates an impenetrable barrier for the stepper pieces; these are trapped on its own back rank out of fear for obstructing its own sliders. While the only way to make progress is to launch a Pawn + stepper assault backed up by sliders.

What is needed is an anisotropic evaluation of the mobility. But such a thing has to be compensated by a piece-square gradient in the same direction. Otherwise, when you award forward mobility higher than backward, it will encourage it to pull back all its sliders as far as possible. A mobility-independent bonus for advancing should compensate that.

I have also not yet decided upon the rules I will finally implement. Points of doubt are in particular whether promotion of Water Buffalo to Fire Demon should immediately incinerate all surrounding enemies or not. And whether jump-capture of the King by generals should be allowed. My original intention was to use rules that did not allow that, because I considered the rules that would allow it to make the game unplayable. I might have to revise that opinion, however. Extensive analysis of opening lines has not really revealed any lines that would produce a huge advantage.

The Inferno Thread, part II

The Inferno Thread, part II

Re: The Inferno Thread, part II

Re: The Inferno Thread, part II

Re: The Inferno Thread, part II

Re: The Inferno Thread, part II

Re: The Inferno Thread, part II

Re: The Inferno Thread, part II

Re: The Inferno Thread, part II

Re: The Inferno Thread, part II

Re: The Inferno Thread, part II