The Inferno thread

hgm · Post by **hgm** » Sat Apr 01, 2017 3:09 pm

Futility pruning

In first instance I will not be doing any futility pruning in d >= 2 nodes. This because there isn't really a useful limit to the damage Fire Demons can do in a single move, so that a non-capture at d=2 that creates a burn threat could easily gain 6 Queens in the first ply of QS. Moves that do not top alpha at d <= 1, however, will be met with stand pat, no matter how much they threaten.

QS vs 1-ply

There is an interesting iteraction between futility pruning and the effective depth of a search. Searching any move is by definition at least a d=1 search on that move, as you have to play the move itself in order to earch it. So QS is really doing d=1 searches. But what distinguishes it from a 1-ply search is that it doesn't do them on all moves. It prunes the captures, and substitutes a stand-pat score for them. But 1-ply searches also prune moves, if they are futile. And non-captures are the most futile of all. So a QS search under conditions where non-captures would have been futile (because currentEval < alpha - MARGIN) is the same as a 1-ply search for that window, as the stand-pat score (currentEval) will have been ignored. When a capture in QS produces a beta cutoff, it also does not matter whether you intended to search the non-captures or not.

This is relevant for our IID scheme, to distinguish the QS from the 1-ply iteration. If the minimum requested search depth is >= 1 ply, then we don't have to bother with QS, and can start IID with a 1-ply search. If the maximum requested depth is QS, we never have to search non-captures. But if the miimum depth is QS, and we might have to deepen when this fails low, we somehow need to make a conditional transition from QS to 1-ply after we are done with the captures.

This will be implemented as follows:

Before any iteration starts we will calculate the futility threshold, (alpha - currentEval - MARGIN) translated to units used in the MVV/LVA scoring. This will be passed to the staged move generator MoreMoves(), so that this can stop its victim-by-victim move generation when it gets down to the futile victims. In this case it will return a negative "out-of-moves code" early. (Normally it returns the number of generated captures.) The QS / 1-ply iteration would be terminated by this; when the iteration depth reaches 2 ply the futility threshold would be reset to 0, so that future calls to MoreMoves() when the current (partial) move list runs out will continue generating captures on low-valued victims, and non-captures.

A negative value for the futility threshold will be used as a kludge to tell MoreMoves() that it should refrain from generating non-captures even when these are not futile. This will cause an early end to the QS / 1-ply iteration, making it a QS iteration. To handle the case where conditional deepening was requested based on whether QS failed low, we have to recognize this situation (from the futility threshold value), and if alpha <= startAlpha (fail low) and maxDepth > QSDEPTH reset the futility threshold to 0, and call MoreMoves() again (which now will generate non-captures), to finish the iteration we already did the captures for as a 1-ply iteration.

Code: Select all

// before IID loop
  int futile = SCALED&#40;alpha - currentEval - MARGIN&#41;;
  if&#40;futile < 0&#41; futile = -&#40;minDepth <= QSDEPTH&#41;; // when non-captures not futile, QS must be requested explicitly


// at top of IID loop
    if&#40;iterDepth > QSDEPTH+1&#41; futile = 0; // switch off futility pruning after 1-ply search


      // pick next move to search
      if&#40;currentMove >= msp&#41; &#123; // out of generated moves, generate next batch
        sort = MoreMoves&#40;currentMove, &stash, areaMaps, futile&#41;;  // generates moves and returns how many need sorting
        if&#40;sort < 0&#41; &#123;                                            // all &#40;non-futile&#41; moves exhausted
          if&#40;futile < 0&#41; &#123;                                        // kludge to indicate we are doing QS
            if&#40;maxDepth <= QSDEPTH&#41; &#123;                             // QS was all that is requested
              if&#40;resultDeph > QSDEPTH&#41; resultDepth = QSDEPTH;     // so we won't search captures, making this a QS result
              break;
            &#125; else  if&#40;alpha > startAlpha&#41; break;                 // deepening requests are only serviced on fail lows
            sort = MoreMoves&#40;currentMove, &stash, areaMaps, 0&#41;;   // generate non-captures after all
            futile = 0;                                           // and flag we are now at 1 ply
            if&#40;sort < 0&#41; break;                                   // still no moves &#40;can this happen?)
          &#125; else break;                                           // termiates non-QS iteration
        &#125;
      &#125;

Pruning of bad captures

A remaining problem is that QS will not search all captures, but prune the bad ones. In Tenjiku Shogi, which is very prone to QS search explosion, I will expand the concept of 'bad capture' to anything that is not obviously gaining material. Only captures of higher or unprotected victims will be considered good. So when we finish a search of captures at QS level, and then continue searching non-captures, the 'bad captures' will still not have been searched. There seems little harm in pruning bad captures at 1-ply as well as in QS, however, and only start searching them from 2-ply on. (This might want to make me reconsider searching equal captures in QS, though, as it does seem bad to leave those out at 1 ply.)

Bad captures will have to be generated even in IID iterations that would prune them, however, because later iterations will want to search them. This will be implemeted by generating bad captures with the lowest possible sort key (which for good captures would hold the MVV/LVA value). After MoreMoves() has placed the next batch of captures (all on the next 'victim group', i.e. a set of victims with the same exchange value) on the move stack, the move loop will extract the one with the lowest sort key from it to search. As all moves have the same victim, this is the one with the LVA. But if it gets to key 1, it will just skip the remaining moves of the batch if the depth is 1-ply or less.

hgm · Post by **hgm** » Sun Apr 02, 2017 12:16 pm

The previously posted Search() code contained some accuracies, as at the time it was completely untested. (E.g. I forgot to pop the move list off the move stack on returning, and the validity depth ignored moves that failed low.) RAther than posting it again I edited the old post to correct the errors (one of the perks of being a moderator...). This now contains a tested routine, which was able to find a mate in 4 in KRK. That is, after I swiched off null move, as with null move KRK is pretty hopeless, especially if you have no piece-square tables yet. it just randomly picks one of the possible lines as PV, which of course will lead nowhere, and any alternative that could force mate if the bare King only where forced to move will then be handsomely refuted by null moves.

I still have a dilemma for how to handle Fire-Demon burns. Moves with the Fire Demon, even just no-captures that reposition them, are amongst the most dangerous in Tenjiku Shogi, because the FD has enormous forward forking power, and there typically are may targets where it could burn several Queens. So I want a search that pus emphasis on such moves, at the expense of normal captures.

But I cannot search only Demon moves + captures in the last two ply before QS. Because that would leave the opponent defenseless. Very often the Demon attacks have to be countered with a non-capture, to limit the damage (move one of the attacked Queens away), or block the attack by interposing. So two ply before QS Demon non-captures are important, so they ca be cashed in QS, but 1 ply before QS defensive moves should be search to not overestimate the value of the Demon attacks.

hgm · Post by **hgm** » Thu Apr 06, 2017 11:39 am

Incredible! Everything seems to work (after a few days of debugging to iron out the bugs that caused crashing). So it seems I now have a working Tenjiku-Shogi engine. The first tests are very encouraging; it reaches depth 10 in about 3 min during the opening phase, with a QS limited to 8 ply. And it does find all the moves from the little opening theory that exists, that way.

There is still lots to be done, though:
* Piece-Square Tables: currently the eval is material only. (The goal in the opening phase of Tenjiku Shogi is to gobble up as much material as possible with the Fire Demons, though, so the engine is already usable there.)
* It doesn't keep track of game phase. (But most Tenjiku games are decided in the opening.)
* Killer heuristic not yet implemented
* History heuristic not yet implemented
* Pondering is not implemented. (But for now I only need it for analysis.)
* Promoted pieces are wrongly valued. Piece values will suck anyway; I obtaied them from a table on the tenjiku-shogi website, and even if there would be some truth in that, they presomably are for the primordial (promotable) pieces. Promoted pieces that move the same will be worth less (sometimes much less), because they cannot promote a second time. E.g. a Water Buffalo promotes to the all-powerful Fire Demon, and a large part of its listed value (11, vs Q=8) is due to that. But a Water Buffalo obtained from promoting a 'Side Soldier' is just a Queen with motion along files restricted to 2 squares. Currently I don't take that difference into account in the MVV/LVA move ordering, and only through a hack in the evaluation (making all unpromotable pieces 175 cP less valuable than the promotable version). In the opening phase promoted pieces usually do not occur, however.

Transposition Table

Currently the engine uses a simple two-way hash table, with a 'shallowest of two' replacement scheme. There is no aging yet. So during a game the depth-preferred half of the table will fill up with useless positions.

Aging doesn't work very well with analysis, however, unless you consider an etire analysis session a single search. In which case you again have the problem that the depth-preferred part of the table will at some point saturate. So I am thinking of using a mixed scheme: each hash bucked should hang on to the largest depth in it, subject to aging, where moving and thaking back moves during analysis would not alter the search number. In addition there should be at least two other entries in the bucket: an always-replace entry, and one that hangs on to larger depth somewhat longer, but not indefinitely.

Currently the hash entries are 16 byte: 8 bytes hash key, 4 bytes move, 2 bytes score, 1 byte depth, and 1 byte flags. So 4 entries could fit into a 64-byte bucket that fills a single cache line. Only 3 bits of the flags byte are used (2 for bound type and in-check), so the remaining 5 bits could be used as age counter. Some of the key bytes are redundant, however, because the index was derived from them. It seems reasonable to assume the hash table will at least have 2^24 = 16M entries (256MB hash), so the signature really only has to be 5 bytes. The hash move is currently not vetted, and with all the incremental updating faulty moves could have disastrous consequences, so I do want to have as long a key as possible.

The move could fit in 3 bytes. (In the move list the highst byte is used as a sort key, but that doesn't have to go into the hash table.) Packing the 5-byte signature and 3-byte move in a single int64 would reduce the entry size to 12 bytes, so that 5 entries would fit in a cache line, with 4 bytes to spare. I can imagine that in the future I want to store more in-check information, though, like the square of the checker. But i that case it might be possible to do away with the flags byte, and hide the bound-type flags in the move. (The move encoding uses 0x88-type square numbers, and there are some unused bits in those.)

So it seems reasonably 'future proof' to take 5 entries of 12 bytes in a bucket (5 signature, 3 move, 2 score, 1 depth, 1 flags), plus 4 age counters. One of the entries would be 'always replace', and would not need an age counter. A bucket index and a 'slot number' N = 0-3 would be derived from the hash key, to map it to one of the four depth-preferred entries. These entries would be used in two pairs, {0,1} and {2,3}. The position can only be stored in the pair to which it maps, or in the always-replace slot of the bucket. So in slot N, N^1 or 4.

If it is in neither of those 3, it will replace one of those. The entry the key primarily maps to of {N, N^1} will be replaced if its age counter declares it obsolete. (And this will never happen during analysis.) As the next choice, the entry with the lowest depth of the two will be replaced if the new depth is at least as high. If the new depth is exactly one lower, it will replace it with a certain probability, e.g. on one out of 4 occasions, controlled by a global counter ('undercut replacement'). In other cases it ges to slot 4.

This scheme keeps the highest-depth non-obsolete entry ever seen in each pair. The other 3 slots in the bucket will be used for 'temporary storage'. The undercut replacement will prevent that the second member of each pair, in absence of aging, would accumulate ever larger depth until it also clogs the table, expelling the currently active search tree to just the always-replace slots. They would tend to protect deeper entries, but occasionally (with a frequency deminishing with their depth) an entry would arrive that undercuts them, so that the depth stored in that slot can also go down in absence of aging.

hgm · Post by **hgm** » Sat Apr 15, 2017 11:52 am

I have been silent for some days, because everything now works so well that I have been very busy actually using the engine to develop opening theory. There is much that still has to be done (e.g. the evaluation now is material only, and then I mean just piece values, not even piece-square tables). But it is already playing in the correspondence tourney I had entered it in before I started writing it.

Note that I revised the Search() code I posted earlier again, because during use a few more bugs and inconveniences became apparent. The worst being that to determine if the score to go into the hash was a upper bound by comparing to the alpha after the iteration, rather than the original alpha of the node. This of course would lead to the upper-bound flag never be set.

. Surprisingly enough this only degraded performance, and I noticed it because during backward analysis it never seemed to be able to get exact hits on just analyzed positions.

Root Search

I am still in doubt about one aspect of the current routine, which acts both as root search and in deeper nodes, for the root-specific code. (Put into "if(ply == ROOT)" clauses in various places.) In particular storing the hash entry only at the end of Search() is inconvenient in the root, because during analysis you never get there: the search will always be aborted by arrival of input, and this of course suppresses hash storing.

So for the root it would be better to store the hash at the end of every iteration, so that when you take back a move during analysis, the root position will have been stored, and available for hash hits. The iteration end is skipped when a beta cutoff occurs, however. This cannot happen in the root (where beta = +INF), but in internal nodes it is the most essential case where you need to store the hash (for the cut move). And the structure of Search() now is

Code: Select all

Search&#40;depth&#41;
&#123;
  for&#40;iterDepth=1; iterDepth<=depth; iterDepth++) &#123;
    for&#40;ALL_MOVES&#41; &#123;
      ...
      if&#40;score >= beta&#41; goto cutoff;
    &#125;
    // fail low &#40;or in window&#41;
    if&#40;ply == ROOT&#41; HashStore&#40;B&#41;;
    if&#40;resultDepth > iterDepth&#41; iterDepth = resultDepth;
  &#125;
 cutoff&#58;
  HashStore&#40;A&#41;;
&#125;

So hash storing is needed both in location A and B, which is a bit inelegant. Perhaps it would be better to just break out of the loop over moves in case of a beta cutoff: the depth bootstrapping in combination with the self-deepening of cut moves would always increase the IID depth to the maximum required to break out of the IID loop as well. Control would then also pass the code directly after the move loop in case of a fail high, though. So that code would have to be beefed up a bit, to not do undesired things in that case. (I guess that setting reduction = redRequest = 0 in the "if(score >= beta)" clause would be sufficient for that.)

An Example

A nice test case was the opening position:

Here a Fire Demon (on i6, worth 28) has stepped into the aim of a Flying Queen (on i13, worth 13), so it seems black has a very profitable capture. The Demon is poisoned, however, and taking it will get black checkmated in 6. The reason is that the Flying Queen is still essential in defending the King here: it is the only piece not transparent to the Flying Queen of white, and after it is gone the latter can checkmate the suffocated black King. It is not possible for black to create an 'air hole' for his King in time.

Code: Select all

 11	-9.51	761.9M	37&#58;42.03	<done>
 11	-9.51	684.7M	33&#58;59.00	g12g11 i6c12! d14a11 c12a11! f13k8 h3j5 k8g4? j5p11 i13i1 h2i1,i1h2 g13g3+
 11	-15.46	173.5M	8&#58;43.42	j13j4+ i3j4 i12i11 i6c12! h13n7 i4j3 n7j3 j4j3 d14a11 c12e14! g14e14! g4g13+ e14g14!
 11	-159.94	54.0M	2&#58;40.37	i13i6 i5i6 j13j4+ h4j4 j15l15 i4p11 m13p10 j4i4 i16j15 i4o10 p10o10 p11o10 g13g4?
 10	+2.30	51.9M	2&#58;35.67	<done>
 10	+2.30	10.0M	0&#58;30.51	i13i6 i5i6 j13i13 i4p11 p12p11 h4i4 i13i4+ g4i4 h13i13 g3h6 g14h13
  9	+8.55	8.9M	0&#58;27.45	<done>
  9	+8.55	1.6M	0&#58;04.36	i13i6 i5i6 j13j4+ h4j4 j15l15 i4o10 p11o10,o10p11 g4i4 h13i13 g3h6 g14h13 j4j14
  8	+9.53	363189	0&#58;01.09	i13i6 i4i6 j15l15 h4i4 i16j15 i4j3 j15i16 j3j14 i15j14,j14i15 g4g13?
  7	+9.53	205645	0&#58;00.67	i13i6 i4i6 j15l15
  6	+9.53	24235	0&#58;00.17	i13i6 i5i6 j15l15 h4i5 i16j15 i5e9 g13g4? e9j14 i15j14,j14i15
  5	+24.50	11673	0&#58;00.06	i13i6 i5i6 j13j4+ i3j4,j4i3 g13g4?
  4	+24.50	1610	0&#58;00.01	i13i6 i5i6 j13j4+ i3j4,j4i3
  3	+24.50	664	0&#58;00.00	i13i6 i5i6 g13g4?
  2	+24.50	8	0&#58;00.00	i13i6 i5i6
  1	+24.50	5	0&#58;00.00	i13i6 i5i6

We see that the engine needs 11 ply, reached within 3 min on my 2.4GHz i3 (where it does about 330knps) to see this. And the alternative moves get a quite egative score as well, meaning that the Demon pseudo-sac isn't just an opening trap, but actually a very strong move. The reason is the Fire Demon has a fork attack on c12 (which is unprotected after all adjacent pieces are burned away, and thus gains a Falcon, Rook and Climber, plus 3 Pawns), and i12 (which burns the three most-valuable flying sliders, worth 13+12+10=35, plus 3 Pawns in exchange for the Fire Demon).

Rumor has it that the strongest competing AI ('mshogi', by John Weathers) already has problems recognizig mate-in-2 threats (in 10 min), so this is pretty encouraging!

hgm · Post by **hgm** » Sat Apr 15, 2017 7:01 pm

Now that everything seems to work, the next step is to get a more elaborate evaluation. Things like piece-square tables, game stage, phase-dependent piece values, etc.

Evaluation of Promotability

Promotability of pieces is a very important strategic asset in games like Tenjiku and Chu Shogi. At some point the board population thins to below the limit where the players can effectively defend their own promotion zone against intrusions, and all promotable sliders will then promote for sure. If that happens when, say, 80% of the pieces are gone, it means that each individual piece in the opening phase has a 20% chance of eventually promoting. So if the gain on promotion is substatial, the anticipated promotion prospects can contribute significantly to the piece value eve in the opening phase.

E.g. a Water Buffalo is just slightly inferior to Queen (its sideway slide is limited to a range of two squares), but it promotes to an all-powerful Fire Demon worth more than 3 Queens. So the value goes up from ~8 (with Q=9) to ~28, a gain of 20. Even with 20% probability, that makes the expected gain 4, and the opening value thus 8+4 = 12. And this will go up as the game advances: by the time half the pieces are gone, so the remaining pieces will have 40% probability to belong to the 20% that will surely promote, its value would be 16. An unpromotable Water Buffalo (i.e. a promoted 'Side Soldier', which moves as Water Buffalo) would of course always be worth 8, irrespective of game phase.

It is an interesting problem how to quantify the promotion prospects. It will not only be dependent on how much the board population has thinned, or even how much the strength of the opponent has thinned, but also on whether you are ahead or behind. With equal opponent material, it will be easier for you to force promotions when you are ahead than when you are behind. And the interesting thing is that ahead or behind here means in terms of actual playing strength, not in terms of anticipated promotion gain.

So it seems necessary to keep track of one game-phase variable for each player, reprensenting his tactical power available for defending the promotion zone. And the total promotion gain you army would reap if every slider in it promoted. How much of the latter then has to be added to your material value would then increase with decreasing tactical power of the opponent, and increase with the difference in tactical power in your advantage. The latter should have a pretty high weight: if you are behind in actual material, the anticipated promotion gain isn't worth very much, as it becomes unlikely you will be able to force any promotions before he can. After which your tactical-power disadvantage becomes even larger, making it even more impossible for you to promote anything.

Managing your promotion prospects is a very important aspect of the middle game.

The Inferno thread

The Inferno thread: Futility

Re: The Inferno thread: Futility

The Inferno thread: Transposition table

The Inferno thread: First use

The Inferno thread: Evaluation