iterative deepeniing and branching factor

Michael Sherwin · Post by **Michael Sherwin** » Sun Jul 15, 2007 11:02 am

Michael Sherwin wrote:
How big the side to move bonus should be is a difficult question, however. Should it be a constant, or should it depend on other components of the evaluation? I currently use a constant value of 0.2 in the middle game and 0.08 in the endgame, but these numbers are completely untuned (the only thing I have verified is that these numbers perform better than no side to move bonus at all).
RomiChess also gives a 0.2 move bonus. I did test other values and 0.2 was best.

A small correction.

I do not award the side to move bonus in the eval function.

It is applied in the qsearch which in RomiChess is called the CaptSearch.

It is added to the standpat score returned from the eval to see if that will cause a cutoff. If it does then beta is returned. If not then the score remains unchanged.

Strange, I know!

Code: Select all

s32 CaptSearch(s32 alpha, s32 beta) {
  s32 score;
  s32 Ply;
  trees *node;

  qnodes++;
  Ply = ply - base;
  score = Eval();
  if(InCheck(wtm)) score -= 80;
  if(score + 20 >= beta) return beta;
  if(CaptGen()) return MATE - Ply;
  if(t == h->t) return score;
  if(score > alpha) alpha = score;
  for(node = h->t; node < (h+1)->t; node++) {
    Sort(node, (h+1)->t);
    MakeMove((moves *)&node->m);
    node->score = -CaptSearch(-beta, -alpha);
    TakeBack();
    if(node->score > alpha) {
      if(node->score >= beta) return beta;
      alpha = node->score;
    }
  }
  return alpha;
}

Don · Post by **Don** » Thu Mar 19, 2009 12:45 am

bob wrote:
...

It isn't a new idea. "BeBe" went by twosies in the middle 80's as one example. I have encountered others that did so. One reason was to eliminate the odd/even ply flip-flopping of the best move when on odd ply searches the program gets one extra move compared to the opponent, while in the even-ply searches both get the same number of moves.

...

I know this is an old post, but I just run across it ...

Tony regretted this, as he once told me, but he claimed it would not be simple to fix. In my opinion it was a pretty big mistake that probably prevented him from winning tournaments, as his machine was pretty impressive and seemed to always finish near, but not quite IN first place. In order to compensate he added a feature he called "verify stand pat" in order to make his program take more time! He was in a situation where he tended to finish a 7 ply search really quickly, but couldn't quite make it to the 9 ply search (he always did odd ply.)

I've played around with it and I think you throw away too much information. It does work in checkers though ...

Don · Post by **Don** » Thu Mar 19, 2009 12:48 am

Tord Romstad wrote: I have personally never tried anything else than exactly one ply per iteration, but I think it might be worth some experiments.
Tord

Presumably you could go 1/2 ply if your program is set up that way. It's probably more useful if you have a few kinds of partial extensions and not just check extensions.

mjlef · Post by **mjlef** » Thu Mar 19, 2009 3:43 am

Don wrote:
bob wrote:
...

It isn't a new idea. "BeBe" went by twosies in the middle 80's as one example. I have encountered others that did so. One reason was to eliminate the odd/even ply flip-flopping of the best move when on odd ply searches the program gets one extra move compared to the opponent, while in the even-ply searches both get the same number of moves.

...

I know this is an old post, but I just run across it ...

Tony regretted this, as he once told me, but he claimed it would not be simple to fix. In my opinion it was a pretty big mistake that probably prevented him from winning tournaments, as his machine was pretty impressive and seemed to always finish near, but not quite IN first place. In order to compensate he added a feature he called "verify stand pat" in order to make his program take more time! He was in a situation where he tended to finish a 7 ply search really quickly, but couldn't quite make it to the 9 ply search (he always did odd ply.)

I've played around with it and I think you throw away too much information. It does work in checkers though ...

Back them, pre-LMR days, branching factors were about 6 per additional ply. Now with branching factors around 2, much like checkers, it might pay off. It seems with LMR, each additional ply of depth takes about double the previous search time. So if a 10 ply search takes 1 second, an 11 ply takes about 2 seconds more, and the 12 ply 4 more seconds. By skipping the 11 ply search (for example), you would save about 2 out of 7 seconds. So it might pay off if you were pretty darn sure the 12 ply search would search the first move. And if the move ordering is not bad since it misses one of the searches.

If anyone tests this, I would love to hear the results.

Don · Post by **Don** » Thu Mar 19, 2009 4:03 am

mjlef wrote:
Don wrote:
bob wrote:
...

It isn't a new idea. "BeBe" went by twosies in the middle 80's as one example. I have encountered others that did so. One reason was to eliminate the odd/even ply flip-flopping of the best move when on odd ply searches the program gets one extra move compared to the opponent, while in the even-ply searches both get the same number of moves.

...

I know this is an old post, but I just run across it ...

Tony regretted this, as he once told me, but he claimed it would not be simple to fix. In my opinion it was a pretty big mistake that probably prevented him from winning tournaments, as his machine was pretty impressive and seemed to always finish near, but not quite IN first place. In order to compensate he added a feature he called "verify stand pat" in order to make his program take more time! He was in a situation where he tended to finish a 7 ply search really quickly, but couldn't quite make it to the 9 ply search (he always did odd ply.)

I've played around with it and I think you throw away too much information. It does work in checkers though ...
Back them, pre-LMR days, branching factors were about 6 per additional ply. Now with branching factors around 2, much like checkers, it might pay off. It seems with LMR, each additional ply of depth takes about double the previous search time. So if a 10 ply search takes 1 second, an 11 ply takes about 2 seconds more, and the 12 ply 4 more seconds. By skipping the 11 ply search (for example), you would save about 2 out of 7 seconds. So it might pay off if you were pretty darn sure the 12 ply search would search the first move. And if the move ordering is not bad since it misses one of the searches.

If anyone tests this, I would love to hear the results.

I'm pretty sure this is more about information loss than branching factor, although it's tempting to think that with a BF of 2 this might work after all. I think it's clear that your search will not be as efficient, but the question is really if you will save more time avoiding the intermediate depths than you will with less efficient searches.

We don't have to guess, it would be easy to do this experiment. Maybe I will try and then report back to the forum.

Of course taking an existing program and simply doing this test may not be quite fair, because if you were determined to make this work it's likely that you might do some things differently. Tony probably tuned his program to work as well as possible with the skipping strategy.

Cardoso · Post by **Cardoso** » Tue Mar 24, 2009 12:08 am

Don wrote:
mjlef wrote:
Don wrote:
bob wrote:
...

It isn't a new idea. "BeBe" went by twosies in the middle 80's as one example. I have encountered others that did so. One reason was to eliminate the odd/even ply flip-flopping of the best move when on odd ply searches the program gets one extra move compared to the opponent, while in the even-ply searches both get the same number of moves.

...

I know this is an old post, but I just run across it ...

Tony regretted this, as he once told me, but he claimed it would not be simple to fix. In my opinion it was a pretty big mistake that probably prevented him from winning tournaments, as his machine was pretty impressive and seemed to always finish near, but not quite IN first place. In order to compensate he added a feature he called "verify stand pat" in order to make his program take more time! He was in a situation where he tended to finish a 7 ply search really quickly, but couldn't quite make it to the 9 ply search (he always did odd ply.)

I've played around with it and I think you throw away too much information. It does work in checkers though ...
Back them, pre-LMR days, branching factors were about 6 per additional ply. Now with branching factors around 2, much like checkers, it might pay off. It seems with LMR, each additional ply of depth takes about double the previous search time. So if a 10 ply search takes 1 second, an 11 ply takes about 2 seconds more, and the 12 ply 4 more seconds. By skipping the 11 ply search (for example), you would save about 2 out of 7 seconds. So it might pay off if you were pretty darn sure the 12 ply search would search the first move. And if the move ordering is not bad since it misses one of the searches.

If anyone tests this, I would love to hear the results.
I'm pretty sure this is more about information loss than branching factor, although it's tempting to think that with a BF of 2 this might work after all. I think it's clear that your search will not be as efficient, but the question is really if you will save more time avoiding the intermediate depths than you will with less efficient searches.

We don't have to guess, it would be easy to do this experiment. Maybe I will try and then report back to the forum.

Of course taking an existing program and simply doing this test may not be quite fair, because if you were determined to make this work it's likely that you might do some things differently. Tony probably tuned his program to work as well as possible with the skipping strategy.

It works in my checkers program (portuguese version of checkers).
However when the game reaches promotions to queens the benefit of iterating by two plies disapears as the branching factor explodes because in the portuguese variation of checkers queens can move like bishops in chess.

Alvaro

bob · Post by **bob** » Tue Mar 24, 2009 12:54 am

Cardoso wrote:
Don wrote:
mjlef wrote:
Don wrote:
bob wrote:
...

It isn't a new idea. "BeBe" went by twosies in the middle 80's as one example. I have encountered others that did so. One reason was to eliminate the odd/even ply flip-flopping of the best move when on odd ply searches the program gets one extra move compared to the opponent, while in the even-ply searches both get the same number of moves.

...

I know this is an old post, but I just run across it ...

Tony regretted this, as he once told me, but he claimed it would not be simple to fix. In my opinion it was a pretty big mistake that probably prevented him from winning tournaments, as his machine was pretty impressive and seemed to always finish near, but not quite IN first place. In order to compensate he added a feature he called "verify stand pat" in order to make his program take more time! He was in a situation where he tended to finish a 7 ply search really quickly, but couldn't quite make it to the 9 ply search (he always did odd ply.)

I've played around with it and I think you throw away too much information. It does work in checkers though ...
Back them, pre-LMR days, branching factors were about 6 per additional ply. Now with branching factors around 2, much like checkers, it might pay off. It seems with LMR, each additional ply of depth takes about double the previous search time. So if a 10 ply search takes 1 second, an 11 ply takes about 2 seconds more, and the 12 ply 4 more seconds. By skipping the 11 ply search (for example), you would save about 2 out of 7 seconds. So it might pay off if you were pretty darn sure the 12 ply search would search the first move. And if the move ordering is not bad since it misses one of the searches.

If anyone tests this, I would love to hear the results.
I'm pretty sure this is more about information loss than branching factor, although it's tempting to think that with a BF of 2 this might work after all. I think it's clear that your search will not be as efficient, but the question is really if you will save more time avoiding the intermediate depths than you will with less efficient searches.

We don't have to guess, it would be easy to do this experiment. Maybe I will try and then report back to the forum.

Of course taking an existing program and simply doing this test may not be quite fair, because if you were determined to make this work it's likely that you might do some things differently. Tony probably tuned his program to work as well as possible with the skipping strategy.

It works in my checkers program (portuguese version of checkers).
However when the game reaches promotions to queens the benefit of iterating by two plies disapears as the branching factor explodes because in the portuguese variation of checkers queens can move like bishops in chess.

Alvaro

As a note, I tried this a few years ago as well. One of the things I hoped to avoid was the sometimes large jump in time when the best move is going to fail high or fail low, because when you relax one of the windows you can cause the search to blow up badly (unless you relax the window in stages as I do now).

I tried fractional iterations, but in every case it always averaged out to be worse in real games, even though it looked better in some test positions. You could iterate by 1/2 ply for example, so that each new search will find lots of hash hits with useful draft, where checks ore extensions are the main thing that make them useless in 1/2 ply iterations. Each iteration goes by faster, but doing twice as many hurt overall. I tried this because of Amir's approach to counting plies differently than us, which is the same sort of idea, but implemented differently. I tried 1/2 and 3/4 ply increments on the small side, and then tried 1.5 but found that to be much worse overall.

Dave Gomboc · Post by **Dave Gomboc** » Tue Mar 24, 2009 5:31 am

hgm wrote:In a perfectly ordered tree the size of sub-trees sprouting from alpha or beta nodes grwos very unevenly with depth: If the old tree had all-nodes as leaves, the number of leaves is multiplied by ~40 in the next ply, but when they are cut-nodes it stays the same. By extending in steps of 2 you always skip the 'useless' deepenings.

I don't believe those deepenings are useless. Section 7.1 of
http://people.ict.usc.edu/~gomboc/publi ... rdance.pdf
provides some evidence that both "odd ply to even ply" and "even ply to odd ply" are effective, i.e. larger search time is rewarded with larger relative improvement in decision quality.

Dave

diep · Post by **diep** » Tue Mar 24, 2009 1:11 pm

hgm wrote:This is a general property of ID, also when you use it in internal nodes: if there is no PV change (meaning in the root that you predicted the opponent's move correctly), only the last iteration has to be done, as if you did not do ID at all.

This is fully automatic, and doesn't have to be programmed explicitly: the early iterations simple are satisfied by hash hits, without search. (It makes time management tricky, though, especially in combination with pondering, as you can get the first N-1 iterations in near-zero time, optimistically starting the Nth iteration, which then takes forever...)

If the branching ratio is too low, the conventional solution is deepening in larger steps, e.g. two ply at the time.

I'd argue it is a property of his dubious pruning, probably reductions of some type. What he describes is a total independant problem other than ID.

diep · Post by **diep** » Tue Mar 24, 2009 1:17 pm

Don wrote:
mjlef wrote:
Don wrote:
bob wrote:
...

It isn't a new idea. "BeBe" went by twosies in the middle 80's as one example. I have encountered others that did so. One reason was to eliminate the odd/even ply flip-flopping of the best move when on odd ply searches the program gets one extra move compared to the opponent, while in the even-ply searches both get the same number of moves.

...

I know this is an old post, but I just run across it ...

Tony regretted this, as he once told me, but he claimed it would not be simple to fix. In my opinion it was a pretty big mistake that probably prevented him from winning tournaments, as his machine was pretty impressive and seemed to always finish near, but not quite IN first place. In order to compensate he added a feature he called "verify stand pat" in order to make his program take more time! He was in a situation where he tended to finish a 7 ply search really quickly, but couldn't quite make it to the 9 ply search (he always did odd ply.)

I've played around with it and I think you throw away too much information. It does work in checkers though ...
Back them, pre-LMR days, branching factors were about 6 per additional ply. Now with branching factors around 2, much like checkers, it might pay off. It seems with LMR, each additional ply of depth takes about double the previous search time. So if a 10 ply search takes 1 second, an 11 ply takes about 2 seconds more, and the 12 ply 4 more seconds. By skipping the 11 ply search (for example), you would save about 2 out of 7 seconds. So it might pay off if you were pretty darn sure the 12 ply search would search the first move. And if the move ordering is not bad since it misses one of the searches.

If anyone tests this, I would love to hear the results.
I'm pretty sure this is more about information loss than branching factor, although it's tempting to think that with a BF of 2 this might work after all. I think it's clear that your search will not be as efficient, but the question is really if you will save more time avoiding the intermediate depths than you will with less efficient searches.

We don't have to guess, it would be easy to do this experiment. Maybe I will try and then report back to the forum.

Of course taking an existing program and simply doing this test may not be quite fair, because if you were determined to make this work it's likely that you might do some things differently. Tony probably tuned his program to work as well as possible with the skipping strategy.

It has to do with the reductions he's doing and dubious pruning.

when i was experimenting with this in diep around 1999 for first time,
and ran with diep in that manner for a year (also during tournaments like the world champs 1999) then the worst case of ID step 2 was a lot worse
from step 1.

In positions where step 2 is better, you already get such a huge search depth, that it doesn't matter when that eats more nodes.

I see junior do it by the way sometimes that they skip first 12 ply rather
quickly. Not sure based upon what decision they let the engine decide when to take which stepsize.

The thing is, if you search THAT selective, is iteration depth 20 in fact a 10 ply search depth with "odds that you find tactics up to 20 ply" or is it really 20 ply?

Vincent

iterative deepeniing and branching factor

Re: iterative deepeniing and branching factor

Re: iterative deepeniing and branching factor

Re: iterative deepeniing and branching factor

Re: iterative deepeniing and branching factor

Re: iterative deepeniing and branching factor

Re: iterative deepeniing and branching factor

Re: iterative deepeniing and branching factor

Re: iterative deepening and branching factor

Re: iterative deepeniing and branching factor

Re: iterative deepeniing and branching factor