Alternatives to History Heuristics

bob · Post by **bob** » Thu Sep 03, 2009 3:05 am

Jan Brouwer wrote:
bob wrote:The history numbers begin to become almost purely random after a couple of minutes of very deep searching at speeds of 20M nodes per second.
I think this can be avoided by half-ing all entries in the table after any entry exceeds some lowish number (I use 10000 as threshold, and depth^3 for updates).
This gives a decaying effect of older updates and should reduce the noise. Using a history[from][to] table this way saves about 10% of nodes in my program.

Simple question. What do you get if you take a group of 32 bit random numbers, and collapse them into the range 0-65535 using a uniform scheme to collapse them? Are the numbers _still_ random, or has the randomness somehow been eliminated???

FrancoisK · Post by **FrancoisK** » Thu Sep 03, 2009 10:04 am

It works for BugChess2.
Whereas LMR seemed to hurt in PV nodes before, when i added root-like ordering in PV nodes with a dedicated hash table, it was a clear gain (ie within error bounds). The whole thing earned something like 20 elos if i recall well (not so bad in my current elo crisis background

).
Tested at very fast games (40 moves/ 10 seconds), the gain seems to be confirmed at long time controls (CCRL 40/40).

hgm · Post by **hgm** » Thu Sep 03, 2009 11:43 am

bob wrote:Simple question. What do you get if you take a group of 32 bit random numbers, and collapse them into the range 0-65535 using a uniform scheme to collapse them? Are the numbers _still_ random, or has the randomness somehow been eliminated???

The question is not relevant, as this is not what he does. By rescaling _during_ accumulation, in stead of _after_ it, he builds in a bias towards more-recently-found-to-be-good moves. Which eliminates most of the randomness, as the later was caused mostly by no-longer-relevant-moves from a distant past.

bob · Post by **bob** » Thu Sep 03, 2009 8:13 pm

hgm wrote:
bob wrote:Simple question. What do you get if you take a group of 32 bit random numbers, and collapse them into the range 0-65535 using a uniform scheme to collapse them? Are the numbers _still_ random, or has the randomness somehow been eliminated???
The question is not relevant, as this is not what he does. By rescaling _during_ accumulation, in stead of _after_ it, he builds in a bias towards more-recently-found-to-be-good moves. Which eliminates most of the randomness, as the later was caused mostly by no-longer-relevant-moves from a distant past.

No it doesn't. Already tried it. The most recent values are _also_ random because they come from drastically different depths. The history values are not very reliable, period.

hgm · Post by **hgm** » Thu Sep 03, 2009 9:10 pm

Then you must weight them wrong with regard to depth. If history works at low depth, it should be posible to make it work at high depth by binning the history counters in some small-enough depth ranges.

rjgibert · Post by **rjgibert** » Thu Sep 03, 2009 10:39 pm

hgm wrote:Then you must weight them wrong with regard to depth. If history works at low depth, it should be posible to make it work at high depth by binning the history counters in some small-enough depth ranges.

I never understood why people started abandoning the history heuristic. If it worked before and the claim that greater depth was randomizing the history counters, then it should be fixable.

My first try would be simply 2 sets of history counters, one for shallow depths and one for deep. Or perhaps 3, one each for shallow, medium and deep. If that doesn't work, then the "explanation" would turn out to be BS.

An alternate explanation might be that the other move ordering heuristics were simply making history counters superfluous with increasing depth. In that case a failure to fix history counters would then make sense.

bob · Post by **bob** » Thu Sep 03, 2009 10:41 pm

hgm wrote:Then you must weight them wrong with regard to depth. If history works at low depth, it should be posible to make it work at high depth by binning the history counters in some small-enough depth ranges.

You are not thinking this through. A good move at ply=3 may well be a good move at ply=5 or even ply=7. But what does it have to do with a good move at ply=25 or 30? Zero. The two positions are likely to be extremely different with that many intervening moves.

Same idea can be tested with killers. Killers from ply-2 (or even ply-4) can be tried at ply and they may well work reasonably well. We did this in Cray Blitz, and others have done it as well. But killers from ply 4 are not going to be useful at ply 24, and vice-versa.

And that is exactly what is wrong with history counters. They might do fine at shallow depths where the ply range is 1-10 or so, but when you can do a 20-24 ply middlegame search, with extensions so that you reach 30-40 plies easily, moves from early in the tree have nothing to do with the positions farther out.

If you can make it work, fine. I've tried. At length. And I have even removed the history code from existing programs that "think" it works effectively. And I have reported the results. I've not found one single case where it helps, and have found quite a few where it hurts.

hgm · Post by **hgm** » Thu Sep 03, 2009 10:51 pm

Well, that was what I mean by "binning". If moves at ply 1 have no relevance for what happens on ply 25, then keep separate sets of history counters for ply 0-5, 6-10, 11-15, 16-20, 21-25. Becase presumably what happens at ply 20-25 is relevant for ply 25. And 1-5, which would be noise for ply 25, are useful info for ply 5.

bob · Post by **bob** » Thu Sep 03, 2009 11:49 pm

hgm wrote:Well, that was what I mean by "binning". If moves at ply 1 have no relevance for what happens on ply 25, then keep separate sets of history counters for ply 0-5, 6-10, 11-15, 16-20, 21-25. Becase presumably what happens at ply 20-25 is relevant for ply 25. And 1-5, which would be noise for ply 25, are useful info for ply 5.

I've tried that, all the way to history[pc/from/to][ply] to history[pc/from/to][1]. The first is _very_ cache unfriendly. The latter is what everyone is currently using.

The problem is that history is something that is local in nature, but the artificial binning approach breaks it. Why would you be willing, at ply=6, to try a history value from moves made at ply=2 or 4, but not be willing to try a value from a move made at ply 8 or 10? Overlapping bins will be way too expensive. Discrete bins have those ugly disjoint edge conditions that hinder performance.

bob · Post by **bob** » Thu Sep 03, 2009 11:56 pm

rjgibert wrote:
hgm wrote:Then you must weight them wrong with regard to depth. If history works at low depth, it should be posible to make it work at high depth by binning the history counters in some small-enough depth ranges.
I never understood why people started abandoning the history heuristic. If it worked before and the claim that greater depth was randomizing the history counters, then it should be fixable.

My first try would be simply 2 sets of history counters, one for shallow depths and one for deep. Or perhaps 3, one each for shallow, medium and deep. If that doesn't work, then the "explanation" would turn out to be BS.

An alternate explanation might be that the other move ordering heuristics were simply making history counters superfluous with increasing depth. In that case a failure to fix history counters would then make sense.

That does't work, from testing, and here's why.

Say you choose to "bin" plies 1-6, 7-12, and 13-18, and 19-24. Clean sizes of 6 plies. Now at ply 6, you will use history counters from plies 2-4, but not 8 or 10. Why? The counters are usable when they are "close" to the current ply. And the discrete bin approach doesn't address this. You could have an overlapping bin approach, where you have a set of counters for each ply, but they are updated by searches within X plies on either side. That is beyond cache unfriendly, and is also too expensive in terms of memory accesses and just simple computational requirement.

For years Crafty used the history heuristic. Unfortunately, when I added that after Schaeffer originally explained the idea, all we did for testing was to use tactical test positions. And it had a positive effect. Not very large, but an effect. But when I started testing on the cluster, and doing the version 22 complete rewrite to eliminate all the black/white stuff, I decided to use the cluster testing to see if it was worth keeping before rewriting it to fit the new color-independent approach. Turns out it was worthless, and actually cost a couple of Elo due to the computational cost, which is not much, but also not zero. I removed it for that reason. I have since tested Fruit, Glaurung and Toga by removing history stuff, and discovered the same for them. Using it with LMR is pointless and can hurt performance somewhat. Using it to order moves is somewhere between break-even and very minor loss of performance.

Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics