Alternatives to History Heuristics

hgm · Post by **hgm** » Fri Sep 04, 2009 7:50 am

bob wrote:I've tried that, all the way to history[pc/from/to][ply] to history[pc/from/to][1]. The first is _very_ cache unfriendly. The latter is what everyone is currently using.

...

The discontinuity problem can be alleviated to a large extent by overlapping the bins and making them more fine-grained. As reading of the history tables occurs far more frequently than writing, the overhead involved in this can be put in the writing without too much impact on overall performance. I.e. use one bin per two ply, but store a cutoff at ply n not only in the [n,n+1] bin but also in the [n-2,n-1] and [n+2,n+3] bin. Then you do span 6 ply.

The cache friendlyness can perhaps be improved by having history[ply][pc/from/to].

mcostalba · Post by **mcostalba** » Fri Sep 04, 2009 8:56 am

hgm wrote:Well, that was what I mean by "binning". If moves at ply 1 have no relevance for what happens on ply 25, then keep separate sets of history counters for ply 0-5, 6-10, 11-15, 16-20, 21-25. Becase presumably what happens at ply 20-25 is relevant for ply 25. And 1-5, which would be noise for ply 25, are useful info for ply 5.

IMHO these scheme is broken.

History table for ply 0-5 has a meaning but for ply 21-25 has not because it will be all random noise.

Tree expansion is not linear with plies, IOW a table for ply 0-5 will be based on similar positions, and this is good, but a table that stores history for ply 21-25 will be based on totally unrelated positions, positions they had 25+25 = 50 plies to diverge.

Another problem is the quantity. If fom 0-5 you have x positions, from 6-10 you have much more according to a geometrical progression and so the number of moves with ply within 21-25 are enourmously bigger then ones from 0-5

So your scheme of dividing tables is, IMHO, absolutely not balanced, it is _almost_ the same that having a single table (or a couple but no more) for 0-5 and discard the remaining.

I think that if duplicating the tables is your way to go then has much more meaning to use subtrees to be associated with tables, not ply ranges, so that positions area actually much more similar among them.

Edsel Apostol · Post by **Edsel Apostol** » Fri Sep 04, 2009 10:58 am

mcostalba wrote:
hgm wrote:Well, that was what I mean by "binning". If moves at ply 1 have no relevance for what happens on ply 25, then keep separate sets of history counters for ply 0-5, 6-10, 11-15, 16-20, 21-25. Becase presumably what happens at ply 20-25 is relevant for ply 25. And 1-5, which would be noise for ply 25, are useful info for ply 5.
IMHO these scheme is broken.

History table for ply 0-5 has a meaning but for ply 21-25 has not because it will be all random noise.

Tree expansion is not linear with plies, IOW a table for ply 0-5 will be based on similar positions, and this is good, but a table that stores history for ply 21-25 will be based on totally unrelated positions, positions they had 25+25 = 50 plies to diverge.

Another problem is the quantity. If fom 0-5 you have x positions, from 6-10 you have much more according to a geometrical progression and so the number of moves with ply within 21-25 are enourmously bigger then ones from 0-5

So your scheme of dividing tables is, IMHO, absolutely not balanced, it is _almost_ the same that having a single table (or a couple but no more) for 0-5 and discard the remaining.

I think that if duplicating the tables is your way to go then has much more meaning to use subtrees to be associated with tables, not ply ranges, so that positions area actually much more similar among them.

I have this idea to use a new history table every fifth ply (subject to tuning) and to initialize it at every 5th first grandchild node of each root move and the succeeding 5th first grandchild from that 5th first grandchild node. This way a history table will only be used for 5 plies and it will only be used by that certain branch of the tree.

Uri Blass · Post by **Uri Blass** » Fri Sep 04, 2009 12:14 pm

mcostalba wrote:
hgm wrote:Well, that was what I mean by "binning". If moves at ply 1 have no relevance for what happens on ply 25, then keep separate sets of history counters for ply 0-5, 6-10, 11-15, 16-20, 21-25. Becase presumably what happens at ply 20-25 is relevant for ply 25. And 1-5, which would be noise for ply 25, are useful info for ply 5.
IMHO these scheme is broken.

History table for ply 0-5 has a meaning but for ply 21-25 has not because it will be all random noise.

Tree expansion is not linear with plies, IOW a table for ply 0-5 will be based on similar positions, and this is good, but a table that stores history for ply 21-25 will be based on totally unrelated positions, positions they had 25+25 = 50 plies to diverge.

You do not get 25+25 plies difference without taking back moves

I think that if you simply divide the history table by 2 everytime that you take back a move and the remaining depth is more than 5 then the position with long distance has a very small influence on the values in the history table because in order to get it you need to take back positions with remaining depth 6,7,8,9,10,...
so you divide all the history tables by 2 a lot of times.

Maybe tests show that this idea does not work but the reason cannot be the reason that you give.

Note that I do something slightly different in movei and I simply divide the all the values in the history table by 2 in case that I discover that one of them that I increase by 1 is too high(again I expect search that happened a long time ago to have a little influence).

Here is the relevant code from movei

Code: Select all

void dividetries&#40;)
&#123;
	int i,j;
	for &#40;i=0;i<64;i++)
		for &#40;j=0;j<16;j++)
		&#123;
			trycount&#91;i&#93;&#91;j&#93;=&#40;trycount&#91;i&#93;&#91;j&#93;>>1&#41;;
			goodcount&#91;i&#93;&#91;j&#93;=&#40;goodcount&#91;i&#93;&#91;j&#93;>>1&#41;;
		&#125;

&#125;
void increasetries&#40;int move&#41;
&#123;
	if capturepromotion&#40;move&#41;
		return;
	int j=info&#91;from&#40;move&#41;&#93;;
	int i=to&#40;move&#41;;
	trycount&#91;i&#93;&#91;j&#93;+=1;
	if &#40;trycount&#91;i&#93;&#91;j&#93;==&#40;1<<22&#41;)
		dividetries&#40;);
	

&#125;

hgm · Post by **hgm** » Fri Sep 04, 2009 12:15 pm

mcostalba wrote:Tree expansion is not linear with plies, IOW a table for ply 0-5 will be based on similar positions, and this is good, but a table that stores history for ply 21-25 will be based on totally unrelated positions, positions they had 25+25 = 50 plies to diverge.

The assumption was that the history values would slowly decay by renormalization, so that in practice they only refer to the last 10,000 nodes or so. Because of the depth-first tree walk you would then not have this problem. A logical way to implement this would be to rescale the deeper counters each time you return to a less deep node. (That is a bit less drastic than completely clearing the tables. The rescaling can be tuned to any factor between 0 and 1, and can be optimized.)

mcostalba · Post by **mcostalba** » Fri Sep 04, 2009 1:45 pm

hgm wrote: The assumption was that the history values would slowly decay by renormalization, so that in practice they only refer to the last 10,000 nodes or so.

In this case I don't understand where is the difference from using just one table with rescaling.

mcostalba · Post by **mcostalba** » Fri Sep 04, 2009 1:47 pm

Edsel Apostol wrote: I have this idea to use a new history table every fifth ply (subject to tuning) and to initialize it at every 5th first grandchild node of each root move and the succeeding 5th first grandchild from that 5th first grandchild node. This way a history table will only be used for 5 plies and it will only be used by that certain branch of the tree.

Yes, I had a _similar_ idea

but still no time to test it.

Actually my idea is slightly different but at the same time _very_different in possible results.

If tests will be good I will post results.

mcostalba · Post by **mcostalba** » Fri Sep 04, 2009 1:58 pm

mcostalba wrote:
Edsel Apostol wrote: I have this idea to use a new history table every fifth ply (subject to tuning) and to initialize it at every 5th first grandchild node of each root move and the succeeding 5th first grandchild from that 5th first grandchild node. This way a history table will only be used for 5 plies and it will only be used by that certain branch of the tree.
Yes, I had a _similar_ idea but still no time to test it.

Actually my idea is slightly different but at the same time _very_different in possible results.

If tests will be good I will post results.

Ok, you posted yours so it is correct I post mine:

It is similar to yours but instead of nodes "of each root move" I only consider PV nodes and of this node I only consider the first move, i.e. the PV move.

IOW I setup an history table for first move of root move list.

All the following moves will use this table.

When fifth ply is reached and we are in PV, then I setup another table for the _first_ move of the PV node: all the moves _below_ that move will use this new table.

In my scheme on a search of say 20 plies I have 5 tables only, on your scheme you have much more and I have doubts it can work

Note that in Stockfish (I don't know in your Twisted logic) history is weighted by depth, also if first main table will end up storing data of side branches at very high ply, this should not compromise history quality because their relative weight is very small.

My scheme also, handles nicely

the asymmetry between the size of the PV move sub-tree and of the remaining ones (that are much smaller).

I think this scheme is what more is near to the concept of one history table per subtree and I have expectations it will work.

hgm · Post by **hgm** » Fri Sep 04, 2009 2:58 pm

mcostalba wrote:In this case I don't understand where is the difference from using just one table with rescaling.

The difference is that the data from positons very far away (in te metric of plies) would not pollute the data for a given depth, thus reducing the noise level.

rjgibert · Post by **rjgibert** » Fri Sep 04, 2009 6:26 pm

hgm wrote:
bob wrote:I've tried that, all the way to history[pc/from/to][ply] to history[pc/from/to][1]. The first is _very_ cache unfriendly. The latter is what everyone is currently using.

...
The discontinuity problem can be alleviated to a large extent by overlapping the bins and making them more fine-grained. As reading of the history tables occurs far more frequently than writing, the overhead involved in this can be put in the writing without too much impact on overall performance. I.e. use one bin per two ply, but store a cutoff at ply n not only in the [n,n+1] bin but also in the [n-2,n-1] and [n+2,n+3] bin. Then you do span 6 ply.

The cache friendlyness can perhaps be improved by having history[ply][pc/from/to].

I don't understand why you want to "overlap the bins." You are going to have some duplication of information between bins anyways, so I don't see the need to enforce this by overlapping.

Just let each bin populate itself naturally. I think that the "problem of borderline cases" is an illusion. If move x produces fail highs for both bin A and bin B, this will be reflected very quickly in both bins regardless. If the idea of multiple bins fails, I don't think it will be because of this issue. I just don't see it.

Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics

Re: Alternatives to History Heuristics