Alternatives to History Heuristics

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27808
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Alternatives to History Heuristics

Post by hgm »

bob wrote:I've tried that, all the way to history[pc/from/to][ply] to history[pc/from/to][1]. The first is _very_ cache unfriendly. The latter is what everyone is currently using.

...
The discontinuity problem can be alleviated to a large extent by overlapping the bins and making them more fine-grained. As reading of the history tables occurs far more frequently than writing, the overhead involved in this can be put in the writing without too much impact on overall performance. I.e. use one bin per two ply, but store a cutoff at ply n not only in the [n,n+1] bin but also in the [n-2,n-1] and [n+2,n+3] bin. Then you do span 6 ply.

The cache friendlyness can perhaps be improved by having history[ply][pc/from/to].
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Alternatives to History Heuristics

Post by mcostalba »

hgm wrote:Well, that was what I mean by "binning". If moves at ply 1 have no relevance for what happens on ply 25, then keep separate sets of history counters for ply 0-5, 6-10, 11-15, 16-20, 21-25. Becase presumably what happens at ply 20-25 is relevant for ply 25. And 1-5, which would be noise for ply 25, are useful info for ply 5.
IMHO these scheme is broken.

History table for ply 0-5 has a meaning but for ply 21-25 has not because it will be all random noise.

Tree expansion is not linear with plies, IOW a table for ply 0-5 will be based on similar positions, and this is good, but a table that stores history for ply 21-25 will be based on totally unrelated positions, positions they had 25+25 = 50 plies to diverge.

Another problem is the quantity. If fom 0-5 you have x positions, from 6-10 you have much more according to a geometrical progression and so the number of moves with ply within 21-25 are enourmously bigger then ones from 0-5

So your scheme of dividing tables is, IMHO, absolutely not balanced, it is _almost_ the same that having a single table (or a couple but no more) for 0-5 and discard the remaining.

I think that if duplicating the tables is your way to go then has much more meaning to use subtrees to be associated with tables, not ply ranges, so that positions area actually much more similar among them.
Edsel Apostol
Posts: 803
Joined: Mon Jul 17, 2006 5:53 am
Full name: Edsel Apostol

Re: Alternatives to History Heuristics

Post by Edsel Apostol »

mcostalba wrote:
hgm wrote:Well, that was what I mean by "binning". If moves at ply 1 have no relevance for what happens on ply 25, then keep separate sets of history counters for ply 0-5, 6-10, 11-15, 16-20, 21-25. Becase presumably what happens at ply 20-25 is relevant for ply 25. And 1-5, which would be noise for ply 25, are useful info for ply 5.
IMHO these scheme is broken.

History table for ply 0-5 has a meaning but for ply 21-25 has not because it will be all random noise.

Tree expansion is not linear with plies, IOW a table for ply 0-5 will be based on similar positions, and this is good, but a table that stores history for ply 21-25 will be based on totally unrelated positions, positions they had 25+25 = 50 plies to diverge.

Another problem is the quantity. If fom 0-5 you have x positions, from 6-10 you have much more according to a geometrical progression and so the number of moves with ply within 21-25 are enourmously bigger then ones from 0-5

So your scheme of dividing tables is, IMHO, absolutely not balanced, it is _almost_ the same that having a single table (or a couple but no more) for 0-5 and discard the remaining.

I think that if duplicating the tables is your way to go then has much more meaning to use subtrees to be associated with tables, not ply ranges, so that positions area actually much more similar among them.
I have this idea to use a new history table every fifth ply (subject to tuning) and to initialize it at every 5th first grandchild node of each root move and the succeeding 5th first grandchild from that 5th first grandchild node. This way a history table will only be used for 5 plies and it will only be used by that certain branch of the tree.
Uri Blass
Posts: 10298
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Alternatives to History Heuristics

Post by Uri Blass »

mcostalba wrote:
hgm wrote:Well, that was what I mean by "binning". If moves at ply 1 have no relevance for what happens on ply 25, then keep separate sets of history counters for ply 0-5, 6-10, 11-15, 16-20, 21-25. Becase presumably what happens at ply 20-25 is relevant for ply 25. And 1-5, which would be noise for ply 25, are useful info for ply 5.
IMHO these scheme is broken.

History table for ply 0-5 has a meaning but for ply 21-25 has not because it will be all random noise.

Tree expansion is not linear with plies, IOW a table for ply 0-5 will be based on similar positions, and this is good, but a table that stores history for ply 21-25 will be based on totally unrelated positions, positions they had 25+25 = 50 plies to diverge.
You do not get 25+25 plies difference without taking back moves

I think that if you simply divide the history table by 2 everytime that you take back a move and the remaining depth is more than 5 then the position with long distance has a very small influence on the values in the history table because in order to get it you need to take back positions with remaining depth 6,7,8,9,10,...
so you divide all the history tables by 2 a lot of times.

Maybe tests show that this idea does not work but the reason cannot be the reason that you give.

Note that I do something slightly different in movei and I simply divide the all the values in the history table by 2 in case that I discover that one of them that I increase by 1 is too high(again I expect search that happened a long time ago to have a little influence).

Here is the relevant code from movei

Code: Select all

void dividetries()
{
	int i,j;
	for &#40;i=0;i<64;i++)
		for &#40;j=0;j<16;j++)
		&#123;
			trycount&#91;i&#93;&#91;j&#93;=&#40;trycount&#91;i&#93;&#91;j&#93;>>1&#41;;
			goodcount&#91;i&#93;&#91;j&#93;=&#40;goodcount&#91;i&#93;&#91;j&#93;>>1&#41;;
		&#125;

&#125;
void increasetries&#40;int move&#41;
&#123;
	if capturepromotion&#40;move&#41;
		return;
	int j=info&#91;from&#40;move&#41;&#93;;
	int i=to&#40;move&#41;;
	trycount&#91;i&#93;&#91;j&#93;+=1;
	if &#40;trycount&#91;i&#93;&#91;j&#93;==&#40;1<<22&#41;)
		dividetries&#40;);
	

&#125;
User avatar
hgm
Posts: 27808
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Alternatives to History Heuristics

Post by hgm »

mcostalba wrote:Tree expansion is not linear with plies, IOW a table for ply 0-5 will be based on similar positions, and this is good, but a table that stores history for ply 21-25 will be based on totally unrelated positions, positions they had 25+25 = 50 plies to diverge.
The assumption was that the history values would slowly decay by renormalization, so that in practice they only refer to the last 10,000 nodes or so. Because of the depth-first tree walk you would then not have this problem. A logical way to implement this would be to rescale the deeper counters each time you return to a less deep node. (That is a bit less drastic than completely clearing the tables. The rescaling can be tuned to any factor between 0 and 1, and can be optimized.)
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Alternatives to History Heuristics

Post by mcostalba »

hgm wrote: The assumption was that the history values would slowly decay by renormalization, so that in practice they only refer to the last 10,000 nodes or so.
In this case I don't understand where is the difference from using just one table with rescaling.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Alternatives to History Heuristics

Post by mcostalba »

Edsel Apostol wrote: I have this idea to use a new history table every fifth ply (subject to tuning) and to initialize it at every 5th first grandchild node of each root move and the succeeding 5th first grandchild from that 5th first grandchild node. This way a history table will only be used for 5 plies and it will only be used by that certain branch of the tree.
Yes, I had a _similar_ idea :-) but still no time to test it.

Actually my idea is slightly different but at the same time _very_different in possible results.

If tests will be good I will post results.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Alternatives to History Heuristics

Post by mcostalba »

mcostalba wrote:
Edsel Apostol wrote: I have this idea to use a new history table every fifth ply (subject to tuning) and to initialize it at every 5th first grandchild node of each root move and the succeeding 5th first grandchild from that 5th first grandchild node. This way a history table will only be used for 5 plies and it will only be used by that certain branch of the tree.
Yes, I had a _similar_ idea :-) but still no time to test it.

Actually my idea is slightly different but at the same time _very_different in possible results.

If tests will be good I will post results.
Ok, you posted yours so it is correct I post mine:

It is similar to yours but instead of nodes "of each root move" I only consider PV nodes and of this node I only consider the first move, i.e. the PV move.

IOW I setup an history table for first move of root move list.

All the following moves will use this table.

When fifth ply is reached and we are in PV, then I setup another table for the _first_ move of the PV node: all the moves _below_ that move will use this new table.

In my scheme on a search of say 20 plies I have 5 tables only, on your scheme you have much more and I have doubts it can work ;-)

Note that in Stockfish (I don't know in your Twisted logic) history is weighted by depth, also if first main table will end up storing data of side branches at very high ply, this should not compromise history quality because their relative weight is very small.

My scheme also, handles nicely ;-) the asymmetry between the size of the PV move sub-tree and of the remaining ones (that are much smaller).

I think this scheme is what more is near to the concept of one history table per subtree and I have expectations it will work.
User avatar
hgm
Posts: 27808
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Alternatives to History Heuristics

Post by hgm »

mcostalba wrote:In this case I don't understand where is the difference from using just one table with rescaling.
The difference is that the data from positons very far away (in te metric of plies) would not pollute the data for a given depth, thus reducing the noise level.
rjgibert
Posts: 317
Joined: Mon Jun 26, 2006 9:44 am

Re: Alternatives to History Heuristics

Post by rjgibert »

hgm wrote:
bob wrote:I've tried that, all the way to history[pc/from/to][ply] to history[pc/from/to][1]. The first is _very_ cache unfriendly. The latter is what everyone is currently using.

...
The discontinuity problem can be alleviated to a large extent by overlapping the bins and making them more fine-grained. As reading of the history tables occurs far more frequently than writing, the overhead involved in this can be put in the writing without too much impact on overall performance. I.e. use one bin per two ply, but store a cutoff at ply n not only in the [n,n+1] bin but also in the [n-2,n-1] and [n+2,n+3] bin. Then you do span 6 ply.

The cache friendlyness can perhaps be improved by having history[ply][pc/from/to].
I don't understand why you want to "overlap the bins." You are going to have some duplication of information between bins anyways, so I don't see the need to enforce this by overlapping.

Just let each bin populate itself naturally. I think that the "problem of borderline cases" is an illusion. If move x produces fail highs for both bin A and bin B, this will be reflected very quickly in both bins regardless. If the idea of multiple bins fails, I don't think it will be because of this issue. I just don't see it.