Beginning hashtable question.

hgm · Post by **hgm** » Tue Feb 09, 2010 9:29 am

Michel wrote:
Of course, this is the very nature of hashing: once you hash your position to the zobrist number then there is no way to get the position back.
I am interested if this statement is really true. A chess position contains a lot of redundancy so it seems not implausible that one might be able to revert a Zobrist hash.

That depends on the key length. There are many more than 2^64 chess positions. (More like 2^164, from Hufman coding the board. That might reduce bit by the requirement of reachability from the opening position, but not nearly by a factor 2^100.) So the mapping cannot be invertible.

And even if it was: inverting the Zobrist key is related to the knapsack problem, (for additive keys it would be exactly that), and the latter is supposed to be so hard, that it is used as the basis for many cryptography algorithms.

hgm · Post by **hgm** » Tue Feb 09, 2010 9:47 am

Fguy64 wrote:Anyways, the bolded remark suggests to me the following, which is useful for understanding if true: Given any key A, if you XOR key B twice you get back key A. So repeatedly XORing the same key into a position key is akin to repeatedly moving a piece off and back onto a square?

Note that this is an implementation detail, and not really of fundamental importance. Hashing schemes similar to the conventional Zobrist, but using addition in stead of XOR work just as well. In that case adding a key would correspond to putting the piece on the square, and subtracting the key would correspond to removing it. You would then not have the peculiar property that putting down a second (identical) piece on the same square is the same as removing it. Which actually can be an advantage in hashing, say, crazyhouse holdings, where you can 'stack' pieces in the holdings, and having 2 pawns in hand is not at all the same as holding no pawns.

The Zobrist scheme with XOR only works by virtue of the fact that in chess you cannot stack pieces. The additive scheme does not suffer from that limitation.

Michel · Post by **Michel** » Tue Feb 09, 2010 1:13 pm

That depends on the key length. There are many more than 2^64 chess positions. (More like 2^164, from Hufman coding the board. That might reduce bit by the requirement of reachability from the opening position, but not nearly by a factor 2^100.) So the mapping cannot be invertible.

Of course this is true if you count all positions. I was referring to positions that are somewhat likely to occur in a game (I realize that this is hard to define).

And even if it was: inverting the Zobrist key is related to the knapsack problem,

Well the XOR version is also linear algebra over the field of two elements. So there might be some tricks which do not apply to the additive version.

Fguy64 · Post by **Fguy64** » Tue Feb 09, 2010 2:04 pm

hgm wrote:
Fguy64 wrote:Anyways, the bolded remark suggests to me the following, which is useful for understanding if true: Given any key A, if you XOR key B twice you get back key A. So repeatedly XORing the same key into a position key is akin to repeatedly moving a piece off and back onto a square?
Note that this is an implementation detail, and not really of fundamental importance.
...

I suppose, but it's a very useful little tidbit nonetheless. For me at least, this sort of detail is very useful from a learning perspective, things tend to fall into place once you can convince yourself that a fact like this is true. It's like that with computer programming in general. One good implementation detail is worth a thousand words of theory, call it the bottom up theory of learning.

Fguy64 · Post by **Fguy64** » Tue Feb 09, 2010 7:46 pm

heavy sigh. OK, I am experimenting with different hashing schemes, and something isn't right.

Right now I have set up two arrays. A one dimensional array of 64-bit integers, hkey[], that contains the key, and a two dimensional array of 32 bit integers hVal[][], that contains ply (or depth if you prefer) and eval.

ok in between the doMove() and undoMove() in alphaBeta I have the foillowing code...

Code: Select all

if( zKey == hKey[hIndex] ) {

	if( ply == hVal[0][hIndex] ) eval = hVal[1][hIndex];
	else {
		eval = -alphaBeta( ply+1, -beta, -alpha );
		hVal[0][hIndex] = ply;
		hVal[1][hIndex] = eval;
	}
}
else {
	eval = -alphaBeta( ply+1, -beta, -alpha );
	hKey[hIndex] = zKey;
	hVal[0][hIndex] = ply;
	hVal[1][hIndex] = eval;
}

to me this says if the keys match, then check if the ply matches. If the ply matches then take the stored eval. If the keys match but the ply does not, do a search and replace the stored ply and eval, keeping the stored key. If the key does not match, do a search and replace, for that index, the key, the ply, and the eval.

But this scheme gives incorrect results. However, if the above code is restricted to a certain ply, it works, for example the following gives correct results...

Code: Select all

if( ply == 3 ) {
	if( zKey == hKey[hIndex] ) {

		if( ply == hVal[0][hIndex] ) eval = hVal[1][hIndex];
		else {
			eval = -alphaBeta( ply+1, -beta, -alpha );
			hVal[0][hIndex] = ply;
			hVal[1][hIndex] = eval;
		}
	}
	else {
		eval = -alphaBeta( ply+1, -beta, -alpha );
		hKey[hIndex] = zKey;
		hVal[0][hIndex] = ply;
		hVal[1][hIndex] = eval;
	}
}
else eval = -alphaBeta( ply+1, -beta, -alpha );

Putting aside whether or not these are good strategies, and assuming for the moment that my calculations of the zKey and index is correct, is there something wrong with the first block that I can't see? It looks ok to me, and if it is then I suppose there is nothing else but the calculation of the key to examine. Am I clear on this question?

hgm · Post by **hgm** » Tue Feb 09, 2010 8:57 pm

Indeed, there is something wrong. You do not take into account that the scores returned by an alpha-beta search are bounds, (lower or upper bounds), rather than exact scores. So it can happen that you do get a matching key and ply, but that the score is in fact useless to you, because it is the wrong kind of bound.

E.g. alpha = -10, beta = +20, and the score is an upper bound of +30. So you still have no idea if the real score is below alpha, above beta, or in between. Your code uses the value +30 as if it was the exact score, while in fact the exact score might be -900.

Fguy64 · Post by **Fguy64** » Tue Feb 09, 2010 9:16 pm

OK Thanks, I can work with that. I guess there is a reason why the issues you raised don't seem to be a concern when I restrict the hashing to a specific ply?

Sven · Post by **Sven** » Tue Feb 09, 2010 10:21 pm

Fguy64 wrote:heavy sigh. OK, I am experimenting with different hashing schemes, and something isn't right.

Right now I have set up two arrays. A one dimensional array of 64-bit integers, hkey[], that contains the key, and a two dimensional array of 32 bit integers hVal[][], that contains ply (or depth if you prefer) and eval.

ok in between the doMove() and undoMove() in alphaBeta I have the foillowing code...
Code: Select all
if( zKey == hKey[hIndex] ) {

	if( ply == hVal[0][hIndex] ) eval = hVal[1][hIndex];
	else {
		eval = -alphaBeta( ply+1, -beta, -alpha );
		hVal[0][hIndex] = ply;
		hVal[1][hIndex] = eval;
	}
}
else {
	eval = -alphaBeta( ply+1, -beta, -alpha );
	hKey[hIndex] = zKey;
	hVal[0][hIndex] = ply;
	hVal[1][hIndex] = eval;
}
to me this says if the keys match, then check if the ply matches. If the ply matches then take the stored eval. If the keys match but the ply does not, do a search and replace the stored ply and eval, keeping the stored key. If the key does not match, do a search and replace, for that index, the key, the ply, and the eval.

But this scheme gives incorrect results. However, if the above code is restricted to a certain ply, it works, for example the following gives correct results...
Code: Select all
if( ply == 3 ) {
	if( zKey == hKey[hIndex] ) {

		if( ply == hVal[0][hIndex] ) eval = hVal[1][hIndex];
		else {
			eval = -alphaBeta( ply+1, -beta, -alpha );
			hVal[0][hIndex] = ply;
			hVal[1][hIndex] = eval;
		}
	}
	else {
		eval = -alphaBeta( ply+1, -beta, -alpha );
		hKey[hIndex] = zKey;
		hVal[0][hIndex] = ply;
		hVal[1][hIndex] = eval;
	}
}
else eval = -alphaBeta( ply+1, -beta, -alpha );
Putting aside whether or not these are good strategies, and assuming for the moment that my calculations of the zKey and index is correct, is there something wrong with the first block that I can't see? It looks ok to me, and if it is then I suppose there is nothing else but the calculation of the key to examine. Am I clear on this question?

Apart from the comments of HGM which I fully support of course, I would like to add that you should put all data that you store in the hash table for a given 'hIndex' in a "struct HashEntry", instead of using several one- or two-dimensional arrays. One HashEntry would consist, in your example, of the key, the value, the "draft" (I come back to it further down), and some information describing whether the stored value is an exact value, a lower bound, an upper bound, or invalid - you could call this the "value type". Later on you will probably also want to store the best move in the hash table, improving move ordering drastically.

Example:

Code: Select all

struct HashEntry {
    uint64 key;
    int16 value;
    uint16 draft;
    uint16 type; // although 2 bits would be sufficient
    uint16 bestMove; // only from, to and promotion piece are needed
};

This is far from being optimized but may be sufficient for the beginning.

Regarding the "draft": in your example you store the "ply", which in your case is the distance to the root node. This is not the information you should store in a hash entry, but instead you need to store how deep a search has gone from that node. Suppose you are at a node somewhere in the middle of the tree, the remaining depth ("draft") is 5, and the search returns from the subtree with a value of +100 which you store in the hash table. Later on, in the next iteration, you get to the same node but now the draft is 6. In the hash table you find +100 with draft 5, so would you skip the 6-ply search and replace it with a value from a previous shallower search? You wouldn't, for obvious reasons (but the best move from the hash table, if you store it, could be used for move ordering).

Therefore also one more correction is necessary to your current code: do not compare "if (ply == hVal[0][hIndex])", or the transformed "if (draft == hVal[index].draft)" after applying my comments above, but

Code: Select all

if (draft <= hVal[index].draft)

to express that the draft of the stored position must be at least the same as the draft for the current search to serve as a possible replacement for searching. Your "==", after changing "ply" into "draft", would not be wrong but too restrictive and thus would not help too much. Cases where the stored draft is really >= the current draft for the same position are usually transpositions.

Sven

Fguy64 · Post by **Fguy64** » Tue Feb 09, 2010 10:41 pm

Thanks Sven.

Duly noted on the struct HashEntry recommendation. I suppose your example is c++. I guess the java analogy to struct HashEntry is to have a class definition for the hash entry, where each element is an instance variable in the class.

Also, I understand the reason for storing depth searched from that point on. In my program, as it stands now, I think it's really six of one, half dozen of the other, because of the fixed depth nature of my search. I suppose at such time as I implement iterative deepening, I'll have to do it differently.

Sven · Post by **Sven** » Tue Feb 09, 2010 11:30 pm

Fguy64 wrote:Duly noted on the struct HashEntry recommendation. I suppose your example is c++. I guess the java analogy to struct HashEntry is to have a class definition for the hash entry, where each element is an instance variable in the class.

Correct.

And, please, since you're doing Java, just make the hash table a class, too, internally implemented with an array of HashEntry instances, and providing an interface with at least a store() and a retrieve() method (maybe also replace()). You will see how well this improves readability of your code without introducing substantial runtime penalties. Creation of hash entries will be done once at program initialization time only, so it won't affect the search performance. The retrieve() method would take the key and current draft as input and return a bool to indicate success, and as out parameters the best move and (in case of success) the value would be returned. Or similar. Handling of lower/upper bound conditions as well as special cases like mate scores could be encapsulated completely within retrieve().

Fguy64 wrote:Also, I understand the reason for storing depth searched from that point on. In my program, as it stands now, I think it's really six of one, half dozen of the other, because of the fixed depth nature of my search. I suppose at such time as I implement iterative deepening, I'll have to do it differently.

The main benefit of the hash table will be the improved move ordering which you get if you also store the best move and do iterative deepening. As long as you don't do both, the *only* benefit of the hash table can come from saving nodes by detecting transpositions. And for that to work properly you need to take the draft instead of the distance to root, otherwise you exclude many transpositions because the "<=" comparison that I mentioned does not work, and you keep your newly created hash table simply non-functional.

Sven

Beginning hashtable question.

Re: Beginning hashtable question.

Re: Beginning hashtable question.

Re: Beginning hashtable question.

Re: Beginning hashtable question.

Re: Beginning hashtable question.

Re: Beginning hashtable question.

Re: Beginning hashtable question.

Re: Beginning hashtable question.

Re: Beginning hashtable question.

Re: Beginning hashtable question.