Well hash = ( any_number * FNV ) ^ sq only changes the lowest bits since it uses the _same_ number to initialize all squares. It is similar to an sequentially executed LCG but poorer because in the latter case any_number is replaced by the previously generated random number. So what this method does is basically using sq as the hashkey. Note that the hash function is called every time the key for the piece on square is required, and it has to return the same key every time. So unless you store different initialization numbers (seeds) for each square (which again means storing it in a table), this can't work.
I'm not explicitly mixing here but this will be pretty fast and avoid a lot of source code clutter with tables of random numbers.
The table could be generated at run time from a given seed so there is no need to store them in the source code. If same random numbers are required as in the case where the keys are used for book too, then the seed can be fixed. Also mixing will add a lot of code that is done twice..
Well hash = ( any_number * FNV ) ^ sq only changes the lowest bits
Yes, this is how it works. In fact it's supposed to just be 8 bits that change. Multiplying it by the cleverly chosen large constant does the mixing.
since it uses the _same_ number to initialize all squares.
The same number? Maybe that is why it's called a constant? I don't understand your point, but this is the famous Fowler/Noll/Vo hash which has been found to work very well. Essentially we are applying the FNV hash to the tuple (color, type, square.) There are actually two constants, the first is called the offset basis and is claimed to not be important, just to provide a consistent way to start this off. FNV is called the FNV prime. It is a magic constant that has certain properties that are considered important for this particular hash.
It is similar to an sequentially executed LCG but poorer because in the latter case any_number is replaced by the previously generated random number. So what this method does is basically using sq as the hashkey. Note that the hash function is called every time the key for the piece on square is required, and it has to return the same key every time. So unless you store different initialization numbers (seeds) for each square (which again means storing it in a table), this can't work.
This is a sound hash - you are probably misunderstanding something. Did you notice that first the color/piece is xored, then multiplied then this is combined with the square number? There is nothing improper about that. I suspect that you did not understand "cp" in my little code example - but what this does will be to create a unique key for any color/piece/square combination. If you want I will compute this for chess and prove it to you. Or I will make up a test with 1000 piece types and 1000 squares and check to see if any match. They won't. Of course to be really sure one would would need to run a test of random numbers on the keys generated. I don't know if it would pass all the tests a true random generator would but that does not necessarily mean it won't be good for this. It's more important that keys are nicely distributed in the domain in which they are used, not that they pass random number tests.
I'm not explicitly mixing here but this will be pretty fast and avoid a lot of source code clutter with tables of random numbers.
The table could be generated at run time from a given seed so there is no need to store them in the source code. If same random numbers are required as in the case where the keys are used for book too, then the seed can be fixed. Also mixing will add a lot of code that is done twice..
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Well hash = ( any_number * FNV ) ^ sq only changes the lowest bits since it uses the _same_ number to initialize all squares.
I'm reading your post again and you are focused on just one step. The ^ sq only changes a few lower bits - but that's just ONE STEP. Even the best cryptographic hashes are complete crap without all the rounds and numerous mixing steps required. So I don't see why you are focused on that.
It is similar to an sequentially executed LCG but poorer because in the latter case any_number is replaced by the previously generated random number. So what this method does is basically using sq as the hashkey. Note that the hash function is called every time the key for the piece on square is required, and it has to return the same key every time. So unless you store different initialization numbers (seeds) for each square (which again means storing it in a table), this can't work.
I'm not explicitly mixing here but this will be pretty fast and avoid a lot of source code clutter with tables of random numbers.
The table could be generated at run time from a given seed so there is no need to store them in the source code. If same random numbers are required as in the case where the keys are used for book too, then the seed can be fixed. Also mixing will add a lot of code that is done twice..
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
So that is the hashkey for the square isn't it?
sq=0 gets hashkey = magic_number ^ 0
sq=1 gets hashkey = magic_number ^ 1
....
So unless you make magic_number of FNV different for each square this will clearly be equivalent to hashing with the sq. And if you make any_number different for each square then you will have to store it in a table which is what I said.
The same number? Maybe that is why it's called a constant? I don't understand your point, but this is the famous Fowler/Noll/Vo hash which has been found to work very well. Essentially we are applying the FNV hash to the tuple (color, type, square.) There are actually two constants, the first is called the offset basis and is claimed to not be important, just to provide a consistent way to start this off. FNV is called the FNV prime. It is a magic constant that has certain properties that are considered important for this particular hash.
No I clearly understand what you are trying to do. There is the cp (color/piece) and (square) indexes. You had FNV hash xored first then multiplied with FNV but this is not how FNV hash works. It should be reversed which I thought was a typo. But I am not sure anymore since you say now it should be xor first. Anyway you are using the FNV hash twice to get a single key. The fact that the two steps are mixed make it a bit different but basically it will suffer from the problems I mentioned.
The same number? Maybe that is why it's called a constant? I don't understand your point, but this is the famous Fowler/Noll/Vo hash which has been found to work very well. Essentially we are applying the FNV hash to the tuple (color, type, square.) There are actually two constants, the first is called the offset basis and is claimed to not be important, just to provide a consistent way to start this off. FNV is called the FNV prime. It is a magic constant that has certain properties that are considered important for this particular hash.
No I clearly understand what you are trying to do. There is the cp (color/piece) and (square) indexes. You had FNV hash xored first then multiplied with FNV but this is not how FNV hash works. It should be reversed which I thought was a typo. But I am not sure anymore since you say now it should be xor first. Anyway you are using the FNV hash twice to get a single key. The fact that the two steps are mixed make it a bit different but basically it will suffer from the problems I mentioned.
The issue with the 2 constants would be optimized away by any compiler and it was just a sloppy translation on my part. There are however 2 constants but one is just used for initialization and is not particularly important.
There are at least two versions of this hash and the one considered superior does the xor first, that is called (I think) FNV-1a if I remember correctly and the one that should be used for this.
Here is my suggested usage:
h = 19282138887198ull;
h = h ^ color_piece;
h = h * FNV_CONSTANT;
h = h ^ square_number;
h = h * FNV_CONSTANT;
This will come out differently for every piece on every square. The only way this varies from proper FNV-1a is that normally you look at 8 bits of key at a time - that is what is xor'd - so to do this the proper FNV way HG might wish to do an extra xor and multiply (if his key is as much as 24 bits.) But I believe this would work just fine.
There is a chance it wouldn't so it would need to be checked out but the hash itself is consider to be reasonable sound for general hashing.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
And since 'cp' and 'square' take only few bits they fit into a character, avoiding the aboveloop. Anyway FNV is a string hashing algorithm so its main strength comes from hashing multiple characters. We have only two here. For instance, if you just needed a hash key for the square alone then it would be a disaster as I showed in my previous post. The fact that two characters (cp & square) are merged make it workable but I have serious doubts about the number of collision with this method... When you use an LCG to generate hash keys, there is no correlation between (sq,cp) and the resulting hashkey. If we have doubts about a hashkey generated using rand() through that process, then the FNV one is very questionable in that regard.
Edit: I just did a _quick_ test with both FNV-1 and FNV-1a and I was right about the possible number of collisions. FNV-1 is completely horrendous while FNV-1a is a little better but you can imagine what how many collisions you can have zobrist hashing with this. First column FNV-1 and second one FNV-1a. Here is the result.
Same piece placed at 64 squares. This is absolutely terrible. An xor would nullify most bits. Note that using a different piece here produces similar result.
That is the version of FNV-1 that is considered inferior because it does not mix short strings well and it would be particularly bad for this application.
And since 'cp' and 'square' take only few bits they fit into a character, avoiding the aboveloop. Anyway FNV is a string hashing algorithm so its main strength comes from hashing multiple characters. We have only two here. For instance, if you just needed a hash key for the square alone then it would be a disaster as I showed in my previous post. The fact that two characters (cp & square) are merged make it workable but I have serious doubts about the number of collision with this method... When you use an LCG to generate hash keys, there is no correlation between (sq,cp) and the resulting hashkey. If we have doubts about a hashkey generated using rand() through that process, then the FNV one is very questionable in that regard.
Edit: I just did a _quick_ test with both FNV-1 and FNV-1a and I was right about the possible number of collisions. FNV-1 is completely horrendous while FNV-1a is a little better but you can imagine what how many collisions you can have zobrist hashing with this. First column FNV-1 and second one FNV-1a. Here is the result.
It does not look like a good function for HG based on this evidence but I never said it would be. There are about 50 hashes out their and this one looked really simple and I thought that it might possibly be adequate but I don't think so now.
Same piece placed at 64 squares. This is absolutely terrible. An xor would nullify most bits. Note that using a different piece here produces similar result.
It does not look like a good function for HG based on this evidence but I never said it would be. There are about 50 hashes out their and this one looked really simple and I thought that it might possibly be adequate but I don't think so now.
Yes it doesn't seem to be good hash function. But it is not necessarily this hash function's fault. The problem might be because of missing random quantity in there. It all seems a bit "calculated". When using even the simplest of LCGs the keys all look random because a different initializer is used for each execution of the LCG. For example, after calculating the hashkey for square 0,the next hashkey for square 1 will use that as the initializer instead of the same constant! That gives a seemingly different key for square 1 that bares no relation with square 0's. Even if we used another function that mixes the bits well, we will probably still suffer from this problem. This is a conundurm because if you make any_number different for each square, why not store the real values of the hash keys instead? I think that using pre-calculated tables is the best solution to avoid this kind of problem.
hgm wrote:I have bad experience with shifting keys by only 1 bit. I tried it in a simulator, and it drove up the number of key collisions enormously. When rotating keys by [????] bits I could not detect any difference in collision statistics with truly random keys.
I have no explanation for this; it was just an observation.
Hi hgm,
Your post says "rotating by 1 bit is bad, but rotating by N bits is good" but the number N is missing.
Could you please reply with the missing number, to sate my curiosity?
Thanks in advance!
ranrot is doing a few rotations number of bits rotated is small prime numbers of course. As then you can prove of course how long it'll take to get back to what you started with.