Comparing results

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Comparing results

Post by Edmund »

That would be fine in normal circumstances. But for a hashtable with millions of entries, isn't that wasting millions of bytes? Wouldn't it be worth the extra effort in return for a larger table?
Maybe you want to try what hgm is suggesting, ie having several entries per index. Or you add more information per entry (eg lower bound value/upper bound value, age, etc.)
Hmmm, ok I did not know that. When a local variable is created in a method, does it take just as long to access that method every time? Because I have never made such considerations for memory access while trying to optimize the search.

I use a bunch of rather unnecessary counters/stacks for more verbose thinking, never thinking those could be a bottleneck. I mean compared to all that move generation, evaluation, recursion, etc. these counters should hardly matter..... right?
Recently used variables are cached by the processor.
Have a look at: http://en.wikipedia.org/wiki/CPU_cache
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: Comparing results

Post by jwes »

hgm wrote:With a 4-byt hash, you will have a collision that would have been avoided using the full key every 2^32 probes. At 1Mnps, that means every 4 million seconds, or about once a month.
Isn't that every 4000 seconds, or about once an hour.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Comparing results

Post by hgm »

You are correct. Anyway, negligible. Every 5 seconds, with a 3-byte signature, would be more taxing.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Comparing results

Post by bob »

hgm wrote:With a 4-byt hash, you will have a collision that would have been avoided using the full key every 2^32 probes. At 1Mnps, that means every 4 million seconds, or about once a month.

Unless a collission can crash your engine, even having many thousands of collissions in a single search does not measurbly affect performance. (Perhaps Bob could give us more exact numbers here.)

In Joker I use 9-byte entries, including a 4-byte signature. I never tried to shrink it to 8 bytes. With a 3 bytes signature, I would have a collission every 5 seconds (at 1 Mnps), as Joker probes 3 entries per access. This might still be bearable.

But there is very little gain to be expected from this. Currently Joker has 7 entries per 64-byte cache line. That could be increased to 8 entries. That is 14% more entries, which would lead to about 1% faster search. (The search speed scales as the twelfth root of hash sizes.)
First, on entry size.

I use 16 byte entries. I store the entire 64 bit Zobrist signature even though this is redundant since I use the rightmost N (N = log2(table size)) bits to address an entry. In Cray Blitz we stored the leftmost 32 bits, because we used the rightmost N (N was always > 24) which was almost using the entire 64 bits. Why store the whole thing? Simplicity and laziness. :) We used 12 byte entries in Cray Blitz, 64 bits for the best move, value/bound, draft/depth, flags, age, etc, the other 32 bits from the upper half of the hash key. That gives you 33% more entries than just using the entire 16 bytes.

Personal choice.

As far as collision errors go, Cozzie and I found that one collision every 1000 nodes did not appreciably affect the final search result. Anything rarer than that could not even be measured as to its effect on the final best move and score.

The only significant issue is the one you mentioned, that is your engine has to be resistant to bogus data, so that it won't crash. If an invalid best move will make you castle after you have already castled, you can have a problem with 2 kings of the same side on the board. If you can prevent bogus data from crashing your program, you can pretty well ignore any collision effect for reasonable hashing approaches.

Our trees are so incredibly big, it is sometimes astounding what kinds of errors the search can hide. I've been playing with some aggressive pruning changes and many of them that I really think should be bad seem to have very small effects on overall performance.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Comparing results

Post by hgm »

So actually it would be theoretically possible to get away witj only a 2-byte signature, even with hash buckets of 4: that would cause a collission every 2^14 = 16K probes, 16 times less frequent than the 1-in-1,000 which you report to have o measurable effect.
User avatar
Zach Wegner
Posts: 1922
Joined: Thu Mar 09, 2006 12:51 am
Location: Earth

Re: Comparing results

Post by Zach Wegner »

bob wrote:As far as collision errors go, Cozzie and I found that one collision every 1000 nodes did not appreciably affect the final search result. Anything rarer than that could not even be measured as to its effect on the final best move and score.
Of course I remember this study, but I think it might be about time to gather new data by comparing X bits of hashkey on hashprobes, then running zillions of cluster games. I would bet that 1 in 1000 has some sort of effect.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Comparing results

Post by bob »

hgm wrote:So actually it would be theoretically possible to get away witj only a 2-byte signature, even with hash buckets of 4: that would cause a collission every 2^14 = 16K probes, 16 times less frequent than the 1-in-1,000 which you report to have o measurable effect.
remember how we tested. We took a bunch of positions, and ran each with the "normal" programs, and then with the "collision-prone" programs (we could set a parameter to control how often we caused a wrong match). We discovered that it took a _lot_ of collisions to change the score or best move, and a lot more than that to make a "game-outcome-changing" switch to a worse move.

I'd bet 32 bits will work perfectly acceptably for complete games. But for our test positions, I think things look more favorable than they might for a full game, since the errors will add up thru a complete game increasing the probability of something going wrong...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Comparing results

Post by bob »

Zach Wegner wrote:
bob wrote:As far as collision errors go, Cozzie and I found that one collision every 1000 nodes did not appreciably affect the final search result. Anything rarer than that could not even be measured as to its effect on the final best move and score.
Of course I remember this study, but I think it might be about time to gather new data by comparing X bits of hashkey on hashprobes, then running zillions of cluster games. I would bet that 1 in 1000 has some sort of effect.
I could certainly test that and it would be interesting. I'll dig up the old version where I introduced the errors and try this test, varying the number of errors until it begins to make the results worse over 32K game matches...

not a bad idea...
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Comparing results

Post by hgm »

I suggest you induce the collissions by simply only using the N (=10, 11, 12, ...) least-significant bits of the sgnature when comparing for a match.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Comparing results

Post by bob »

hgm wrote:I suggest you induce the collissions by simply only using the N (=10, 11, 12, ...) least-significant bits of the sgnature when comparing for a match.
Didn't work very well. Since you are probing at the right address, most of the time you get a real match anyway. We originally tried that but later had to fudge better and when we wanted an error, we used a different set of bits from the signature to give a probe address, then assumed a match no matter what...