Zobrist Number Statistics and WHat to Look For

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Zobrist Number Statistics and WHat to Look For

Post by diep »

syzygy wrote:
diep wrote:If we have entries like:

{R(x),x},
{R(R(R(x))),R(R(x))}
etc.

then there always will be a multilineair connection.
That results in reducing number of safetybits you have.
I have no idea what you mean by { R(x), x }.
If you are not capable of understanding how one would construct a 64 bits random number out of 2 numbers of 32 bits then things stop right here indeed.

I get this impression in a lot of what you post. Just enough buzz to keep the postings going.
We are talking about taking a 32-bit random generator and using it to construct a 64-bit random generator by generating 2 32-bit numbers and concatenating the bits to produce a 64-bit number.

If this construction results in a BAD 64-bit random generator, then the 32-bit random generator you started with is simply already BAD. Period.

If you start with a GOOD 32-bit random generator, the constructed 64-bit generator will also be GOOD.
syzygy
Posts: 5554
Joined: Tue Feb 28, 2012 11:56 pm

Re: Zobrist Number Statistics and WHat to Look For

Post by syzygy »

diep wrote:
syzygy wrote:
diep wrote:If we have entries like:

{R(x),x},
{R(R(R(x))),R(R(x))}
etc.

then there always will be a multilineair connection.
That results in reducing number of safetybits you have.
I have no idea what you mean by { R(x), x }.
If you are not capable of understanding how one would construct a 64 bits random number out of 2 numbers of 32 bits then things stop right here indeed.
I certainly do understand the latter, but your notation simply does not make sense if you intend it to express the idea of forming a 64-bit random number from two 32-bit random numbers by concatenating the bits.

What is "x" supposed to be? A 32-bit random number?
Then what is "R(x)" supposed to be? A function of x? Is the result of function R applied to argument x the 2nd 32-bit random number to be concatenated to x?

There is no pseudo-random number generator worthy of that name that works by applying a function R to the previously generated random number. If the generated number "23" is always followed by (say) the generated number R(23) = 17 (say), we are not talking about (pseudo) random numbers.

So all I can do is ignore you notation, and reformulate the concepts in a clear manner. That is precisely what I already did in my previous post.
munchkin

Re: Zobrist Number Statistics and what to look for.

Post by munchkin »

Hello All,

I just thought I'd post a bit more info on what it was I was doing.

Here is the pseudo-random number generator that I used:
(from http://en.wikipedia.org/wiki/Random_Number_Generator)

Code: Select all

#define ZOBRIST_KEYS 1586

// 1586 long ints combine to make 793 64-bit ints

// The seeds for the Zobrist random number generator

unsigned int m_z = 0x12345;
unsigned int m_w = 0xE486B;


// multiply-with-carry method invented by George Marsaglia

unsigned int GetRand(void)
{
    m_z = 36969 * (m_z & 65535) + (m_z >> 16);
    m_w = 18000 * (m_w & 65535) + (m_w >> 16);
    return &#40;m_z << 16&#41; + m_w;
&#125;
Now all I did was set one of the random number seeds to a per-determined value, then performed a loop setting the other value, and testing the resulting set of random numbers it produced.

Code: Select all

void RandSeedGen&#40;void&#41;
&#123;
  long int zobrists&#91;ZOBRIST_KEYS+1&#93;;
  int trials = 1000000;
...


do&#123;
...
     m_z = 0x12345;
     m_w = 0x67890 + trials;
     for &#40;i = 1; i <= ZOBRIST_KEYS; i++)
        zobrists&#91;i&#93; = GetRand&#40;);

     ... perform statistical analysis of zobrists here ...
     ... if fabs&#40;mean - 0.5&#41; and fabs&#40;stddev - 0.288675&#41; <= 0.00001 
            then report seeds and write list of random numbers ...


   &#125;while&#40;trials--);

&#125;
This trial of 1 million produced only three seeds that provided me with a set of numbers that met my criteria, namely that both the mean and the standard deviation would be within 0.00001 of their ideal values.

I also performed a trial of 10 million to see if I could bring the values within the range of +/- 0.000001, but I didn't find any matches. (I'm sure there must be some, but you'd have to look a lot harder than what I did.)

I'm now working on a function that will test other aspects of the set of Zobrist random numbers, one of which is XORing together all the numbers in groups of 3, 4, etc. I'll write back with whatever results I get.

To Diep: I understand the point that you are making about simply combining two unsigned integers into one 64-bit number, that there is no guarantee that the resulting 64-bit numbers will still be random. (I suppose an extreme example of this would be a list of 32-bit integers where half the numbers were odd and occupied the odd-numbered cells of the array. If you then combined all these, you'd get 64-bit numbers that would all be either even or odd, and each of them would have at least two identical bits.) Still, I don't think the risk of that is very great, but I will nevertheless run my numbers against the new test function that I'm writing to see how they do. I may also write it to provide results for the numbers as 32-bit and 64-bits.

Thanks to everyone who replied, it was an interesting discussion and I learned a lot.
syzygy
Posts: 5554
Joined: Tue Feb 28, 2012 11:56 pm

Re: Zobrist Number Statistics and what to look for.

Post by syzygy »

munchkin wrote:This trial of 1 million produced only three seeds that provided me with a set of numbers that met my criteria, namely that both the mean and the standard deviation would be within 0.00001 of their ideal values.
I hope it is clear now that you should get rid of these criteria, since they really do not make sense. A true random generator will on 99% of its runs NOT give a mean within 0.00001 of its expected value (well, it might if you are generating millions of numbers in each run, but that's not the case here).

The statistical mean is an expectation that has almost no bearing on what you should get in a single run. It will only be achieved in the limit, when you generate N numbers, take their mean m_N, and then let N approach infinity. Infinity is a mathematical concept, it does not exist in practical computer implementations.
I also performed a trial of 10 million to see if I could bring the values within the range of +/- 0.000001, but I didn't find any matches. (I'm sure there must be some, but you'd have to look a lot harder than what I did.)
You can easily make a match. If you need N numbers then generate N-1 and set aN equal to N/2 - a1 - a2 - a3 - ... - a(N-1). Now the N numbers add up to N/2, so their mean is exactly 1/2. Again, doing this is not useful at all (and in fact reduces randomness).
User avatar
Ajedrecista
Posts: 1952
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Zobrist Number Statistics and what to look for.

Post by Ajedrecista »

Hello again:
munchkin wrote:Hello All,

I just thought I'd post a bit more info on what it was I was doing.

Here is the pseudo-random number generator that I used:
(from http://en.wikipedia.org/wiki/Random_Number_Generator)

Code: Select all

#define ZOBRIST_KEYS 1586

// 1586 long ints combine to make 793 64-bit ints

// The seeds for the Zobrist random number generator

unsigned int m_z = 0x12345;
unsigned int m_w = 0xE486B;


// multiply-with-carry method invented by George Marsaglia

unsigned int GetRand&#40;void&#41;
&#123;
    m_z = 36969 * &#40;m_z & 65535&#41; + &#40;m_z >> 16&#41;;
    m_w = 18000 * &#40;m_w & 65535&#41; + &#40;m_w >> 16&#41;;
    return &#40;m_z << 16&#41; + m_w;
&#125;
Now all I did was set one of the random number seeds to a per-determined value, then performed a loop setting the other value, and testing the resulting set of random numbers it produced.

Code: Select all

void RandSeedGen&#40;void&#41;
&#123;
  long int zobrists&#91;ZOBRIST_KEYS+1&#93;;
  int trials = 1000000;
...


do&#123;
...
     m_z = 0x12345;
     m_w = 0x67890 + trials;
     for &#40;i = 1; i <= ZOBRIST_KEYS; i++)
        zobrists&#91;i&#93; = GetRand&#40;);

     ... perform statistical analysis of zobrists here ...
     ... if fabs&#40;mean - 0.5&#41; and fabs&#40;stddev - 0.288675&#41; <= 0.00001 
            then report seeds and write list of random numbers ...


   &#125;while&#40;trials--);

&#125;
This trial of 1 million produced only three seeds that provided me with a set of numbers that met my criteria, namely that both the mean and the standard deviation would be within 0.00001 of their ideal values.

I also performed a trial of 10 million to see if I could bring the values within the range of +/- 0.000001, but I didn't find any matches. (I'm sure there must be some, but you'd have to look a lot harder than what I did.)

I'm now working on a function that will test other aspects of the set of Zobrist random numbers, one of which is XORing together all the numbers in groups of 3, 4, etc. I'll write back with whatever results I get.

To Diep: I understand the point that you are making about simply combining two unsigned integers into one 64-bit number, that there is no guarantee that the resulting 64-bit numbers will still be random. (I suppose an extreme example of this would be a list of 32-bit integers where half the numbers were odd and occupied the odd-numbered cells of the array. If you then combined all these, you'd get 64-bit numbers that would all be either even or odd, and each of them would have at least two identical bits.) Still, I don't think the risk of that is very great, but I will nevertheless run my numbers against the new test function that I'm writing to see how they do. I may also write it to provide results for the numbers as 32-bit and 64-bits.

Thanks to everyone who replied, it was an interesting discussion and I learned a lot.
I agree with Ronald about the relative importance of mean and standard deviation. Please imagine that you generate N numbers but not in a random way: you generate half of the numbers (N/2) as 0.5 - 1/sqrt(12) and the other half as 0.5 + 1/sqrt(12); then the mean is 0.5 and the standard deviation is 1/sqrt(12). You have the desired values but in a not random way.

As far as I understand in this case, mean and standard deviation are useful for detecting errors in the implementation but not for validate the whole PRNG: if you have {mean, standard deviation} = {0.3, 0.2887} or {0.5, 0.03} with millions of numbers, you can be sure that something is wrong; OTOH, the fact of having the expected values or very close does not imply the rightness of the PRNG in view of the example I wrote above, although the PRNG could be right (a good start is a good start, but nothing more). You can plot the frequency of each generated number (I think it is very visual) but of course I am not an expert on PRNGs and other people have better solutions: there are tests and other things that are already posted in this thread. Good luck!

Regards from Spain.

Ajedrecista.
jdart
Posts: 4361
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Zobrist Number Statistics and WHat to Look For

Post by jdart »

In general I think using a PRNG that is intended for 64-bit output is probably preferable. Otherwise you are assuming zero correlation between adjacent outputs, which is not necessarily true, although low correlation is one attribute of a good PRNG.

One 64-bit PRNG is http://www.math.sci.hiroshima-u.ac.jp/~ ... emt64.html.

--Jon
syzygy
Posts: 5554
Joined: Tue Feb 28, 2012 11:56 pm

Re: Zobrist Number Statistics and WHat to Look For

Post by syzygy »

jdart wrote:In general I think using a PRNG that is intended for 64-bit output is probably preferable. Otherwise you are assuming zero correlation between adjacent outputs, which is not necessarily true, although low correlation is one attribute of a good PRNG.
Binary correlations between consecutive outputs will be bad in any case.

What I did a long time ago was to generate random values in a large range (probably 32 btis), mod them by a small prime greater than 255, and discard the result if it is larger than 255. Then simply concatenate the byte values. This uses even eight consecutive outputs for each Zobrist key, but binary correlations seem pretty much impossible as long as the PRNG used does not have an extremely small period.

I have to say I like Alvaro's suggestion very much:
AlvaroBegue wrote:As others have said, what you want to use is random numbers. In order to generate a table of high-quality random numbers, what I have done in the past is generate several such tables by different methods, and then XOR them together. The resulting table will be at least as random as the most random of the ingredients.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Zobrist Number Statistics and WHat to Look For

Post by diep »

syzygy wrote:
jdart wrote:In general I think using a PRNG that is intended for 64-bit output is probably preferable. Otherwise you are assuming zero correlation between adjacent outputs, which is not necessarily true, although low correlation is one attribute of a good PRNG.
Binary correlations between consecutive outputs will be bad in any case.

What I did a long time ago was to generate random values in a large range (probably 32 btis), mod them by a small prime greater than 255, and discard the result if it is larger than 255. Then simply concatenate the byte values. This uses even eight consecutive outputs for each Zobrist key, but binary correlations seem pretty much impossible as long as the PRNG used does not have an extremely small period.

I have to say I like Alvaro's suggestion very much:
AlvaroBegue wrote:As others have said, what you want to use is random numbers. In order to generate a table of high-quality random numbers, what I have done in the past is generate several such tables by different methods, and then XOR them together. The resulting table will be at least as random as the most random of the ingredients.
Generically spoken, what Alvaro usually posts works.