Writing bugs

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Writing bugs

Post by chrisw »

Ras wrote: Fri Jan 11, 2019 9:09 pm
hgm wrote: Fri Jan 11, 2019 7:14 pmNo, that cannot happen. Caching is fully transparent and coherent.
OK, that's a point, but the load-modify-store cycle is interruptible, and memory ordering is not guaranteed across cores because the execution order can vary. Not sure whether this alone could already explain the observations.
That sound like a bug in the PRNG.
No, rand() is specified as not reentrant and not thread safe. So what actually happens with multithreaded use will always depend on the libc in question. It's even legal to have the state per thread and not global.
Well, the results do appear to show that python random.randint() does perform differently. I wonder why/how.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Writing bugs

Post by mar »

chrisw wrote: Fri Jan 11, 2019 9:42 pm Well, the results do appear to show that python random.randint() does perform differently. I wonder why/how.
Like I said, this is likely due to the fact that you have to use multiprocessing in Python
Basically the calling process is forked n times (in your case 8 = number of "cores").

How fork works wrt memory: copies virtual address space of parent process (pages point to the same mapped pages as in parent process), then marks pages to trap on write access and then creates a local copy on demand in the handler, so basically "copy on write".

What happens: rand() has a global state in C, so n forks happen, "copying" the parent global state.
In C one would use threads so they would happily race on the global state and still produce pseudo-random numbers.

I'm not quite sure how Python's random is implemented, but several possibilities come to mind:
- state is stored in shared memory
- reseed after fork
- grab from system (dev/urandom etc.)
- ...

As for rand(), I would recommend to stay away from it. It's fine when you use a good standard library (gcc, ...) but Microsoft stdlib implements this as a dumb LCG (which is anything but pseudo-random) PLUS the return value is masked so that only 15 lsbits are kept => disastrous.
Martin Sedlak
Fulvio
Posts: 395
Joined: Fri Aug 12, 2016 8:43 pm

Re: Writing bugs

Post by Fulvio »

chrisw wrote: Fri Jan 11, 2019 9:42 pm Well, the results do appear to show that python random.randint() does perform differently. I wonder why/how.
Does

Code: Select all

Parallel(n_jobs=workers, verbose=1, backend="threading")
works?
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Writing bugs

Post by chrisw »

Fulvio wrote: Sat Jan 12, 2019 10:58 am
chrisw wrote: Fri Jan 11, 2019 9:42 pm Well, the results do appear to show that python random.randint() does perform differently. I wonder why/how.
Does

Code: Select all

Parallel(n_jobs=workers, verbose=1, backend="threading")
works?
No, makes no difference. See results below.
xorshift() with a seed for each thread is good though.

Code: Select all

serial ....
0 12 68 7
1 1 28 2
2 35 3 4
3 51 100 67
4 84 60 67
5 82 42 95
6 82 63 63
7 17 73 13
parallel python ....
1 66 52 67
2 17 54 13
3 81 44 69
4 2 6 6
7 45 19 53
0 77 10 30
6 24 8 82
5 10 0 3
parallel C rand() ....
0 41 67 34
1 41 67 34
2 41 67 34
3 41 67 34
4 41 67 34
5 41 67 34
6 41 67 34
7 41 67 34
parallel backend="threading" C rand() ....
0 41 67 34
1 41 67 34
3 41 67 34
5 41 67 34
7 41 67 34
4 41 67 34
2 41 67 34
6 41 67 34
parallel C xorshift() ....
0 11 14 44
1 68 83 32
2 70 86 69
3 38 0 61
4 29 39 85
5 49 50 97
6 65 72 97
7 48 6 11

Code: Select all


// lazy seeds, no reason why rand() can't be used to initialise seeds whenever, whenever
static u64 rand_state[MAX_THREADS] = { 11185,245,386,40,57584,6152544,745575, 6584845 };

u64 xorshift64star(int thread_id) 
{
	u64 x = rand_state[thread_id];	/* The state must be seeded with a nonzero value. */
	x ^= x >> 12; // a
	x ^= x << 25; // b
	x ^= x >> 27; // c
	rand_state[thread_id] = x;
	return x * 0x2545F4914F6CDD1D;
}


// test random function, designed to be thread safe, requires thread_id
PyObject* safe_rand64_impl(PyObject *dummy, PyObject* p)
{
	int thread_id = PyLong_AsLong(PyTuple_GetItem(p, ((Py_ssize_t)0)));
	u64 x = PyLong_AsUnsignedLongLong(PyTuple_GetItem(p, ((Py_ssize_t)1)));
	u64 v = xorshift64star(thread_id);
	v = v % x;
	v = (int)v;
	return PyLong_FromLong(v);
}

// rand() is NOT thread safe
PyObject* rand64_impl(PyObject *dummy, PyObject* o)
{
	int x = PyLong_AsLong(o);
	int y = rand();
	int z = y % x;
	return PyLong_FromLong(z);
}

Code: Select all


def print_something_random(i, j):
    if (j==0):
        x = random.randint(0, 100)
        y = random.randint(0, 100)
        z = random.randint(0, 100)
    else:
        if (j==1):
            x = BB.rand64(100)
            y = BB.rand64(100)
            z = BB.rand64(100)
        else:
            x = BB.safe_rand64(i, 100)
            y = BB.safe_rand64(i, 100)
            z = BB.safe_rand64(i, 100)
    print(i, x, y, z)
    return

def parallel_test():
    workers = 8
    
    print('serial ....')
    for i in range(workers):
        print_something_random(i, 0)
    time.sleep(1)

    print('parallel python ....')
    Parallel(n_jobs=workers)(delayed(print_something_random)(thread_id, 0) for thread_id in range(workers))
    time.sleep(1)

    print('parallel C rand() ....')
    Parallel(n_jobs=workers)(delayed(print_something_random)(thread_id, 1) for thread_id in range(workers))
    time.sleep(1)

    print('parallel backend="threading" C rand() ....')
    Parallel(n_jobs=workers, backend="threading")(delayed(print_something_random)(thread_id, 1) for thread_id in range(workers))
    time.sleep(1)

    print('parallel C xorshift() ....')
    Parallel(n_jobs=workers)(delayed(print_something_random)(thread_id, 2) for thread_id in range(workers))
    
    return
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Writing bugs

Post by Ras »

mar wrote: Sat Jan 12, 2019 5:57 ambut Microsoft stdlib implements this as a dumb LCG (which is anything but pseudo-random) PLUS the return value is masked so that only 15 lsbits are kept => disastrous.
Here is what ISO/IEC 9899 suggests as portable example implementation:

Code: Select all

static unsigned long int next = 1;

int rand(void) // RAND_MAX assumed to be 32767
{
    next = next * 1103515245 + 12345;
    return (unsigned int)(next/65536) % 32768;
}

void srand(unsigned int seed)
{
    next = seed;
}
The reason for the masking is that otherwise, no numbers will repeat throughout the whole period of (in this example) 2^31. Therefore, anything somehow related to the birthday paradoxon would fail. Also, any application that just wants a random yes/no and tests that for checking the LSB of the output would yield an alternating 0/1 sequence. The TYPE_0 implementation in glibc has these problems.

An LGC has the main advantage of being fast to compute, and I don't see what would be "disastrous" unless used for some cryptographic purpose where rand() should never be used at all. For initialising some engine weights, it's perfectly fine.
Rasmus Althoff
https://www.ct800.net
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Writing bugs

Post by mar »

Ras wrote: Sat Jan 12, 2019 3:04 pm The reason for the masking is that otherwise, no numbers will repeat throughout the whole period of (in this example) 2^31. Therefore, anything somehow related to the birthday paradoxon would fail. Also, any application that just wants a random yes/no and tests that for checking the LSB of the output would yield an alternating 0/1 sequence. The TYPE_0 implementation in glibc has these problems.

An LGC has the main advantage of being fast to compute, and I don't see what would be "disastrous" unless used for some cryptographic purpose where rand() should never be used at all. For initialising some engine weights, it's perfectly fine.
This is no excuse for using a bad PRNG.

Here's my old public domain PRNG that passes all dieharder tests, is blazingly fast on a 64-bit machine (providing the compiler can fold rotations I get ~ 1 billion 64-bit HQ PR integers per second), burst-free, doesn't break on zero IV, not crypto-secure of course.
Internal state is 128 bits.

LCG is still faster but is anything even remotely close to pseudo-random. It's actually a pile of crap.
I still don't follow how masking is useful? You can always mask yourself.
Also by requiring more samples from a bad PRNG to glue larger PR numbers, the quality decreases even more. LCG is a no-go for anyone who cares.

Code: Select all

struct PRNG 
{ 
   static const ULong C0 = 0xc5462216u | ((ULong)0xcf14f4ebu<<32); 
   static const ULong C1 = 0x75ecfc58u | ((ULong)0x9576080cu<<32); 

   ULong keys[2];      // IV 

   static ULong Rotate( ULong v, Byte s ) { 
      return (v >> s) | (v << (64-s)); 
   } 

   ULong Next() 
   { 
      ULong tmp = keys[0]; 
      keys[0] += Rotate( keys[1] ^ C0, 1 ); 
      return keys[1] += Rotate( tmp ^ C1, 9 ); 
   } 

   void Seed( ULong val ) 
   { 
      keys[0] = keys[1] = val; 
      for ( int i=0; i<64; i++ ) { 
         Next(); 
      } 
   } 
};
Martin Sedlak
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Writing bugs

Post by Ras »

mar wrote: Sat Jan 12, 2019 3:50 pmThis is no excuse for using a bad PRNG.
It isn't "bad", it's at worst unsuited for certain applications. The one in question here simply doesn't belong to them. The issues that Chris was experiencing had nothing to do with the LGC, but with how the state is managed.
I still don't follow how masking is useful? You can always mask yourself.
Now that's bad practice because then you have to fumble out the implementation, which is not how a library is supposed to be used.
Rasmus Althoff
https://www.ct800.net
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Writing bugs

Post by mar »

Ras wrote: Sat Jan 12, 2019 5:22 pm It isn't "bad", it's at worst unsuited for certain applications. The one in question here simply doesn't belong to them. The issues that Chris was experiencing had nothing to do with the LGC, but with how the state is managed.
Yes, I already explained the issue he was having.
As for wordplay, if failing most statistical tests isn't bad then I wonder what is :)
Martin Sedlak
grahamj
Posts: 43
Joined: Thu Oct 11, 2018 2:26 pm
Full name: Graham Jones

Re: Writing bugs

Post by grahamj »

Ras wrote: Sat Jan 12, 2019 5:22 pm
mar wrote: Sat Jan 12, 2019 3:50 pmThis is no excuse for using a bad PRNG.
It isn't "bad", it's at worst unsuited for certain applications.
A very long time ago, probably last century, I wanted to make a random texture for artistic purposes. I knew that Microsoft's rand() was bad, but I thought it would be good enough just to make a random texture. The result was disastrous. Microsoft's rand() is also 'unsuited' for any serious statistcal application. Possibly Chris has a situation where it is OK.
Graham Jones, www.indriid.com
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Writing bugs

Post by mar »

grahamj wrote: Sat Jan 12, 2019 5:59 pm A very long time ago, probably last century, I wanted to make a random texture for artistic purposes. I knew that Microsoft's rand() was bad, but I thought it would be good enough just to make a random texture. The result was disastrous.
You mean something like this? :)

LCG (microsoft), 256x256 8-bit composed by generating each bit using a separate call (aka "not bad"):
Image

simple KISS does slightly better:
Image

note that grabbing 8 lsbits of LCG with 32-bit accumulator directly starts to break at 1024x1024
glibc doesn't use LCG by default (which is a good thing!) and doesn't suffer from ugly patterns

source code: http://www.crabaware.com/Test/rand_test_small.cpp
Martin Sedlak