Writing bugs

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Ras
Posts: 1161
Joined: Tue Aug 30, 2016 6:19 pm
Contact:

Re: Writing bugs

Post by Ras » Sat Jan 12, 2019 11:30 pm

mar wrote:
Sat Jan 12, 2019 7:46 pm
(aka "not bad")
Completely irrelevant for this application, and also bear in mind that it's mod'ed down to have the desired small variation. For image and audio processing, that would be another issue, of course, especially where you don't mod down.
Rasmus Althoff
https://www.ct800.net

User avatar
lucasart
Posts: 3046
Joined: Mon May 31, 2010 11:29 am
Full name: lucasart
Contact:

Re: Writing bugs

Post by lucasart » Sun Jan 13, 2019 12:37 am

Coming back to the thread-safety problem, I think the real design mistake of C was to define rand().

Instead, a better way, is to make it re-entrant: rand(&seed). That way the ownership of the seed is pushed to the caller, and if the caller wants to make seed thread local, or chooses to use a global and have racy code, that's the caller's problem. Or maybe the caller wants to have a global seed with a lock?

For example:

Code: Select all

// SplitMix64 PRNG, by Sebastiano Vigna: http://xoroshiro.di.unimi.it/splitmix64.c
static uint64_t prng(uint64_t *state)
{
    uint64_t s = (*state += 0x9E3779B97F4A7C15ULL);
    s = (s ^ (s >> 30)) * 0xBF58476D1CE4E5B9ULL;
    s = (s ^ (s >> 27)) * 0x94D049BB133111EBULL;
    s ^= s >> 31;
    assert(s);  // We cannot have a zero key for zobrist hashing. If it happens, change the seed.
    return s;
}
Same goes for C's strtok(). Making it non re-entrant is not only a thread safety disaster, but even single threaded code with nested function calls will break.

Some of C's stdlib is just crap and should never be used in any serious program. Fortunately, these functions are trivial to replace by one's own code.

State is the root of many evils in programming. So write stateless code.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

grahamj
Posts: 38
Joined: Thu Oct 11, 2018 12:26 pm
Full name: Graham Jones

Re: Writing bugs

Post by grahamj » Sun Jan 13, 2019 8:07 am

mar wrote:
Sat Jan 12, 2019 7:46 pm
You mean something like this? :)
:)
Graham Jones, www.indriid.com

Michel
Posts: 2052
Joined: Sun Sep 28, 2008 11:50 pm

Re: Writing bugs

Post by Michel » Sun Jan 13, 2019 8:57 am

State is fine but it should be encapsulated in an object. Like so:

Code: Select all

rg=random_generator(seed)
r=rg.rand()
The state maintained by the rg object could be much more complicated than a single number (this would likely be the case for a cryptographically secure random generator). The rg object completely hides this state from the user.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.

mar
Posts: 2011
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Writing bugs

Post by mar » Sun Jan 13, 2019 9:03 am

Ras wrote:
Sat Jan 12, 2019 11:30 pm
Completely irrelevant for this application, and also bear in mind that it's mod'ed down to have the desired small variation. For image and audio processing, that would be another issue, of course, especially where you don't mod down.
There's no variation whatsoever, the modulo simply reduces the output to 15 bits (not useful, wasteful in fact).
Anything sized 2^n is automatically uniform distribution (if your PRNG is any good of course).

I simply generated 8-bit white noise bit by bit for visualization purposes.
Slicing the output of a bad PRNG pronounces the problem even more which is clearly shown in the images.

I still don't understand why you defend LCG so vehemently. It's bad whether you like it or not.
There're much better options (pick any KISS you want) that would satisfy 99.9% cases minus crypto
and this is what I'd prefer anytime over a bad PRNG.

The original problem was neither rng nor thread safety but simply multiprocessing instead of multithreading.
LCG/KISS don't care about data races on x86 apart from slowdown due to false sharing.
You'd still get the expected pseudo-random results with multithreading though.
Martin Sedlak

User avatar
lucasart
Posts: 3046
Joined: Mon May 31, 2010 11:29 am
Full name: lucasart
Contact:

Re: Writing bugs

Post by lucasart » Sun Jan 13, 2019 9:49 am

Michel wrote:
Sun Jan 13, 2019 8:57 am
State is fine but it should be encapsulated in an object. Like so:

Code: Select all

rg=random_generator(seed)
r=rg.rand()
The state maintained by the rg object could be much more complicated than a single number (this would likely be the case for a cryptographically secure random generator). The rg object completely hides this state from the user.
you do realise that rg.rand() is C++ syntactic sugar for C's rand(&rg) ?

In other words re-entrant function in C = method in C++

sure the state can be more than 64-bit but that does not require object orientation (which does not exist in C, and C's rand() was the topic of debate). just rand(&state), and state is of type 'struct State' which contains whatever it needs to contain.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

Michel
Posts: 2052
Joined: Sun Sep 28, 2008 11:50 pm

Re: Writing bugs

Post by Michel » Sun Jan 13, 2019 10:09 am

Of course it is obvious how to emulate the correct object oriented design in C in this simple case. That was left as an exercise to the reader.

You were claiming "state is evil" (as usual a statement devoid of any nuance). I am claiming instead: "there is no problem with state as long as it is properly encapsulated".
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.

Ras
Posts: 1161
Joined: Tue Aug 30, 2016 6:19 pm
Contact:

Re: Writing bugs

Post by Ras » Sun Jan 13, 2019 2:28 pm

lucasart wrote:
Sun Jan 13, 2019 12:37 am
Instead, a better way, is to make it re-entrant: rand(&seed).
POSIX-2001 has rand_r for this, but it was declared as obsolete in POSIX-2008 - any idea why?

mar wrote:
Sun Jan 13, 2019 9:03 am
There's no variation whatsoever, the modulo simply reduces the output to 15 bits
I meant in actual usage. When you want a small variation, say +/- 50 centipawns, then the usual idiom is to mod the rand() result down like in

Code: Select all

noise = rand()%101 - 50;
(not useful, wasteful in fact).
The mod down inside rand is useful in fact, and the C ISO standard suggests this for good reason. It would be nonsense to hand out the LCG 32 bit value where the LSB is regularly alternating. Also, RAND_MAX is only specified as at least 32767 anyway in the C standard because an int is only guaranteed to be at least 16 bits.
I still don't understand why you defend LCG so vehemently. It's bad whether you like it or not.
Because the notion that "it's bad" is just nonsense. It's unsuited for certain applications, and the one here just is not one of them.

Besides, the moiré image issue with MS' rand() isn't even the LGC, it's that they didn't just copy the suggestion from the ISO standard and used 214013 instead of 1103515245 as multiplier, which is too small. No surprise that MS invents something of its own and botches it up, they do that all the time. Use the one from the ISO standard, and you don't get a moiré image, see attachment.
Attachments
lgc.png
lgc.png (95.37 KiB) Viewed 1901 times
Rasmus Althoff
https://www.ct800.net

mar
Posts: 2011
Joined: Fri Nov 26, 2010 1:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Writing bugs

Post by mar » Sun Jan 13, 2019 3:08 pm

Ras wrote:
Sun Jan 13, 2019 2:28 pm
I meant in actual usage. When you want a small variation, say +/- 50 centipawns, then the usual idiom is to mod the rand() result down like in

Code: Select all

noise = rand()%101 - 50;
Ok, not to mention that getting a random in interval by using modulo is not a good idea, you don't get uniform distribution this way unless the value you modulate with is a power of two. It gets worse the smaller the output range is, especially bad if RAND_MAX is only 32k.
The mod down inside rand is useful in fact, and the C ISO standard suggests this for good reason. It would be nonsense to hand out the LCG 32 bit value where the LSB is regularly alternating. Also, RAND_MAX is only specified as at least 32767 anyway in the C standard because an int is only guaranteed to be at least 16 bits.
Yes, legacy limits.
If it behaves this way (alternating LSBit), then it's simply a bad PRNG, sorry.
Anyone who suggests LCG as a standard way to generate PR sequences probably has no clue about PRNGs in general.
Because the notion that "it's bad" is just nonsense. It's unsuited for certain applications, and the one here just is not one of them.

Besides, the moiré image issue with MS' rand() isn't even the LGC, it's that they didn't just copy the suggestion from the ISO standard and used 214013 instead of 1103515245 as multiplier, which is too small. No surprise that MS invents something of its own and botches it up, they do that all the time. Use the one from the ISO standard, and you don't get a moiré image, see attachment.
How many times do I have to show you that LCG is bad :) You claim it's nonsense, that's your strongest argument I suppose.
"nonsense" + zero data to back it up.

I wonder how you got that image but you couldn't possibly have obtained it by composing the 8-bit final grayscale value bit-by-bit (hence "cheating" :) If you simply output the lowest 8 bits then I suggest you try 1024x1024 or 2048x2048 images, then you'll see for yourself.

This is what I get with the ISO multiplier, not much better (still 8 PRNG calls for each byte):
Image
(I updated the source code)

Or if you want to prove me wrong, just take your ISO LCG and run it through some statistical test suites, like diehard or dieharder. Then come back and tell me how good LCGs are :)

So let's agree to disagree. You don't care about the quality of PR sequences while I do, it's probably as simple as that.
Martin Sedlak

Sesse
Posts: 180
Joined: Mon Apr 30, 2018 9:51 pm
Contact:

Re: Writing bugs

Post by Sesse » Sun Jan 13, 2019 11:53 pm

Michel wrote:
Sun Jan 13, 2019 8:57 am
State is fine but it should be encapsulated in an object. Like so:

Code: Select all

rg=random_generator(seed)
r=rg.rand()
The state maintained by the rg object could be much more complicated than a single number (this would likely be the case for a cryptographically secure random generator). The rg object completely hides this state from the user.
https://en.cppreference.com/w/cpp/numeric/random

Post Reply