Many thanks in advance to anybody who can provide some guidance on this.
In a nutshell, the 32-bit and 64-bit versions of Myrddin have always given very different results in analysis mode -- not just move count but PV and sometimes even best move. It's only now that I've decided to devote some time to the problem.
I thought it might be the compile setup, since the two versions are compiled on two different machines (but with the same compiler, Visual Studio 2010, and compile/link settings). But when Jim Ablett's compiles also produce the same issue, I start to suspect the code itself, as I seem to recall that Jim does not exclusively use VS for his builds.
Jim pointed me to a thread from a couple of years ago about Stockfish exhibiting the same problem, but that was due to a MS library sort function which Myrddin does not use. I've searched through all of Myrddin code many times, and cannot find any 64-bit specific code, and I'm just not familiar enough with the MS libraries to guess at which functions might be causing this problem.
Again, any help will be very much appreciated (you'll be mentioned in the release notes!)
jm
64-bit and 32-bit exes producing different results
Moderators: hgm, Rebel, chrisw
-
- Posts: 1357
- Joined: Wed Mar 08, 2006 10:15 pm
- Location: San Francisco, California
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: 64-bit and 32-bit exes producing different results
Do you use any external libraries which might change the behavior of your program? For example random number generators.
Failing that, it means it's something internal to your code. In that case, you probably have a bug somewhere (accessing uninitialized memory or an invalid memory location for example).
Failing that, it means it's something internal to your code. In that case, you probably have a bug somewhere (accessing uninitialized memory or an invalid memory location for example).
-
- Posts: 1357
- Joined: Wed Mar 08, 2006 10:15 pm
- Location: San Francisco, California
Re: 64-bit and 32-bit exes producing different results
I do not. All of my zobrist hashing values are in a fixed table.rbarreira wrote:Do you use any external libraries which might change the behavior of your program? For example random number generators.
Failing that, it means it's something internal to your code. In that case, you probably have a bug somewhere (accessing uninitialized memory or an invalid memory location for example).
As I type this, Andrew Fan (Firefly) is looking at it. His debug builds show identical behavior, but the release builds do not. Which is even stranger because there is definitely no debug-specific code -- not even asserts.
The mystery deepens....
jm
-
- Posts: 879
- Joined: Mon Dec 15, 2008 11:45 am
Re: 64-bit and 32-bit exes producing different results
@Ricardo
thought of this too, but ms compiler will warn you if you run in debug mode and there is an access on uninitialized memory (variable).
But not sure of the circumstances, so it might be a good point to start.
@John
Another idea is to check implicit casts, or more general types behaviour.
Also structure alignment can differ i think, which can cause problems
using the sizeof operator which will lead to any kind of problems.
And many other things of course....
So before thinking longer about the problem, i want to ask how to
Or the other way around, did you get different nodecounts, different results in any form when you doing fix depth searches ?
Michael
thought of this too, but ms compiler will warn you if you run in debug mode and there is an access on uninitialized memory (variable).
But not sure of the circumstances, so it might be a good point to start.
@John
Another idea is to check implicit casts, or more general types behaviour.
Also structure alignment can differ i think, which can cause problems
using the sizeof operator which will lead to any kind of problems.
And many other things of course....
So before thinking longer about the problem, i want to ask how to
understand this statement. How do you compare movecount in analysis mode ?...different results in analysis mode -- not just move count but ...
Or the other way around, did you get different nodecounts, different results in any form when you doing fix depth searches ?
Michael
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: 64-bit and 32-bit exes producing different results
If everything else fails, you can always try disabling parts of the program (for example extensions, quiescence search, etc.) and after disabling a lot of stuff, seeing if the node counts match then. If they do, you can start enabling parts one by one until the two versions start behaving differently.
The last part you enabled must contain/trigger the problem, which will give further clues or even make the problem obvious.
The last part you enabled must contain/trigger the problem, which will give further clues or even make the problem obvious.
Last edited by rbarreira on Tue Jul 26, 2011 11:10 pm, edited 1 time in total.
-
- Posts: 1357
- Joined: Wed Mar 08, 2006 10:15 pm
- Location: San Francisco, California
Re: 64-bit and 32-bit exes producing different results
By nodecount difference, I mean that for a specific PV output, even if all else is the same. Andrew is still poking, and he is now saying that:Desperado wrote:@Ricardo
thought of this too, but ms compiler will warn you if you run in debug mode and there is an access on uninitialized memory (variable).
But not sure of the circumstances, so it might be a good point to start.
@John
Another idea is to check implicit casts, or more general types behaviour.
Also structure alignment can differ i think, which can cause problems
using the sizeof operator which will lead to any kind of problems.
And many other things of course....
So before thinking longer about the problem, i want to ask how to
understand this statement. How do you compare movecount in analysis mode ?...different results in analysis mode -- not just move count but ...
Or the other way around, did you get different nodecounts, different results in any form when you doing fix depth searches ?
Michael
1) Release and Debug builds do not show the same behavior to each other, for both 32-bit and 64-bit.
2) 32-bit and 64-bit Release builds also differ.
3) 32-bit and 64-bit Debug builds are identical.
Investigation continues....
jm
-
- Posts: 879
- Joined: Mon Dec 15, 2008 11:45 am
Re: 64-bit and 32-bit exes producing different results
Because it is not so easy for me to express myself in english very well,
i will just give some short examples of the errorTypes i'm thinking of.
I know this is all about guessing (and there are many more possibilities), but at least the first two examples are
_typical_ for debug/release differences. Also for 32/64 bit differences.
So just some ideas to start with.
Now if there is a bug in this category, then i think the only way to get rid
of it, is stepwise enable/disable code parts.
Michael
i will just give some short examples of the errorTypes i'm thinking of.
Code: Select all
guess 1: uninitialized memory
====================================
void example(void)
{
int value;
value++; ...
}
guess 2: uninteneded use of memory
====================================
void example(void)
{
//bishop
while(tmp) {src=bsf64(tmp); value+=pst[bishop][src];}
//king
value += pst[king][src]; -> should be pst[king][posKing] instead of src...
}
guess 3:
=====================================
struct test_t
{
ui08_t a;
ui32_t b;
ui16_t c;
ui08_t d;
};
sizeof operator will report 12 Byte (not 8)!
_typical_ for debug/release differences. Also for 32/64 bit differences.
So just some ideas to start with.
Now if there is a bug in this category, then i think the only way to get rid
of it, is stepwise enable/disable code parts.
Michael
-
- Posts: 1357
- Joined: Wed Mar 08, 2006 10:15 pm
- Location: San Francisco, California
Re: 64-bit and 32-bit exes producing different results
I understood you perfectly, so no problem. I know that there is no problem with your example #1, and I'm aware of the issue with your example #3, but not entirely sure how it might cause problems.
Andrew determined that if you turn off my hash code completely, the problem goes away. So now I have a place to target my efforts, since my hash code is pretty simple.
Just to make sure, does this code look like it might cause a problem?
where PosSignature is defined as a DWORD in both 32-bit and 64-bit, and HASH_ENTRY is defined as:
and MoveFlagType is an unsigned short and SquareType is an unsigned char? So, without padding, one hash entry is 14 bytes, and is padded up to 16 bytes?
dwHashSize is the total number of entries in the hash table.
Many thanks,
jm
Andrew determined that if you turn off my hash code completely, the problem goes away. So now I have a place to target my efforts, since my hash code is pretty simple.
Just to make sure, does this code look like it might cause a problem?
Code: Select all
void SaveHash(CHESSMOVE *cmMove, int nDepth, int nEval, BYTE nFlags, int nPly, PosSignature dwSignature)
{
PosSignature index = (dwSignature & (dwHashSize - 1));
HASH_ENTRY *pentry = HashTable + index;
....
Code: Select all
typedef struct HASH_ENTRY
{
PosSignature dwSignature;
WORD nAge;
short nEval;
MoveFlagType moveflag;
BYTE nFlags;
BYTE nDepth;
SquareType from;
SquareType to;
} HASH_ENTRY;
dwHashSize is the total number of entries in the hash table.
Many thanks,
jm
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: 64-bit and 32-bit exes producing different results
How do you calculate dwHashSize? Hopefully you're not assuming that sizeof (HASH_ENTRY) is a power of 2, or that it is the same for both versions.
-
- Posts: 879
- Joined: Mon Dec 15, 2008 11:45 am
Re: 64-bit and 32-bit exes producing different results
Just a quick idea before i go to bed .
The problem may be caused by the _&_ operation when the size of
_dwHashsize_ is not longer a power of 2.
My example is padded to 8 bytes (not 12). if i would use it without
knowing the issue my _dwHashsize_ would not be a power of 2
and the & operation would fail.
Michael
The problem may be caused by the _&_ operation when the size of
_dwHashsize_ is not longer a power of 2.
My example is padded to 8 bytes (not 12). if i would use it without
knowing the issue my _dwHashsize_ would not be a power of 2
and the & operation would fail.
Michael