Stockfish vs. VS

Discussion of chess software programming and technical issues.

Moderator: Ras

adamh

Stockfish vs. VS

Post by adamh »

Sorry, I just could not resist the pun :oops:

I downloaded SF 2.0.1 exe and source. Runs beautifully. Wanted to hunt the new "thread crash". I am using VS 2010 express and Win7.

Compiled: fine
Started in cmd prompt: fine
Started in arena: fine
Started in Chessbase: nothing, loaded new games, cleared hash, changed params,etc: still nothing

The exe works fine in CB but my compile does not. And unfortunately VS express does not allow "attach to running process" (my old comp died so my real environment went poof!) so I cannot see where it is hanging. I can see it has got 16 threads and is constantly consuming 4 % CPU.

Maybe it is just a compile time constant or something? Please help, thanks.
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: Stockfish vs. VS

Post by jwes »

My copy of VS 2010 express has attach to process under the debug menu. Did you register your copy (free) ?
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Stockfish vs. VS

Post by mcostalba »

adamh wrote: Maybe it is just a compile time constant or something? Please help, thanks.
I have VS 2008 express under Windows Vista 32 bits and I don't experience any crash and, btw, I can connetc the debugger to an external running process without problems:Debug->connect to process...

Here are my settings for the compiler:

Code: Select all

/Ox /Oi /Ot /GT /GL /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /FD /EHsc /MD /GS- /arch:SSE2 /Fo"Release\" /Fd"Release\vc90.pdb" /W4 /nologo /c /Wp64 /Zi /Gd /TP /errorReport:prompt
and for the linker:

Code: Select all

/OUT:"Release\stockfish.exe" /NOLOGO /MANIFEST /MANIFESTFILE:"Release\stockfish.exe.intermediate.manifest" /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG /PDB:"c:\Users\Marco\Documents\programmi\stockfish\Release\stockfish.pdb" /SUBSYSTEM:CONSOLE /OPT:REF /OPT:ICF /LTCG /FIXED:No /NXCOMPAT /ERRORREPORT:PROMPT kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib

Very possibly this is not the fastest compile I can get out of VS but I don't care, I use them just for developing....
adamh

Re: Stockfish vs. VS

Post by adamh »

Thanks to both of you!
Even MS thinks that this has been taken out of Express:
http://msdn.microsoft.com/en-us/library/c6wf8e4z.aspx

But further down the same page a community comment explains:
No, feature is not removed from express edition
Activate Tools/Settings/Expert Settings
and you will have access to Debug/Attach to process
:P

So my main thread is hanging in:

Code: Select all

void TranspositionTable::set_size(size_t mbSize) {

// ....
  while ((2 * newSize) * sizeof(TTCluster) <= (mbSize << 20))
      newSize *= 2; // HANGS !!!
// ....
Problem is that the condition never hits, and newSize gets wrapped to zero!

The expression (mbSize << 20)
evaluates to (mbSize << 20) = 0x99200000

sizeof(TTCluster) is 0x00000040
newSize is initially 0x00000040

I am very unused to debugging 64 apps. But I would expect to see the size_t variables as 64-bit values. They are 32-bit values! And with these initial values the conditional expression can never become true!

newSize will reach 0x80000000 and after the next doubling wrap to zero.

With 64-bits it would not wrap. So I will try to track the origin of the incoming values, I might also try a 32-bit compile.

But any help is warmly welcome.
adamh

BUG FOUND !!!

Post by adamh »

So I located the bug. Stockfish will hang here...

Code: Select all

void TranspositionTable::set_size(size_t mbSize) {

// ....
  while ((2 * newSize) * sizeof(TTCluster) <= (mbSize << 20))
      newSize *= 2; // HANGS !!!
// .... 
...If I set the Hashtable size to exactly 2048 MB or higher, in ChessBase. 2047 works great!

However a mystery remains:
This only happens with my home-compiled SF 2.0.1. The exe download does not hang.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: BUG FOUND !!!

Post by bob »

adamh wrote:So I located the bug. Stockfish will hang here...

Code: Select all

void TranspositionTable::set_size(size_t mbSize) {

// ....
  while ((2 * newSize) * sizeof(TTCluster) <= (mbSize << 20))
      newSize *= 2; // HANGS !!!
// .... 
...If I set the Hashtable size to exactly 2048 MB or higher, in ChessBase. 2047 works great!

However a mystery remains:
This only happens with my home-compiled SF 2.0.1. The exe download does not hang.
Either a bad declaration that is causing it to use 32 bits, or else you are running a 32 bit system at home (or at least a 32 bit compiler)...
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: BUG FOUND !!!

Post by mcostalba »

adamh wrote:So I located the bug.
Adam could you please verify that following patch fixes the issue for you ?

Code: Select all

--- a/src/tt.cpp
+++ b/src/tt.cpp
@@ -59,7 +59,7 @@ void TranspositionTable::set_size(size_t mbSize) {
   // each cluster consists of ClusterSize number of TTEntries.
   // Each non-empty entry contains information of exactly one position.
   // newSize is the number of clusters we are going to allocate.
-  while ((2 * newSize) * sizeof(TTCluster) <= (mbSize << 20))
+  while (2ULL * newSize * sizeof(TTCluster) <= (mbSize << 20))
       newSize *= 2;
 
   if (newSize != size)
Thanks.

Possibly this is what happens:

Code: Select all

    If size_t is defined as a 32 bit quanitity then we have an
    overflow in the left term of the while condition if mbSize
    is bigger then 2048.
    
    For instance if mbSize is 2049 then when newSize will reach
    0x80000000 (2048MB) comparison is still true, 'while' loops
    again and we have an overflow in the expression (2*newSize)
    so that result is 0 and at that point 'while' keeps looping
    forever hanging the application.
adamh

Re: BUG FOUND !!!

Post by adamh »

1
Marco, unfortunately that does not fix it. Oh yes, I get past that line but instead I get a std::bad_alloc exception. The callstack is unclear in VS but the call appears to be coming from a few lines later:
entries = new TTCluster[size];
( via operator new[ ] )

2
Before I tried this I was looking for the other crash, mentioned earlier. I have a prime candidate where I get a debug assertion in
search(Position& pos, SearchStack* ss, Value alpha, Value beta, Depth depth, int ply)
line 1014
Here:

Code: Select all

assert(tte->static_value() != VALUE_NONE);


3
I think #2 could be a true bug. But regarding #1 I think Robert is right. I am trying to make a 64-bit build but somewhere along the road all VS preferences are out of sync with each other. My sizeof is returning 4 so I should start there. I downloaded the required SDK for ITHANIUM builds but probably something more than selecting that platform is needed...
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: BUG FOUND !!!

Post by mcostalba »

adamh wrote:1
Marco, unfortunately that does not fix it.
Actually it does.

The other error is partially unrelated. if we use the operator new and the memory cannot be allocated, an exception of type bad_alloc is thrown. This simply means that you cannot allocate all that amount of memory, but is no more a Stockfish error but a limit of your platform / libraries.

Code: Select all

Allocation size limits

The largest possible memory block malloc can allocate depends on the host system, particularly the size of physical memory and the operating system implementation. Theoretically, the largest number should be the maximum value that can be held in a size t type, which is an implementation-dependent unsigned integer representing the size of an area of memory. The maximum value is 2CHAR_BIT*sizeof(size_t) − 1, or the constant SIZE_MAX in the C99 standard.
The bug here is that operator new should not throw a bad alloc exception but return a NULL pointer instead...


Regarding bug #2 that is more interesting and would be very useful to find a way to reliably reproduce it....
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: BUG FOUND !!!

Post by mcostalba »

mcostalba wrote: The bug here is that operator new should not throw a bad alloc exception but return a NULL pointer instead...
Could you please test also this one:

Code: Select all

       size = newSize;
       delete [] entries;
-      entries = new TTCluster[size];
+      entries = new (std::nothrow) TTCluster[size];
       if (!entries)
       {
           std::cerr << "Failed to allocate " << mbSize
Now it should fail gracefully....