Crafty-22.2 is available

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Crafty-22.2 is available

Post by Dann Corbit »

bob wrote:
Dann Corbit wrote:Be aware that it's going to crash on Windows if you use more than one thread.
There is a patched version here, but it only contains a 64 bit binary:
http://cap.connx.com/chess-engines/new- ... ty22-3.zip

However, anyone can recompile the code.
Why would it crash? windows and linux both now use threads and the old exit(0) problem has been removed. I'm not aware of anything that breaks on windows with 22.2 and beyond. 22.1 had a problem for sure.
It is only a problem for more than one thread.


Here is the problem from file init.c:

Code: Select all

#if defined(_WIN32) || defined(_WIN64) 
  ThreadMalloc((int) 0); 
#else 


This allocates memory only for the first CPU, but not for subsequent CPUS.

It should be:

Code: Select all

#if defined(_WIN32) || defined(_WIN64) 
  { 
   size_t index; 
   for (index = 0; index < CPUS; index++) 
      ThreadMalloc(index); 
  } 
#else 
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty-22.2 is available

Post by bob »

Dann Corbit wrote:Access to null pointers and/or unintialized memory.
I sent you a patch via email last night.
The question is, how is that happening?

BTW the email did not arrive. Can you post it here? Probably a SPAM filter somewhere doing this. Tracy kept trying to send me a log file right before the ACCA tournament and it kept going into the toilet as well for reasons unknown...
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Crafty-22.2 is available

Post by Dann Corbit »

bob wrote:
Dann Corbit wrote:Access to null pointers and/or unintialized memory.
I sent you a patch via email last night.
The question is, how is that happening?

BTW the email did not arrive. Can you post it here? Probably a SPAM filter somewhere doing this. Tracy kept trying to send me a log file right before the ACCA tournament and it kept going into the toilet as well for reasons unknown...
Here is a zip file with three source files changed.
http://cap.connx.com/chess-engines/new- ... ty22-3.zip

There are three patched files.
The first (and only really significant patch) is in init.c

This code is not correct:
#if defined(_WIN32) || defined(_WIN64)
ThreadMalloc((int)0);

It should be this:
#if defined(_WIN32) || defined(_WIN64)
{
size_t index;
for (index = 0; index < CPUS; index++)
ThreadMalloc(index);
}

Obviously, if you use more than one thread the behavior of the Windows version would be undefined in the original.

Epd.c was patched to initialize a variable that the compiler thought might conceivably be unitialized. (Probably not, but it won't hurt anything).

Crafty.c was patched to either include windows.h or unistd.h as the first include file depending on operating system.

If windows.h is not included first for Windows, the combined file crafty.c will not compile correctly.
It may be that some non-windows OS is not POSIX, so the file crafty.c might not be fully correct as-is for those non-windows non-posix systems.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty-22.2 is available

Post by bob »

Dann Corbit wrote:
bob wrote:
Dann Corbit wrote:Be aware that it's going to crash on Windows if you use more than one thread.
There is a patched version here, but it only contains a 64 bit binary:
http://cap.connx.com/chess-engines/new- ... ty22-3.zip

However, anyone can recompile the code.
Why would it crash? windows and linux both now use threads and the old exit(0) problem has been removed. I'm not aware of anything that breaks on windows with 22.2 and beyond. 22.1 had a problem for sure.
It is only a problem for more than one thread.


Here is the problem from file init.c:

Code: Select all

#if defined(_WIN32) || defined(_WIN64) 
  ThreadMalloc((int) 0); 
#else 


This allocates memory only for the first CPU, but not for subsequent CPUS.

It should be:

Code: Select all

#if defined(_WIN32) || defined(_WIN64) 
  { 
   size_t index; 
   for (index = 0; index < CPUS; index++) 
      ThreadMalloc(index); 
  } 
#else 
Something else is wrong then. ThreadInit() which is called when each thread is created, calls ThreadMalloc() the very first thing.

Code: Select all

void *STDCALL ThreadInit(void *tid)
{
  int i;
  long j;

#if defined(_WIN32) || defined(_WIN64)
  ThreadMalloc((int) tid);
#endif
So that will get everything Malloc'ed. And when threads are created in iterate.c, this is used:

Code: Select all

#  if defined(_WIN32) || defined(_WIN64)
          NumaStartThread(ThreadInit, (void *) proc);
#  else
          pthread_create(&pt, NULL, ThreadInit, (void *) proc);
#  endif
So when a thread is started, either in linux or windows, it is sent directly to ThreadInit(), and ThreadInit() does the ThreadMalloc() upon entry. I don't see how this could not work, and it it hasn't been changed since way back. I diffed with 20.1and 22.1, for example and the changes were not in this code at all...

It also should be blowing up under linux since that has to be done for linux as well or it will die with missing blocks causing null-pointer failures...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty-22.2 is available

Post by bob »

Dann Corbit wrote:
bob wrote:
Dann Corbit wrote:Access to null pointers and/or unintialized memory.
I sent you a patch via email last night.
The question is, how is that happening?

BTW the email did not arrive. Can you post it here? Probably a SPAM filter somewhere doing this. Tracy kept trying to send me a log file right before the ACCA tournament and it kept going into the toilet as well for reasons unknown...
Here is a zip file with three source files changed.
http://cap.connx.com/chess-engines/new- ... ty22-3.zip

There are three patched files.
The first (and only really significant patch) is in init.c

This code is not correct:
#if defined(_WIN32) || defined(_WIN64)
ThreadMalloc((int)0);

It should be this:
#if defined(_WIN32) || defined(_WIN64)
{
size_t index;
for (index = 0; index < CPUS; index++)
ThreadMalloc(index);
}

Obviously, if you use more than one thread the behavior of the Windows version would be undefined in the original.

Epd.c was patched to initialize a variable that the compiler thought might conceivably be unitialized. (Probably not, but it won't hurt anything).

Crafty.c was patched to either include windows.h or unistd.h as the first include file depending on operating system.

If windows.h is not included first for Windows, the combined file crafty.c will not compile correctly.
It may be that some non-windows OS is not POSIX, so the file crafty.c might not be fully correct as-is for those non-windows non-posix systems.
That is not going to work correctly, as mentioned in a previous post. There is something else wrong with how it is being compiled if this is failing, as everyone else running windows is not having a problem. ThreadInit is called as each thread is created. The first thing ThreadInit() does is call ThreadMalloc(). No search can be done until everyone has exited ThreadInit() since the blocks need to be available to anyone, but I want them to be Malloc()'ed and faulted in to the physical RAM on the node where they are running (this is for NUMA primarily but it works the same way for non-NUMA since it doesn't hurt and avoids specialized NUMA code...
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Crafty-22.2 is available

Post by Dann Corbit »

bob wrote:
Dann Corbit wrote:
bob wrote:
Dann Corbit wrote:Access to null pointers and/or unintialized memory.
I sent you a patch via email last night.
The question is, how is that happening?

BTW the email did not arrive. Can you post it here? Probably a SPAM filter somewhere doing this. Tracy kept trying to send me a log file right before the ACCA tournament and it kept going into the toilet as well for reasons unknown...
Here is a zip file with three source files changed.
http://cap.connx.com/chess-engines/new- ... ty22-3.zip

There are three patched files.
The first (and only really significant patch) is in init.c

This code is not correct:
#if defined(_WIN32) || defined(_WIN64)
ThreadMalloc((int)0);

It should be this:
#if defined(_WIN32) || defined(_WIN64)
{
size_t index;
for (index = 0; index < CPUS; index++)
ThreadMalloc(index);
}

Obviously, if you use more than one thread the behavior of the Windows version would be undefined in the original.

Epd.c was patched to initialize a variable that the compiler thought might conceivably be unitialized. (Probably not, but it won't hurt anything).

Crafty.c was patched to either include windows.h or unistd.h as the first include file depending on operating system.

If windows.h is not included first for Windows, the combined file crafty.c will not compile correctly.
It may be that some non-windows OS is not POSIX, so the file crafty.c might not be fully correct as-is for those non-windows non-posix systems.
That is not going to work correctly, as mentioned in a previous post. There is something else wrong with how it is being compiled if this is failing, as everyone else running windows is not having a problem. ThreadInit is called as each thread is created. The first thing ThreadInit() does is call ThreadMalloc(). No search can be done until everyone has exited ThreadInit() since the blocks need to be available to anyone, but I want them to be Malloc()'ed and faulted in to the physical RAM on the node where they are running (this is for NUMA primarily but it works the same way for non-NUMA since it doesn't hurt and avoids specialized NUMA code...
I don't know what the right fix is then, but the code crashes before the change and seems to run correctly after it.

Without my patch, this code gives an access violation (with mt=2 or more):
On line 71 of nextr.c, when i is == 65:

Code: Select all

 
if (block[i]->used) // here block[i] is a null pointer. 


block[i-1] is a valid pointer.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty-22.2 is available

Post by bob »

Might be a different bug. Are you running a SMP version but using 1 CPU? I think I screwed up that loop you pointed out in nextr.c which assumes _all_ blocks have been allocated. If you only run one thread, then only the first N blocks are allocated which is an issue. I'll have to think about this one. The loop in nextr.c is the problem. I added this so I could display an accurate NPS during a search, since each different thread has a node counter for each different split block. Probably should just change the loop to check for block == NULL and not test them for "used" if they have never been created. WIll have a new version ready shortly that should fix this glitch...
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Crafty-22.2 is available

Post by Dann Corbit »

bob wrote:Might be a different bug. Are you running a SMP version but using 1 CPU?
4 CPUs

Code: Select all

       CPU Name : Intel Core 2 Extreme (Yorkfield)
  Vendor String : GenuineIntel
    Name String : Intel(R) Core(TM)2 Extreme CPU X9650  @ 3.00GHz
   Architecture : x64
   Process Rule : 45 nm
       Platform : LGA775 [4]
       CPU Type : Original OEM processor [0]
 Number (Total) : 4
  Physical Core : 4
         Family : 6h
    Ext. Family : 6h
          Model : 7h
     Ext. Model : 17h
       Stepping : 6h
  Ext. Stepping : 6h
   Microcode ID : 60Bh
        Feature : MMX SSE SSE2 SSE3 SSSE3 SSE4.1 XD VT Intel 64
PowerManagement : SpeedStep

                    Current        Original
          Clock : 2992.45 MHz     3000.00 MHz
   System Clock :  332.49 MHz      333.33 MHz
     System Bus : 1329.98 MHz     1333.33 MHz
     Multiplier :    9.00            9.00
      Data Rate :    QDR
     Over Clock :   -0.25 %

     L1 I-Cache :   32 KB
     L1 D-Cache :   32 KB
       L2 Cache : 6144 KB [Full:2992.45 MHz]
         Memory : 4029 MB
I think I screwed up that loop you pointed out in nextr.c which assumes _all_ blocks have been allocated. If you only run one thread, then only the first N blocks are allocated which is an issue. I'll have to think about this one. The loop in nextr.c is the problem. I added this so I could display an accurate NPS during a search, since each different thread has a node counter for each different split block. Probably should just change the loop to check for block == NULL and not test them for "used" if they have never been created. WIll have a new version ready shortly that should fix this glitch...


If it is 22.4, I found it and am trying it out.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty-22.2 is available

Post by bob »

Dann Corbit wrote:
bob wrote:Might be a different bug. Are you running a SMP version but using 1 CPU?
4 CPUs

Code: Select all

       CPU Name : Intel Core 2 Extreme (Yorkfield)
  Vendor String : GenuineIntel
    Name String : Intel(R) Core(TM)2 Extreme CPU X9650  @ 3.00GHz
   Architecture : x64
   Process Rule : 45 nm
       Platform : LGA775 [4]
       CPU Type : Original OEM processor [0]
 Number (Total) : 4
  Physical Core : 4
         Family : 6h
    Ext. Family : 6h
          Model : 7h
     Ext. Model : 17h
       Stepping : 6h
  Ext. Stepping : 6h
   Microcode ID : 60Bh
        Feature : MMX SSE SSE2 SSE3 SSSE3 SSE4.1 XD VT Intel 64
PowerManagement : SpeedStep

                    Current        Original
          Clock : 2992.45 MHz     3000.00 MHz
   System Clock :  332.49 MHz      333.33 MHz
     System Bus : 1329.98 MHz     1333.33 MHz
     Multiplier :    9.00            9.00
      Data Rate :    QDR
     Over Clock :   -0.25 %

     L1 I-Cache :   32 KB
     L1 D-Cache :   32 KB
       L2 Cache : 6144 KB [Full:2992.45 MHz]
         Memory : 4029 MB
I think I screwed up that loop you pointed out in nextr.c which assumes _all_ blocks have been allocated. If you only run one thread, then only the first N blocks are allocated which is an issue. I'll have to think about this one. The loop in nextr.c is the problem. I added this so I could display an accurate NPS during a search, since each different thread has a node counter for each different split block. Probably should just change the loop to check for block == NULL and not test them for "used" if they have never been created. WIll have a new version ready shortly that should fix this glitch...


If it is 22.4, I found it and am trying it out.


This bug should only have popped up if you compile for N, and run with M cpus, where M < N. It has been fixed in 22.3/22.4