Crafty 22.5 leaking memory ?

Discussion of chess software programming and technical issues.

Moderator: Ras

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty 22.5 leaking memory ?

Post by bob »

Guetti wrote:
bob wrote:
kiroje wrote:First of all a bug thanks to Robert Hyatt for Crafty 22.4

But there may be a memory leak of some sort :(

Im using SCID and Peter Skinners Win32 port of crafty for analysing my own games, but it seems that it steadily consumes more and more memory until Windows runs out :shock:

Im going back and forth in the games while crafty analyses, so maybe someone could do the same and see if they get the same problem ?
OK. I have now released the source for version 22.5.

The problem was not actually in Crafty, although I originally thought it might be. the new posix threads now defaults to "joinable" for the threads, where Crafty does not every join. Last time I used posix threads (at least a couple of years ago) the default was "detached" which means when a thread terminates the stack is not kept until the join is completed. This resulted in N "stacks" being kept for each search done, and the pthreads stacks are quite large. I have fixed this although as far as I am aware, this only applies to UNIX systems. This version sets the thread attribute correctly to solve this...
22.5 seems hopelessly broken for me. You shortened rook_open_file from a three dimensional array to a one dimensional array which contradicts data.c. Could you please update data.c?

In file included from crafty.c:13:
evaluate.c: In function ‘EvaluateRooks’:
evaluate.c:1760: warning: unused variable ‘open_files’
In file included from crafty.c:27:
data.c: At top level:
data.c:905: error: conflicting types for ‘rook_open_file’
data.h:398: error: previous declaration of ‘rook_open_file’ was here
In file included from crafty.c:33:
init.c: In function ‘InitializePawnMasks’:
init.c:877: warning: unused variable ‘file’
init.c:877: warning: unused variable ‘k’
In file included from crafty.c:39:
option.c: In function ‘Option’:
option.c:3062: warning: initialization makes integer from pointer without a cast
option.c:3063: error: invalid operands to binary *
option.c:3064: error: invalid operands to binary -
make[2]: *** [crafty.o] Error 1
Do you have an old data.c? In data.c, data.h and evaluate.c I have them all as a 2-element array in my source files here. I just re-copied the source files to the ftp box, but I do not believe that will fix the issue since it wasn't an issue here. Could you perhaps have a non-writable data.c on your box so the unzip did not replace it? It certainly compiles cleanly here. I've been running for a couple of days with the score change (really not much of a change when you look at the numbers) with no problems at all. If I mix old/new data.c/data.h it produces the same error you are getting....
Guetti

Re: Crafty 22.5 leaking memory ?

Post by Guetti »

I redownloaded it again and now it works.
I always put stuff in a clean temporary directory before unzipping, so it was not that. But now it's fine.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty 22.5 leaking memory ?

Post by bob »

Guetti wrote:I redownloaded it again and now it works.
I always put stuff in a clean temporary directory before unzipping, so it was not that. But now it's fine.
I suppose that somehow the zip operation had some sort of glitch, or somehow I did not copy matching .c and .h files... Stranger things have happened to me at 1am in the morning, which is when I was working on this problem. :)
CThinker
Posts: 388
Joined: Wed Mar 08, 2006 10:08 pm

22.6 still leaks memory

Post by CThinker »

This is from the 64-bit build by Peter. It is really easy to repro. The leak is 3MB on each new game!

You don't even have to run real games. Just type in the following.

Code: Select all

EPD Kit revision date: 1996.04.21
unable to open book file [./book.bin].
book is disabled
unable to open book file [./books.bin].

Initializing multiple threads.
System is SMP, not NUMA.

Crafty v22.6 (1 cpus)

White(1): xboard

tellicsnoalias set 1 Crafty v22.6 (1 cpus)
tellicsnoalias kibitz Hello from Crafty v22.6! (1 cpus)

smpmt=2
new
tellicsnoalias set 1 Crafty v22.6 (2 cpus)
time 1000
easy
go
move Nf3

smpmt=2
new
tellicsnoalias set 1 Crafty v22.6 (2 cpus)
time 1000
easy
go
move Nf3

smpmt=2
new
tellicsnoalias set 1 Crafty v22.6 (2 cpus)
time 1000
easy
go
move Nf3
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: 22.6 still leaks memory

Post by bob »

I assume this is for windows? I'll try to walk through the code when I get home, I can't test it directly since I don't run windows anywhere...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: 22.6 still leaks memory

Post by bob »

CThinker wrote:This is from the 64-bit build by Peter. It is really easy to repro. The leak is 3MB on each new game!

You don't even have to run real games. Just type in the following.

Code: Select all

EPD Kit revision date: 1996.04.21
unable to open book file [./book.bin].
book is disabled
unable to open book file [./books.bin].

Initializing multiple threads.
System is SMP, not NUMA.

Crafty v22.6 (1 cpus)

White(1): xboard

tellicsnoalias set 1 Crafty v22.6 (1 cpus)
tellicsnoalias kibitz Hello from Crafty v22.6! (1 cpus)

smpmt=2
new
tellicsnoalias set 1 Crafty v22.6 (2 cpus)
time 1000
easy
go
move Nf3

smpmt=2
new
tellicsnoalias set 1 Crafty v22.6 (2 cpus)
time 1000
easy
go
move Nf3

smpmt=2
new
tellicsnoalias set 1 Crafty v22.6 (2 cpus)
time 1000
easy
go
move Nf3
One simple work-around is to not reuse the engine for each new game, and start a clean copy, which is certainly safer since we then know everything is initialized properly. I always use -xreuse for that reason... The leaks I were looking at were not for running multiple games in the same process, but for the smpnice=1 terminating processes and then re-allocating memory without freeing the original copy up.
CThinker
Posts: 388
Joined: Wed Mar 08, 2006 10:08 pm

Re: 22.6 still leaks memory

Post by CThinker »

bob wrote:I assume this is for windows? I'll try to walk through the code when I get home, I can't test it directly since I don't run windows anywhere...
Yes, that is a Windows build, which I got from Peter Skinner's site.

I could try to debug it myself, but I can't access your FTP site. Firewalls these days don't open FTP ports anymore. If you can find me another way to get the source (http), I will debug it.
Dann Corbit
Posts: 12814
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: 22.6 still leaks memory

Post by Dann Corbit »

CThinker wrote:
bob wrote:I assume this is for windows? I'll try to walk through the code when I get home, I can't test it directly since I don't run windows anywhere...
Yes, that is a Windows build, which I got from Peter Skinner's site.

I could try to debug it myself, but I can't access your FTP site. Firewalls these days don't open FTP ports anymore. If you can find me another way to get the source (http), I will debug it.
Here's a web link:
http://cap.connx.com/chess-engines/new- ... y-22.6.zip
CThinker
Posts: 388
Joined: Wed Mar 08, 2006 10:08 pm

Re: 22.6 still leaks memory

Post by CThinker »

Dann Corbit wrote:
CThinker wrote:
bob wrote:I assume this is for windows? I'll try to walk through the code when I get home, I can't test it directly since I don't run windows anywhere...
Yes, that is a Windows build, which I got from Peter Skinner's site.

I could try to debug it myself, but I can't access your FTP site. Firewalls these days don't open FTP ports anymore. If you can find me another way to get the source (http), I will debug it.
Here's a web link:
http://cap.connx.com/chess-engines/new- ... y-22.6.zip
Thanks Dann.

The culprit is a combination the following:

The code that handles the "new" command terminates all the threads.

Code: Select all

/*
 ************************************************************
 *                                                          *
 *   "new" command initializes for a new game.              *
 *                                                          *
 ************************************************************
 */
  else if (OptionMatch("new", *args)) {
    new_game = 1;
    if (thinking || pondering)
      return (3);
    if (max_threads) {
      int proc;

      Print(128, "parallel threads terminated.\n");
      for (proc = 1; proc < CPUS; proc++)
        thread[proc] = (TREE *) - 1;
    }
    NewGame(0);
    return (3);
  }
When the threads terminated, they decrement the smp_idle 'global' variable.

Code: Select all

/*
 *******************************************************************************
 *                                                                             *
 *   ThreadWait() is the idle loop for the N threads that are created at the   *
 *   beginning when Crafty searches.  threads are "parked" here waiting on a   *
 *   pointer to something they should search (a parameter block built in the   *
 *   function Thread() in this case.  when this pointer becomes non-zero, each *
 *   thread "parked" here will immediately call SearchParallel() and begin the *
 *   parallel search as directed.                                              *
 *                                                                             *
 *******************************************************************************
 */
int ThreadWait(int tid, TREE * RESTRICT waiting)
{
  int value;

/*
 ************************************************************
 *                                                          *
 *   the N initial threads enter here and are kept penned   *
 *   here forever.  however, once a thread leaves here it   *
 *   may well re-enter ThreadWait() from the top while it   *
 *   waits for a parallel search to complete.  while it     *
 *   waits here it can also join in to help other busy      *
 *   threads search their subtrees as well.                 *
 *                                                          *
 ************************************************************
 */
  while (1) {
    Lock(lock_smp);
    smp_idle++;
    Unlock(lock_smp);
#if defined(DEBUGSMP)
    Lock(lock_io);
    Print(128, "thread %d now idle (%d procs, %d idle).\n", tid, max_threads,
        smp_idle);
    if (FindBlockID(waiting) >= 0)
      Print(128,
          "thread %d  waiting on block %d, still %d threads busy there\n", tid,
          FindBlockID(waiting), waiting->nprocs);
    Unlock(lock_io);
#endif
/*
 ************************************************************
 *                                                          *
 *   we can exit if our thread[i] is non-zero, or if we are *
 *   waiting on others to finish a block that *we* have to  *
 *   return through.  when the busy count on such a block   *
 *   hits zero, we return immediately which unwinds the     *
 *   search as it should be.                                *
 *                                                          *
 ************************************************************
 */
    while (!thread[tid] && !quit && (!waiting || waiting->nprocs));
    if (quit) {
      Lock(lock_smp);
      smp_threads--;
      Unlock(lock_smp);
      return (0);
    }
    Lock(lock_smp);
    if (!thread[tid])
      thread[tid] = waiting;
/*
 ************************************************************
 *                                                          *
 *   we either have work to do, or threads we were waiting  *
 *   on have finished their work.                           *
 *                                                          *
 ************************************************************
 */
#if defined(DEBUGSMP)
    Lock(lock_io);
    Print(128, "thread %d now has work at block %d.\n", tid,
        FindBlockID(thread[tid]));
    Unlock(lock_io);
#endif
    smp_idle--;
In "Iterate", when it notices that there idle threads, it initializes them.

Code: Select all

/*
 ************************************************************
 *                                                          *
 *   if we are using multiple threads, and they have not    *
 *   been started yet, then start them now as the search    *
 *   is ready to begin.                                     *
 *                                                          *
 ************************************************************
 */
#if (CPUS > 1)
      if (max_threads > smp_idle + 1) {
        long proc;

        initialized_threads = 1;
        for (proc = smp_threads + 1; proc < max_threads; proc++) {
          Print(128, "starting thread %d\n", proc);
          thread[proc] = 0;
#  if defined(_WIN32) || defined(_WIN64)
          NumaStartThread(ThreadInit, (void *) proc);
#  else
          pthread_create(&pt, &attributes, ThreadInit, (void *) proc);
#  endif
          Lock(lock_smp);
          smp_threads++;
          Unlock(lock_smp);
        }
      }
      WaitForAllThreadsInitialized();
#endif
Each time a thread is initialized, it allocates memory.

Code: Select all

/*
 ************************************************************
 *                                                          *
 *   now for some NUMA stuff.  we need to allocate the      *
 *   local memory for each processor, but we can't touch it *
 *   here or it will be faulted in and be allocated on the  *
 *   curret CPU, which is not where it should be located    *
 *   for optimal NUMA performance.  ThreadInit() will do    *
 *   the actual initialization after each new process is    *
 *   created, so that the pages of local memory will be     *
 *   faulted in on the correct processor and use local      *
 *   node memory for optimal performance.                   *
 *                                                          *
 ************************************************************
 */
#if defined(_WIN32) || defined(_WIN64)
  ThreadMalloc((int) 0);
#else
  for (i = 0; i < CPUS; i++) {
    mem = malloc(MAX_BLOCKS_PER_CPU * ((sizeof(TREE) + 2047)) & ~2047);
    for (j = 0; j < MAX_BLOCKS_PER_CPU; j++) {
      block[i * MAX_BLOCKS_PER_CPU + j + 1] =
          (TREE *) ((long) mem + j * ((sizeof(TREE) + 2047) & ~2047));
    }
  }
  for (i = 0; i < MAX_BLOCKS_PER_CPU; i++) {
    memset((void *) block[i + 1], 0, sizeof(TREE));
    block[i + 1]->used = 0;
    block[i + 1]->parent = (TREE *) - 1;
    LockInit(block[i + 1]->lock);
  }
#endif
  initialized_threads++;
  InitializeHashTables();
  hash_mask = (1 << log_hash) - 1;
  pawn_hash_mask = (1 << (log_pawn_hash)) - 1;
  InitializeKingSafety();
Notice that memory is not released in this cycle.

One should be able to repro this even on linux builds.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: 22.6 still leaks memory

Post by bob »

CThinker wrote:This is from the 64-bit build by Peter. It is really easy to repro. The leak is 3MB on each new game!

You don't even have to run real games. Just type in the following.

Code: Select all

EPD Kit revision date: 1996.04.21
unable to open book file [./book.bin].
book is disabled
unable to open book file [./books.bin].

Initializing multiple threads.
System is SMP, not NUMA.

Crafty v22.6 (1 cpus)

White(1): xboard

tellicsnoalias set 1 Crafty v22.6 (1 cpus)
tellicsnoalias kibitz Hello from Crafty v22.6! (1 cpus)

smpmt=2
new
tellicsnoalias set 1 Crafty v22.6 (2 cpus)
time 1000
easy
go
move Nf3

smpmt=2
new
tellicsnoalias set 1 Crafty v22.6 (2 cpus)
time 1000
easy
go
move Nf3

smpmt=2
new
tellicsnoalias set 1 Crafty v22.6 (2 cpus)
time 1000
easy
go
move Nf3
OK, I am violating my last rule which was to not release a new version until we are +20 over the last one. This one is only +10 better. :) But it has what I hope is the last memory-leak issue resolved. I apparently missed this one in the NUMA support code that Eugene wrote a couple of years back. It is called for every "new" command (which starts a new game) and re-allocates the split blocks which causes the VM to grow slowly, game by game.

Consider this version (22.7) as a "test" version. If someone can compile it and check it under windows to see if the leak is gone, it would help. If there is another problem, I will probably fix it and re-release 22.7, otherwise I will move on to 22.8 for testing...