Starting a new project - Rommie

Discussion of chess software programming and technical issues.

Moderator: Ras

Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Starting a new project - Rommie

Post by Mike Sherwin »

Mergi wrote: Fri Mar 04, 2022 2:02 am
Mike Sherwin wrote: Fri Mar 04, 2022 1:59 am I just came up with this before seeing your reply.

Code: Select all

thread = new Thread*[maxThreads];

  for (s32 i = 0; i < maxThreads; i++) {
	thread[i] = new Thread;
  }

  t = thread[0];
This looks logically correct to me. :?:
Yea, this seems correct. But i'd still suggest getting rid of the extra indirection ... why do you need it? It will only make deallocation and memory management harder later on.
Now I am confused again. I know new returns a pointer to some object like a simple int or an array or a struct or class, whatever. But does it also create a dynamic array of pointers and at the same time set aside a block of n structs? Does new guarantee all the structs will be contiguous in memory? Okay that seems like a silly question. So thread[1] is thread + sizeof Thread and thread[2] is thread + sizeof Thread * 2 etc. My way would be thread[2] = thread + sizeof pointer * 2. Thread can be extremely large and allocating a complete block of maxThreads might more easily fail. Allocating one Thread at a time might be less likely to fail. If someone wants 128 Threads and they do not have enough memory then one at a time allocation can be stopped at a lesser number of threads and reported to the user. Am I finally getting a handle on things??? Let me say a little prayer before you answer. :lol:
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Starting a new project - Rommie

Post by mvanthoor »

Mike Sherwin wrote: Fri Mar 04, 2022 1:33 am Thanks that looks familiar. I kind of like new but using malloc makes perfect sense with no guessing on my part needed. Only one question though. I see that malloc creates a dynamic array of pointers but does it also set aside memory for nr_of_threads or does that need another step. And does new also need another step? My brain is shutting down so I should take a break.
It's been a long time since I used C; I forgot some of the syntax. The correct allocation would be:

Code: Select all

Thread *thread = (Thread *) malloc(nr_of_threads * sizeof(Thread));
calloc() would also work:

Code: Select all

Thread *thread = (Thread *) calloc(nr_of_threads, sizeof(Thread));
Let's say you want 4 threads and sizeof(Thread) gives 24 bytes.

malloc will then create a block of memory that is 4 * 24 bytes.
calloc will then allocate 4 blocks of 24 bytes, one after another.

Both will give you a pointer to the first element in the memory block. Becaues the first element is a Thread, and you get a void pointer from these functions (which has no type), you'll have to cast it to a Thread pointer with (Thread *).

So in the end you have a memory block that holds Thread types, not a memory block that holds pointers to Thread types. You could do the latter too, but I don't see why:

Code: Select all

Thread **thread = (Thread **) malloc(nr_of_threads * sizeof(Thread *))
Now malloc is allocating a block of memory the size of nr_of threads times the size of "pointer to Thread", and it gives a void pointer to the first element. This is thus a pointer to a pointer to a Thread, and it must thus be cast to Thread **.

You're just creating an extra layer if indirection.

With regard to new(), I'd have to read up; I never used C++ a lot (it's just much less used than C in embedded software), and I haven't touched C in many years. Maybe, at some point, I should just port Rustic to C or C++ just to get some practice in again.

Oh, and yes: Mergi is right. The first version creates a memory block (which can be accessed like an array) of "Thread" structs. The second version only creates the pointers to Thread structs, so you'll have to run through that list to create the actual Thread structs. When de-allocating, you can just "free" the first version; in the second version, you'll have to run through the list to "free" every Thread struct, and then "free" the list itself.

That second version with the list of pointers to Threads is much harder to maintain.
Last edited by mvanthoor on Fri Mar 04, 2022 10:12 am, edited 1 time in total.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
Mergi
Posts: 127
Joined: Sat Aug 21, 2021 9:55 pm
Full name: Jen

Re: Starting a new project - Rommie

Post by Mergi »

Mike Sherwin wrote: Fri Mar 04, 2022 2:56 am Now I am confused again. I know new returns a pointer to some object like a simple int or an array or a struct or class, whatever. But does it also create a dynamic array of pointers and at the same time set aside a block of n structs? Does new guarantee all the structs will be contiguous in memory? Okay that seems like a silly question. So thread[1] is thread + sizeof Thread and thread[2] is thread + sizeof Thread * 2 etc. My way would be thread[2] = thread + sizeof pointer * 2. Thread can be extremely large and allocating a complete block of maxThreads might more easily fail. Allocating one Thread at a time might be less likely to fail. If someone wants 128 Threads and they do not have enough memory then one at a time allocation can be stopped at a lesser number of threads and reported to the user. Am I finally getting a handle on things??? Let me say a little prayer before you answer. :lol:
Ah, so that's why you want pointers ... Unless by extremely large you mean dozens of megabytes (or you are planning to run the engine on some extremely memory constrained system) i still don't see the point of this on modern computers. But you are correct in what you say. The way i used new would create the threads immediatelly all at once in a contiguous block of memory (and call the default constructor for each of them), and return a pointer to the first thread in the array.

One important distinction to note between malloc/calloc and new is that new will call the constructor of the object whereas malloc will not. Malloc will just set aside enough memory but leave all the garbage values there, so you then have to initialize each thread yourself.
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Starting a new project - Rommie

Post by mvanthoor »

Oh, and I agree with Mergi:

If you can, either write the engine in C, or C++; not in a mix of both languages. C is a fairly small language, but C++ can be a monster.

So my advice would be:
1. Use only C (giving your files the .c extension), and avoid everything that is C++.
2. Give your files the "cpp" extension, but write C, only consciously cherry-picking C++ features such as the Thread or Vector object. (And use it with new() and delete() so the object is properly constructed and destroyed.)
3. Give your files the "cpp" extension and write C++, including classes and avoid anything C.

Don't mix C and C++ style haphazardly.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
tcusr
Posts: 325
Joined: Tue Aug 31, 2021 10:32 pm
Full name: tcusr

Re: Starting a new project - Rommie

Post by tcusr »

mvanthoor wrote: Fri Mar 04, 2022 10:01 am malloc will then create a block of memory that is 4 * 24 bytes.
calloc will then allocate 4 blocks of 24 bytes, one after another.
memory is allocated the same, the only difference is that calloc will set the memory to zero after allocating
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Starting a new project - Rommie

Post by Mike Sherwin »

Mergi wrote: Fri Mar 04, 2022 10:12 am
Mike Sherwin wrote: Fri Mar 04, 2022 2:56 am Now I am confused again. I know new returns a pointer to some object like a simple int or an array or a struct or class, whatever. But does it also create a dynamic array of pointers and at the same time set aside a block of n structs? Does new guarantee all the structs will be contiguous in memory? Okay that seems like a silly question. So thread[1] is thread + sizeof Thread and thread[2] is thread + sizeof Thread * 2 etc. My way would be thread[2] = thread + sizeof pointer * 2. Thread can be extremely large and allocating a complete block of maxThreads might more easily fail. Allocating one Thread at a time might be less likely to fail. If someone wants 128 Threads and they do not have enough memory then one at a time allocation can be stopped at a lesser number of threads and reported to the user. Am I finally getting a handle on things??? Let me say a little prayer before you answer. :lol:
Ah, so that's why you want pointers ... Unless by extremely large you mean dozens of megabytes (or you are planning to run the engine on some extremely memory constrained system) i still don't see the point of this on modern computers. But you are correct in what you say. The way i used new would create the threads immediatelly all at once in a contiguous block of memory (and call the default constructor for each of them), and return a pointer to the first thread in the array.

One important distinction to note between malloc/calloc and new is that new will call the constructor of the object whereas malloc will not. Malloc will just set aside enough memory but leave all the garbage values there, so you then have to initialize each thread yourself.
Thanks that answers all my questions. :D Yes by large I mean kind of humongous as besides there being a global TT each thread will have its own TT with about a million entries. Rommie will be able to run in several search modes. A traditional multithreaded engine or in RT RL mode playing maxThreads different games using a more shallow max depth or in a special UI mode where maxThread different games are played in RL mode starting from user supplied positions. Imagine a chess player preparing for a tournament using a 64 core Threadripper searching 128 positions of interest from various openings all playing game after game in RL mode for as long as they want then being able to save any position in a file with all of its learning intact, stopped and restarted where it left off anytime they want. That is what I want to create! :D
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Starting a new project - Rommie

Post by Mike Sherwin »

mvanthoor wrote: Fri Mar 04, 2022 10:25 am Oh, and I agree with Mergi:

If you can, either write the engine in C, or C++; not in a mix of both languages. C is a fairly small language, but C++ can be a monster.

So my advice would be:
1. Use only C (giving your files the .c extension), and avoid everything that is C++.
2. Give your files the "cpp" extension, but write C, only consciously cherry-picking C++ features such as the Thread or Vector object. (And use it with new() and delete() so the object is properly constructed and destroyed.)
3. Give your files the "cpp" extension and write C++, including classes and avoid anything C.

Don't mix C and C++ style haphazardly.
Thanks! I got it now!!! :D :D :D

I like option 2.

Now I must get busy and try to become Super-programmer-man if I am to get anything done.
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Starting a new project - Rommie

Post by Mike Sherwin »

Code: Select all

// Final Thread pointer setup example.

// declarations:
s32 maxThreads;
Thread** thread;
Thread* t;

// initialization:
  // must figure out how to get the number of available threads
  // and then reduce to the number of threads the user request
  maxThreads = 32;

  thread = new Thread*[maxThreads];

  for (s32 i = 0; i < maxThreads; i++) {
	thread[i] = new Thread;
  }

  t = thread[0];
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Starting a new project - Rommie

Post by Mike Sherwin »

GenMoves() design better defined. Name changed to GenPiecesBB(). Only the pieces bb's are generated and stored in the Thread structure by ply and square. So no bb's have been spun into a move list yet saving a substantial amount of time. Only WP2 has been added because pawns need to be handled a little differently. That is because attacks on squares are also accumulated by ply in the Thread structure. At the end of GenPiecesBB() the accumulated attacks are used for legality checking and latter for help with move ordering. So here is a little update to show this. The next update will be when GenPiecesBB() is finished.

Code: Select all

struct Thread {
  u64 genbb[100][64];
  u64 atkbb[100];
  u64 pieceSquareBits[2];
  s32 board[64];
  s32 stm;
  s32 ply;
};

#define genbb t->genbb
#define atkbb t->atkbb
#define pieceSquareBits t->pieceSquareBits
#define board t->board
#define stm t->stm
#define ply t->ply

s32 GenPiecesBB(Thread* t) {
  u64 fSquares, at, enemy, notMe, occ;
  u08 fs, sq;
  s32 ft;

  fSquares = pieceSquareBits[stm];
  notMe = 0xffffffffffffffff ^ fSquares;
  enemy = pieceSquareBits[1 - stm];
  occ = fSquares | enemy;
  atkbb[ply] = 0;

  do {
	fs = std::countr_zero(fSquares);
	fSquares ^= 1ull << fs;
	ft = board[fs];
	switch (ft) {
	case EMPTY:
	  // can't get here
	  break;
	case WP2:
	  sq = std::countr_zero(wPawnMoves[fs] & occ);
	  genbb[ply][fs] = wPawnMoves[fs] & below[sq];
	  at = wPawnCapts[fs];
	  genbb[ply][fs] |= at & enemy;
	  atkbb[ply] |= at;
	  break;
	case WP3:

	  break;
	case WP4:

	  break;
	case WP5:

	  break;
	case WP6:

	  break;
	case WP7:

	  break;
	case WN:

	  break;
	case WB:

	  break;
	case WRC:
	case WR:

	  break;
	case WQ:
	  occ |= 0x8000000000000001;
	  genbb[ply][fs] = ray[std::countr_zero(ray[fs].rwsNW & occ)].rayNW
		| ray[std::countr_zero(ray[fs].rwsNN & occ)].rayNN
		| ray[std::countr_zero(ray[fs].rwsNE & occ)].rayNE
		| ray[std::countr_zero(ray[fs].rwsEE & occ)].rayEE
		| ray[63 - std::countl_zero(ray[fs].rwsSE & occ)].raySE
		| ray[63 - std::countl_zero(ray[fs].rwsSS & occ)].raySS
		| ray[63 - std::countl_zero(ray[fs].rwsSW & occ)].raySW
		| ray[63 - std::countl_zero(ray[fs].rwsWW & occ)].rayWW
		^ qob[fs];
	  atkbb[ply] |= genbb[ply][fs];
	  genbb[ply][fs] &= notMe;
	  break;
	case WKC:

	case WK:

	  break;
	case WCS:
	case WCL:
	  // can't get here
	  break;
	case BP7:

	  break;
	case BP6:

	  break;
	case BP5:

	  break;
	case BP4:

	  break;
	case BP3:

	  break;
	case BP2:

	  break;
	case BN:

	  break;
	case BB:

	  break;
	case BRC:
	case BR:

	  break;
	case BQ:
	  occ |= 0x8000000000000001;
	  genbb[ply][fs] = ray[std::countr_zero(ray[fs].rwsNW & occ)].rayNW
		| ray[std::countr_zero(ray[fs].rwsNN & occ)].rayNN
		| ray[std::countr_zero(ray[fs].rwsNE & occ)].rayNE
		| ray[std::countr_zero(ray[fs].rwsEE & occ)].rayEE
		| ray[63 - std::countl_zero(ray[fs].rwsSE & occ)].raySE
		| ray[63 - std::countl_zero(ray[fs].rwsSS & occ)].raySS
		| ray[63 - std::countl_zero(ray[fs].rwsSW & occ)].raySW
		| ray[63 - std::countl_zero(ray[fs].rwsWW & occ)].rayWW
		^ qob[fs];
	  atkbb[ply] |= genbb[ply][fs];
	  genbb[ply][fs] &= notMe;
	  break;
	case BKC:

	case BK:

	  break;
	}

  } while (fSquares);

  return true;
}
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: Starting a new project - Rommie

Post by Mike Sherwin »

In the code above is a couple of errors that I have fixed. I'll wait until the gen function is complete and I'm sure it is correct before an update is made.