pedrojdm2021 wrote: ↑Thu Jul 01, 2021 6:01 am
a) memory allocations your engine does in search
i don't know if i am doing that... each node i get a new list<int> instance from the moveGenerator,
Which is it?
You are in serious trouble if you don't know when or where you're allocating memory. If you don't know where you allocate memory on the heap (reference types), you're unprepared to troubleshoot performance issues. If you "new up" class instances inside tight loops (like a chess engine's search method), then allow those instances to fall out of scope (or be overwritten by newer instances), you're putting extreme pressure on the garbage collector to sweep and compact memory.
Code: Select all
TTable.Store(new TTEntry(TTFlag.BETA, board.hash, _depth , _beta), ply);
moves = new List<int>(64);
Ugh. Memory allocation inside search.
Let's break it down: Assume a bitboard chess engine is capable of generating moves at 50 million NPS (incremented at Board.PlayMove). Assume the average middlegame position contains 30 legal moves. Searching the tree of potential continuations involves generating 50 million / 30 = 1.67 million move lists per second. It is
far more efficient to write into a pre-allocated array of moves (preferably encoded as a primitive type like uint or ulong) than it is to allocate and then garbage-collect (GC) 1.67 million List<ulong>. The situation is worse than that: Chess engines are
selective searchers. They don't search all 30 moves of the average middlegame position. The alpha / beta algorithm causes beta cutoffs much earlier than move 30- ideally by move 2 or less on average. So an engine generates (read "pays a cost in decreased performance") many moves it never searches. Even worse: The .NET implementation of List<T> allocates a small array initially (length = 16, I think?). If you add an item beyond the last index, it allocates a new array of 2x size and copies the existing 16 items into it. The GC will determine the original array no longer is referenced and will compact its memory in the heap.
In a business application with no hot-path that repeats operations millions of times, this has little, practical consequence. Business developers are more focused on flexible software designs and application lifecycle management (DI, SOA, unit testing, dynamic config, DevOps, and the like) than performance.
In a chess engine, it's disastrous. All of this memory churn (allocate and GC) is happening inside a recursive, tight loop (the search method). If done inefficiently, it will kill performance. Search always will be slower than perft (counting legal moves) because of "bookkeeping" it must do to order, reduce, and prune moves + the cost of statically evaluating inner and leaf positions. However, if search is decreasing move generation speed from 50 million NPS to 100,000 NPS, something is seriously wrong with your implementation. My guess is too much "new", causing too many expensive GC sweeps.
Pre-allocate at engine startup everything used by search. Banish the "new" keyword from search or any method called by search.
i don't know clearly how to avoid declaration of that stuff inside the methods that they are in use
It sounds like you need to break your habit of allocating locally and letting class instances fall out of scope. You need to break away from your "let the garbage collector" handle it mentality. To unlock improved performance, you need to solve the problem of how to pre-allocate at engine startup and pass references (and index positions) to the search and eval methods.