Question regarding perft capture results

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Ras
Posts: 2488
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Question regarding perft capture results

Post by Ras »

diegopaiva wrote: Fri Jun 04, 2021 11:10 pmI implemented magic bitboards and created a function to test whether a move is legal, instead of making and unmaking moves. However, my perft(6) on the starting position still runs is something like 10 seconds, which is upsetting me.
That's odd unless you have very slow hardware. On my 3400G (Zen+ Ryzen), perft 6 takes 5.8 seconds. I don't have bitboards, no TT or bulk counting in perft, I first do a move and then do the in-check verification to see whether it was legal, and my move generator isn't really efficient. However, I still get around 20M NPS in perft on the initial position, which is more than ten times the node rate in real search, so the move generator isn't a bottleneck.
Rasmus Althoff
https://www.ct800.net
diegopaiva
Posts: 5
Joined: Fri Jun 04, 2021 9:29 pm
Full name: Diego Paiva

Re: Question regarding perft capture results

Post by diegopaiva »

mvanthoor wrote: Fri Jun 04, 2021 11:50 pm
diegopaiva wrote: Fri Jun 04, 2021 11:10 pm Regarding your item (2), I did not even think about this. You said it makes perft much faster. Could this be the reason why my perft(6), e.g., is not very fast? Of course there are a lot of implementation details I am not taking in account here, but is there any rough estimate on how much speedup a TT provides in that case?
Hi,

You don't need a TT to run perft(6) in a resonable time. In my case it takes a bit over 3 seconds without a TT. (With no special perft tricks such as bulk counting.)

First try to make your perft correct; compare against the perft results in the chess programming wiki. You can also create a function that runs through a perft testing suite. Rustic has one, and has a feature to compile the suite into the engine:

https://github.com/mvanthoor/rustic/blo ... ra/epds.rs

First, get perft to run right, then optimize.

But, off the bat... make/unmake (where unmake can also be done if a move turns out to be illegal) is much faster than copying the board, making a move, en then throwing the board away. Is your code online somewhere?
My perft results are perfectly correct, I have a long test suite based on the results from the Chess Programing Wiki. I do am making/unmaking moves inside the perft function, and testing them for legality inside the move generation function.

Now I am completely focused on optimizing it. However I do not have any more ideas to increase the performance... I am using the compiler flag O3 and trying to catch some implementation details, but still unable to run perft(6) in less than 9 seconds. My code is on github but a little messy for now, maybe I can share it sometime.
diegopaiva
Posts: 5
Joined: Fri Jun 04, 2021 9:29 pm
Full name: Diego Paiva

Re: Question regarding perft capture results

Post by diegopaiva »

Ras wrote: Sat Jun 05, 2021 2:00 am
diegopaiva wrote: Fri Jun 04, 2021 11:10 pmI implemented magic bitboards and created a function to test whether a move is legal, instead of making and unmaking moves. However, my perft(6) on the starting position still runs is something like 10 seconds, which is upsetting me.
That's odd unless you have very slow hardware. On my 3400G (Zen+ Ryzen), perft 6 takes 5.8 seconds. I don't have bitboards, no TT or bulk counting in perft, I first do a move and then do the in-check verification to see whether it was legal, and my move generator isn't really efficient. However, I still get around 20M NPS in perft on the initial position, which is more than ten times the node rate in real search, so the move generator isn't a bottleneck.
I don't believe my hardware is that slow... I am running it on my laptop i7-8565U, 1.80GHz. For the sake of comparison I tested Stockfish's perft(6) and it runs in less than a second, so I expected my results to be something between 1 and 7 seconds, at most. I don't really know what is the catch here.
diegopaiva
Posts: 5
Joined: Fri Jun 04, 2021 9:29 pm
Full name: Diego Paiva

Re: Question regarding perft capture results

Post by diegopaiva »

Ok, I just found out my perft function was screwed. I had written it a long time ago (while studying Stockfish's code) and it had a subtle mistake. After visiting https://www.chessprogramming.org/Perft again I realized it. This was it before:

Code: Select all

unsigned long long move_generator::perft(int depth, bool root)
{
  // This is slowing down the code
  if (depth == 0)
    return 1;

  unsigned long long nodes = 0, count = 1;
  double total_elapsed_seconds = 0;

  for (Move move : generate_moves()) {
    auto start = high_resolution_clock::now();

    if (root && depth <= 1) {
      nodes++;
    }
    else {
      POS.make_move(move);
      count = perft(depth - 1, false);
      nodes += count;
      POS.unmake_move(move);
    }

    auto finish = high_resolution_clock::now();
    auto elapsed_seconds = duration_cast<duration<double, std::ratio<1>>>(finish - start).count();

    if (root) {
      total_elapsed_seconds += elapsed_seconds;
      fmt::print("{}: {:>15L} ({:.2Lf}s)\n", move.to_string(), count, elapsed_seconds);
    }
  }

  if (root) {
    fmt::print("\nTotal time (s) : {:>15.2Lf}\n", total_elapsed_seconds);
    fmt::print("Nodes          : {:>15L}\n", nodes);
    fmt::print("Nodes/second   : {:>15L}\n\n", (int) (nodes/total_elapsed_seconds));
  }

  return nodes;
}
I rewrote it as follows:

Code: Select all

unsigned long long move_generator::perft(int depth, bool root)
{
  std::vector<Move> move_list = depth <= 1 ? generate_moves(true)
                                           : generate_moves(false);

  unsigned long long nodes = 0, decremented_perft = 0;
  double total_elapsed_seconds = 0;

  for (Move move : move_list) {
    auto start = high_resolution_clock::now();

    if (depth <= 1) {
      decremented_perft++;
      nodes++;
    }
    else {
      POS.make_move(move);

      if (!POS.in_check(POS.inactive_side)) {
        decremented_perft = perft(depth - 1, false);
        nodes += decremented_perft;
      }

      POS.unmake_move(move);
    }

    auto finish = high_resolution_clock::now();
    auto elapsed_seconds = duration_cast<duration<double, std::ratio<1>>>(finish - start).count();

    if (root) {
      total_elapsed_seconds += elapsed_seconds;
      fmt::print("{}: {:>15L} ({:.2Lf}s)\n", move.to_string(), decremented_perft, elapsed_seconds);
    }
  }

  if (root) {
    fmt::print("\nTotal time (s) : {:>15.2Lf}\n", total_elapsed_seconds);
    fmt::print("Nodes          : {:>15L}\n", nodes);
    fmt::print("Nodes/second   : {:>15L}\n\n", (int) (nodes/total_elapsed_seconds));
  }

  return nodes;
}

Where generate_moves(true) generates all legal moves and generate_moves(false) generates all pseudolegal moves.
Now it is running in something like 7 seconds. But I tested other versions without looping when I am in depth 1 and making nodes equal to move_list.size() and I made it to ~3.5 seconds. There is room for improvement, but I'll do it tomorrow since it is almost 4am here :lol:
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Question regarding perft capture results

Post by mvanthoor »

diegopaiva wrote: Sat Jun 05, 2021 8:14 am Ok, I just found out my perft function was screwed...
I just noticed you're not the topic-starter. It would have been better if you had opened your own topic with this question.
Where generate_moves(true) generates all legal moves and generate_moves(false) generates all pseudolegal moves.
Personally, I'd advise against this type of code.

generate_legal_moves(true) => generates all legal moves
generate_legal_moves(false) => generates all non-legal moves (what use is that?)

I mean: that's how the code literally reads because from the outside, you don't know what "true" and "false" are for. If you do things such as this across your entire engine, other people won't understand the code when it gets a bit more complex. Introduce some constants:

generate_legal_moves(ALL)
generate_legal_moves(QUIET)
generate_legal_moves(CAPTURES)

Now it's perfectly clear what is generated.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
diegopaiva
Posts: 5
Joined: Fri Jun 04, 2021 9:29 pm
Full name: Diego Paiva

Re: Question regarding perft capture results

Post by diegopaiva »

mvanthoor wrote: Sat Jun 05, 2021 12:45 pm I just noticed you're not the topic-starter. It would have been better if you had opened your own topic with this question.
Yes... I am sorry for that. I did not believe this could go too long.
mvanthoor wrote: Sat Jun 05, 2021 12:45 pm Personally, I'd advise against this type of code.

generate_legal_moves(true) => generates all legal moves
generate_legal_moves(false) => generates all non-legal moves (what use is that?)

I mean: that's how the code literally reads because from the outside, you don't know what "true" and "false" are for. If you do things such as this across your entire engine, other people won't understand the code when it gets a bit more complex. Introduce some constants:

generate_legal_moves(ALL)
generate_legal_moves(QUIET)
generate_legal_moves(CAPTURES)

Now it's perfectly clear what is generated.
I know passing booleans to functions is not really a good practice, as stated by Uncle Bob. I am using a keyword argument. The function signature is generate_moves(legal = true). Unfortunately, unlike Python, C++ does not allow the keyword in function call. But thanks for the catch, I might change that.