lithander wrote: ↑Mon Feb 22, 2021 5:39 pm
Marcel, sorry for hijacking your thread. I was just typing out the question 'Any ideas why there's an order of magnitude difference in speed?' when I realized that this is really offtopic, right? This should be about Rustic! Sorry, I'm going to shut up now.
It's alright; you could (unintentionally) post something that's useful to me, even if it's not about Rustic or Rust itself.
With regard to perft speed comparison, you have to be careful.
- If a move generator uses bulk-counting, that will speed up perft extremely.
- Speed will be even higher if there's even a 1 MB hash table available.
- A move generator can use "tricks" to be fully legal, such as not generating moves for pinned pieces, etc... this will speed up the move generator, and thus perft. (It also has a small impact on the speed of the search, obviously, but not as much as it has on perft.)
At this point, Rustic does not do bulk counting, the has can be disabled, and it doesn't do any special tricks in the move generator.
36M leaves/sec seem to be right for Rustic's popcnt / bmi2 versions, for the starting position, at least on a CPU such as a i7-6700K.
20 Mlps feels slow, even for the generic 64-bit code, but that can be because of the CPU. I even don't know for sure (can't find any definitive information) if "target-cpu" in Rust actually enables ALL of the available instruction sets, or only the default ones, and then generates code for the actual micro-architecture.
For Alpha 2, I'm going to look into Stockfish's makefile, to see how it compiles for Clang. I can then do the same for Rust, because both Clang and Rust use LLVM as a backend (and thus have the same options).
With regard to other engines, the problem with speed can be manifold. Using bitboards is not a silver bullet.
- It could be that they use massive data structures such as: Move {int ..., int ..., int ...., int ....} instead of single int for the entire move (working by bits), and then copy those structures around.
- It could be that they allocate memory where no allocation is necessary. For example, use a vector of size 1 for the move list, and then it needs to be reallocated for each move that is added (better start with something like size 256, or better yet, use an array and keep the move list on the stack)
- It could be that they are using objects for everything, such as a King object, Queen object, even with each Square being an object, all packed within a Board object, everything with its own interface. That can be very neat if you tend to have 5 board implementations and several different piece implementations, but as a piece can also just be a number in an integer array, the object-oriented approach is probably slower.
- it could be that code is just written inefficiently. Instead of assigning a sort order score to moves and then just picking the fastest one, it could be that they actually sort the list with something very slow like bubble sort (and then move entire Move { } objects around instead of pointers, or integers)
- ... and so on.
One would need to profile an engine to see where the problems are, and then start refactoring at the point of the worst one. (Which could actually mean that starting from scratch can be faster.)
===
I just (like five minutes ago) nuked my entire hash table. It worked, but I ended up with a generic object (the HashData struct) in a dynamic object (the Hash bucket), in a dynamic object (the Hash table), which were all stuck together using traits, some of which were generic themselves.
Even though it worked, and speed didn't seem to suffer (at least not measurably), the code was... somewhat... much... more convoluted than intended.
I'm going to redesign the hash table from scratch, because I probably made some massive mistakes a year ago by not doing things by the Rust philosophy. I wouldn't be surprised if this caused those +/- 100 lines of massively complicated and unreadable code. Ah, well. I'll just start over. It's not a long term project for nothing. Writing this chess engine is as much about learning about Rust than it is about writing a chess engine. (Some people wrote 3-4 engines before they actually started on their main one, so I think I can rewrite a hash table if need be
)
I want the hash table to be generic, so I can use it for perft, search, pawns, and later, for evaluation stuff. However, I don't want to use a HashMap (which would probably be much easier), because I want to be able to control the details such as replacement schemes and such myself, without being dependent on HashMap's implementation.
See you later, when I've learned to design generic hash table in Rust that looks more like this:
and not like this: