My Perft Results

dangi12012 · Post by **dangi12012** » Fri Nov 04, 2022 4:15 pm

Joost Buijs wrote: ↑Fri Nov 04, 2022 3:04 pm
dangi12012 wrote: ↑Fri Nov 04, 2022 2:32 pm Optimized Bitboard approach:

Perft 6: 119060324 88ms
Perft 7: 3195901860 2225ms
Ankan's GPU Perft is still faster than this, and that is on a mere GTX-780.

http://www.talkchess.com/forum3/viewtopic.php?t=48387

I wonder what it would do on a RTX-4090Ti.

I tried compiling this for some time now for rtx 3080 with cuda 11.x - is there a version that works with current CUDA toolkit and drivers? If not then a 4090 build is impossible.
Or is there a win64 build somewhere? (should work - not optimal performance)

JVMerlino · Post by **JVMerlino** » Fri Nov 04, 2022 4:29 pm

Mike Sherwin wrote: ↑Thu Nov 03, 2022 10:56 pm Perft measurement itself has its mysteries. Many authors report n/s without specifying which speedup tricks were used or not used.

Indeed, and therefore it is a bit surprising that nobody seems to have mentioned, when reporting their times, whether or not they are using bulk counting.

My engine, Myrddin, is pretty slow when it comes to perft times. With bulk counting, it does perft 6 from the initial position in 2.3 seconds.

Joost Buijs · Post by **Joost Buijs** » Fri Nov 04, 2022 4:47 pm

JVMerlino wrote: ↑Fri Nov 04, 2022 4:29 pm
Mike Sherwin wrote: ↑Thu Nov 03, 2022 10:56 pm Perft measurement itself has its mysteries. Many authors report n/s without specifying which speedup tricks were used or not used.
Indeed, and therefore it is a bit surprising that nobody seems to have mentioned, when reporting their times, whether or not they are using bulk counting.

My engine, Myrddin, is pretty slow when it comes to perft times. With bulk counting, it does perft 6 from the initial position in 2.3 seconds.

The 0.79 sec. reported for Nightmare is with bulk counting. Without bulk counting the time it takes does not seem to increase. Maybe the compiler is so smart that it sees that the extra move-make and move-unmake don't do anything useful and that it removes the call to both functions, otherwise I can't explain this.

Joost Buijs · Post by **Joost Buijs** » Fri Nov 04, 2022 5:08 pm

dangi12012 wrote: ↑Fri Nov 04, 2022 4:15 pm
Joost Buijs wrote: ↑Fri Nov 04, 2022 3:04 pm
dangi12012 wrote: ↑Fri Nov 04, 2022 2:32 pm Optimized Bitboard approach:

Perft 6: 119060324 88ms
Perft 7: 3195901860 2225ms
Ankan's GPU Perft is still faster than this, and that is on a mere GTX-780.

http://www.talkchess.com/forum3/viewtopic.php?t=48387

I wonder what it would do on a RTX-4090Ti.
I tried compiling this for some time now for rtx 3080 with cuda 11.x - is there a version that works with current CUDA toolkit and drivers? If not then a 4090 build is impossible.
Or is there a win64 build somewhere? (should work - not optimal performance)

I don't have an executable for it, maybe somebody else does. CUDA changed quite a lot over the years, probably it will be a lot of work to make it compatible with current development tools.

Mike Sherwin · Post by **Mike Sherwin** » Fri Nov 04, 2022 10:00 pm

hgm wrote: ↑Fri Nov 04, 2022 9:19 am
Mike Sherwin wrote: ↑Thu Nov 03, 2022 10:56 pmMy perft 6 takes about 3.5 seconds. It is written in C. And it does not use the fastest move generator, magic. Mine uses no tricks and another's that uses no tricks runs 50% faster than mine, but he uses magic. Crafty also uses magic and does perft 6 in 2 seconds. I don't think Bob uses any tricks because his perft has always cared about accuracy and not speed.

Qperft uses mailbox, and it calculates perft(6) in 0.7 sec on my 3.2GHz i7.

Is that not with all tricks active? What does it do with no tricks. With all permutations of tricks? As I have said many times before mine makes and unmakes all moves. So maybe my 30 million n/s is not so bad. And I need to test after adding magic. And it would be nice if there were well defined examples to compare results with.

Ras · Post by **Ras** » Fri Nov 04, 2022 10:25 pm

JVMerlino wrote: ↑Fri Nov 04, 2022 4:29 pmIndeed, and therefore it is a bit surprising that nobody seems to have mentioned, when reporting their times, whether or not they are using bulk counting.

I have a pseudo legal move generator. If the moving piece is not a king, not en passant, and not a check evasion, then I use the alignment of the moving piece and possibly verify that there is no pinning enemy piece in the direction of the possible alignment. I'm doing the same in actual engine search where that spares the in-check verification for most moves.

That means most of the final moves in the perft tree don't need make / check verification / unmake. However, that's not quite bulk counting, either.

hgm · Post by **hgm** » Fri Nov 04, 2022 10:31 pm

Mike Sherwin wrote: ↑Fri Nov 04, 2022 10:00 pm
hgm wrote: ↑Fri Nov 04, 2022 9:19 am
Mike Sherwin wrote: ↑Thu Nov 03, 2022 10:56 pmMy perft 6 takes about 3.5 seconds. It is written in C. And it does not use the fastest move generator, magic. Mine uses no tricks and another's that uses no tricks runs 50% faster than mine, but he uses magic. Crafty also uses magic and does perft 6 in 2 seconds. I don't think Bob uses any tricks because his perft has always cared about accuracy and not speed.

Qperft uses mailbox, and it calculates perft(6) in 0.7 sec on my 3.2GHz i7.
Is that not with all tricks active? What does it do with no tricks. With all permutations of tricks? As I have said many times before mine makes and unmakes all moves. So maybe my 30 million n/s is not so bad. And I need to test after adding magic. And it would be nice if there were well defined examples to compare results with.

Not making the moves is not a 'trick'. Making and UnMaking them without doing anything in between is just stupid. An engine would not make any moves in the leaves of the search tree. So why should perft do that. The purpose is to count them, not to make them.

Ras · Post by **Ras** » Fri Nov 04, 2022 11:41 pm

hgm wrote: ↑Fri Nov 04, 2022 10:31 pmAn engine would not make any moves in the leaves of the search tree. So why should perft do that. The purpose is to count them, not to make them.

An engine would not count the moves in the leaves, either. It would call eval instead. I think the purpose of perft is to be as close to the engine move generator as possible as to allow checking its performance and correctness. So, if the engine design relies on make / check verification / unmake, then it's perfectly reasonable to use that same way also in perft. If however the engine uses some way to avoid this overhead, partially or completely, then it should do the same thing in perft.

syzygy · Post by **syzygy** » Sat Nov 05, 2022 12:39 am

JoAnnP38 wrote: ↑Thu Nov 03, 2022 7:44 pmFor simplicity's sake, I am currently dynamically allocating a list to hold all of the generated moves, and this happens on every call to GenerateMoves. I suspect this needs to be changed to use a single array that is shared by all moves to eliminate the heap allocations. But I don't have concrete evidence of that yet.

Dynamic allocation on the heap like with malloc() in C and new in C++ should definitely be avoided.

However, if you can get C# to allocate the move list on the stack, it should be fine. I suppose that means allocating a local array in search() and forwarding a reference to it in GenerateMoves(). But I don't know C#.

A single large array is not a necessity. (It might be slightly faster since you can get a somewhat better utilisation of cache lines.)

While the node counts are all "correct" for these tests, I have to admit I am surprised by the exponential explosion in timings, especially when going from 4 to 5 or 6 ply. All constructive comments welcome.

Node counts grow exponentially, so total time will grow exponentially, too. To have a better idea of the speed, calculate nodes per second.

To compare your perft with another engine, you have to know whether you are counting the same thing. If your engine checks for legality by making the move, checking whether the position is legal (enemy king not in check), and unmaking the move, then your perft will be very much slower than if your engine checks legality without actually making the move (or only generates legal moves).

syzygy · Post by **syzygy** » Sat Nov 05, 2022 12:59 am

JoAnnP38 wrote: ↑Fri Nov 04, 2022 1:42 amI hear ya. I knew that when I started, but at that time I was favoring correctness over performance. But now seeing how far I'm off from where I want to be it looks like I made the wrong choice. Live and learn as they say.

Best way to learn is from your own mistakes! (Or let's call them suboptimal design choices.)

My Perft Results

Re: My Perft Results

Re: My Perft Results

Re: My Perft Results

Re: My Perft Results

Re: My Perft Results

Re: My Perft Results

Re: My Perft Results

Re: My Perft Results

Re: My Perft Results

Re: My Perft Results