help validating perft numbers

Discussion of chess software programming and technical issues.

Moderator: Ras

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: help validating perft numbers

Post by bob »

hgm wrote:But qperft is not an engine at all. Who wants an engine? It is evaluation that is completely useless. It does not help you at all to count move paths...

As usual you ar arguing from the tunnel vision of your own limited interests.
I am arguing from the perspective of the person that _created_ this idea, and why it is useful. I wanted to check correctness when I started the Crafty project, because the basic move generator and make/unmake changed dramatically month to month. I also wanted a way to compare approaches in terms of speed.

The hashing idea is nonsensical. Why add something to perft that requires _more_ debugging (the hashing stuff) when the primary purpose is to evaluate correctness and compare performance between two versions of some function.

I released this code, and others adopted the idea for testing, debugging and development. Then it turned into a "speed race" for a few. Best speed test is to compare time to depth for two programs, not compare some silly perft times where there is absolutely no incentive to make that efficient since it has little to do wtth playing chess, other than for the speed/correctness issues I mentioned. Comparing perft numbers between two programs is useful if you compare the node counts to detect errors. Comparing the speeds is about as pointless as comparing lines of code, or number of procedures, or whatever.

No tunnel-vision on my end at all. Simply a practical observation. Your comment, on the other hand.....
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: help validating perft numbers

Post by bob »

stegemma wrote:
bob wrote:...
First question is, what is your board representation? Bitboards have advantages and disadvantages. One significant advantage is in the evaluation where you can ask multiple questions by doing boolean operations on "sets" of information in one instruction. Make/Unmake is not particularly fast compared to a 0x88-like implementation because of redundant work being done (occupied squares, etc). The first place bitboards shows its versatility is when you choose to generate just captures, as here it is simply faster than 0x88. But perft doesn't do a capture-only search so it plays no part in perft.

If I were trying to write the fastest-possible perft, I would use something else. I use perft to check things after making a significant change to the move generator, or make/unmake, or the check detector.
I use a 16x10 board. , that means:

Code: Select all


I=Illegal
V=Vuota (empty)

I,I,I,I,I,I,I,I,I,I
I,I,I,I,I,I,I,I,I,I
I,I,I,I,I,I,I,I,I,I
I,I,I,I,I,I,I,I,I,I
V,V,V,V,V,V,V,V,I,I
V,V,V,V,V,V,V,V,I,I
V,V,V,V,V,V,V,V,I,I
V,V,V,V,V,V,V,V,I,I
V,V,V,V,V,V,V,V,I,I
V,V,V,V,V,V,V,V,I,I
V,V,V,V,V,V,V,V,I,I
V,V,V,V,V,V,V,V,I,I
I,I,I,I,I,I,I,I,I,I
I,I,I,I,I,I,I,I,I,I
The first 2 illegal squares lines are more than needed, one extra illegal square is enough to handle knight on a8 illegal moves (on the left top square). I don't use board indexes but directly square pointers. Its easiest to handle, in assembly, from my point of view. Despite from that, it is a common rapresentation, i think.

Maybe transposition tables used in standard search and applyed to perft can help to debug the transposition tables itself or to collect statistics? In fact, as you said, there are no usefull raison to speed up perft but it becomes interesting if the speed up one is trying can then be applyed to normal search. Using some standard search speed-up in perft can give an information about that speed-up correctness.

Coming back to check test, i test the squares to know if there are an enemy piece of the required types. I test many piece types for any square tested, depending if they are ne suqare near or more to the king. Supposing to have to test for check to a white king in e4, i test:

d5,f5 for K/Q/B/P
e5 for K/Q/R
d3,f3 for K/Q/B
e3 for K/Q

then others "e" column squares for Q/R and "5" line for Q/R and obviously the 2 diagonals for Q/B.

When i find a friend piece or an enemy piece not of desired type i stop testing on that line (column or diagonal).

At end, i test for 8 knights squares.

The test costs "only" 3 assembly instructions per square (the test is in something like an unrolled loop, in assembly - in fact i use macros to write readable code). It is very expensive because it requires testing for 16/17 squares for a just castled king with all pawns unmoved. It means about 50 instructions. If the king is in the middle of the board, it takes more inst. Generating a single moves takes me about 7/8 inst. so the cost of testing for check is about as to generate almost 7 more moves per any move played!!! Obviously making the moves takes more than generate it... so the effective cost, compared with move generation, is less important but still higher than other approach, i think.

In previous programs (Drago/Raffaela) i tested for check only at the next ply (that means not to check at all) but with Freccia i want to try to leave the check as soon as i have played the move (or a little before, if i want further optimize).
There you go. One of the advantages of bitboards is the ability to generate captures without having to "skip" over the non-captures. Perft doesn't measure this because there is no captures-only component. Mailbox is pretty efficient when you want to generate _all_ moves as your comparisons show. I should add a perft option to just generate captures and promotions, which would show just how fast bitboards are doing those things compared to a traditional mailbox program.

That's why I have said that comparing perft speeds really is meaningless unless you are comparing two versions of your own program to see which is fastest.
User avatar
smrf
Posts: 484
Joined: Mon Mar 13, 2006 11:08 am
Location: Klein-Gerau, Germany

Re: help validating perft numbers

Post by smrf »

Inside of a Parallels window on a three year old iMac i asked my old SMIRF for some related statistics:

Code: Select all

FEN: 2K5/3P4/2k1q3/8/3R4/8/8/8 b - - 14 20

=>+-a--b--c--d--e--f--g--h-+ MS Vis.Studio C++ 32-Bit-Vers. 14.00
8 |   :::<K>:::   :::   :::| (Compilation: Oct 15 2009)
7 |:::   :::<P>:::   :::   |
6 |   :::[k]:::[q]:::   :::| Perft Testseries
5 |:::   :::   :::   :::   | (with TT caching +256.0 MB / 4-fold)
4 |   :::   <R>   :::   :::| TT Access Success +92.0%
3 |:::   :::   :::   :::   |
2 |   :::   :::   :::   :::| Smirf Test No.:  0
1 |:::   :::   :::   :::   |
  +-a--b--c--d--e--f--g--h-+ Break Time: +10.00 Sec.

Ply       Nodes      all (x) (ep)     all (+)        (#)      Prom.Cstl.   Sec.
-------------------------------------------------------------------------------
1            24            1    0           3          0          0    0      0
2           360           10    0          60          0         72    0      0
3          7554          322    0         972         49          0    0      0
4        123671         2483    0       19354         11      19236    0      0
5       2611766        95421    0      356160      18606          0    0  0.047
6      45554896       829562    0     6867336      19882    5347056    0  0.219
7     969124580     33605077    0   140936330    6369053          0    0  1.219
8   17883914386    320359510    0  2700920229   11900712 1518589344    0  5.078
9  380865893753  12901002659    0 59088937263 2269772263          0    0  53.20
-------------------------------------------------------------------------------
Jan Brouwer
Posts: 201
Joined: Thu Mar 22, 2007 7:12 pm
Location: Netherlands

Re: help validating perft numbers

Post by Jan Brouwer »

bob wrote:There you go. One of the advantages of bitboards is the ability to generate captures without having to "skip" over the non-captures. Perft doesn't measure this because there is no captures-only component. Mailbox is pretty efficient when you want to generate _all_ moves as your comparisons show. I should add a perft option to just generate captures and promotions, which would show just how fast bitboards are doing those things compared to a traditional mailbox program.

That's why I have said that comparing perft speeds really is meaningless unless you are comparing two versions of your own program to see which is fastest.
So let's try to define a such perft-like command which does measure move generation / execution speed in a manner which is more relevant to a chess program.

Some characteristics I can think of:
- goal is to measure move generation / execution speed, so no caching or skipping the last ply moves tricks
- it should produce an exactly defined number of nodes so different implementations can be compared just like perft
- a considerable fraction of nodes should consist of the best MVV/LVA move available without siblings

Then I would be willing to pit my super-duper mailbox lazy updating attack table based incremental MVV/LVA capture move generator to any bitboard based one :-)

But seriously, it could inspire efforts by those who don't have the time or patience to develop full-blown chess programs.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: help validating perft numbers

Post by bob »

Jan Brouwer wrote:
bob wrote:There you go. One of the advantages of bitboards is the ability to generate captures without having to "skip" over the non-captures. Perft doesn't measure this because there is no captures-only component. Mailbox is pretty efficient when you want to generate _all_ moves as your comparisons show. I should add a perft option to just generate captures and promotions, which would show just how fast bitboards are doing those things compared to a traditional mailbox program.

That's why I have said that comparing perft speeds really is meaningless unless you are comparing two versions of your own program to see which is fastest.
So let's try to define a such perft-like command which does measure move generation / execution speed in a manner which is more relevant to a chess program.

Some characteristics I can think of:
- goal is to measure move generation / execution speed, so no caching or skipping the last ply moves tricks
- it should produce an exactly defined number of nodes so different implementations can be compared just like perft
- a considerable fraction of nodes should consist of the best MVV/LVA move available without siblings

Then I would be willing to pit my super-duper mailbox lazy updating attack table based incremental MVV/LVA capture move generator to any bitboard based one :-)

But seriously, it could inspire efforts by those who don't have the time or patience to develop full-blown chess programs.
Once you get to a "perft" point, you are _close_ to having a real engine. You can code up a basic alpha/beta search in 20 lines of C or so. A basic evaluation in a hundred lines or so. And you are ready to play.

While doing whatever you want with perft is fine, it is not going to excite me when someone reports "My perft is 3x faster than Crafty's." I've not spent any time trying to make it fast... just accurate. So that I can catch bugs. I used it regularly when I did 22.0 and renumbered the bits to match Intel. There were plenty of hard-coded masks and shifts in move generation that got broken. And perft helped me find them one by one.

I'm not sure how to compare speed on anything. So many programs today are outright lying (if that is a term applicable to computers) about their NPS speed, Rybka obviously being one, but I am certain there are many others because it is better to be considered "slow and knowledgeable as opposed to being a 'bean counter'". Most of the claims are nonsense. So if you can't trust NPS numbers, there is not much left to compare.
User avatar
stegemma
Posts: 859
Joined: Mon Aug 10, 2009 10:05 pm
Location: Italy
Full name: Stefano Gemma

Re: help validating perft numbers

Post by stegemma »

[quote="bob]...I'm not sure how to compare speed on anything. So many programs today are outright lying (if that is a term applicable to computers) about their NPS speed, Rybka obviously being one, but I am certain there are many others because it is better to be considered "slow and knowledgeable as opposed to being a 'bean counter'". Most of the claims are nonsense. So if you can't trust NPS numbers, there is not much left to compare.[/quote]

I think that the problem is the diffferent approach to chess programming that anybody have. There are ones that wants to compete by doing the best playing program, who wants to compete with the fastest one, who wants just to explore in a "scientifical" way chess programming... who do all just for fun and are not interested at all in any kind of competition or in any fast/goodness stuff. Last but not least, somebody wants to earn money from theyr chess-programming work. Whe should accept any point of view but still we can provide our point of view and maybe try to change the other ones idea or mind, if we think that it is worst/bad.

Personally i follow any discussion about speed, because i like the "competition on speedness". This could be usefull or useless... who knows? For me is usefull because that's what i like and get fun from that. Rybka is a commercial program and some possible lies are to be accepted... as the ones that a car seller can tell us (if we cannot demonstrate that it is obviously false). What's important, imho, is that nobody should not say some crazyness as "i've done a full statistical research in my university and i can prove that bitboards is the worst way to generate moves", because is like to say "my approach to chess programming is scientific" and then provide a completely false information.

So, perft is a function born to verify correctness of changes in Crafty. In that context it is not important speedness but it is very important to have no bugs at all. That's true and nobody could say the opposite. After that everybody can use that idea to get some different results or just to have some funny things to do in friday nights. :) Counting nodes per second is another funny thing to do... could be done in a scientifical way, providing standard rules on wich nodes to count, just to compare the speedness of different programs... or can be done for fun, as fishers does with length or theyr captured fishes.......
Uri Blass
Posts: 10895
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: help validating perft numbers

Post by Uri Blass »

bob wrote:
hgm wrote:But qperft is not an engine at all. Who wants an engine? It is evaluation that is completely useless. It does not help you at all to count move paths...

As usual you ar arguing from the tunnel vision of your own limited interests.
I am arguing from the perspective of the person that _created_ this idea, and why it is useful. I wanted to check correctness when I started the Crafty project, because the basic move generator and make/unmake changed dramatically month to month. I also wanted a way to compare approaches in terms of speed.

The hashing idea is nonsensical. Why add something to perft that requires _more_ debugging (the hashing stuff) when the primary purpose is to evaluate correctness and compare performance between two versions of some function.

I released this code, and others adopted the idea for testing, debugging and development. Then it turned into a "speed race" for a few. Best speed test is to compare time to depth for two programs, not compare some silly perft times where there is absolutely no incentive to make that efficient since it has little to do wtth playing chess, other than for the speed/correctness issues I mentioned. Comparing perft numbers between two programs is useful if you compare the node counts to detect errors. Comparing the speeds is about as pointless as comparing lines of code, or number of procedures, or whatever.

No tunnel-vision on my end at all. Simply a practical observation. Your comment, on the other hand.....
The target of a fast reliable perft function is to help other people to test if there is a bug in their program faster.

You can simply take epd file of a lot of position and calculate perft(5) in all the position and compare results with another program.


If you want this process to be as fast as possible then you will prefer faster program so speed is clearly relevant.

Uri
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: help validating perft numbers

Post by wgarvin »

Uri Blass wrote:
bob wrote:
hgm wrote:But qperft is not an engine at all. Who wants an engine? It is evaluation that is completely useless. It does not help you at all to count move paths...

As usual you ar arguing from the tunnel vision of your own limited interests.
I am arguing from the perspective of the person that _created_ this idea, and why it is useful. I wanted to check correctness when I started the Crafty project, because the basic move generator and make/unmake changed dramatically month to month. I also wanted a way to compare approaches in terms of speed.

The hashing idea is nonsensical. Why add something to perft that requires _more_ debugging (the hashing stuff) when the primary purpose is to evaluate correctness and compare performance between two versions of some function.

I released this code, and others adopted the idea for testing, debugging and development. Then it turned into a "speed race" for a few. Best speed test is to compare time to depth for two programs, not compare some silly perft times where there is absolutely no incentive to make that efficient since it has little to do wtth playing chess, other than for the speed/correctness issues I mentioned. Comparing perft numbers between two programs is useful if you compare the node counts to detect errors. Comparing the speeds is about as pointless as comparing lines of code, or number of procedures, or whatever.

No tunnel-vision on my end at all. Simply a practical observation. Your comment, on the other hand.....
The target of a fast reliable perft function is to help other people to test if there is a bug in their program faster.

You can simply take epd file of a lot of position and calculate perft(5) in all the position and compare results with another program.


If you want this process to be as fast as possible then you will prefer faster program so speed is clearly relevant.

Uri
Even dumb implementations of perft(5) are fast enough to run on a whole list of positions. perft(8) however, would take a long time to run without hashing.

Making the fastest perft in the west might be a fun goal in its own right, for some people. But for those who want to make a fast chess engine, the usefulness of a perft is in its correctness, not its speed. Adding hashing etc. to your perft may make it faster, but it also might introduce bugs.
Uri Blass
Posts: 10895
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: help validating perft numbers

Post by Uri Blass »

wgarvin wrote:
Uri Blass wrote:
bob wrote:
hgm wrote:But qperft is not an engine at all. Who wants an engine? It is evaluation that is completely useless. It does not help you at all to count move paths...

As usual you ar arguing from the tunnel vision of your own limited interests.
I am arguing from the perspective of the person that _created_ this idea, and why it is useful. I wanted to check correctness when I started the Crafty project, because the basic move generator and make/unmake changed dramatically month to month. I also wanted a way to compare approaches in terms of speed.

The hashing idea is nonsensical. Why add something to perft that requires _more_ debugging (the hashing stuff) when the primary purpose is to evaluate correctness and compare performance between two versions of some function.

I released this code, and others adopted the idea for testing, debugging and development. Then it turned into a "speed race" for a few. Best speed test is to compare time to depth for two programs, not compare some silly perft times where there is absolutely no incentive to make that efficient since it has little to do wtth playing chess, other than for the speed/correctness issues I mentioned. Comparing perft numbers between two programs is useful if you compare the node counts to detect errors. Comparing the speeds is about as pointless as comparing lines of code, or number of procedures, or whatever.

No tunnel-vision on my end at all. Simply a practical observation. Your comment, on the other hand.....
The target of a fast reliable perft function is to help other people to test if there is a bug in their program faster.

You can simply take epd file of a lot of position and calculate perft(5) in all the position and compare results with another program.


If you want this process to be as fast as possible then you will prefer faster program so speed is clearly relevant.

Uri
Even dumb implementations of perft(5) are fast enough to run on a whole list of positions. perft(8) however, would take a long time to run without hashing.

Making the fastest perft in the west might be a fun goal in its own right, for some people. But for those who want to make a fast chess engine, the usefulness of a perft is in its correctness, not its speed. Adding hashing etc. to your perft may make it faster, but it also might introduce bugs.
If the list is long enough then dumb implementation of perft(5) are not fast enough
Movei has perftpgn command that is clearly useful to debug the perft function

If you save a file with the name a.pgn at the same folder as the movei.exe folder you can run movei and type perftpgn a.pgn 5 and movei is going to calculate the sum of perft 5 in all the positions in the pgn file.
Other programmers can compare their result with movei to see if their result is correct

calculating the sum of perft 5 on a big pgn file is not something that dumb implementations of perft are fast enough.
When I worked on movei I used the perftpgn command to test that I did not break something in my move generator in case of doing changes.

I took a big pgn file and calculated the sum of perft 3 for all positions in that file and I knew the number that I need to get to have a correct result
(it took movei 15-20 minutes to calculate the sum of perft 3).

I do not claim my implementation of perft to be the fastest that is possible and together with faster computers and with more time it is possible to have test of the sum of perft 5 on the same pgn file that may take many hours so having faster perft for debugging is clearly useful to detect bugs

I did not work on movei in the last years but
note that I now used movei to calculate sum of perft(5) on 8 ssdf games
The result was 37,965,712,019 and it took many minutes
You can see the 8 ssdf games in the first post of the following thread
http://www.talkchess.com/forum/viewtopic.php?t=30029

Edit:I think that it is better if every serious programmer implements perftpgn command in order to debug his move generator and compare with other programs and considering the fact that the perft of movei is not the fastest it is better to use qperft of H.G.Muller and not movei for comparison purpose but people need to change qperft to calculate sum of perft in pgn files.

Uri
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: help validating perft numbers

Post by bob »

Uri Blass wrote:
bob wrote:
hgm wrote:But qperft is not an engine at all. Who wants an engine? It is evaluation that is completely useless. It does not help you at all to count move paths...

As usual you ar arguing from the tunnel vision of your own limited interests.
I am arguing from the perspective of the person that _created_ this idea, and why it is useful. I wanted to check correctness when I started the Crafty project, because the basic move generator and make/unmake changed dramatically month to month. I also wanted a way to compare approaches in terms of speed.

The hashing idea is nonsensical. Why add something to perft that requires _more_ debugging (the hashing stuff) when the primary purpose is to evaluate correctness and compare performance between two versions of some function.

I released this code, and others adopted the idea for testing, debugging and development. Then it turned into a "speed race" for a few. Best speed test is to compare time to depth for two programs, not compare some silly perft times where there is absolutely no incentive to make that efficient since it has little to do wtth playing chess, other than for the speed/correctness issues I mentioned. Comparing perft numbers between two programs is useful if you compare the node counts to detect errors. Comparing the speeds is about as pointless as comparing lines of code, or number of procedures, or whatever.

No tunnel-vision on my end at all. Simply a practical observation. Your comment, on the other hand.....
The target of a fast reliable perft function is to help other people to test if there is a bug in their program faster.

You can simply take epd file of a lot of position and calculate perft(5) in all the position and compare results with another program.


If you want this process to be as fast as possible then you will prefer faster program so speed is clearly relevant.

Uri
This argument makes no sense. How often does this happen? How many new programs are developed in a single year? How many of those even know about perft? And how many of those need deep perft's? I've developed two major programs, completely different in their architecture, I used perft during the development of both, and back in the 70's I did not find it necessary to have a perft 8, 9 or 10. I haven't done any of those for my own use even in Crafty. 3-4-5 has been more than enough to find any errors I have had, and those run so quickly that optimization is a complete waste of time.

This is clearly a violation of the principle behind Amdahl's law, which guides where we spend our time optimizing a program. Optimizing perft is simply an idea that makes no sense. Once your engine is done, you won't use it again for years. While writing your engine, you will barely use it. Most of the time is not spent on the move generator code. So why optimize something that will represent less than .0001% of your total compute cycles used to develop a program?

Now if you want to take the "anal approach" and simply say "but I just wanted to develop the fastest implementation of perft possible..." then that's up to you or anyone else that wants to use this as a speed comparison. But when a program can win a perft comparison and then get crushed on the chess board, it makes that effort to optimize perft seem somewhat foolishly spent.