Perft(14)

Don · Post by **Don** » Mon Oct 08, 2012 6:39 am

sje wrote:
Don wrote:I don't think I can get depth 7 down to your 3.2 number without doing bulk counting.
Note that "000.00:03:21.540" means "0 days 0 hours 3 minutes 21 seconds 540 milliseconds" or about 202 seconds to do perft(7) bulk mode which gives frequency of about 15.9 MHz.

This is for a single thread on a 2.67 GHz Intel Xeon from six years ago.

What? I thought this was 3 seconds and that I was only twice as slow!

What times do you get on a fast modern machine for perft(7) from the opening position?

I'm doing perft(8) in 100 seconds but I am on an i5, substantially faster I am sure than your 2.67 Xeon from 6 years ago.

sje · Post by **sje** » Mon Oct 08, 2012 7:20 am

Running on that 2.67 GHz quad core Xeon with four threads sharing a big transposition table, perft(7) takes 3.2 seconds. This does not count the set-up time of about 14 seconds to allocate and clear the 12 GB transposition table.

Perft(8) with the above takes 40.7 seconds, again not counting set-up time.

Perft(9) with the above takes 430.3 seconds, again not counting set-up time.

The above are all with Symbolic, not CookieCat.

Don · Post by **Don** » Mon Oct 08, 2012 4:30 pm

sje wrote:Maybe I shouldn't say too much about perft(14), because I once said that I wasn't interested in doing perft(13) and look at what happened.

It is possible that perft(14) has already been calculated; any of a number of government parallel supercomputers could grind out the result in an afternoon or less. And it would be a good test of distributed integer performance, but it would not be made public.

With consumer level hardware, a distributed approach to perft(14) seems best at this time. But it is a non-trivial task. Consider any number of well known distributed computing projects where the end-user application has had dozens of revisions to squeeze out bugs. Perft(n) cannot admit a single off-by-one error; any bug would mean a restart.

For perft(14), I'd consider generating the 96,400,068 ply seven positions (each with a usage count) and deal these to worker clients with each position getting a draft 7 calculation.

Checking my own logic, I get a different number: 95,631,046

Some complete lines terminate before depth 7, since your number is larger I wonder if you are just adding them to the total and that would explain the difference? That would make sense to get a proper count.

Otherwise, I suspect that your number is the correct one. This was based on producing all the fen strings, sorting and then uniq'ing them using unix tools:

sort -m fens* | uniq > ufens

each fen was already sorted.

So I suspect my routine to convert a position to a fen. There is some trickiness there that has to be properly covered - the en-passant target square should be set only if an en-passant is possible. But my code covers that properly. So I will rerun this using the hash keys instead of the fens to see if it comes out differently. The perft numbers up through 9 are always coming out correct.

[/quote]
Averaging three per second would allow the run to complete in about one year, but without any redundancy checking. Tripling the time (or the processors) would support 2-from-3 majority voting for flagging bad intermediate results.[/quote]

Ajedrecista · Post by **Ajedrecista** » Mon Oct 08, 2012 4:44 pm

Hello Don:

Don wrote:
sje wrote:Maybe I shouldn't say too much about perft(14), because I once said that I wasn't interested in doing perft(13) and look at what happened.

It is possible that perft(14) has already been calculated; any of a number of government parallel supercomputers could grind out the result in an afternoon or less. And it would be a good test of distributed integer performance, but it would not be made public.

With consumer level hardware, a distributed approach to perft(14) seems best at this time. But it is a non-trivial task. Consider any number of well known distributed computing projects where the end-user application has had dozens of revisions to squeeze out bugs. Perft(n) cannot admit a single off-by-one error; any bug would mean a restart.

For perft(14), I'd consider generating the 96,400,068 ply seven positions (each with a usage count) and deal these to worker clients with each position getting a draft 7 calculation.

Checking my own logic, I get a different number: 95,631,046

Some complete lines terminate before depth 7, since your number is larger I wonder if you are just adding them to the total and that would explain the difference? That would make sense to get a proper count.

Otherwise, I suspect that your number is the correct one. This was based on producing all the fen strings, sorting and then uniq'ing them using unix tools:

sort -m fens* | uniq > ufens

each fen was already sorted.

So I suspect my routine to convert a position to a fen. There is some trickiness there that has to be properly covered - the en-passant target square should be set only if an en-passant is possible. But my code covers that properly. So I will rerun this using the hash keys instead of the fens to see if it comes out differently. The perft numbers up through 9 are always coming out correct.

Averaging three per second would allow the run to complete in about one year, but without any redundancy checking. Tripling the time (or the processors) would support 2-from-3 majority voting for flagging bad intermediate results.[/quote][/quote]

Indeed Steven's number is the correct one, also verified by JetChess (the third column of the first point) and OEIS A083276 integer sequence:

OEIS wrote:A083276

Number of distinct chess positions after n plies including differences due to availability and possibility of castling and en passant captures.

1, 20, 400, 5362, 72078, 822518, 9417681, 96400068, 988187354, 9183421888, 85375278064

[...]

AUTHOR
Richard Bean (rwb(AT)eskimo.com), Jun 02 2003

EXTENSIONS
a(9) from Paul Byrne (ibid(AT)carolina.rr.com), Jan 26 2004
a(10) from Arkadiusz Wesolowski, Jan 04 2012

STATUS
approved

Thank you very much for your efforts, Don. I hope to see Komodo MP soon in Graham's tournaments and CCRL, CEGT... good luck!

Regards from Spain.

Ajedrecista.

Don · Post by **Don** » Mon Oct 08, 2012 4:53 pm

OEIS wrote:A083276

Number of distinct chess positions after n plies including differences due to availability and possibility of castling and en passant captures.

1, 20, 400, 5362, 72078, 822518, 9417681, 96400068, 988187354, 9183421888, 85375278064

I think this does explain the discrepancy. I was stupidly including only position seen on the 7th ply - not "uniqe lines."

So this is A bug, maybe not the only one.

Don · Post by **Don** » Tue Oct 09, 2012 10:40 pm

I continue to get a very tiny discrepancy and I am quite sure my code is correct - also perf is working to high depths.

My 6 ply and 7 ply numbers are only barely off, but they are off. 6 ply is only off by 2 positions!

So I think they are doing something wrong at this point. Were those values also verified by others as perft is?

The only thing I can imagine that might make the difference is that with some analysis you could "collapse" some en-passant positions into one. It's not really en-passant if en-passant were illegal due to check - and I think that would slightly reduce the count of "unique" positions but that seems like a stretch, I don't think anyone would go to the trouble to make their hash function take that into consideration as it would represent a major slowdown for pitiful small gain.

So I am at a loss here - I am quite sure my code is now correct.

Don wrote:
OEIS wrote:A083276

Number of distinct chess positions after n plies including differences due to availability and possibility of castling and en passant captures.

1, 20, 400, 5362, 72078, 822518, 9417681, 96400068, 988187354, 9183421888, 85375278064

I think this does explain the discrepancy. I was stupidly including only position seen on the 7th ply - not "uniqe lines."

So this is A bug, maybe not the only one.

sje · Post by **sje** » Wed Oct 10, 2012 2:45 am

From A083276:

This differs from A057745 at 6 ply and above because where an en passant capture would be illegal, the position is essentially the same as where an en passant capture is not available. It is two less than A057745 at 6 ply because the positions after 1. f4 e6/e5 2. Kf2 Qf6 3. f5 g5 are considered to be the same as after 1. f4 g5 2. Kf2 e6/e5 3. f5 Qf6.

Don · Post by **Don** » Wed Oct 10, 2012 4:26 am

sje wrote:From A083276:
This differs from A057745 at 6 ply and above because where an en passant capture would be illegal, the position is essentially the same as where an en passant capture is not available. It is two less than A057745 at 6 ply because the positions after 1. f4 e6/e5 2. Kf2 Qf6 3. f5 g5 are considered to be the same as after 1. f4 g5 2. Kf2 e6/e5 3. f5 Qf6.

Incredible! My numbers match perfectly. What is funny is that dismissed this exact idea as being a long-shot.

Thanks,

Don

sje · Post by **sje** » Wed Oct 10, 2012 11:20 am

Some changes I could make to CookieCat:

1) Change the mtperft (multi-threaded perft) command to use transposition assistance for a big speed increase.

2) Change the transposition table to use an additional 64 bits of hash code for safety and a slight slowdown.

3) Add a command line argument cracker to activate a multi-threaded perft operation without a need for a console interface.

4) For a command line perft, have the program output only a single line which contains the ASCII decimal result count.

Any other support needed for a distributed perft client can be done by separate wrapper software.

A manual invocation example:

Code: Select all

./CookieCat -dp "r1bqkbnr/1ppp1ppp/p1n5/4p3/B3P3/5N2/PPPP1PPP/RNBQK2R b KQkq - 1 4" 7
22668836317

Do the above some 90+ million times with different ply 7 FEN strings, apply some multiplication and addition, and presto! you get perft(14).

Don · Post by **Don** » Wed Oct 10, 2012 3:05 pm

sje wrote:Some changes I could make to CookieCat:

1) Change the mtperft (multi-threaded perft) command to use transposition assistance for a big speed increase.

2) Change the transposition table to use an additional 64 bits of hash code for safety and a slight slowdown.

3) Add a command line argument cracker to activate a multi-threaded perft operation without a need for a console interface.

4) For a command line perft, have the program output only a single line which contains the ASCII decimal result count.

Any other support needed for a distributed perft client can be done by separate wrapper software.

A manual invocation example:
Code: Select all
./CookieCat -dp "r1bqkbnr/1ppp1ppp/p1n5/4p3/B3P3/5N2/PPPP1PPP/RNBQK2R b KQkq - 1 4" 7
22668836317
Do the above some 90+ million times with different ply 7 FEN strings, apply some multiplication and addition, and presto! you get perft(14).

Here is what I have in mind - very similar to what you describe:

1. A server that parcels out work from the 9,417,683 unique 6 ply positions.
2. Clients that get this assignment and return the perft(8) answer to the server.

I considered using 7 ply positions instead - but that would require sending out almost 100 million work assignments - and I fear we lose some efficiency because perft(7) goes much faster - perhaps making this more I/O bound. So 6 has the effect of taking pressure off the server and constantly opening and closing connections.

I did some analysis and perft(8) based on the komodo implementation takes about 90 seconds in the opening, but could take a couple of hours in a complex position. I have only sampled a few positions. That seems like a reasonable chunk of work to parcel out to the clients. Also, 9 mlliion positions is easier to work with than 90 million and most hosting companies are stingy about bandwidth and disk space. If I put these positions in a sql database it's going to use a lot of disk space, just a raw text file with the fen's for the 9 million positions is about half a gig - not a lot but 90 million would already be several gig.

So all things considered I believe the 6 ply positions are the easiest to work with and the least resource intensive for the server.

Using the 90 million positions has some advantages too. The clients would be getting much smaller work packets (assignments that can be completed in seconds), and users running the clients could stop their clients without losing work. With this we would not need to consider having a restart mechanism. With the 9 million ply 6 positions we might want to consider giving the client the ability to stop and restart some other time - taking up where it left off. I don't want to do that as it complicates the programming, but it may be enough to "fwrite" the hash table to disk and restart from scratch as a simple restart mechanism.

The mechanism the server will use is to assign one of the N positions at random when a client makes a request for help. We won't track what has been assigned when making assignments, but we will track what has been completed. In other words when a client gets a work assignment we won't assume that it will complete. My experience with the DAT (Distributed Auto Tester) reveals that there are many stops and starts and trying to track what has been assigned under the assumption it will complete is huge amount of work requiring a lot of guesswork (how long to wait, etc.) In the end it's work for almost no gain.

The client will be graphical and run on windows or inux and will report in real time the progress. It will basically show the progress in terms of percentage of work to complete, project completion date based on current load and other statistics including the progress of the current test being run by the client.

I would like the perft engines to be modular, if someone produces some super fast client we should be able to just start using it. I don't want to complicate things, but the perft engine should be reported and tracked by the server - just in case it's determined that one is buggy. What I had planned is sending out each work assignment twice, and I would want to ensure that each one goes to a different perft client.

This is going to require huge CPU resources - I don't know if we can keep people interested for a year or two - however long it takes. If we get enough volunteers it should not take more than a few months - but it remains to be seen ....

Steven, please send me an email - if you are willing I need to pick your brain and get your help since you have already done this. I could have this up and running in a few minutes as I already have a client and server setup from the DAT project but it needs to be done right before it's started. My email is dailey.don@gmail.com - it's in my profile.

Perft(14)

Re: CookieCat reloaded

Re: CookieCat reloaded

Re: Perft(14)

Re: Perft(14): correct number of unique ply seven positions.

Re: Perft(14): correct number of unique ply seven positions.

Re: Perft(14): correct number of unique ply seven positions.

Re: Perft(14): correct number of unique ply seven positions.

Re: Perft(14): correct number of unique ply seven positions.

Re: CookieCat reloaded

Re: CookieCat reloaded