0 of 20 perft(12)
0 of 400 perft(11)
15 of 5,362 perft(10)
293 of 72,078 perft(9)
5,392 of 822,518 perft(8)
Estimated total time: 896 days
That's a long time. So I've put in an order for a new machine with quite a bit more throughput capacity. The specification includes:
Intel Desktop Main Board, Model DP67DE
Intel Core i7-2600 CPU (3.4 GHz, 4 cores, 8 hyperthreads, 8 MB L3)
16 GB RAM (4 x 4 GB DDR3-1333)
I'm getting the i7-2600 CPU instead of the i7-2600K version as I don't use overclocking because of reliability concerns. And although the board can handle four 8 GB RAM modules, my budget can't so the transposition table will be limited to 2^30 entries (ca. 12 GB RAM). The machine will have processing power equal to the current high-end iMac build-to-order model at a fraction of the cost. The OS will be Ubuntu Linux 10.04 LTS (64 bit desktop version), although it may be possible to run Mac OS/X 10.6 without too much difficulty.
How long will it take the new machine to do perft(13)? My guess is 180 days or so. That's six preventive maintenance sessions of cleaning cat fur from the ventilation screens.
0 of 20 perft(12)
0 of 400 perft(11)
24 of 5,362 perft(10)
469 of 72,078 perft(9)
8,287 of 822,518 perft(8) (Just over 1% of total; figure used for time estimation)
Estimated total time: 1,034 days
The time estimate is growing ominously. At least there have been no power outages so far.
In another ten days or so, the calculation will migrate to the new 3.4 GHz Core i7-2600 16 GB RAM machine which will use the results from the run on the current 2.0 GHz Core 2 Duo box.
I'm thinking small database, a little send/recv service and a client part to handle the engine.
Each client registers and there is a viewer to see performance stats.
But maybe that is overdoing it
Could the be the skeleton framework for a future distributed chess engine though..
A distributed model can work, and indeed that's how I verified perft(12). But instead of an automated client/server environment, I did assignments and collections manually.
The big drawback of client/server in this context is not its complexity, but rather the loss of efficiency due to missing transpositions that are otherwise present when using a single transposition table. To adapt the current code, it would be necessary to:
1) Ensure that the different clients worked on different parts of the tree, and
2) Ensure that the clients dynamically share lots of transposition results.
This can be done, but I'm lazy and have more interesting tasks to handle.
========
After nearly eleven days, the perft(13) run has spit out its first perft(11) subtotal. For the position
[D]rnbqkbnr/1ppppppp/p7/8/8/N7/PPPPPPPP/R1BQKBNR w KQkq - 0 2
the perft(11) is 1,865,423,942,798,492. One down, 399 to go.
0 of 20 perft(12)
2 of 400 perft(11)
41 of 5,362 perft(10)
797 of 72,078 perft(9)
10,957 of 822,518 perft(8) (ca. 1.33% of total)
Current total time estimate: 983 days
Fortunately, the new machine should arrive next week. With perhaps twice the throughput per core, twice the number of cores, and four times the RAM, the new box just might finish the calculation before the end of this year.
0 of 20 perft(12)
2 of 400 perft(11)
49 of 5,362 perft(10)
963 of 72,078 perft(9)
14,062 of 822,518 perft(8) (ca. 1.71% of total)
Current total time estimate: 1,048 days
I'm working on a new parallel enumeration algorithm which is analogous to PVS/YBW. It should do no worse than the current root split method and may be much better by enhancing inter-thread transposition entry reuse.
Even with the current algorithm and mediocre machine, cranking out 49 perft(1) numbers in 18 days is not too bad. Back in 2002 it took 18 days just to generate a single perft(10) for the initial position. This earlier run was made with using a single thread on a 1 GHz PowerPC G4 and did not use transposition tables. I don't think it had depth one bulk counting, either.
after 81 hours: 808 days
after 141 hours: 896 days
after 250 hours: 1034 days
after 314 hours: 983 days
after 430 hours: 1048 days
Since I would expect an almost constant NPS I could only imagine that there are some differences between the expected and the real amount of nodes to visit. What is your formula to calculate the estimated total time?
after 81 hours: 808 days
after 141 hours: 896 days
after 250 hours: 1034 days
after 314 hours: 983 days
after 430 hours: 1048 days
Since I would expect an almost constant NPS I could only imagine that there are some differences between the expected and the real amount of nodes to visit. What is your formula to calculate the estimated total time?
Sven
By interpolating from the known perft(1) .. perft(12) numbers I would estimate about 1.98 * 10^18 nodes for perft(13). Does this roughly match your expected number?
after 81 hours: 808 days
after 141 hours: 896 days
after 250 hours: 1034 days
after 314 hours: 983 days
after 430 hours: 1048 days
Since I would expect an almost constant NPS I could only imagine that there are some differences between the expected and the real amount of nodes to visit. What is your formula to calculate the estimated total time?
There are 822,518 distinct perft(8) calculations required. The time estimate is:
(822,518 / completed(8)) * elapedtime
At present, completed(8) = 14,771 and elapsedtime = 455 hours
So, (822,518 / 14,777) * 455 hours -> 25,337 hours -> 1,056 days
Different perft(8) calculations take different times. Transposition usage also plays a part.
Sven Schüle wrote:By interpolating from the known perft(1) .. perft(12) numbers I would estimate about 1.98 * 10^18 nodes for perft(13). Does this roughly match your expected number?