Perft(14)

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Distributed perft

Post by sje »

Don wrote:By the way, just out of curiosity, how does cookie work with user interfaces if it cannot even understand long algebraic? I think both xboard and UCI expect moves in that format, no?
CookieCat uses an interactive console interface. The program doesn't have much code for xboard or UCI, only stubs.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

CookieCat reloaded

Post by sje »

Code: Select all

gail:~ sje$ cd tmp
gail:tmp sje$ ppcx64 -O3 CookieCat.pas
Free Pascal Compiler version 2.6.0 [2011/12/30] for x86_64
Copyright (c) 1993-2011 by Florian Klaempfl and others
Target OS: Darwin for x86_64
Compiling CookieCat.pas
Assembling (pipe) CookieCat.s
Linking CookieCat
ld: warning: -macosx_version_min not specified, assuming 10.7
15904 lines compiled, 1.6 sec 
gail:tmp sje$ ./CookieCat
CookieCat 2012.01.30   Copyright (c) 2012 by S. J. Edwards
CookieCat ready

[] help
Enter a command, or a sequence of one or more SAN chess moves 

  Commands:
    bench      Run the benchmark
    d          Display everything
    dao        Display active options
    db         Display board
    dbbdb      Display bitboard database
    dbcolor    Display board (ANSI color)
    dbmono     Display board (monochrome)
    desc       Display evaluation score components
    dfen       Display FEN
    dm         Display moves
    dmsig      Display material signature
    dobm       Display opening book moves
    dp         Display position
    dpgn       Display PGN
    dpi        Display program identification
    dps        Display pawn structure components
    dtbm       Display tablebase moves
    dtbs       Display tablebase score
    echo       Echo parameters
    efnormal   Normalize from an EPD <input-file> to an EPD <output-file>
    exit       Exit program
    ffmate     Scan FEN <input-file> data, find mate in <number> full moves
    ffnormal   Normalize from a FEN <input-file> to a FEN <output-file>
    ffperft    Summing over a FEN <input-file> data, run perft to <depth>
    g          Search for a move and then play it
    gg         Play both sides until the end of the game
    help       Show help
    loadfen    Load FEN from an <input-file>
    loadpgn    Load PGN from an <input-file>
    mate       Search for a checkmate in <number> full moves
    mtperft    Run multithreaded perft to <depth> with full node visits
    new        New game
    noop       No operation
    optreset   Reset <option> [<option>]*
    optset     Set <option> [<option>]*
    perftbulk  Run perft to <depth> with bulk counting
    perftfull  Run perft to <depth> with each node visited
    perfttran  Run perft to <depth> with transposition help
    pfnormal   Normalize from a PGN <input-file> to a PGN <output-file>
    rg         Generate and display a single random game
    rgdump     Dump to a PGN <output-file> <number> random games
    rgstat     Generate a report for <number> random game(s)
    rm         Retract move
    rmp        Retract move pair
    rmts       Retract move(s) to start
    s          Search for a move but do not play it
    savefen    Save FEN to an <output-file>
    savepgn    Save PGN to an <output-file>
    selftest   Run the self test
    sfen       Set FEN <mpd> <good> <cavs> <epsq> <hmvc> <fmvn>
    slevfd     Set level fixed depth <ply>
    slevfn     Set level fixed nodes <count>
    slevft     Set level fixed time <seconds>
    slevnt     Set level nominal time <seconds>
    slevut     Set level unlimited time
    stag       Set PGN <tagname> to <tagvalue>
    test       Run developer test
    ttreset    Transposition table data reset

  Options:
    adcc       Auto display: chess clock
    adcp       Auto display: chess position
    mono       Monochrome only output
    trcv       Trace: current variation
    trea       Trace: EPD analysis
    trfd       Trace: first time node at depth
    trfv       Trace: final (predicted) variation
    trif       Trace: input FEN
    trit       Trace: iteration start/finish
    trpv       Trace: predicted variation
    trst       Trace: search termination
    trts       Trace: timing statistics
[] perfttran 7
Na3 120142144
Nc3 148527161
Nf3 147678554
Nh3 120669525
a3 106743106
a4 137077337
b3 133233975
b4 134087476
c3 144074944
c4 157756443
d3 227598692
d4 269605599
e3 306138410
e4 309478263
f3 102021008
f4 119614841
g3 135987651
g4 130293018
h3 106678423
h4 138495290
[ts] Count: 3195901860   Time: 000.00:00:39.860   Frequency: 80178170
[] exit

CookieCat done
gail:tmp sje$ 
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: CookieCat reloaded

Post by sje »

Like perfttran, perftbulk is single threaded.

Code: Select all

[] perftbulk 7
Na3 120142144
Nc3 148527161
Nf3 147678554
Nh3 120669525
a3 106743106
a4 137077337
b3 133233975
b4 134087476
c3 144074944
c4 157756443
d3 227598692
d4 269605599
e3 306138410
e4 309478263
f3 102021008
f4 119614841
g3 135987651
g4 130293018
h3 106678423
h4 138495290
[ts] Count: 3195901860   Time: 000.00:03:21.540   Frequency: 15857407
User avatar
Ajedrecista
Posts: 2098
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: (Perft): CookieCat versus JetChess.

Post by Ajedrecista »

Hello Steven:
sje wrote:Like perfttran, perftbulk is single threaded.

Code: Select all

[] perftbulk 7
Na3 120142144
Nc3 148527161
Nf3 147678554
Nh3 120669525
a3 106743106
a4 137077337
b3 133233975
b4 134087476
c3 144074944
c4 157756443
d3 227598692
d4 269605599
e3 306138410
e4 309478263
f3 102021008
f4 119614841
g3 135987651
g4 130293018
h3 106678423
h4 138495290
[ts] Count: 3195901860   Time: 000.00:03:21.540   Frequency: 15857407
Good! Just for comparison, I ran Perft(1) to Perft(10) back in January using JetChess 1.0.0.0 perft counter. These counts were ran in an Intel i5-760 (2.8 GHz); JetChess is single core and 32-bit:

Code: Select all

Hash = 64 MB: 

Perft(1) ---> Time:    0 ms (0:00:00.000). 
Perft(2) ---> Time:    0 ms (0:00:00.000). 
Perft(3) ---> Time:    0 ms (0:00:00.000). 
Perft(4) ---> Time:    1 ms (0:00:00.001). 
Perft(5) ---> Time:   23 ms (0:00:00.023). 
Perft(6) ---> Time:  320 ms (0:00:00.320). 
Perft(7) ---> Time: 3.864 s (0:00:03.864).

Code: Select all

Hash = 256 MB: 

Perft(8) ---> Time: 56.260 s (0:00:56.260).

Code: Select all

Hash = 1 GB: 

 Perft(9) ---> Time:   764.290 s (0:12:44.290). 
Perft(10) ---> Time: 10607.827 s (2:56:47.827).
CookieCat is quite fast regarding perft! Congratulations. I want to see your multi-threaded perft with hash, bulk counting or whatever tricks to speed up the count (please note that I do not have idea about all this technical stuff).

I would like to take part in this distributed project of Perft(14) but I have slow hardware (a dual core Intel Pentium D930 at 3 GHz) that can not be used 24/7 (I need it for other tasks), and very limited access to an Intel i5-760 at 2.8 GHz (both PCs with Windows XP 32-bit), so I can not participate in Perft(14). I wish you good luck with the project.

Regards from Spain.

Ajedrecista.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: CookieCat reloaded

Post by Don »

Can you make a binary that will work on 64 bit Linux as well as Windows?

Don

sje wrote:

Code: Select all

gail:~ sje$ cd tmp
gail:tmp sje$ ppcx64 -O3 CookieCat.pas
Free Pascal Compiler version 2.6.0 [2011/12/30] for x86_64
Copyright (c) 1993-2011 by Florian Klaempfl and others
Target OS: Darwin for x86_64
Compiling CookieCat.pas
Assembling (pipe) CookieCat.s
Linking CookieCat
ld: warning: -macosx_version_min not specified, assuming 10.7
15904 lines compiled, 1.6 sec 
gail:tmp sje$ ./CookieCat
CookieCat 2012.01.30   Copyright (c) 2012 by S. J. Edwards
CookieCat ready

[] help
Enter a command, or a sequence of one or more SAN chess moves 

  Commands:
    bench      Run the benchmark
    d          Display everything
    dao        Display active options
    db         Display board
    dbbdb      Display bitboard database
    dbcolor    Display board (ANSI color)
    dbmono     Display board (monochrome)
    desc       Display evaluation score components
    dfen       Display FEN
    dm         Display moves
    dmsig      Display material signature
    dobm       Display opening book moves
    dp         Display position
    dpgn       Display PGN
    dpi        Display program identification
    dps        Display pawn structure components
    dtbm       Display tablebase moves
    dtbs       Display tablebase score
    echo       Echo parameters
    efnormal   Normalize from an EPD <input-file> to an EPD <output-file>
    exit       Exit program
    ffmate     Scan FEN <input-file> data, find mate in <number> full moves
    ffnormal   Normalize from a FEN <input-file> to a FEN <output-file>
    ffperft    Summing over a FEN <input-file> data, run perft to <depth>
    g          Search for a move and then play it
    gg         Play both sides until the end of the game
    help       Show help
    loadfen    Load FEN from an <input-file>
    loadpgn    Load PGN from an <input-file>
    mate       Search for a checkmate in <number> full moves
    mtperft    Run multithreaded perft to <depth> with full node visits
    new        New game
    noop       No operation
    optreset   Reset <option> [<option>]*
    optset     Set <option> [<option>]*
    perftbulk  Run perft to <depth> with bulk counting
    perftfull  Run perft to <depth> with each node visited
    perfttran  Run perft to <depth> with transposition help
    pfnormal   Normalize from a PGN <input-file> to a PGN <output-file>
    rg         Generate and display a single random game
    rgdump     Dump to a PGN <output-file> <number> random games
    rgstat     Generate a report for <number> random game(s)
    rm         Retract move
    rmp        Retract move pair
    rmts       Retract move(s) to start
    s          Search for a move but do not play it
    savefen    Save FEN to an <output-file>
    savepgn    Save PGN to an <output-file>
    selftest   Run the self test
    sfen       Set FEN <mpd> <good> <cavs> <epsq> <hmvc> <fmvn>
    slevfd     Set level fixed depth <ply>
    slevfn     Set level fixed nodes <count>
    slevft     Set level fixed time <seconds>
    slevnt     Set level nominal time <seconds>
    slevut     Set level unlimited time
    stag       Set PGN <tagname> to <tagvalue>
    test       Run developer test
    ttreset    Transposition table data reset

  Options:
    adcc       Auto display: chess clock
    adcp       Auto display: chess position
    mono       Monochrome only output
    trcv       Trace: current variation
    trea       Trace: EPD analysis
    trfd       Trace: first time node at depth
    trfv       Trace: final (predicted) variation
    trif       Trace: input FEN
    trit       Trace: iteration start/finish
    trpv       Trace: predicted variation
    trst       Trace: search termination
    trts       Trace: timing statistics
[] perfttran 7
Na3 120142144
Nc3 148527161
Nf3 147678554
Nh3 120669525
a3 106743106
a4 137077337
b3 133233975
b4 134087476
c3 144074944
c4 157756443
d3 227598692
d4 269605599
e3 306138410
e4 309478263
f3 102021008
f4 119614841
g3 135987651
g4 130293018
h3 106678423
h4 138495290
[ts] Count: 3195901860   Time: 000.00:00:39.860   Frequency: 80178170
[] exit

CookieCat done
gail:tmp sje$ 
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Distributed perft

Post by Don »

sje wrote:
Don wrote:By the way, just out of curiosity, how does cookie work with user interfaces if it cannot even understand long algebraic? I think both xboard and UCI expect moves in that format, no?
CookieCat uses an interactive console interface. The program doesn't have much code for xboard or UCI, only stubs.
Steven,

Komodo already has a perft calculation in it, so I'm building a binary for both platforms to make it a standalone perft calculator and I'm removing all the cruft from the chess programs. I changed the key type to __uint128_t also to avoid hash table collisions.

I also added memoization to speed up the perft calculation but I'm not trying to do anything particularly sophisticated. Also, the move generator was not designed for perft and neither was the move make routine so I'm positive this is far from optimal. I'm pretty sure I'm losing a lot at the leaf nodes because I have to generate and then test each move for legality. In the chess program there is no point in making this optimization because the first move is often a cutoff.

So we can use this as a reference point - I will set this up so that if you follow the interface you can use any perft you choose, but it will have to follow a simple protocol which I will publish shortly.

Bottom line is that my perft will probably be pretty slow - but it will get us started.

What is a good time for a 7 and 8 ply perft? I'm on an i5 modern desktop at 3.00 GHz and here is what I'm getting. It ok if it's way out of line with the better perft generators, as we can plug and play. The time is in seconds of course:

Code: Select all

....................  1         0.0  20  
....................  2         0.0  400  
....................  3         0.0  8902  
....................  4         0.0  197281  
....................  5         0.2  4865609  
....................  6         2.7  119060324  
....................  7        48.8  3195901860  
....................  8      1257.2  84998978956  

Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Distributed perft

Post by Don »

Sorry, I just noticed that you published perft numbers. Your perft is obviously very fast so we should try to use that if we can.

Are you using any special hash table magic? I'm simply overwriting any entry in the table if the depth is less than the depth of the new entry that I wish to place in the table.

Don

Don wrote:
sje wrote:
Don wrote:By the way, just out of curiosity, how does cookie work with user interfaces if it cannot even understand long algebraic? I think both xboard and UCI expect moves in that format, no?
CookieCat uses an interactive console interface. The program doesn't have much code for xboard or UCI, only stubs.
Steven,

Komodo already has a perft calculation in it, so I'm building a binary for both platforms to make it a standalone perft calculator and I'm removing all the cruft from the chess programs. I changed the key type to __uint128_t also to avoid hash table collisions.

I also added memoization to speed up the perft calculation but I'm not trying to do anything particularly sophisticated. Also, the move generator was not designed for perft and neither was the move make routine so I'm positive this is far from optimal. I'm pretty sure I'm losing a lot at the leaf nodes because I have to generate and then test each move for legality. In the chess program there is no point in making this optimization because the first move is often a cutoff.

So we can use this as a reference point - I will set this up so that if you follow the interface you can use any perft you choose, but it will have to follow a simple protocol which I will publish shortly.

Bottom line is that my perft will probably be pretty slow - but it will get us started.

What is a good time for a 7 and 8 ply perft? I'm on an i5 modern desktop at 3.00 GHz and here is what I'm getting. It ok if it's way out of line with the better perft generators, as we can plug and play. The time is in seconds of course:

Code: Select all

....................  1         0.0  20  
....................  2         0.0  400  
....................  3         0.0  8902  
....................  4         0.0  197281  
....................  5         0.2  4865609  
....................  6         2.7  119060324  
....................  7        48.8  3195901860  
....................  8      1257.2  84998978956  

Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: (Perft): CookieCat versus JetChess.

Post by Don »

Ajedrecista wrote:Hello Steven:
sje wrote:Like perfttran, perftbulk is single threaded.

Code: Select all

[] perftbulk 7
Na3 120142144
Nc3 148527161
Nf3 147678554
Nh3 120669525
a3 106743106
a4 137077337
b3 133233975
b4 134087476
c3 144074944
c4 157756443
d3 227598692
d4 269605599
e3 306138410
e4 309478263
f3 102021008
f4 119614841
g3 135987651
g4 130293018
h3 106678423
h4 138495290
[ts] Count: 3195901860   Time: 000.00:03:21.540   Frequency: 15857407
Good! Just for comparison, I ran Perft(1) to Perft(10) back in January using JetChess 1.0.0.0 perft counter. These counts were ran in an Intel i5-760 (2.8 GHz); JetChess is single core and 32-bit:

Code: Select all

Hash = 64 MB: 

Perft(1) ---> Time:    0 ms (0:00:00.000). 
Perft(2) ---> Time:    0 ms (0:00:00.000). 
Perft(3) ---> Time:    0 ms (0:00:00.000). 
Perft(4) ---> Time:    1 ms (0:00:00.001). 
Perft(5) ---> Time:   23 ms (0:00:00.023). 
Perft(6) ---> Time:  320 ms (0:00:00.320). 
Perft(7) ---> Time: 3.864 s (0:00:03.864).

Code: Select all

Hash = 256 MB: 

Perft(8) ---> Time: 56.260 s (0:00:56.260).

Code: Select all

Hash = 1 GB: 

 Perft(9) ---> Time:   764.290 s (0:12:44.290). 
Perft(10) ---> Time: 10607.827 s (2:56:47.827).
CookieCat is quite fast regarding perft! Congratulations. I want to see your multi-threaded perft with hash, bulk counting or whatever tricks to speed up the count (please note that I do not have idea about all this technical stuff).

I would like to take part in this distributed project of Perft(14) but I have slow hardware (a dual core Intel Pentium D930 at 3 GHz) that can not be used 24/7 (I need it for other tasks), and very limited access to an Intel i5-760 at 2.8 GHz (both PCs with Windows XP 32-bit), so I can not participate in Perft(14). I wish you good luck with the project.

Regards from Spain.

Ajedrecista.
I found a lot of performance bugs in my version and I made some general improvements. It is now within a factor of about 2.5 of Stevens and slightly slower than 2x yours. But this is without bulk counting. I think with bulk counting I could make this a lot faster:

.................... 1 0.0 20
.................... 2 0.0 400
.................... 3 0.0 8902
.................... 4 0.0 197281
.................... 5 0.0 4865609
.................... 6 0.6 119060324
.................... 7 8.2 3195901860
.................... 8 174.4 84998978956
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: CookieCat reloaded

Post by Don »

sje wrote:Like perfttran, perftbulk is single threaded.

Code: Select all

[] perftbulk 7
Na3 120142144
Nc3 148527161
Nf3 147678554
Nh3 120669525
a3 106743106
a4 137077337
b3 133233975
b4 134087476
c3 144074944
c4 157756443
d3 227598692
d4 269605599
e3 306138410
e4 309478263
f3 102021008
f4 119614841
g3 135987651
g4 130293018
h3 106678423
h4 138495290
[ts] Count: 3195901860   Time: 000.00:03:21.540   Frequency: 15857407
Ok, I have a version that does not do bulk counting but it's pretty fast considering that fact.

Code: Select all

....................  1         0.0  20  
....................  2         0.0  400  
....................  3         0.0  8902  
....................  4         0.0  197281  
....................  5         0.0  4865609  
....................  6         0.5  119060324  
....................  7         6.5  3195901860  
....................  8       100.6  84998978956  
I don't think I can get depth 7 down to your 3.2 number without doing bulk counting. Actually, the 6.5 is really 6 because I am accumulating the time.
So I am only a factor of about 1.88 slower. I think I could more than double the speed if I generated only the legal moves and didn't have to test them but I can tackle that later.

Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: CookieCat reloaded

Post by sje »

Don wrote:I don't think I can get depth 7 down to your 3.2 number without doing bulk counting.
Note that "000.00:03:21.540" means "0 days 0 hours 3 minutes 21 seconds 540 milliseconds" or about 202 seconds to do perft(7) bulk mode which gives frequency of about 15.9 MHz.

This is for a single thread on a 2.67 GHz Intel Xeon from six years ago.