Modular opening book SF analysed 87417 pos., beta-1

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Modular opening book SF analysed 87417 pos., beta-1

Post by Frank Quisinsky »

Hi Ferdinand,

for Komodo analysis I am using the "epd" file with quantity of move transpositions.


Stockfish directory.

From engine to engine the number of positions is differently, clear the database ... the PGN file of games I have ... is smaller. No number of games here in the beta-1_dublicates2.epd.

Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Modular opening book SF analysed 87417 pos., beta-1

Post by Ferdy »

Frank Quisinsky wrote:Ferdinand,


In the Stockfish analysis directory you can find the epd with all analysis.
Included / inside in the big download file.

Ahh ok, I am just verifying I might have missed something.

I just want to test the real data before releasing the tool.

The tool is something like this.
You have files:

ref.epd, this is the epd file with epd lines having ce values.
src.pgn, this is your big pgn file.


The tool will ask max score and min score from the user to be included in good.pgn, others will be saved in bad.pgn.

The tests works so far.

If you like, I can make this tool remove duplicates in good.pgn. So it does not matter if your src.pgn has duplicates or not.
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Hm but ... Houston we've had a Problem!

Post by Frank Quisinsky »

Hi Ferdinand,

strong what you do!
But Houston (means Ferdinand) we've had a problem.

Your tool will be work fine if I am using for each engine analysis the big PGN / EPD ... with move transpositions.

But I do that for the first engine Stockfish only!
For all other engines ... comes after ... I am using the EPD file without move transpositions.

So I have only
9 engines x around 26.000 positions = 234.000 positions : 2.880 by day = 81,25 days.

And not ...
9 engines x around 80.000 positions = 720.000 positions : 2.880 by day =
250 days ... to expensive and not necessary!

With other words ...
If Komodo is ready with analysis your tool can be reject only 1 of 4 games (if 4 move transpositions) of the currrent 82.704 PGN database after Stockfish analysis.

Have a look here ...
What Komodo find after the first 5.181 of 26.629 positions.
You can see for the first ... 6 move transpositions ... for the second ... 4 move transpositions ...

Code: Select all

r1bq1rk1/ppp1n1bp/3p1n2/2PPp1p1/4Pp2/2NN1P2/PP1BB1PP/R2Q1RK1 w - - c0 6; ; ce 83; acd 27; acs 30; acn 255512219; pv Tc1 c6 cxd6 Dxd6 dxc6 Sxc6 Sb5 Dd8 Db3+ Kh8 Lb4 Sxb4 Dxb4 Se8 Tfd1 Db6+ Dc5 Le6 Sc3 Tg8 b3 Lf7 Sd5 Lxd5 exd5 Sd6 Sf2 ;
r1b1k2r/1pq1bppp/p2ppn2/n7/3NPP2/1BN1BQ2/PPP3PP/R3K2R w KQkq - c0 4; ; ce 63; acd 27; acs 30; acn 277790300; pv g4 Sc4 O-O-O ;
r1bq1rk1/ppp2pbp/5np1/3P4/2QpP3/2N5/PP2BPPP/R1B1K2R w KQ - c0 7; ; ce 58; acd 28; acs 30; acn 278220074; pv Dxd4 c6 Dc4 cxd5 exd5 Te8 O-O a6 Db3 b5 Lf4 Lf5 Tfe1 Se4 Sxe4 Txe4 Le3 Dd6 a3 De5 Ld3 Th4 h3 Dxb2 Tab1 Dc3 Lxf5 ;
r1b1k2r/2q1bp1p/p2ppp2/1pn3P1/3NP3/2N2Q2/PPP4P/2KR1B1R w kq - c0 4; ; ce 57; acd 27; acs 30; acn 309426921; pv gxf6 Lf8 ;
r1b1r1k1/pp1nqpbp/2pp1np1/8/2PNP3/2N1B3/PPQ1BPPP/3R1RK1 w - - c0 10; ; ce 55; acd 27; acs 30; acn 258536111; pv f3 a5 Dd2 Sc5 Sb3 a4 Sxc5 dxc5 Lg5 Df8 Lh4 a3 b3 b6 Dc1 Sh5 Sa4 Tb8 g4 Lh6 Lg5 Lxg5 Dxg5 f6 Dd2 Sg7 Dd6 ;
r1b1k2r/pp2ppbp/6p1/q1pPn3/4P3/2P1B3/P2N1PPP/R2QKB1R w KQkq - c0 5; ; ce 55; acd 29; acs 30; acn 271897264; pv Tc1 Ld7 Sb3 ;
r1bq1rk1/pp3pbp/2np1np1/2p1p3/2P2B2/P1NP1NP1/1P2PPBP/1R1Q1RK1 w - - c0 5; ; ce 55; acd 30; acs 30; acn 286586107; pv Lg5 h6 Lxf6 Lxf6 b4 cxb4 axb4 Le6 Se1 Dd7 Sc2 Se7 b5 Tfb8 Se3 Lg7 Sed5 Sxd5 Sxd5 Dd8 e3 h5 h4 Lg4 Dd2 Le6 Ta1 ;
r1b2rk1/ppp1b1pp/n2pp3/1N3p1q/2PPnB2/5NP1/PPQ1PPBP/R3R1K1 w - - c0 6; ; ce 54; acd 27; acs 30; acn 253561531; pv Sc3 g5 Le3 ;
r1bq1rk1/p3bppp/1pn1p3/2p5/3PP3/2P2NP1/P4PBP/R1BQ1RK1 w - - c0 10; ; ce 53; acd 28; acs 30; acn 261060937; pv d5 Sa5 Dc2 ;
r2q1rk1/pp1bppbp/1n1p2p1/n7/3NP3/1BNQBP2/PPP3PP/2KR3R w - - c0 5; ; ce 53; acd 27; acs 30; acn 326095818; pv Kb1 Tc8 h4 ;
r1bq1rk1/1p2ppbp/5np1/p2p4/Pn2P3/1NN1BP2/1PPQB1PP/R3K2R w KQ - c0 4; ; ce 53; acd 26; acs 30; acn 291096275; pv O-O-O Dc7 exd5 ;
r2qkbnr/pp3ppp/4p3/3pPb2/3n4/2P5/PP2BPPP/RNBQ1RK1 w kq - c0 6; ; ce 52; acd 29; acs 30; acn 261686410; pv cxd4 Dd7 ;
r1bqk2r/pp2bppp/2n5/3p4/3p4/5NP1/PP2PPBP/R1BQ1RK1 w kq - c0 4; ; ce 52; acd 29; acs 30; acn 270206116; pv Sxd4 O-O Le3 ;
r1bqk2r/pp2bppp/2p2nn1/3p2B1/3P4/2NBPN2/PPQ2PPP/R3K2R w KQkq - c0 4; ; ce 52; acd 28; acs 30; acn 279102165; pv h4 Sf8 ;
1rb2rk1/p3ppbp/3p1np1/q7/2P5/2N3P1/PP2PPBP/R1BQ1RK1 w - - c0 20; ; ce 51; acd 29; acs 30; acn 280340232; pv Dd2 Lb7 Lxb7 ;
rn1qk2r/pp2ppbp/2p2np1/8/3Pp3/2N1BB1P/PPP2PP1/R2QK2R w KQkq - c0 13; ; ce 51; acd 28; acs 30; acn 252274150; pv Sxe4 Sxe4 Lxe4 Sd7 c3 Sf6 Lf3 Sd5 Ld2 e6 O-O O-O Db3 Tb8 Dc2 h5 Tfe1 b5 Tad1 Dd6 Lg5 Lf6 Lxf6 Sxf6 a4 bxa4 Ta1 ;
r1bqnrk1/pp3pb1/2pp2p1/4n2p/2PNP3/1PN3P1/P1QB1PBP/R4RK1 w - - c0 5; ; ce 51; acd 27; acs 30; acn 258144398; pv Sde2 h4 gxh4 Dxh4 f4 Sd7 Tad1 Tb8 h3 b6 Le3 Tb7 Dd2 Tc7 Ld4 Lf6 Lxf6 Sdxf6 f5 Sd7 Sd4 Se5 Sf3 Df6 Sxe5 dxe5 De3 ;
rnbq1rk1/3nppbp/p5p1/1pp1P3/3P4/1QN1BN2/PP3PPP/R3KB1R w KQ - c0 5; ; ce -31; acd 27; acs 30; acn 290414706; pv e6 cxd4 ;
rnb2rk1/pp2ppbp/6p1/q1P1P2n/2p2B2/2N2N2/PP3PPP/2RQKB1R w K - c0 7; ; ce -32; acd 29; acs 30; acn 290346022; pv Le3 Td8 Da4 Dxa4 Sxa4 Sc6 Lxc4 Sxe5 Sxe5 Lxe5 O-O Ld7 Sc3 Lc6 Tcd1 e6 Tfe1 Tac8 Txd8+ Txd8 Sb5 Lxb5 Lxb5 Sf6 c6 bxc6 Lxc6 ;
rn1q1rk1/p1pbppbp/5np1/1p6/2QPP3/2N2N2/PP3PPP/R1B1KB1R w KQ - c0 11; ; ce -34; acd 28; acs 30; acn 270904194; pv Dd3 b4 ;
r1bq1rk1/p3bppp/2pp1n2/4p3/4PP2/1BN1B3/PPP3PP/R2QK2R w KQ - c0 7; ; ce -34; acd 29; acs 30; acn 275241232; pv Df3 Sg4 Ld2 exf4 Lxf4 Lg5 O-O Lxf4 Dxf4 Se5 Tf2 Tb8 h3 h6 Dd2 Dc7 Sd1 a5 Se3 a4 Lxa4 Txb2 Dc3 Tb8 Lb3 Le6 Td1 ;
r1bqk2r/pp1pnpbp/2n1p1p1/2p5/2B1PP2/2N2N2/PPPP2PP/R1BQ1RK1 w kq - c0 5; ; ce -34; acd 29; acs 30; acn 251303075; pv Lb3 O-O d3 d5 e5 Ld7 Se2 Sa5 c3 Sxb3 axb3 Dc7 d4 b6 Le3 a5 c4 dxc4 bxc4 cxd4 Lxd4 Dxc4 Lxb6 Sd5 Tc1 Da6 Ld4 ;
1rbq1rk1/1p2ppb1/p1np1np1/2p3Bp/2P1P2P/2NP2P1/PP2NPB1/R2Q1RK1 w - - c0 4; ; ce -35; acd 27; acs 30; acn 299786566; pv Dd2 b5 ;
rnbq1rk1/ppp2pbp/5np1/3p4/1PPP4/5N2/PB2BPPP/RN1QK2R w KQ - c0 5; ; ce -36; acd 28; acs 30; acn 256083503; pv O-O dxc4 Lxc4 Sbd7 Lb3 Sb6 Te1 Sbd5 a3 Le6 Sc3 Sxc3 Lxc3 Lxb3 Dxb3 Dd5 Dc2 Tfe8 Se5 Tad8 Te3 Sd7 Tae1 Sf8 Sf3 Se6 T3e2 ;
r4rk1/2pqbppp/p1np1nb1/1p1Bp3/3PP1P1/2P2N1P/PP3P2/RNBQR1K1 w - - c0 7; ; ce -42; acd 27; acs 30; acn 293273771; pv dxe5 Sxd5 ;
rnbqkb1r/1p3ppp/p4n2/1N1pp1B1/2P5/2N5/PP2PPPP/R2QKB1R w KQkq - c0 5; ; ce -67; acd 32; acs 30; acn 354968335; pv Da4 Ld7 cxd5 ;
Forget it ...
I understand ...
ref.epd ... the EPD from all positions included in big pgn file.

So you are working with ... ref.epd and epd file with ce values.

Great ...
Now I understand ...
So no problem I can see here ... should be work!

Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Hm but ... Houston we've had a Problem!

Post by Ferdy »

Frank Quisinsky wrote: Forget it ...
I understand ...
ref.epd ... the EPD from all positions included in big pgn file.
This is not necessary, the ref.epd may contain only 100 positions or epd lines and the big pgn file may contain more games, or the ref.epd may contain more positions than the number of games in big pgn file. The tool will look the score or ce values specified by the user in ref.epd, if it is there I let pgn-extract find that game and output it in good.pgn, all games that are not in good.pgn will be saved in bad.pgn.
Frank Quisinsky wrote: So you are working with ... ref.epd and epd file with ce values.
I am interested on epd file with ce values because of the score. If the epd line on the ref.epd file has no ce values then it will just be ignored, and if that epd without ce values has a game in big.pgn file then that game will be saved in bad.pgn. The src.pgn or big pgn file is still there and the new files good.pgn and bad.pgn are the outputs.
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: New idea ...

Post by Frank Quisinsky »

OK ...

I have the big *.pgn file alpha.pgn with all 87.714 games and to this alpha.pgn the big *.epd file with the same 87.714 positions.

If I understand ... all what I need is 10x (10 engines will be analysed positions) the file with "ce" after analysis.

Later ...
If I have interest to create a database with 0.30 / -0.20 for an example ... I can start your tool 10 times ...

Allways with alpha.pgn and alpha.epd and the *.epd of each of the ten engines.

So what we need ... if your tool is ready ... is a table!

Stockfish: 0.50 / -0.30
Komodo: 0.50 / -0.30
Houdini: 0.60 / -0.35
Fizbo: 1.00 / -0.50
10. ...

Means the evals are quiet differently!
So users have a table for select out the games.

If for an user 0.50 to high ...
He can give for all of the 10 ... 0.10 lesser or 0.20 lesser!

For searching positions for a test-set ...

Your tool can be create an extra feature!

Reject: 0.00 too!
Probability of a 3-fold repitition will be falled.
And the database of positions have a very hight contempt!!

But this one is more interesting if we select out positions for a test-set ... maybe in combination with the tool PGN selection from Volker Annuss, available in my download area too.

What do you think?
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Genial ... n.t.

Post by Frank Quisinsky »

Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: New idea 2 & 3 ... from Stefan P. ... contempt!

Post by Frank Quisinsky »


I think the idea with reject 0.00 as extra option will be fine.

A second idea as extra Option for your new tool ...
if you like to do that later:

I have to explain:
Idea by Stefan Pohl ... Stefan generated for a short time a database for testing engines with Queens on board ... for more tactial possibilities. I think the idea is good for testing engines.

And very good for create test-sets!!
I think the idea is good because the probability of 3-fold repetition falled too. And engine-engine matches for users are more nice for looking.

Furthermore ... in theory positions with many pieces on board, starting with a balanced eval, computer chess programs can find better new theory possibilities / ways. Games we are using later for new opening books have a very high Quality ... if the Basics are 3-moves after ECO Code formed.

So the tool can reject all positions without Queens / Queen on board too!

And in additional ...
In combination of the idea from Stefan Pohl ...

Reject all positions with lesser as x pieces on board. In many openings we have end positions with lesser pieces on board ... often very boring for eng-eng matches!

Same here ... the probability of 3-fold is higher for eng-eng matches with lesser pieces on board.

Idea 1: 0.00 evals to reject
Idea 2: reject positions without queen on board
Idea 3: reject positions with to many switched pieces.

And the contempt of the position database will be higher and higher!

All that isn't important for my Event ... to create strong opening books for engine testing. But in combination to create test-sets I think the ideas are good.

Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: New idea ...

Post by Ferdy »

Frank Quisinsky wrote:OK ...

I have the big *.pgn file alpha.pgn with all 87.714 games and to this alpha.pgn the big *.epd file with the same 87.714 positions.

If I understand ... all what I need is 10x (10 engines will be analysed positions) the file with "ce" after analysis.

Later ...
If I have interest to create a database with 0.30 / -0.20 for an example ... I can start your tool 10 times ...

Allways with alpha.pgn and alpha.epd and the *.epd of each of the ten engines.

So what we need ... if your tool is ready ... is a table!

Stockfish: 0.50 / -0.30
Komodo: 0.50 / -0.30
Houdini: 0.60 / -0.35
Fizbo: 1.00 / -0.50
10. ...
A user can extract games based on analyzing engine if your epd has the name of the analyzing engine (Ae) opcode, example.

Code: Select all

r2qkb1r/ppp1pppp/5n2/8/3P2b1/5N2/PPP2PPP/R1BQKB1R w KQkq - id "001"; ce 42; acd 28; acs 30; acn 298161609; pv h3 Lxf3; Ae "Stockfish 8";
I will support this Ae opcode in the tool by asking which engine has evaluated the position.

Code: Select all

min score in cp? -30
max score in cp? 30
analyzing engine? Stockfish 8
If the user wants any engine then just enter 0

Code: Select all

analyzing engine? 0
Frank Quisinsky wrote: Means the evals are quiet differently!
So users have a table for select out the games.

If for an user 0.50 to high ...
He can give for all of the 10 ... 0.10 lesser or 0.20 lesser!

For searching positions for a test-set ...

Your tool can be create an extra feature!

Reject: 0.00 too!
Probability of a 3-fold repitition will be falled.
And the database of positions have a very hight contempt!!
The user can just input twice. If user does not want drawish position perhaps only interested outside the window [-10, +10] but inside the window [-30, +30] in cp.

Code: Select all

min score in cp? 11
max score in cp? 30
and another filter run

Code: Select all

min score in cp? -30
max score in cp? -11
The user can then combine the 2 filtered pgn files.
Frank Quisinsky wrote: But this one is more interesting if we select out positions for a test-set ... maybe in combination with the tool PGN selection from Volker Annuss, available in my download area too.

What do you think?
This is indeed a very useful one.
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: New idea 2 & 3 ... from Stefan P. ... contempt!

Post by Ferdy »

Frank Quisinsky wrote: Idea 1: 0.00 evals to reject
Idea 2: reject positions without queen on board
Idea 3: reject positions with to many switched pieces.
Idea 1: was already suggested in previous post.
Idea 2: I can add

Code: Select all

number of queens [0, 1, 2]? 2
Idea 3: Total pieces (w/o) pawns = 7+7 = 14, perhaps

Code: Select all

minimum piece counts? 10
We can get crazier by:

Code: Select all

closed positions?
open positions?
semi-open positions?
with passer?
with king castled queen-side?
king's indian?
with doubled pawn?
with isolated pawn?
with backward pawn?
Until this tool becomes unusable because of too many questions :).

But I like the pawn structure weaknesses idea,

Code: Select all

with doubled pawn?
with isolated pawn?
with backward pawn?
or just one

Code: Select all

with pawn structure weakness?
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: New idea 2 & 3 ... from Stefan P. ... contempt!

Post by Frank Quisinsky »

1. number of Queens ... perfect
2. number of pieces ... perfect too ...

But not complete understand:
32 pieces on board and I think interesting will be ... 3 moves after ECO Code formed ... no lesser as 24 pieces on board ... I should give the tool 24

User should give his own idea with number of pieces on board.
I think 24 is intersting for test-set games ...


Interesting can be ...

- closed positions ... tool give me closed positions in PGN only
- open positions ... tool give me open positions in PGN only
- semi.pen positions ... tool give me open positions in PGN only

- different king castled ... example: White on queen site, black on king site
and the tool give me such positions in PGN only

Should be enough ...

So the program asked ...
- closed positions ... I can say yes or no
and so one and I have the positions in a new PGN database


Indeed pawn structure are most important for openings.
If possible ... not pawn structure weakness ... better is ...

with double pawn
with isolated pawn
with backward.pawn

If I say the tool 3x yes I have all the weakness pawn structures.

All in all ...
The tool must be easy to handle.
In foreground reject games with eval.

In background the ideas to make more with the database and create test-sets. So perphas better to create two newer Tools.

One for test-sets from the database of positions ...
One for opening books from the database of positions ...