Modular opening book SF analysed 87417 pos., beta-1
Moderators: hgm, Rebel, chrisw
-
- Posts: 6808
- Joined: Wed Nov 18, 2009 7:16 pm
- Location: Gutweiler, Germany
- Full name: Frank Quisinsky
Re: Stefan will be happy ... first beta!
Hi Ferdinand,
OK, found my mistake too.
A short error checking will be make it a bit better.
Best
Frank
OK, found my mistake too.
A short error checking will be make it a bit better.
Best
Frank
-
- Posts: 6808
- Joined: Wed Nov 18, 2009 7:16 pm
- Location: Gutweiler, Germany
- Full name: Frank Quisinsky
Re: But ... have a look here!
OK, again ...
You wrote in your readme:
"Duplicates in filtered_big.pgn file are also removed."
Not seen at first, sorry!
---
I have:
87.417 games + 4.728 games in update database = 92.145 games.
pgn-select
with parameter
Result = 30.220 games without doubles
Or with the tool by Norm:
Result = 30.238 games without doubles
With SF analysis I reject
9.391 with doubles and 50 by hand (very rarely lines with unusal combinations)
Result = 9.441 with doubles or 4.924 without doubles
...
Now ...
If I copy the 30.220 games (file called test.pgn) and the *.epd with 87.417 SF analyses in your directory ... I got the result:
24.173 positions in filtered_test.pgn
And this can't be right.
It must be 26.629 ...
Because:
87.417 in Alpha.pgn - 9.441 reject = 77.976 with doubles or
77.976 with doubles + 4.726 update with doubles = 82.704 final.
82.704 with doubles = 26.629 without doubles (after the tool by Norm).
and all what SF found is reject.
But your tool give not 26.629, your tool give ... 24.173!!
Best
Frank
So, better at first is not to reject the doubles (max. with parameter in criteria.txt) and we can check where ist the mistake if we have 4x output with *.epd. Think so ... hope I am right.
Must drive to my prof. work ...
Can answere in the evening!
You wrote in your readme:
"Duplicates in filtered_big.pgn file are also removed."
Not seen at first, sorry!
---
I have:
87.417 games + 4.728 games in update database = 92.145 games.
pgn-select
with parameter
Code: Select all
pgn-extract --fuzzydepth 0 --duplicates dupes.pgn --output unique-alpha.pgn alpha.pgn
Or with the tool by Norm:
Code: Select all
pgnFin alpha.pgn
you need pgn-extract available
it creates outF.epd
I then trim it, and add id
epdtrim outF.epd
idopcode outT.epd
copy idlist inlist
epdInsert inlist outT.epd
creates outN.epd
then I rename outN.epd to whatever you want
With SF analysis I reject
9.391 with doubles and 50 by hand (very rarely lines with unusal combinations)
Result = 9.441 with doubles or 4.924 without doubles
...
Now ...
If I copy the 30.220 games (file called test.pgn) and the *.epd with 87.417 SF analyses in your directory ... I got the result:
24.173 positions in filtered_test.pgn
And this can't be right.
It must be 26.629 ...
Because:
87.417 in Alpha.pgn - 9.441 reject = 77.976 with doubles or
77.976 with doubles + 4.726 update with doubles = 82.704 final.
82.704 with doubles = 26.629 without doubles (after the tool by Norm).
and all what SF found is reject.
But your tool give not 26.629, your tool give ... 24.173!!
Code: Select all
start
refepd, alpha.epd
refpgn, test.pgn
minscorecp, -30
maxscorecp, +50
mincntqueen, 0
maxcntqueen, 2
end
Frank
So, better at first is not to reject the doubles (max. with parameter in criteria.txt) and we can check where ist the mistake if we have 4x output with *.epd. Think so ... hope I am right.
Must drive to my prof. work ...
Can answere in the evening!
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: But ... have a look here!
From my calculation:Frank Quisinsky wrote:OK, again ...
You wrote in your readme:
"Duplicates in filtered_big.pgn file are also removed."
Not seen at first, sorry!
---
I have:
87.417 games + 4.728 games in update database = 92.145 games.
pgn-select
with parameter
Result = 30.220 games without doublesCode: Select all
pgn-extract --fuzzydepth 0 --duplicates dupes.pgn --output unique-alpha.pgn alpha.pgn
Code: Select all
alpha.pgn = 87417
upd_a00-e99.pgn = 4728
alpha-1.pgn = 87417 + 4728 = 92145
Code: Select all
pgn-extract --fuzzydepth 0 --duplicates dupes.pgn --output unique-alpha-1.pgn alpha-1.pgn
Yours : 30220
Mine : 30236
Could you verify your result?
-
- Posts: 1056
- Joined: Thu Mar 09, 2006 4:15 pm
- Location: Long Island, NY, USA
Re: But ... have a look here!
Fsux en passant in end positions could be causing the difference. Try using the --nofaux option with pgn-extract.
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: But ... have a look here!
I looked at the epd called _00001-87417-analysis.epd having 87417 epd lines we call this alpha.epd. In this file there are duplicates, one is this.Frank Quisinsky wrote:OK, again ...
You wrote in your readme:
"Duplicates in filtered_big.pgn file are also removed."
Not seen at first, sorry!
---
I have:
87.417 games + 4.728 games in update database = 92.145 games.
pgn-select
with parameter
Result = 30.220 games without doublesCode: Select all
pgn-extract --fuzzydepth 0 --duplicates dupes.pgn --output unique-alpha.pgn alpha.pgn
Or with the tool by Norm:
Result = 30.238 games without doublesCode: Select all
pgnFin alpha.pgn you need pgn-extract available it creates outF.epd I then trim it, and add id epdtrim outF.epd idopcode outT.epd copy idlist inlist epdInsert inlist outT.epd creates outN.epd then I rename outN.epd to whatever you want
With SF analysis I reject
9.391 with doubles and 50 by hand (very rarely lines with unusal combinations)
Result = 9.441 with doubles or 4.924 without doubles
...
Now ...
If I copy the 30.220 games (file called test.pgn) and the *.epd with 87.417 SF analyses in your directory ... I got the result:
24.173 positions in filtered_test.pgn
And this can't be right.
It must be 26.629 ...
Because:
87.417 in Alpha.pgn - 9.441 reject = 77.976 with doubles or
77.976 with doubles + 4.726 update with doubles = 82.704 final.
82.704 with doubles = 26.629 without doubles (after the tool by Norm).
and all what SF found is reject.
But your tool give not 26.629, your tool give ... 24.173!!
BestCode: Select all
start refepd, alpha.epd refpgn, test.pgn minscorecp, -30 maxscorecp, +50 mincntqueen, 0 maxcntqueen, 2 end
Frank
So, better at first is not to reject the doubles (max. with parameter in criteria.txt) and we can check where ist the mistake if we have 4x output with *.epd. Think so ... hope I am right.
Must drive to my prof. work ...
Can answere in the evening!
Code: Select all
rn1qk1nr/pbpp2pp/1p2p3/3P4/1bP1p3/2NB4/PP3PPP/R1BQK1NR w KQkq -
Code: Select all
rn1qk1nr/pbpp2pp/1p2p3/3P4/1bP1p3/2NB4/PP3PPP/R1BQK1NR w KQkq - id "10299"; ce -30; acd 29; acs 30; acn 304539901; pv Lxe4 ;
Code: Select all
rn1qk1nr/pbpp2pp/1p2p3/3P4/1bP1p3/2NB4/PP3PPP/R1BQK1NR w KQkq - id "10311"; ce -15; acd 29; acs 30; acn 306083854; pv Lxe4 Dh4 De2 Sf6 Lf3 O-O dxe6 Lc6 Ld2 Te8 g3 Lxf3 Sxf3 Dg4 h3 Dxe6 Dxe6+ Txe6+ Kf1 Lxc3 Lxc3 Sc6 Sd4 Sxd4 Lxd4 Kf7 Td1 ;
Code: Select all
rn1qk1nr/pbpp2pp/1p2p3/3P4/1bP1p3/2NB4/PP3PPP/R1BQK1NR w KQkq - id "10320"; ce -13; acd 27; acs 30; acn 303890551; pv Lxe4 Dh4 De2 Sf6 Lf3 O-O g3 Dd4 Kf1 La6 Sb5 Lxb5 cxb5 Sxd5 Sh3 Df6 Kg2 c6 Lg5 Df7 Sf4 a6 a4 axb5 axb5 Txa1 Txa1 ;
Code: Select all
rn1qk1nr/pbpp2pp/1p2p3/3P4/1bP1p3/2NB4/PP3PPP/R1BQK1NR w KQkq - id "10566"; ce -31; acd 27; acs 30; acn 299206348; pv Lxe4 Dh4 Ld3 exd5 Sf3 De7+ Le3 dxc4 Lxc4 Lxc3+ bxc3 Sc6 O-O O-O-O Te1 Df8 Sd4 Sf6 Sxc6 dxc6 Dc2 Kb8 f3 Da3 Lb3 The8 Lf2 ;
[...] more
These duplicates makes the calculation complicated.
My tool will search ce values in a given window say -30/+50, then let pgn-extract find the games and remove duplicates by end position matching. In this case my tool will include this epd as good, because there is epd with ce within -30/+50, although there is ce -31.
It seems to me that the ref epd should be unique. If there are more than 1 epd with different ce values then those should be identified by analyzing engine, example.
Code: Select all
rn1qk1nr/pbpp2pp/1p2p3/3P4/1bP1p3/2NB4/PP3PPP/R1BQK1NR w KQkq - id "10299"; ce -30; acd 29; acs 30; acn 304539901; pv Lxe4 ; Ae "Sf8";
rn1qk1nr/pbpp2pp/1p2p3/3P4/1bP1p3/2NB4/PP3PPP/R1BQK1NR w KQkq - id "10311"; ce -15; acd 29; acs 30; acn 306083854; pv Lxe4 Dh4; Ae "K10";
But not below, same engine, same epd different ce values.
Code: Select all
rn1qk1nr/pbpp2pp/1p2p3/3P4/1bP1p3/2NB4/PP3PPP/R1BQK1NR w KQkq - id "10299"; ce -30; acd 29; acs 30; acn 304539901; pv Lxe4 ; Ae "Sf8";
rn1qk1nr/pbpp2pp/1p2p3/3P4/1bP1p3/2NB4/PP3PPP/R1BQK1NR w KQkq - id "10299"; ce -36; acd 29; acs 30; acn 314539901; pv Lxe4 ; Ae "Sf8";
-
- Posts: 6808
- Joined: Wed Nov 18, 2009 7:16 pm
- Location: Gutweiler, Germany
- Full name: Frank Quisinsky
Re: But ... have a look here!
Hi Ferdinand,
same epd ... different values!
Should be clear ...
I am using 4 cores with 4x Hyperthreading.
Final results never will be the same.
Indeed bad ... I am thinking a long time about the problem!
But with more cores and hyperthreading I can get in 30 seconds by move a clearly better result.
Best
Frank
same epd ... different values!
Should be clear ...
I am using 4 cores with 4x Hyperthreading.
Final results never will be the same.
Indeed bad ... I am thinking a long time about the problem!
But with more cores and hyperthreading I can get in 30 seconds by move a clearly better result.
Best
Frank
-
- Posts: 6808
- Joined: Wed Nov 18, 2009 7:16 pm
- Location: Gutweiler, Germany
- Full name: Frank Quisinsky
Re: But ... have a look here!
Hi Ferdinand,
Example:
If I have 4 times the EPD with 4 different values ...
reject with 0.50 / -0.30
1. Value = -0.20
2. Value = -0.25
3. Value = -0.30
4. Value = -0.31
I reject only 1/4 with Value = -0.31 after Stockfish analysis. Because I do it by hand with game number information under Chessbase GUI. I created from the 87.417 PGN file CBH database files. PGN and EPD have the same game numbers! With epdOrder by Norm I can sort the EPD file with CE Information and delete the game numbers by hand in CBH file.
Reject information can be found in my database v1.03 file in the Stockfish subdirectory: reject
With other words ... during this work I can't see that the position is 4 times in the database.
After I reject what Stockfish find out + the update database of 4.728 games I build the beta-1.pgn file.
In beta-1.pgn (82.704 games) is now three times the PGN included because only 1 time removed.
That is indeed a problem yes!
Because better is to reject 4/4.
Maybe possible with your programming?
If epd more as 1 time in database delete all of it if one of them higher as value in criteria.txt
---
Now Komodo analysed the database without doubles. I will not have the problem again. After Komodo all other engines will be analysed again without doubles.
You wrote:
This duplicates makes the calculation complicated.
I know that ... and not thinking about it at first.
Best
Frank
Much more easy is to do this one.
Forget the 87.417 alpha database.
New main database is the beta-1.pgn database after Stockfish analyses with 26.619 games without doubles or 82.704 with doubles. If Komodo is ready we have 26.619 positions in EPD with ce. too because Komodo analysesd not all ... only the smaller database without doubles.
Maybe it make more sense to work and compare results from your tool with the beta-1 database and not with the alpha.pgn database with or without the update I create.
I don't know!
In around 4 days Komodo is done and I can create the beta-2 file and Houdini will be the next.
Best
Frank
Example:
If I have 4 times the EPD with 4 different values ...
reject with 0.50 / -0.30
1. Value = -0.20
2. Value = -0.25
3. Value = -0.30
4. Value = -0.31
I reject only 1/4 with Value = -0.31 after Stockfish analysis. Because I do it by hand with game number information under Chessbase GUI. I created from the 87.417 PGN file CBH database files. PGN and EPD have the same game numbers! With epdOrder by Norm I can sort the EPD file with CE Information and delete the game numbers by hand in CBH file.
Reject information can be found in my database v1.03 file in the Stockfish subdirectory: reject
With other words ... during this work I can't see that the position is 4 times in the database.
After I reject what Stockfish find out + the update database of 4.728 games I build the beta-1.pgn file.
In beta-1.pgn (82.704 games) is now three times the PGN included because only 1 time removed.
That is indeed a problem yes!
Because better is to reject 4/4.
Maybe possible with your programming?
If epd more as 1 time in database delete all of it if one of them higher as value in criteria.txt
---
Now Komodo analysed the database without doubles. I will not have the problem again. After Komodo all other engines will be analysed again without doubles.
You wrote:
This duplicates makes the calculation complicated.
I know that ... and not thinking about it at first.
Best
Frank
Much more easy is to do this one.
Forget the 87.417 alpha database.
New main database is the beta-1.pgn database after Stockfish analyses with 26.619 games without doubles or 82.704 with doubles. If Komodo is ready we have 26.619 positions in EPD with ce. too because Komodo analysesd not all ... only the smaller database without doubles.
Maybe it make more sense to work and compare results from your tool with the beta-1 database and not with the alpha.pgn database with or without the update I create.
I don't know!
In around 4 days Komodo is done and I can create the beta-2 file and Houdini will be the next.
Best
Frank
-
- Posts: 6808
- Joined: Wed Nov 18, 2009 7:16 pm
- Location: Gutweiler, Germany
- Full name: Frank Quisinsky
Re: But ... have a look here!
Hi Norm,
--nofaux
I think --nofauxep
?
But how I can used that?
pgn-extract --nofauxep ... and than?
I try out different combinations!
At the moment differents in the final results !
30.238 with your tools is right
30.220 with pgn-extract is wrong
18 games missed!
How I can find the 18 games and how I can create with pgn-extract the right results?
The hint is great!!
I am thinking if I am working with --nofauxep I must do it in two steps and not in one step in combination with ...
pgn-extract --fuzzydepth 0 --duplicates dupes.pgn --output unique-beta-1.pgn beta-1.pgn
Best
Frank
--nofaux
I think --nofauxep
?
But how I can used that?
pgn-extract --nofauxep ... and than?
I try out different combinations!
At the moment differents in the final results !
30.238 with your tools is right
30.220 with pgn-extract is wrong
18 games missed!
How I can find the 18 games and how I can create with pgn-extract the right results?
The hint is great!!
I am thinking if I am working with --nofauxep I must do it in two steps and not in one step in combination with ...
pgn-extract --fuzzydepth 0 --duplicates dupes.pgn --output unique-beta-1.pgn beta-1.pgn
Best
Frank
-
- Posts: 6808
- Joined: Wed Nov 18, 2009 7:16 pm
- Location: Gutweiler, Germany
- Full name: Frank Quisinsky
Re: But ... have a look here!
Hi Ferdinand,
again ... I reject by hand under Chessbase GUI.
Possible that I made mistakes here!
Example:
I reject not game number 60212, I reject game number 60112.
If so, all games I reject I have in on other database (can check that later).
Komodo will find such mistakes and will give me again the same "bad lines" Stockfish found.
After all ...
This is possible but I think I am working without many mistakes here. I can't do it with an other way because no tools are available for it.
In reality, after Komodo analysis, the first good database will be available.
Maybe you should work in testing with your new tool with the beta-1.pgn or the beta-2.pgn, in a short time ... in 3-4 days available.
Best
Frank
again ... I reject by hand under Chessbase GUI.
Possible that I made mistakes here!
Example:
I reject not game number 60212, I reject game number 60112.
If so, all games I reject I have in on other database (can check that later).
Komodo will find such mistakes and will give me again the same "bad lines" Stockfish found.
After all ...
This is possible but I think I am working without many mistakes here. I can't do it with an other way because no tools are available for it.
In reality, after Komodo analysis, the first good database will be available.
Maybe you should work in testing with your new tool with the beta-1.pgn or the beta-2.pgn, in a short time ... in 3-4 days available.
Best
Frank
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: But ... have a look here!
I tried to run it with option --nofauxep but I get same numbers. Note that the fen that I use is from the analyzed epd file which is already existing. I don't know if Frank's tool he used to generate the analyzed epd from pgn file considers the undefined ep sq.Norm Pollock wrote:Fsux en passant in end positions could be causing the difference. Try using the --nofaux option with pgn-extract.