Introducing the *.EBF project

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Introducing the *.EBF project

Post by Rebel »

The (E)xtended (B)ook (F)ormat Project.

http://rebel13.nl/misc/ebf.html

Snippets:

The goal of this project is to create an extended opening book (or multiple books) for all chess engines that roughly will ensure:

Book hits for the first 10 plies with a 90-95% security
Book hits for the first 15 plies with a 75-80% security
Book hits for the first 20 plies with a 50-55% security
Book hits for the first 25 plies with a 25-30% security
Book hits for the first 30 plies with a 10-15% security.

If these numbers are feasible has to be seen, we are going to try it anyway.

All moves in an *.EBF book are analysed by at least one top engine at 30 seconds a move. Meaning that:

• The quality of the moves for the vast majority of chess engines is good enough to play it as a book move, gain some strength and win time on the clock. If Stockfish 7 plays a move analysed for 30 seconds it likely will be better than 98% of other engines.

• For variety reasons a position is (can be) analysed by more engines. Actually the system allows you to up to 127 engines you can freely define yourself.

Once your engine is out of your (own) opening book consult the (an) *.EBF opening book (just one call with as input an EPD string of the current position) and it for instance will return:

[d] rn1qkb1r/p1pp1ppp/bp2pn2/8/2PP4/5NP1/PP2PP1P/RNBQKB1R w KQkq

Code: Select all

Move Score Engine
d1a4    4  0 = Stockfish 7
b2b3   12  1 = Komodo 7
b1c3    4  2 = Gull 3
.........

This is a typical project (like with PGN & EPD or Winboard & UCI) that stands or falls with the number of chess programmers that are going to support the EBF Extended Book Format. The more engines that support it, the more activity of users that will build stronger and stronger EBF books.

To get things to work takes no longer than 1 hour, you might even want to rewrite the source code yourself for your own purposes, the data structure of the EBF format (as described in the Source Code is quite simple. Therefore I like to maintain a list of engines that support the format. Drop me a note if you want to be named.
flok

Re: Introducing the *.EBF project

Post by flok »

I created an opening book of the first 1350516 unique positions. That is roughly the first 6 plies. Most of them I search with a depth of 22. A mix of stockfish and komodo.

Code: Select all

+----------+-----------------+-------+
| count(*) | sum(occurences) | plies |
+----------+-----------------+-------+
|       10 |              28 |     0 |
|        2 |               4 |     1 |
|  1295713 |         4698726 |    22 |
|      280 |             285 |    23 |
|       25 |              25 |    24 |
|    45470 |          284514 |    25 |
|     9016 |           82947 |    32 |
+----------+-----------------+-------+
Would that be of any use to you?

It is stored in an sql database:

Code: Select all

+------------+--------------+------+-----+---------+-------+
| Field      | Type         | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+-------+
| fen        | varchar(255) | NO   | PRI | NULL    |       |
| plies      | tinyint(4)   | YES  |     | NULL    |       |
| eval       | smallint(6)  | YES  |     | NULL    |       |
| occurences | smallint(6)  | YES  |     | NULL    |       |
| pv         | text         | NO   |     | NULL    |       |
+------------+--------------+------+-----+---------+-------+
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Introducing the *.EBF project

Post by Rebel »

flok wrote:IWould that be of any use to you?
If these positions can be stored like this:

rnbqkb1r/ppp1pp1p/6p1/3n4/3P4/2N5/PP2PPPP/R1BQKBNR w KQkq - bm e2-e4; ce +0,43;
rnbqkb1r/pp2pppp/2p2n2/8/2pP4/2N2N2/PP2PPPP/R1BQKB1R w KQkq - bm a2-a4; ce +0,22;
rnbqkb1r/pppn1ppp/4p3/3pP3/3P4/8/PPPN1PPP/R1BQKBNR w KQkq - bm c2-c3; ce +0,27;
rn1qkb1r/p1pp1ppp/bp2pn2/8/2PP4/5NP1/PP2PP1P/RNBQKB1R w KQkq - bm d1-a4; ce +0,05;
rnbqkb1r/pp2pppp/3p1n2/8/3NP3/8/PPP2PPP/RNBQKB1R w KQkq - bm b1-c3; ce +0,17;
r1bqkbnr/pp1p1ppp/2n1p3/8/3pP3/2N2N2/PPP2PPP/R1BQKB1R w KQkq - bm f3xd4; ce +0,23;

Then that would be a gift from heaven as they can be stored immediately into an *.EBF book.
flok

Re: Introducing the *.EBF project

Post by flok »

Ok.
I'll let you know where to download it from. Takes a bit to dump it so please hold on.
Please include my name + website if you're going to use it.
Thanks
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Introducing the *.EBF project

Post by Rebel »

Thanks, I will.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Introducing the *.EBF project

Post by hgm »

This looks more like a database than a book. What use would the information which engine evaluated the move by how much be to an engine using that book? If I want to play a book move I just want the book to tell me what move to play. Having to compare scores and trustworthiness of sources needlessly complicates my probing code. In the end the only thing that matters is with which probability I should play what move, for the current position. And all information that could help me making that decision was already known when the book was compiled.
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Introducing the *.EBF project

Post by Dann Corbit »

hgm wrote:This looks more like a database than a book. What use would the information which engine evaluated the move by how much be to an engine using that book? If I want to play a book move I just want the book to tell me what move to play. Having to compare scores and trustworthiness of sources needlessly complicates my probing code. In the end the only thing that matters is with which probability I should play what move, for the current position. And all information that could help me making that decision was already known when the book was compiled.
Win/loss/draw statistics, analysis score, performance above Elo, I can think of dozens of things that can be useful in a book.

If you only want a move to make, then a random move generator will give you that.

If you want to make the best possible move, then the more information you have the better your move choice can be.

You can perform your own experiments that weight (for instance) actual game outcome scoring verses ply depth of analysis multiplied by engine Elo or whatever other clever things you can imagine.

Data is knowledge and the deeper and more powerfully analyzed the data, the better the data is. Maybe you find a correlation between how often a node wins in top correspondence games that trumps all other measures. Then this is information you would want to store in your book.

Personally, I think it wise to store every pv node an engine ever finds in a file. Load the file on startup. Whenever a new pv node is found or any pv node is improved, update the file.

Let's call it permanent pv hash.
That is another kind of book information.
Pv nodes are rare, and therefore valuable.

There are many memory backed database systems like MonetDB and FastDB that operate at memory speed. You can think of them as extremely elaborate hash tables. If the access is almost free, and it gives a winning move, then why not try it?

I think that an opening book should always evolve with new information so that eventually it becomes perfect.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Introducing the *.EBF project

Post by hgm »

Dann Corbit wrote:Win/loss/draw statistics, analysis score, performance above Elo, I can think of dozens of things that can be useful in a book.

If you only want a move to make, then a random move generator will give you that.
It would not be a very good move, though...
If you want to make the best possible move, then the more information you have the better your move choice can be.
Not really. Information can be sub-divided into useful and irrelevant information, and irrelevant information is just noise that obscures the relevant information.
You can perform your own experiments that weight (for instance) actual game outcome scoring verses ply depth of analysis multiplied by engine Elo or whatever other clever things you can imagine.
Normally you would do such things while creating a book from a game database. There is no gain in doing it 'on the fly'.
Personally, I think it wise to store every pv node an engine ever finds in a file. Load the file on startup. Whenever a new pv node is found or any pv node is improved, update the file.

Let's call it permanent pv hash.
That is another kind of book information.
Pv nodes are rare, and therefore valuable.
Funny you bring that up. I was just considering implementing my mini-Shogi book in Shokidoki that way.
There are many memory backed database systems like MonetDB and FastDB that operate at memory speed. You can think of them as extremely elaborate hash tables. If the access is almost free, and it gives a winning move, then why not try it?
Because opening books that just give you the moves that you would play through this method, but then without you having to do any complex judgement on the fly, work just as well.
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Introducing the *.EBF project

Post by Dann Corbit »

hgm wrote:
Dann Corbit wrote:Win/loss/draw statistics, analysis score, performance above Elo, I can think of dozens of things that can be useful in a book.

If you only want a move to make, then a random move generator will give you that.
It would not be a very good move, though...
If you want to make the best possible move, then the more information you have the better your move choice can be.
Not really. Information can be sub-divided into useful and irrelevant information, and irrelevant information is just noise that obscures the relevant information.
The way to find out is not to guess. If you have the data available then you can run experiments to find the answer.
Suppose, for instance, that you have a move. Stockfish at 40 plies says +100 centipawns and the next best is +80 centipawns. Which will you choose? Now suppose that in actual game play the 100 centipawn move won 1000 games drew 1300 and lost 1400. And the 80 centipawn move won 1400 games, drew 1350 and lost 900. Now which will you choose?
We cannot say to simply go with the actual won/loss/draw percentage either. What if there were only 8 games? Only 24? What if there are 40,000 games?

We can also tag the book with *our* engine's results.
You can perform your own experiments that weight (for instance) actual game outcome scoring verses ply depth of analysis multiplied by engine Elo or whatever other clever things you can imagine.
Normally you would do such things while creating a book from a game database. There is no gain in doing it 'on the fly'.
You state there is no gain in doing it on the fly. I state that if the data in the book is constantly updated on the fly, then doing it on the fly is the only sensible way to do it.
Personally, I think it wise to store every pv node an engine ever finds in a file. Load the file on startup. Whenever a new pv node is found or any pv node is improved, update the file.

Let's call it permanent pv hash.
That is another kind of book information.
Pv nodes are rare, and therefore valuable.
Funny you bring that up. I was just considering implementing my mini-Shogi book in Shokidoki that way.
There are many memory backed database systems like MonetDB and FastDB that operate at memory speed. You can think of them as extremely elaborate hash tables. If the access is almost free, and it gives a winning move, then why not try it?
Because opening books that just give you the moves that you would play through this method, but then without you having to do any complex judgement on the fly, work just as well.
Complex judgement on the fly will take a microsecond for the computer.
Doing the search will take longer.
Now, which is better is a judgement call.

There is nothing wrong with a book that delivers nothing more than "the move".
But I suggest that we can have a book that is both smart (contains many dimensions of data) and active (gets mini-maxed on the fly, even asks the engine for advice if needed).
In order to know which method is better we need to measure.
Of course, both methods can work. If superior methods are used to compute the simple "this is the answer" book, then it can deliver superior moves.

I do not know that the method I describe is in any way better. I only know that I like this idea much better than the oracle method.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Introducing the *.EBF project

Post by Rebel »

hgm wrote:This looks more like a database than a book. What use would the information which engine evaluated the move by how much be to an engine using that book? If I want to play a book move I just want the book to tell me what move to play. Having to compare scores and trustworthiness of sources needlessly complicates my probing code. In the end the only thing that matters is with which probability I should play what move, for the current position. And all information that could help me making that decision was already known when the book was compiled.
There is elo in that kind of information, it's that simple.