Opening Book from pgn games set

Discussion of chess software programming and technical issues.

Moderator: Ras

Matt Thomas

Opening Book from pgn games set

Post by Matt Thomas »

Hello,

I am working on creating an opening book for my chess engine.

I have seen a method where I have a stored position and then weighted moves from that position, then repeat this concept to add positions.

I would like to be able to convert a multi-game pgn into an opening book, but am not picturing a good method to accomplish this.

Has anyone tried to do this?

I would like to be able to create an opening book that would give the engine a style of play, like say Lasker. If I can only picture a method then I can write some code to accomplish it.

My problem in visualizing a solution is that Lasker may have played a number of varied lines from a specific opening position and I need to combine his choices into book positions that would have the engine playing his move choices exactly like he would if the opponent plays moves out of one of his games. I can do this manually, but I was hoping for a swifter approach.

Thanks for all replies, Matt
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Opening Book from pgn games set

Post by Edmund »

There are several tools for converting pgn files to Opening Books.

I have designed one opening book format for the Glass Chess Engine, I made it open source and anyone is free to use it. It comes with its own opening book manager you can download at http://www.koziol.home.pl/marittima/glass/download.htm

when parsing the pgn file you can set certain parameters, like how to weigh moves (number of occurrences, game outcomes, etc)
brianr
Posts: 540
Joined: Thu Mar 09, 2006 3:01 pm
Full name: Brian Richardson

Re: Opening Book from pgn games set

Post by brianr »

In addition to the GOB option, I would also suggest starting with the Polyglot book format.

http://alpha.uhasselt.be/Research/Algeb ... ormat.html

See the Utilities link near the bottom for working code.
Polyglot is a very commonly used utility (perhaps second only to Winboard).

Newer versions of SCID can edit Polyglot books (3.7.1 for instance).

Also, PGNExtract is useful to clean up PGN files.
This is usually necessary before converting .pgn files to .bin books with Polyglot itself, using the make-book command.
Matt Thomas

Re: Opening Book from pgn games set

Post by Matt Thomas »

Thanks, looking at the docs now.
User avatar
Bill Rogers
Posts: 3562
Joined: Thu Mar 09, 2006 3:54 am
Location: San Jose, California

Re: Opening Book from pgn games set

Post by Bill Rogers »

Please excuse my ignorance as I have never understood the format that most opening books are stored. I actually use almost the exact same way of storing books as found in TCSP. Would any of the above programs create that type of book? I am in the dark on how most books are created and stored. Oh, I forgot to mention I don't program in "C" but can read it a little, very little. I still program in Basic.
Thanks for any replys.
Bill
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Opening Book from pgn games set

Post by Edmund »

Bill Rogers wrote:Please excuse my ignorance as I have never understood the format that most opening books are stored. I actually use almost the exact same way of storing books as found in TCSP. Would any of the above programs create that type of book? I am in the dark on how most books are created and stored. Oh, I forgot to mention I don't program in "C" but can read it a little, very little. I still program in Basic.
Thanks for any replys.
Bill
Glass Opening Book Manager can import TSCP opening books easily. Export doesn't work for the simple reason that GOB stores a lot of information per move (like weight, value, etc) which would get lost in the simple move line informations like in TSCP.

Hash table based Opening Books offer certain advantages over Move line based ones. Especially for large books the first option is much smaller as each move only takes 12 bytes in GOB and 16 bytes in Polyglot, most lines in TSCP have more bytes than that. Furthermore you can store additional information like how often to play which move and which lines to avoid totally. As well you can take advantage of transpositions.

I have got a Visual Basic implementation of the GOB source code, so I am convinced it is working.

regards,
Edmund
User avatar
Bill Rogers
Posts: 3562
Joined: Thu Mar 09, 2006 3:54 am
Location: San Jose, California

Re: Opening Book from pgn games set

Post by Bill Rogers »

Thanks for the reply. I don't store any of the things that you mentioned when I create opening books. Although I store the data just like TCSP I examing only winning games and take the best opening lines for both black and white therefore I don't need any scores as the best moves have already been determined in my opinion that is one of the reasons why the different colors have won in the first place. This is not to say that their mid game and end game logic don't come into play but the opening lines that they chose led to their winning positions.
Now I can expect a bunch of people to find exceptions to my way of thinking but its my story and I am sticking to it as the song goes.
I wouldn't mind seeing you Basic code if you didn't mind as I might be able to find a way to make it work for me in my way. It would not be released or made public in any way or form.
Thanks again for all the information.
Bill
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Opening Book from pgn games set

Post by Edmund »

Bill Rogers wrote:Thanks for the reply. I don't store any of the things that you mentioned when I create opening books. Although I store the data just like TCSP I examing only winning games and take the best opening lines for both black and white therefore I don't need any scores as the best moves have already been determined in my opinion that is one of the reasons why the different colors have won in the first place. This is not to say that their mid game and end game logic don't come into play but the opening lines that they chose led to their winning positions.
Now I can expect a bunch of people to find exceptions to my way of thinking but its my story and I am sticking to it as the song goes.
I wouldn't mind seeing you Basic code if you didn't mind as I might be able to find a way to make it work for me in my way. It would not be released or made public in any way or form.
Thanks again for all the information.
Bill
For each move you can adjust the weight, so you can set how often to play certain openings. The idea here is to keep a certain variety of play - not always play the same/best line, but still rather play better moves.
The other advantage I have already pointed out is the ability to use transpositions.
Finally, if you want to edit large opening books manually it is very hard to do so in the plain text format.

Concerning the Visual Basic code, my "Opening Book Manager" which is for download I programmed with VB6. The code has no comments at all and is very messy so I will only post the main ideas. Please let me know if you have any specific questions. I would be happy to help if I can.

Each entry consists of the following data:
6 byte hash of the position, 2 byte hash of the move, 2 byte to indicate the value for the position, 1 byte to set certain flags, 1 byte to set the weight = 12 bytes.
Each move in the opening book is represented by one entry like this. So the whole bookfile consists of many 12 byte blocks. Note that with the text editor you wouldn't be able to read those.

Code: Select all

Type tentry
    hash1(0 To 5) As Byte
    hash2(0 To 1) As Byte
    
    value As Integer
    user As Byte
    weight As Byte
End Type       '12  byte

Next step is to load the opening book:

Code: Select all

Public entry() As tentry
Public centry as Long

Public Sub LoadData(path as String)

    Dim max as Integer
    max = 10000000
    ReDim entry(max)

    centry = 0
    
    Open path For Random As #1 Len = Len(entry(0))
        centry = Int(LOF(1) / Len(entry(0)))
        For i = 0 To centry - 1
            Get #1, i + 1, entry(i)
        Next i
    Close #1

End Sub
Finally, when you want to look whether there is a bookmove, you generate all moves for the position and for each move you generate the hash. Then you go through all entry(i) and compare the hash. In GOB all entries are sorted so you only have to do log2(n) compares, where n is the number of entries, to find the position. Then for each move that is in the book you calculate a probability of being chosen, using the given information for all moves. A simple way for example would be weight/sum_of_weights_from_all_moves. Then you generate a random number with which you can then select the move to play.