*First release* Giraffe, a new engine based on deep learning

matthewlai · Post by **matthewlai** » Wed Jul 08, 2015 7:23 pm

Hello!

I have been working on a new engine for the past few months as my Master's thesis, and I think it's about time for the first release, since while it's not very strong, it sure is entertaining!

The goal is to create a chess engine that learns how to play chess using temporal-difference reinforcement learning, with as little hand-coded chess knowledge as possible.

The first stage of the project is to replace the evaluation function with a deep neural network, and use TDLeaf[1] to train it. That has been done, and is where the project is at right now.

The current evaluation architecture is a neural network with 3 hidden layers. About 270,000 parameters. It is bootstrapped using a materials-only eval, and trained by looking at millions of random positions from the CCRL dump. Essentially for each position it would first do a search, and then play against itself for a few more moves. Then it would adjust the model so that the eval of the original position (more precisely the leaf of the search from original position) gets closer to evaluations of subsequent positions.

I am using the Strategic Test Suite to test its positional knowledge. With the initial neural net (material-only) it scores about 4500/15000 at 0.3s per move. After about 8 hours of training (160 CPU-hours), it now scores about 6100/15000. I have played a few games against it, and it's clear that it has learned a thing or two about positions!

The main problem with evaluation right now is that I am only using 1 evaluation function for all phases of the game. That's why it tends to over-extend pawns and get the queen out too early in the opening, and probably doesn't know how to play end game (that hasn't been tested). Statistically speaking, most training positions are in middle game, so the current evaluation function is pretty heavily tuned for middle game.

I have some ideas on how to fix that, and that's what I am working on right now.

A few things that may be of interest:

Eval scores are not in centi-pawns. They are probability-based, and not anywhere close to linear to material at all. 10000 means it thinks white will win for sure, -10000 means it thinks black will win for sure, etc. If search actually finds a mate it will output (30000 - moves) or (-30000 + moves), like normal engines. Score from the start position is about 2000. Don't be alarmed . It means the engine thinks white has roughly 60% chance of winning (whether that actually makes sense or not is another matter).
Searches are slow (about 5 seconds to depth 7), because eval is slow. This is intentional. The search is also very basic, with only A-B, PVS, Null move, killer, TT. I am intentionally not including any forward pruning (besides NM) or move-count based stuff (LMR), because I am planning to use neural nets to make that kind of decisions later on. Profiling shows that it spends 97% of the time in eval, which is pretty nice because that means I don't have to bother optimizing anything else at all...
There is no support for opening books or EGTBs. This is intentional. I want it to learn how to play openings and end games.
No support for pondering or SMP yet. This is... not intentional. Just lack of time .
Sorry the xboard protocol part hasn't been tested very much, since I do all testing/training using my own tools. I just tried it with xboard and Arena, though, and it does seem to work.
I have absolutely no idea how strong it is! It's much stronger than me (human) for sure, but that's not saying much.

It would be highly appreciated if someone wants to give it a try and let me know what you think!

Download:
https://bitbucket.org/waterreaction/gir ... 150708.zip

It is closed source for now while the thesis is ongoing, but will be released under the GPL (or another open source license) at the conclusion of the thesis in October, if not earlier.

It is Windows-only for now even though I do all my testing and training on Linux, because releasing binaries for Linux is not very straight forward (glibc versions, etc), and I don't think many people here use Linux as their primary OS anyways.

Acknowledgements:

Professor Duncan Gillies, thesis advisor
Imperial College High Performance Computing Service, for providing all the computational power required for this project

Libraries/Code Borrowed:

Eigen linear algebra library (http://eigen.tuxfamily.org/index.php?title=Main_Page)
Pradyumna Kannan's magic move generator

[1] http://arxiv.org/abs/cs/9901001

fern · Post by **fern** » Wed Jul 08, 2015 8:25 pm

How can be pñayed?
Uci, Winboard, any gui?

Fern

matthewlai · Post by **matthewlai** » Wed Jul 08, 2015 8:26 pm

fern wrote:How can be pñayed?
Uci, Winboard, any gui?

Fern

It uses winboard/xboard protocol. Any GUI supporting the protocol should work, though I have only tested xboard and Arena.

And make sure eval.net (also in the archive) is in the working directory of the engine!

cdani · Post by **cdani** » Wed Jul 08, 2015 8:28 pm

Congratulations! Sure it has a big future!!

matthewlai · Post by **matthewlai** » Wed Jul 08, 2015 8:29 pm

cdani wrote:Congratulations! Sure it has a big future!!

Thanks!! Not sure about big future yet, but I certainly am having a heck of a lot of fun building it!

op12no2 · Post by **op12no2** » Wed Jul 08, 2015 8:43 pm

Fab to see novel approaches; good luck.

I've added it to a gauntlet that I'm running (2300-2400) 'cos a couple of the engines (mine) don't have any end game knowledge. I'll point you at the PGN file.

It's weird interpreting the score...

matthewlai · Post by **matthewlai** » Wed Jul 08, 2015 8:50 pm

op12no2 wrote:Fab to see novel approaches; good luck.

I've added it to a gauntlet that I'm running (2300-2400) 'cos a couple of the engines (mine) don't have any end game knowledge. I'll point you at the PGN file.

It's weird interpreting the score...

Awesome thanks! My guess is it's quite a bit below 2300, since it is very weak tactically. Even I can out-tactic it sometimes... which is unusual for human-computer matches nowadays

.

fern · Post by **fern** » Wed Jul 08, 2015 8:54 pm

No way. It freezes.
I will wait your next effort.

truly yours
Fern

matthewlai · Post by **matthewlai** » Wed Jul 08, 2015 8:55 pm

fern wrote:No way. It freezes.
I will wait your next effort.

truly yours
Fern

Which GUI did you use and do you have the log?

op12no2 · Post by **op12no2** » Wed Jul 08, 2015 8:57 pm

Working fine in Arena using WB2 protocol. Forgot to say - TC for my gauntlet is 5'+5".

First release Giraffe, a new engine based on deep learning

First release Giraffe, a new engine based on deep learning

Re: First release Giraffe, a new engine based on deep lear

Re: First release Giraffe, a new engine based on deep lear

Re: First release Giraffe, a new engine based on deep lear

Re: First release Giraffe, a new engine based on deep lear

Re: First release Giraffe, a new engine based on deep lear

Re: First release Giraffe, a new engine based on deep lear

Re: First release Giraffe, a new engine based on deep lear

Re: First release Giraffe, a new engine based on deep lear

Re: First release Giraffe, a new engine based on deep lear