*First release* Giraffe, a new engine based on deep learning

matthewlai · Post by **matthewlai** » Sun Aug 09, 2015 2:45 am

clumma wrote:Very interesting project. I have two questions:

1. Why did you bootstrap from material-only eval? Doesn't the CCRL dump contain evals by the best engines? Why not train on those?

2. Are you familiar with the idea of model compression -- training a small network to mimic a larger one? E.g.
http://arxiv.org/abs/1312.6184
http://arxiv.org/abs/1503.02531
and do you think this could be used to speed up Giraffe's eval?

Cheers,

-Carl

Thanks!

1. That was the original plan. I originally modified Stockfish to label positions for bootstrapping. I ended up not doing that for a more philosophical reason than a technical one - I want to see what it can do with as little bootstrapping knowledge as possible. From my experiments, I have already found that even the material bootstrap weights are not really important either, as long as the relative order of the values are correct.

Eventually I want to switch to random initialization. It would probably take much longer to train, but I wouldn't be surprised if that actually works. I just don't have the spare CPU cycles right now. I have access to about 600 CPUs in about 150 quad cores, but for parallel back-propagation I need large shared-memory systems, and I only have access to 2-3 20-core machines. And they are all doing more useful stuff right now.

2. I am aware of model compression. Really cool stuff isn't it?

I was just talking to my supervisor about it a while ago. I probably won't have time to do it for the thesis, since there are still a million things I want to try, and not much time left.

It's definitely on my list of things to investigate later, though.

matthewlai · Post by **matthewlai** » Sun Aug 09, 2015 2:48 am

melajara wrote:
One big problem with Giraffe right now is that she always wants to get the king out early. That's probably positions in endgames are trained faster (because they are close to the final reward), so the eval is biased for endgames, where the king should get out.

Could you not compensate for this training bias by providing more context parameters, actually modelling what we, as humans, understand as the 3 game phases in chess (opening, middlegame, endgame)?

Doing so, development moves (but not king move beside castling, lol) would be favored in opening and king centralization (a strong point in Giraffe's endgame play) reserved for endgame?

That is indeed what I did, more or less. The new release (http://talkchess.com/forum/viewtopic.php?t=57142) fixed most of the game-phase-confusion problems. It doesn't usually do a king hike in the opening anymore, and it does castle reasonably now.

First release Giraffe, a new engine based on deep learning

Re: First release Giraffe, a new engine based on deep lear

Re: First release Giraffe, a new engine based on deep lear

*First release* Giraffe, a new engine based on deep learning

Re: *First release* Giraffe, a new engine based on deep lear

Re: *First release* Giraffe, a new engine based on deep lear

First release Giraffe, a new engine based on deep learning

Re: First release Giraffe, a new engine based on deep lear

Re: First release Giraffe, a new engine based on deep lear