NN faster and energy efficient training.

tomitank · Post by **tomitank** » Thu Dec 31, 2020 10:53 am

Since I am developing in JavaScript, I train the neural network in NodeJS.
Training with own writed Neural Network, I don't use machine learning platform.
This is very slow and by far not the best choice. This forced me to experiment with alternative solutions.
I wanted to get the best results as quickly as possible.

My rules:
1.) the smallest possible network -> because of the speed and *
2.) less training examples -> faster learning, less memory requirements
3.) less training period -> faster development

* more info for rule 1:
Using too many neurons in the hidden layers can result in several problems. First, too many neurons in the hidden layers may result in overfitting. Overfitting occurs when the neural network has so much information processing capacity that the limited amount of information contained in the training set is not enough to train all of the neurons in the hidden layers. The second problem can occur even when the training data is sufficient. An inordinately large number of neurons in the hidden layers can increase the time it takes to train the network.

Current results:
around +60 elo against tomitankChess 4.2 with a 768x32x1 network.
This network only augmented it the HCE, did not replace it.
This network - with a little exaggeration- only corrects bad positions. Because of this, much less epoch is needed. (~6 epoch)

I trained only with 2.7M example!!
I am convinced that with 10-30Million or even more example better results can be achieved. Without violating Rules 1 and 3. Unfortunately for me this is not an option. Not in NodeJs! It's pretty boring anyway..

I’ll probably try to remove HCE and train with more epoch, but I don’t think I’ll get better results with a small network and 2.7M examples.
This is probably already tried by the authors of Halogen.

I think this can be successful even with more powerful engines. Even for Stockfish. Network training and thus development could be accelerated.

For Andy:
And most importantly, the network could not be used in any other engine. It would depend on HCE.

-Tamás

tomitank · Post by **tomitank** » Thu Dec 31, 2020 12:44 pm

tomitank wrote: ↑Thu Dec 31, 2020 10:53 am * more info for rule 1:
Using too many neurons in the hidden layers can result in several problems. First, too many neurons in the hidden layers may result in overfitting. Overfitting occurs when the neural network has so much information processing capacity that the limited amount of information contained in the training set is not enough to train all of the neurons in the hidden layers. The second problem can occur even when the training data is sufficient. An inordinately large number of neurons in the hidden layers can increase the time it takes to train the network.

this applies to the neurons in the hidden layer.

Kieren Pearson · Post by **Kieren Pearson** » Fri Jan 01, 2021 3:19 am

Hi, Halogen author here. 768x32x1 was the shape of the first network that was able to completely replace my old HCE. Gaining +60 elo already with a hybrid approach is very impressive for so few training positions. I was able to replace my HCE completely and since then have gained hundreds of elo points using just a NN for evaluation. This happened when Halogen was rated around 2500. I see Tomitank is around 2800 which will make replacing the HCE completely a more daunting task.

People like to discuss if a hybrid approach or just using only a NN is better, but it really doesn't matter. If you're gaining elo, you're heading in the right direction.

Keep up the impressive work.

tomitank · Post by **tomitank** » Fri Jan 01, 2021 11:11 am

Hi!

Kieren Pearson wrote: ↑Fri Jan 01, 2021 3:19 am Hi, Halogen author here. 768x32x1 was the shape of the first network that was able to completely replace my old HCE.

How strong was this network? How many examples did you used for learning?

Kieren Pearson wrote: ↑Fri Jan 01, 2021 3:19 am Gaining +60 elo already with a hybrid approach is very impressive for so few training positions.

Thanks and that would be nearly 100 elo in C language. I'm very happy

Kieren Pearson · Post by **Kieren Pearson** » Sat Jan 02, 2021 1:38 pm

tomitank wrote: ↑Fri Jan 01, 2021 11:11 am Hi!

Kieren Pearson wrote: ↑Fri Jan 01, 2021 3:19 am Hi, Halogen author here. 768x32x1 was the shape of the first network that was able to completely replace my old HCE.
How strong was this network? How many examples did you used for learning?

Kieren Pearson wrote: ↑Fri Jan 01, 2021 3:19 am Gaining +60 elo already with a hybrid approach is very impressive for so few training positions.
Thanks and that would be nearly 100 elo in C language. I'm very happy

Well my HCE was pretty awful so I'm guessing my 768x32x1 network wasn't very strong. I don't know exactly how many positions it was trained with but it was likely in the low millions or possibly 10s of millions

NN faster and energy efficient training.

NN faster and energy efficient training.

Re: NN faster and energy efficient training.

Re: NN faster and energy efficient training.

Re: NN faster and energy efficient training.

Re: NN faster and energy efficient training.