Understanding Training against Q as Knowledge Distillation.

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
AdminX
Posts: 6339
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Understanding Training against Q as Knowledge Distillation.

Post by AdminX »

Interesting write up on the LC0 Blog:

Knowledge Distillation (KD) is a technique where there are two neural networks at play: a teacher network and a student network. The teacher network is usually a fixed, fully-trained network, perhaps of bigger size than the student network. Through KD, the goal is usually to produce a smaller student network than the teacher -- which allows for faster inference -- while still encoding the same "knowledge" within the network; the teacher teaches its knowledge to the student. When training the student network, instead of training with the dataset labels as targets (in our case this is the policy distribution and the value output), the student is trained to match the outputs of the teacher.


http://blog.lczero.org/2018/10/understa ... .html#more
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers