What is Leela doing wrong?
https://arxiv.org/abs/1902.04522
-Carl
ELF OpenGo: An Open Reimplementation of AlphaZero
Moderators: hgm, Rebel, chrisw
-
- Posts: 186
- Joined: Fri Oct 10, 2014 10:05 pm
- Location: Berkeley, CA
-
- Posts: 75
- Joined: Thu Jan 31, 2019 4:54 pm
- Full name: Sven Steppenwolf
Re: ELF OpenGo: An Open Reimplementation of AlphaZero
Great, there is alreday a binary available: https://facebook.ai/developers/tools/elf-opengo
Waiting for Porting this ELF to chess...
Waiting for Porting this ELF to chess...
-
- Posts: 4610
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: ELF OpenGo: An Open Reimplementation of AlphaZero
This is even more interesting for the programmers sectionSteppenwolf wrote: ↑Wed Feb 13, 2019 7:19 pm Great, there is alreday a binary available: https://facebook.ai/developers/tools/elf-opengo
Waiting for Porting this ELF to chess...
https://github.com/pytorch/ELF
https://rwbc-chess.de
trollwatch:
Talkchess nowadays is a joke - it is full of trolls/idiots/people stuck in the pleistocene > 80% of the posts fall into this category...
trollwatch:
Talkchess nowadays is a joke - it is full of trolls/idiots/people stuck in the pleistocene > 80% of the posts fall into this category...
-
- Posts: 75
- Joined: Thu Jan 31, 2019 4:54 pm
- Full name: Sven Steppenwolf
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: ELF OpenGo: An Open Reimplementation of AlphaZero
Notes I took from glancing at the paper
a) CPUCT = 1.5
b) Virtual loss = 1
c) Ladders ( tactics in GO) is hard to learn
d) Batch normalization moment staleness. Some technical issue i don't fully understand but for which they have provided a plugin in torch
e) Value head only, which is something I used to do, gives a weak engine. They accidentally found this out when they fixed policy weight to 1/362
by mistake. The full quote:
a) CPUCT = 1.5
b) Virtual loss = 1
c) Ladders ( tactics in GO) is hard to learn
d) Batch normalization moment staleness. Some technical issue i don't fully understand but for which they have provided a plugin in torch
e) Value head only, which is something I used to do, gives a weak engine. They accidentally found this out when they fixed policy weight to 1/362
by mistake. The full quote:
f) Game resignation during selfplay training is important. It will focus the net to learn the opening/middlegame (most important parts of the game) fasterDominating value gradients We performed an unintentional
ablation study in which we set the cross entropy coefficient
to 1/362 during backpropogation. This change will
train the value network much faster than the policy network.
We observe that ELF OpenGo can still achieve a strength
of around amateur dan level. Further progress is extremely
slow, likely due to the minimal gradient from policy network.
This suggests that any MCTS augmented with only a
value heuristic has a relatively low skill ceiling in Go.
-
- Posts: 1
- Joined: Wed Feb 13, 2019 11:05 am
- Full name: Vishal Jadhav
-
- Posts: 2
- Joined: Sun Mar 03, 2019 7:13 pm
- Full name: juanca lina