Page 2 of 2

Re: What I want to know is can this be ported to a chess eng

Posted: Thu Apr 07, 2016 1:13 am
by jhellis3
GPUs are constantly changing....

There is still not enough known about pascal to say if it will be a potentially useful target for something like chess.

But certainly we are getting closer... I would guess that sometime before 2022 GPUs will become not only viable but potentially necessary targets for a top tier chess engine.

Re: What I want to know is can this be ported to a chess eng

Posted: Thu Apr 07, 2016 12:43 pm
by yurikvelo
Chess rely heavily on UNIFORM memory access.

Even NUMA memory access on multi-socket CPU gives penalty.

GPU is heavily-NUMA by nature. GPU speed is not due to advance in technology, but because heavy parrallelism with hundreds and thousands of threads, each working with isolated subset of slow memory.

Mini-Max bruteforce tree cannot be split in 1000 subsets of independent memory.

No matter 2022 or 2042 - if GPU will have advantages over CPU - only thanks to heavy-NUMA architecture.

Much easier and more promising is distributed computing over a PC cluster

http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf

Re: What I want to know is can this be ported to a chess eng

Posted: Thu Apr 07, 2016 2:55 pm
by Werewolf
yurikvelo wrote:Chess rely heavily on UNIFORM memory access.

GPU is heavily-NUMA by nature. GPU speed is not due to advance in technology, but because heavy parrallelism with hundreds and thousands of threads, each working with isolated subset of slow memory.

Mini-Max bruteforce tree cannot be split in 1000 subsets of independent memory.
What about Monte Carlo?

Re: What I want to know is can this be ported to a chess eng

Posted: Thu Apr 07, 2016 3:12 pm
by jhellis3
No matter 2022 or 2042 - if GPU will have advantages over CPU - only thanks to heavy-NUMA architecture.

Much easier and more promising is distributed computing over a PC cluster
HINT: This is wrong.

Re: What I want to know is can this be ported to a chess eng

Posted: Fri Apr 08, 2016 11:55 am
by WuShock
What about this , for neural network chess ??

http://www.extremetech.com/extreme/2257 ... onic-brain

Re: What I want to know is can this be ported to a chess eng

Posted: Fri Apr 08, 2016 1:04 pm
by smatovic
What caught my eye is the "double precision" calculations.
What caught my is the support for native "half precision" calculations.
Mixed-Precision in GPUs

Looks like AMD, Intel and Nvidia add native 16 bit float and integer support to their devices:

AMD with GCN 1.2 (GCN Gen3)
http://www.anandtech.com/show/8460/amd- ... 5-review/2

Intel with Skylake IGP
https://software.intel.com/sites/defaul ... 9-v1d0.pdf

Nvidia with the upcoming Pascal architecture
http://blogs.nvidia.com/blog/2015/03/17/pascal/
src:
http://web.archive.org/web/201602131457 ... s.app26.de

--
Srdja

Re: What I want to know is can this be ported to a chess eng

Posted: Fri Apr 08, 2016 1:11 pm
by smatovic
5 million nps? That's good for those old cards.
I guess with some further tuning it could be 50+ mnps,
my implementation relies on not optimized memory patterns.

--
Srdja

Re: What I want to know is can this be ported to a chess eng

Posted: Fri Apr 08, 2016 1:17 pm
by smatovic
Isn't there ANYTHING that can be done on the GPU?
just my 2 cents...
YBWC vs. RBFMS vs. MCTS vs. MCAB

To port an classic chess engine approach with an parallel Alphabeta algorithm like YBWC to an GPU architecutre would take a significant bunch of time, if it is even possible to port all well known computer chess techniques straight forward. And it is questionable if an Elo gain, by more computed nodes per second, is eaten up again by an higher branchingfactor due to an simpler implementation.

Zeta 098 and 097 make use of an Randomized Best First MiniMax Search, but my implementation makes excessive use of Global Memory and scales poorly.

At the very beginning of the project it was clear, that an Monte Carlo Tree Search would fit best for gpus. But until now there is no known engine that could make MCTS work well for Chess.

What is left, except to try to port an classic approach?

I could improve the performance of the BestFist search significantly by switching from GlobalMemory to LocalMemory and i could remove the randomness...another alternative would be to switch to MCAB, Monte Carlo Alphabeta...
src:
http://web.archive.org/web/201602131457 ... s.app26.de

--
Srdja