What I want to know is can this be ported to a chess engine?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
jhellis3
Posts: 399
Joined: Fri Aug 16, 2013 10:36 pm

Re: What I want to know is can this be ported to a chess eng

Post by jhellis3 » Thu Apr 07, 2016 1:13 am

GPUs are constantly changing....

There is still not enough known about pascal to say if it will be a potentially useful target for something like chess.

But certainly we are getting closer... I would guess that sometime before 2022 GPUs will become not only viable but potentially necessary targets for a top tier chess engine.

User avatar
yurikvelo
Posts: 470
Joined: Sat Dec 06, 2014 12:53 pm

Re: What I want to know is can this be ported to a chess eng

Post by yurikvelo » Thu Apr 07, 2016 12:43 pm

Chess rely heavily on UNIFORM memory access.

Even NUMA memory access on multi-socket CPU gives penalty.

GPU is heavily-NUMA by nature. GPU speed is not due to advance in technology, but because heavy parrallelism with hundreds and thousands of threads, each working with isolated subset of slow memory.

Mini-Max bruteforce tree cannot be split in 1000 subsets of independent memory.

No matter 2022 or 2042 - if GPU will have advantages over CPU - only thanks to heavy-NUMA architecture.

Much easier and more promising is distributed computing over a PC cluster

http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf

Werewolf
Posts: 1193
Joined: Thu Sep 18, 2008 8:24 pm

Re: What I want to know is can this be ported to a chess eng

Post by Werewolf » Thu Apr 07, 2016 2:55 pm

yurikvelo wrote:Chess rely heavily on UNIFORM memory access.

GPU is heavily-NUMA by nature. GPU speed is not due to advance in technology, but because heavy parrallelism with hundreds and thousands of threads, each working with isolated subset of slow memory.

Mini-Max bruteforce tree cannot be split in 1000 subsets of independent memory.
What about Monte Carlo?

jhellis3
Posts: 399
Joined: Fri Aug 16, 2013 10:36 pm

Re: What I want to know is can this be ported to a chess eng

Post by jhellis3 » Thu Apr 07, 2016 3:12 pm

No matter 2022 or 2042 - if GPU will have advantages over CPU - only thanks to heavy-NUMA architecture.

Much easier and more promising is distributed computing over a PC cluster
HINT: This is wrong.

WuShock
Posts: 167
Joined: Thu Jul 19, 2007 1:13 am

Re: What I want to know is can this be ported to a chess eng

Post by WuShock » Fri Apr 08, 2016 11:55 am

What about this , for neural network chess ??

http://www.extremetech.com/extreme/2257 ... onic-brain

smatovic
Posts: 907
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: What I want to know is can this be ported to a chess eng

Post by smatovic » Fri Apr 08, 2016 1:04 pm

What caught my eye is the "double precision" calculations.
What caught my is the support for native "half precision" calculations.
Mixed-Precision in GPUs

Looks like AMD, Intel and Nvidia add native 16 bit float and integer support to their devices:

AMD with GCN 1.2 (GCN Gen3)
http://www.anandtech.com/show/8460/amd- ... 5-review/2

Intel with Skylake IGP
https://software.intel.com/sites/defaul ... 9-v1d0.pdf

Nvidia with the upcoming Pascal architecture
http://blogs.nvidia.com/blog/2015/03/17/pascal/
src:
http://web.archive.org/web/201602131457 ... s.app26.de

--
Srdja

smatovic
Posts: 907
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: What I want to know is can this be ported to a chess eng

Post by smatovic » Fri Apr 08, 2016 1:11 pm

5 million nps? That's good for those old cards.
I guess with some further tuning it could be 50+ mnps,
my implementation relies on not optimized memory patterns.

--
Srdja

smatovic
Posts: 907
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: What I want to know is can this be ported to a chess eng

Post by smatovic » Fri Apr 08, 2016 1:17 pm

Isn't there ANYTHING that can be done on the GPU?
just my 2 cents...
YBWC vs. RBFMS vs. MCTS vs. MCAB

To port an classic chess engine approach with an parallel Alphabeta algorithm like YBWC to an GPU architecutre would take a significant bunch of time, if it is even possible to port all well known computer chess techniques straight forward. And it is questionable if an Elo gain, by more computed nodes per second, is eaten up again by an higher branchingfactor due to an simpler implementation.

Zeta 098 and 097 make use of an Randomized Best First MiniMax Search, but my implementation makes excessive use of Global Memory and scales poorly.

At the very beginning of the project it was clear, that an Monte Carlo Tree Search would fit best for gpus. But until now there is no known engine that could make MCTS work well for Chess.

What is left, except to try to port an classic approach?

I could improve the performance of the BestFist search significantly by switching from GlobalMemory to LocalMemory and i could remove the randomness...another alternative would be to switch to MCAB, Monte Carlo Alphabeta...
src:
http://web.archive.org/web/201602131457 ... s.app26.de

--
Srdja

Post Reply