ChessUSA.com TalkChess.com
Hosted by Your Move Chess & Games
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

uct on gpu
Post new topic    TalkChess.com Forum Index -> Computer Chess Club: Programming and Technical Discussions Flat
View previous topic :: View next topic  
Author Message
Daniel Shawul



Joined: 14 Mar 2006
Posts: 3060
Location: Ethiopia

PostPost subject: uct on gpu    Posted: Fri Feb 24, 2012 5:52 am Reply to topic Reply with quote

I have now a partially working version of UCT on gpu. The tree is generated and stored on the device so there is no overhead due to cpu-gpu data transfer. Also the way I implemented it, there doesn't seem to be a significant slow down going from pure monte carlo to uct.

a) The tree is stored in global memory so it is shared across multi processors
b) The minimum number of simulations you can ask for is right now 8192 as compared to 1 in the conventional cpu method. For my current gpu device with 14 multi processors, I launch 56 blocks with 64 threads each, so all of the blocks are active. Each thread does 128 simulations for a total of 64x128=8192 simulations and then consults the tree stored in global memory. Then a UCT_select is done and a different node could be selected for simulation.
c) I use spin locks for each node which can be called from each of the 56 active blocks. I do not want to allow each thread to work independently on generating and grabbing nodes because that will increase the idle time on the spinlocks. Later on I may allow a warp to be the smallest unit of working unit on the tree but for now it is a block.
Another reason is that the game I am using right now has simple move generation which allows a game to be run on registers alone. So I want to avoid global memory consults as much as possible. Chess or Go may will probably require some tables that won't fit in shared memory but for now i want to keep it simple.

What side effects (if any) do you think will the block simulation have on UCT? For perft approximations we did before, I used 2 simulations per call but the results were not added up.

Any ideas and suggestions are welcome
Daniel

Edit: I tested 32 threads in a block (1 warp!) and doubled the block sizes i.e 112 x 32 and it performed only slighlty worse. 13sec vs 11sec for completing
90 million simulations. The maximum number of active blocks per MP is 8 so this new setting is the best I can do 8 X 14 = 112. In the test the tree was expanded upto depth=3 and 4600 nodes added. The game is 8x8 hex so BF goes like 64,63,62... The rate of tree growth is definitely slower than what would be possible if UCT_select is done after 1 simulation. Each thread does 128 cycles before consulting the shared tree, so I will try to lower that and see how it affects tree growth and speed. Maybe I am being paranoid about the global memory access , a few cycles could perform better who knows..

Edit2: Indeed I was paranoid! I lowered the number of cycles from 128 gradually down to 16 and it finished the solution in the same time but with much bigger tree. At 16 cycles in fact it used up all the 2 Mega bytes memory I reserved (about 65536 nodes). It does 512 block simulations before checking the tree. I think the cuda warp execution model hides the latency so well ...
_________________
https://sites.google.com/site/dshawul/
https://github.com/dshawul
Back to top
View user's profile Send private message Visit poster's website Yahoo Messenger
Display posts from previous:   
Subject Author Date/Time
uct on gpu Daniel Shawul Fri Feb 24, 2012 5:52 am
      Re: uct on gpu Srdja Matovic Fri Feb 24, 2012 8:17 am
      Re: uct on gpu Srdja Matovic Fri Feb 24, 2012 8:45 am
            Re: uct on gpu Daniel Shawul Fri Feb 24, 2012 1:00 pm
                  Re: uct on gpu Srdja Matovic Fri Feb 24, 2012 1:44 pm
                        Re: uct on gpu Daniel Shawul Fri Feb 24, 2012 2:28 pm
                              Re: uct on gpu Srdja Matovic Fri Feb 24, 2012 3:04 pm
                                    Re: uct on gpu Daniel Shawul Fri Feb 24, 2012 3:53 pm
                  Re: uct on gpu david nash Sun Feb 26, 2012 12:42 am
                        Re: uct on gpu Daniel Shawul Thu Mar 08, 2012 1:26 pm
      Re: uct on gpu Daniel Shawul Sat Feb 25, 2012 8:30 pm
      100x speed up Daniel Shawul Mon Feb 27, 2012 8:02 pm
            Re: 100x speed up Robert Hyatt Thu Mar 15, 2012 2:13 pm
                  Re: 100x speed up Daniel Shawul Thu Mar 15, 2012 3:24 pm
                        Re: 100x speed up Robert Hyatt Thu Mar 15, 2012 4:35 pm
                              Re: 100x speed up Daniel Shawul Thu Mar 15, 2012 5:11 pm
                                    Table Daniel Shawul Thu Mar 15, 2012 5:51 pm
                                    Re: 100x speed up Robert Hyatt Thu Mar 15, 2012 7:36 pm
                                          Re: 100x speed up Daniel Shawul Thu Mar 15, 2012 8:21 pm
      Re: uct on gpu Daniel Shawul Thu Mar 08, 2012 1:08 pm
      uct for chess Daniel Shawul Mon Mar 12, 2012 10:30 pm
            Re: uct for chess Karlo Bala Jr. Mon Mar 12, 2012 11:14 pm
                  Re: uct for chess Daniel Shawul Tue Mar 13, 2012 12:13 am
                        Re: uct for chess Karlo Bala Jr. Tue Mar 13, 2012 12:52 pm
            Re: uct for chess Srdja Matovic Tue Mar 13, 2012 8:08 pm
                  Re: uct for chess Daniel Shawul Tue Mar 13, 2012 9:43 pm
                        Re: uct for chess Daniel Shawul Wed Mar 14, 2012 2:21 am
                        Re: uct for chess Srdja Matovic Wed Mar 14, 2012 11:56 am
                              Re: uct for chess Daniel Shawul Wed Mar 14, 2012 12:46 pm
                                    Re: uct for chess Srdja Matovic Wed Mar 14, 2012 1:00 pm
                        Re: uct for chess - move gen speedup by vector datatypes Srdja Matovic Mon Mar 19, 2012 3:04 pm
                              Re: uct for chess - move gen speedup by vector datatypes Daniel Shawul Mon Mar 19, 2012 8:01 pm
                                    Re: uct for chess - move gen speedup by vector datatypes Vincent Diepeveen Mon Mar 19, 2012 8:43 pm
                                          Re: uct for chess - move gen speedup by vector datatypes Vincent Diepeveen Mon Mar 19, 2012 9:01 pm
                                                Re: uct for chess - move gen speedup by vector datatypes Daniel Shawul Mon Mar 19, 2012 10:01 pm
                                                      Re: uct for chess - move gen speedup by vector datatypes Vincent Diepeveen Tue Mar 20, 2012 12:59 am
                                                            Re: uct for chess - move gen speedup by vector datatypes Vincent Diepeveen Tue Mar 20, 2012 1:04 am
                                                            Re: uct for chess - move gen speedup by vector datatypes Daniel Shawul Tue Mar 20, 2012 2:40 am
                                                                  Re: uct for chess - move gen speedup by vector datatypes Vincent Diepeveen Tue Mar 20, 2012 1:07 pm
                                                                        Re: uct for chess - MCS, YBW and 32 bit move gen Srdja Matovic Tue Mar 20, 2012 2:37 pm
                                                                              Re: uct for chess - MCS, YBW and 32 bit move gen Vincent Diepeveen Wed Mar 21, 2012 4:39 pm
                                                                                    Re: uct for chess - MCS, YBW and 32 bit move gen Srdja Matovic Wed Mar 21, 2012 5:53 pm
                                                                        Re: uct for chess - move gen speedup by vector datatypes Daniel Shawul Tue Mar 20, 2012 3:18 pm
                                                                              Re: uct for chess - move gen speedup by vector datatypes Vincent Diepeveen Wed Mar 21, 2012 2:13 pm
                                                                                    Re: uct for chess - move gen speedup by vector datatypes Daniel Shawul Wed Mar 21, 2012 4:00 pm
                              Re: uct for chess - move gen speedup by vector datatypes Vincent Diepeveen Mon Mar 19, 2012 8:33 pm
                                    Re: uct for chess - move gen speedup by vector datatypes Srdja Matovic Mon Mar 19, 2012 9:30 pm
                                          Re: uct for chess - move gen speedup by vector datatypes Vincent Diepeveen Tue Mar 20, 2012 12:54 am
                                          Re: uct for chess - move gen speedup by vector datatypes Vincent Diepeveen Tue Mar 20, 2012 12:45 pm
                              Re: uct for chess - move gen speedup by vector datatypes Srdja Matovic Tue Mar 20, 2012 2:38 am
                                    Re: uct for chess - move gen speedup by vector datatypes Vincent Diepeveen Tue Mar 20, 2012 1:13 pm
                                          Re: uct for chess - move gen speedup by vector datatypes Srdja Matovic Tue Mar 20, 2012 1:43 pm
                                                Re: uct for chess - move gen performance killers Srdja Matovic Tue Mar 20, 2012 4:45 pm
            intrinsic popcnt Daniel Shawul Wed Mar 14, 2012 5:21 am
                  Re: intrinsic popcnt Daniel Shawul Wed Mar 14, 2012 5:50 am
                        Re: intrinsic popcnt Robert Hyatt Thu Mar 15, 2012 5:12 pm
      Re: uct on gpu Vincent Diepeveen Thu Mar 15, 2012 8:14 pm
            Re: uct on gpu Daniel Shawul Thu Mar 15, 2012 8:27 pm
                  Re: uct on gpu Vincent Diepeveen Sat Mar 17, 2012 1:17 pm
Post new topic    TalkChess.com Forum Index -> Computer Chess Club: Programming and Technical Discussions

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




Powered by phpBB © 2001, 2005 phpBB Group
Enhanced with Moby Threads