Help with Best-First Select-Formula

smatovic · Post by **smatovic** » Sat Jun 23, 2012 5:00 pm

Heyho,

i am looking for improvements for the Select-Formula of Zetas Best-First Search.

Current formula looks like this:

Code: Select all

child.score + child.parent.visits / child.visits

Any suggestions what could be added?

--
Srdja

Antonio Torrecillas · Post by **Antonio Torrecillas** » Sat Jun 23, 2012 6:11 pm

Hi Srdja,
I recently came to a small improvement at UCB formula.
the trick is to get the UCB formula dependent of the branch factor of the position.
The idea behind is : this constant controls the balance between exploration and exploitation. When there is a lot of moves available we need to explore more (high value for the constant). In end games with few moves, we can be more agressive and increase exploitation with a low constant.

Code: Select all

int C_MV[] = {
1,1,1,2,4,8,8,16,16,16,
16,32,32,32,32,64,64,64,64,64,
64,64,128,128,128,128,128,128,128,128,
128,128,256,256,256,256,256,256,256,256,

...
Area[i] =  (child->score)+ sqrt(log(C_MV[parent->MoveCount] * parent->visits)/(child->visits));

the C_MV value is fixed with some simulations. Supposing each move is one centipawn ahead the next move. I fixed C_MV so that each move get visited at least once in 7 times the number of moves available.
let us know if it works for you.
best regards, Antonio.

smatovic · Post by **smatovic** » Sat Jun 23, 2012 9:59 pm

Muchas Gracias Antonio,

i am playing a bit with your formula....

i try to remove the log inside the sqrt to get more expanded nodes...will play some test games....looks good.

Code: Select all

Area[i] =  (child->score)+ sqrt( (C_MV[parent->MoveCount] * parent->visits)/(child->visits));

--
Srda

smatovic · Post by **smatovic** » Sun Jun 24, 2012 5:15 pm

Heyho Antonio,

with your posted formula i am not able to measure any better play, but with an modified version it <looks a bit> better for me, will run some more games with different time controls....

I compute the value "nodes to explore" per node which is the sum off all unexplored child of the current node.

Code: Select all

child.score + sqrt(  log(child.explorable+10) * parent.visits / child.visits )

--
Srdja

ZirconiumX · Post by **ZirconiumX** » Sun Jun 24, 2012 5:39 pm

Stupid thought, but if this is an unexplored node, you have a divide by zero bug there.

It is just a stupid thought though.

Matthew:out

smatovic · Post by **smatovic** » Sun Jun 24, 2012 5:47 pm

alright, for those with stupid thoughts

Code: Select all

child.score + sqrt(  log(child.explorable+10) * parent.visits / (child.visits+1) )

--
Srdja

ZirconiumX · Post by **ZirconiumX** » Sun Jun 24, 2012 6:44 pm

Sorry - I am a really picky person.

Matthew:out

Antonio Torrecillas · Sun Jun 24, 2012 10:20 pm

Hi Srdja,
my actual formula is a bit more complex since I take in consideration the gap between the best and worse move.
I try to improve tactically so I need to consider a minimal amount of visits for all moves, at least in root.
Which engines do you use for your testing?
if you use self test there is probably an important bias, since what perform well against a best first engine can be desastrous against an conventional engine.
I'm running a verification test but everything I see up to now confirm my first idea. Well tomorrow the verdict. I'm not that quick to test my engine.

A+,
Antonio.

smatovic · Post by **smatovic** » Sun Jun 24, 2012 11:58 pm

Which engines do you use for your testing?

I tested different versions of the formula in self play and than against "Zeta Simple", a modified version of Zeta Dva(~2000 ELO) without special moves and with the same, simple evaluation function like my GPU-Engine.

I am still running test games against Zeta Simple, but by now it looks like a big improvement with that "child.explorable"-factor.

I try to improve tactically so I need to consider a minimal amount of visits for all moves, at least in root.

I am still not sure how to handle root moves...at first i visited them all equally often, now i try the same select-formula for all, maybe i should define some special root formula....

### edit ###

In my select algorithm i have to check further things too....because i am running the Engine on an GPU with 1024 threads...so for example i check if the locked nodes against the explorable counters.

Do you run your Engine multi-threaded??

--
Srdja

Antonio Torrecillas · Post by **Antonio Torrecillas** » Mon Jun 25, 2012 9:37 pm

Do you run your Engine multi-threaded?

Not often, I do everything in an core-2 duo laptop.

OK, verification done. +50 elo for 1000 games against 6 engines.
So my little formula has addressed a true weakness in my engine.
Now it's time to have a thought about your idea. I'll try to bias the formula with the number of unexplored tactical moves in the subtree in relation with the parent "tactical counter".

Do you have tested if there is a difference in counting unexplored moves versus unexplored tactical moves ?

Tactically obsessed regards. Antonio.

Help with Best-First Select-Formula

Help with Best-First Select-Formula

Re: Help with Best-First Select-Formula

Re: Help with Best-First Select-Formula

Re: Help with Best-First Select-Formula

Re: Help with Best-First Select-Formula

Re: Help with Best-First Select-Formula

Re: Help with Best-First Select-Formula

Re: Help with Best-First Select-Formula

Re: Help with Best-First Select-Formula

Re: Help with Best-First Select-Formula