MCTS without random playout?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: MCTS without random playout?

Post by diep »

smatovic wrote:
Additional another thing i find not so interesting is the 2 ply fullwidth searches introduced somewhere. If you simply don't know how to parallellize your software, then it won't help to 'rape' your CNS algorithm by doing 2 ply searches everywhere near the leaves.
I agree.
Basically CNS is most interesting, yet leave out all the crap that's needing too much of an overhead.
But how do you handle a 16 bit Score value with CNS?

I could store the C-Numbers for 8 bit Scores at each expanded node, but i am not sure if this implementation is too rough-grained.

--
Srdja
In diep i score using 20 bits. Mate is something like +500k .
Using diep's normal evaluation function for this wasn't a problem.

So i don't really understand the question.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: MCTS without random playout?

Post by diep »

Oh wait maybe i should tell something what i did do.

I stored the entire search in RAM. Total lossless.

You know, just buy a 4 socket box and throw in 32 DIMMS of 8 GB.
Not gonna be extremely cheap yet it's 256GB ram.

What i did do could search for a limited time.

So the mainline nonstop changed. Now you can see this as a problem or not; the only big problem i saw was that it wasn't finding relative simple tactics for Diep where the normal Diep, using same evaluation function found that already at a ply or 6-10.

Sure it could find some deep shots which is nice, but i want also a 'worst case analysis' to go ok on an algorithm :)

Note that i hadn't really optimized the datastructure bigtime, did do things pretty naive, as obviously i wanted to experiment with CNS, not directly use it to play slow time control games.

As my machiens always historically had relative little RAM, i was very constrained there. I used between a 150 and 400MB ram for that storage back then. Was around 2001.
smatovic
Posts: 2645
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: MCTS without random playout?

Post by smatovic »

correct me if i am wrong, but afaik in CNS you have to store at every expanded node the C-Numbers for every Score....so 16 Bit means you need to store 65536 C Numbers at every node...this could work with 10k expanded nodes, but not with 100k.

##edit###
Oh wait maybe i should tell something what i did do.
ah, ok.

--
Srdja
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: MCTS without random playout?

Post by diep »

by the way i hope you realize the strong point of monte carlo;
if we compare it to search algorithms like CNS and B* and similar
alternatives for depthlimited alfabeta, then we see that monte carlo
is the only algorithm out of the 'alternatives' that's gonna search the
mobility optimized mainline, if i may call it like that, with some sort of
chance.

Todays top engines of course search it fully, yet also in a selective manner
around the mainline.

So to speak what can happen is that a direct positional refutation of a line gets missed, as the refutation starts with an unexpected move as the first reply, and it happens to be a quiet position.

Programs like Stockfish then do some sort of supernullmove, with a reduction factor where the original inventors of probcut would even have problems with, then every next move it'll reduce, and soon it will use fuility and razor away the entire line; you'd be very lucky if the total 'refutation'
is more than 50% of it's published search depth for that line.

Where with Stockfish this happens nonstop, realize it's nominal search depth is 30+ plies.

So half of that still is a formidable 15 plies that it searches lines which its evaluation function doesn't favour. That's a depth where the CNS searchers still can dream of as a worst case depth to checkout the mainline.
Antonio Torrecillas
Posts: 90
Joined: Sun Nov 02, 2008 4:43 pm
Location: Barcelona

Re: MCTS without random playout?

Post by Antonio Torrecillas »

Hi Vincent,
How strong is that stuff?
Rocinante 1.01 PB* with ufo evaluation (material+pst) 1650 elo
Rocinante 2.0 PB* with ufo eval 1750-1800 elo
Rocinante 2.0 PB* with new eval not tested.
Rocinante 2.0 MCTS AB with ufo eval 1850-1900 elo
Rocinante 2.0 MCTS AB with new eval 2000 elo.

Why MCTS AB?

The main drawback of PB* is a double evaluation for leaf nodes (real + pessimistic). The second is a kind of horizont effect, and a last chance effect. If you expand a node and find it quiet you never come back to this node to see deeper.

I tried to overcome the first drawback, doing simply a single leaf evaluation. To overcome the second I was trying to change the way PB* trace down the next node to expand.

Once there I realized that I do reinvent something that already exist. So I applied Upper confidence to tree and a single eval.

So that's a MCTS with the random playout replaced by a fixed depth alpha-beta.

I agree that MCTS with random playout is really weak, but a UCT on top an alpha-beta is another beast.

This is a newborn, let see how it grows.

regard Antonio.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: MCTS without random playout?

Post by diep »

smatovic wrote:correct me if i am wrong, but afaik in CNS you have to store at every expanded node the C-Numbers for every Score....so 16 Bit means you need to store 65536 C Numbers at every node...this could work with 10k expanded nodes, but not with 100k.

##edit###
Oh wait maybe i should tell something what i did do.
ah, ok.

--
Srdja
ok that's what you mean - well use an integer if you like to.

However the size of the cns number depends also upon your expansion strategy. In its purest form this number shouldn't grow real huge of course as you stop expansion above a certain value.

That said i just used simply an integer myself of 32 bits everywhere.

you realize cost of a 4 socket AMD box is pretty cheap?

on ebay a cheapo mainboard is maybe $400 max. cpu's $70 each,
then you got 16 x 2.5Ghz or so.

Just the RAM is expensive. You can store in 256GB really a lot you know.

Even a handful of gigabytes it can have easily a searchtree
of a billion+ nodes.

If you checkout what most of the dudes post about here, it's about search trees with modern chessprograms of a few hundreds of k's of nodes in their superbullet tests up to a few dozens of millions max which is exceptionnel.

So whether you store things in 16 bits or 32 bits or 24 bits, that doesn't really matter. Also hardly eats more system time of course; obviously
storing every node in RAM is expensive, something i do in diep anyway as
well. Each time updating the entire search tree is expensive.

Yet you can easily parallellize this algorithm. Just some tricks needed
to avoid updating simultaneously in the same entry in RAM.

The thing is to be busy outside the normal the normal zone and be creative.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: MCTS without random playout?

Post by diep »

Antonio Torrecillas wrote:Hi Vincent,
How strong is that stuff?
Rocinante 1.01 PB* with ufo evaluation (material+pst) 1650 elo
Rocinante 2.0 PB* with ufo eval 1750-1800 elo
Rocinante 2.0 PB* with new eval not tested.
Rocinante 2.0 MCTS AB with ufo eval 1850-1900 elo
Rocinante 2.0 MCTS AB with new eval 2000 elo.

Why MCTS AB?

The main drawback of PB* is a double evaluation for leaf nodes (real + pessimistic). The second is a kind of horizont effect, and a last chance effect. If you expand a node and find it quiet you never come back to this node to see deeper.

I tried to overcome the first drawback, doing simply a single leaf evaluation. To overcome the second I was trying to change the way PB* trace down the next node to expand.

Once there I realized that I do reinvent something that already exist. So I applied Upper confidence to tree and a single eval.

So that's a MCTS with the random playout replaced by a fixed depth alpha-beta.

I agree that MCTS with random playout is really weak, but a UCT on top an alpha-beta is another beast.

This is a newborn, let see how it grows.

regard Antonio.
No no i'm not a fan of Monte Carlo at all for computerchess. CNS is far stronger than Monte Carlo of course.

Realize the elo depends upon a lot of factors. I don't think the initial focus should be elo, as that's dependant upon things like what book you use,
what time control you use - and you don't want quick time controls with all these algorithms - and the overhead it costs.

I feel such comparisions aren't fair for experimental search strategies. It requires years of optimization just to get the same amount of nps that the normal chess engines get quite easily. Don't waste time in those optimizations now.

Then furthermore bunches of enhancements and algorithms added to depth limited alfabeta, they won't work in the same manner usually for alternative forms of search. Optimization of such search with such enhancements is a special field of research simply.

You won't manage to do that within a year. It requires nonstop testing what works better for the specific algorithm.

It's about the progress you can make.

Monte Carlo + UCT in itself doesn't allow much of a progress, as basically you're busy adding moves to the search tree in a manner that in advance has too much of an overhead.

You can of course already modify depthlimited alfabeta a tad to do that more efficient than MC + UCT do that.

The interesting thing is to improve CNS and initially compare it in a correct manner. Just against fullwidth alfabeta + hashtable, for the same program.

When you add nullmove, add it for both engines at the same time and then compare again.

Check extensions you also add for both at the same time, and so on.

Then compare that.

This is all big work, but more promising than any monte carlo derivate of course.

Ignore the absolute elorating, that's so so dependant as well upon the evaluation functions quality, it's about comparing the same program with the same program using a different search and measuring the progress possible of the alternative approach.

Todays engines have bunches of search optimizations. Dozens. You won't be able to implement all that so quickly into an alternative approach.

Right now of course a simple fullwidth alfabeta with hashtables will handsdown beat any Monte Carlo chessprogram of course.

The overhead of MC is too huge, in all its incarnations.

Now another thing to consider is that many engines aren't very efficient searchers WITHOUT their forward pruning, but that's for later concern.