Using Negamax for the minimalist game of Go

maksimKorzh · Post by **maksimKorzh** » Sat Jul 06, 2024 10:29 am

Hello everyone!

Please excuse me upfront if this would be considered off topic - since I would like to share
some experience I had during developing a tiny Go/Weiqi/Baduk program. Yes, this is not chess
but I used the approach that we had in Toledo Nano Chess or MicroMax which was treated to be
a bad idea if applied for Go, but since the results I eventually got are quite surprising and exciting,
yet it's a technical discussion, I hope moderators would keep this post.

So, long story short - I was looking for the smallest (code size wise) implementation of the game of Go.
I was hoping to find some MicroMax/ToledoChess equivalent but surprisingly there doesn't seem to be any.
So I decided to make my own, just for fun. I was wondering if the traditional minimalist chess approach would work -
using recursive search for tactics and a sort of a PST equivalent to give a program an idea of where to put stones.
Initial attempt failed because I was trying to search for all moves, not only forcing moves but once I limited search
to searching forcing moves only it started working much better - 6 plies depth with bare brute force search, no alpha beta
(I believe with Go beta cutoffs might lead to a trouble because local win does not always mean a global advantage +
I wanted to have as small code as possible). Now which moves did I consider to be forcing? In chess if the piece is attacked
the capture move is a forcing move but using this heuristic in Go would lose because usually groups with 1 liberty left
(those being in atari) would eventually be captured, so I'm searching for groups that have 1 or 2 liberties left. These liberties
are stored to a move list and get searched recursively. Again, search would generate only these forcing moves. So the
way it works is like: we have a group in trouble? Try to save it. We're chasing opponent's group? Try to kill it.
As a logic for "calm" moves I used 2 heuristics - take corners/sides/center if possible, if not find opponent's group
with the least number of liberties and start surrounding it.

Surprisingly the naive approach above results in my engine beating GnuGo1.2 (first public release of GnuGo) at least 60%
winrate. I didn't do much testing, but it clearly wins more than looses. Interestingly, GnuGo is based on exactly the same
program my engine has been inspired by - Wally By J.K. Millen, a Go program from 80s running on unexpanded KIM-1 (~1K RAM).
Wally didn't have recursive search but static eval logic to attack/save/apply pattern/play random move, GnuGo took this to the
next level - although it knows nothing about 2 eyes (core concept in Go - a group with 2 eyes is alive, with 1 eye it's dead)
still GnuGo feels more intelligent then my program. Nevertheless, a simple negamax search effectively replaces attack/deffence
logic and tremendously minimizes the amount of source code to be involved to achieve a similar behavior. In the cooperation
with a simple take territory + surround weak groups logic my program can beat GnuGo.

Here's the source code of MicroGo(my tiny Go program): https://github.com/maksimKorzh/microgo
You won't understand much above because of single variable names though, so
here's a fairly similar implementation but with meaningful variable names in JS + it has own GUI
if you want to quickly experience it: https://github.com/maksimKorzh/bmgp

Finally, here's a video where MicroGo beats GnuGo1.2 three times and looses once:

For those not willing to navigate through the links, here's the source code:

Code: Select all

#include <stdio.h>
#include <string.h>
#define F "                    \n"
#define O " ...................\n"
#define G F O O O O O O O O O O O O O O O O O O O F
int d[]={88,340,352,100,69,129,363,297,311,
371,143,77,213,227,73,367,215,225,115,325,220};                                   // Opening moves
int I=441,S=21,E=0,B=1,W=2,M=4,L=8;char b[]=G;
int s=1,k,m,r,e;int g[360],l[720];int n[]={-1,-21,21,1};
void C(int q,int c){int t=b[q]-'`';if(b[q]<=' ')return;                           // Count liberties
if(t>0&&(t&c)&&!(t&M)){b[q]|=M;g[r++]=q;for(int i=0;i<4;i++)
C(q+n[i],c);}else if(b[q]=='.'){b[q]=(t+50)|L+'`';l[e++]=q;}}
void R(){memset(g,0,360*sizeof(int));memset(l,0,720*sizeof(int));                 // Restore board after
r=0;e=0;for(int i=0;i<I;i++){if(b[i]>'.'){char t=(b[i]-'`')&3;
t+=t?'`':'.';b[i]=t;}}}
int Y(int q){int c=-1,o=-1;for(int i=0;i<4;i++){if(b[q+n[i]]<=' ')                // Is square in diamond shape?
continue;if(b[q+n[i]]=='.')return 0;if(c==-1){c=(b[q+n[i]]-'`')&3;
o=3-c;}else if(((b[q+n[i]]-'`')&3)==o)return 0;}return c;}
int P(int q,int c){if(b[q]!='.'||q==k)return 0;int _k=k;k=E; b[q]=c;              // Put stone on board, handle captures
for(int i=0;i<I;i++){if(b[i]<='.')continue;if((b[i]-'`')&(3-(c-'`'))){
C(i,3-(c-'`'));if(!e){if(r==1&&Y(q)==3-s)k=g[0];for(int j=0;j<r;j++)
b[g[j]]='.';}R();}}C(q,c-'`');int suicide=e?0:1;R();if(suicide){b[q]='.';
k=_k;return 0;}s=3-s;return 1;}
int X(int x){if(!x){int v=0,a=0,w=0;for(int q=0;q<I;q++){                         // Recursively search fighting moves at depth 6
if(b[q]<=' '||b[q]=='.')continue;if(b[q]=='a')a+=1;if(b[q]=='b')
w+=1;}v+=(a-w);return(s==B)?v:-v;};int h=-10000;int u[100];
memset(u,0,100*sizeof(int));int y=0;for(int q=0;q<I;q++){
if(b[q]<=' '||b[q]=='.')continue;C(q,b[q]-'`');if(e<3){for(int j=0;j<e;j++)
{int f=0;for(int z=0;z<y;z++)if(u[z]==l[j])f=1;if(!f)u[y++]=l[j];}}R();}
for(int q=0;q<(y<7?y:6);q++){if(u[q]==k)continue;char _b[]=G;strcpy(_b,b);
int _s=s;int _k=k;if(!P(u[q],s+'`'))continue;int p=-X(x-1);if(p>h){h=p;
if(x==6)m=u[q];}strcpy(b,_b);s=_s;k=_k;}return h;}
void T(int z){for(int q=0;q<21;q++){if(b[d[q]]=='.'&&!(Y(d[q]))){m=d[q];          // Play away if no forcing moves
return;}}int i=100;for(int q=0;q<I;q++){if((b[q]-'`')==3-s){int a=0;
C(q,b[q]-'`');if(e<i){i=e;a=l[0];}else if(e)a=l[z?(e-1):0];R();int c=0;
for(int j=0;j<4;j++)if(b[a+n[j]]=='.')c++;if(a&&c&&a!=k)m=a;}}}
void D(){setbuf(stdin,NULL);setbuf(stdout,NULL);char u[10000];while(1){           // GTP communication
memset(u,0,sizeof(u));fflush(stdout);if(!fgets(u,10000,stdin))continue;
if(u[0]=='\n')continue;if(strncmp(u,"name",4)==0)printf("= Micro Go\n");
if(strncmp(u,"version",7)==0)printf("= by Code Monkey King\n\n");
else if(strncmp(u,"protocol_version",16)==0)printf("= 1\n\n");
else if(strncmp(u,"showboard",9)==0)printf("= %s %d %d\n\n",b,s,k);
else if(strncmp(u,"clear_board",11)==0){strcpy(b,G);s=B,k=m=E;printf("=\n\n");}
else if(strncmp(u,"genmove",7)==0){s=(u[8]=='B')?B:W;m=0;int p=X(6);
if(m)P(m,s+'`');else{T(1);if(!m)T(0);}char c[20];int y=(m/21)-1,x=(m%21)-1;
c[0]='A'+x+(x>=8);sprintf(c+1,"%d",19-y);c[3]='\0';c[0]='A'+x+(x>=8);
if(!m)printf("= pass\n\n");else {P(m,s+'`');printf("= %s\n\n",c);}}
else if(strncmp(u,"play",4)==0){int c=u[5]=='B'?'a':'b';int x=u[7]-'A'+1-(u[7]>'I'?1:0);
int y;sscanf(u,"play %*c %*c%d",&y);y=S-1-y;P(y*S+x,c);printf("=\n\n");}
else if(strncmp(u,"quit",4)==0)break;else printf("=\n\n");}}int main(){D();}

Summary:
Seems like I'm the first guy to implement minimalist Go program.
If not - please kindly point me to those existing already.

Thanks for reading. Cheers!

hgm · Post by **hgm** » Sun Jul 07, 2024 10:06 am

I thought that the main problem with Go was evaluation. Search is one thing, but if you don't know what to search for, even a very powerful search algorithm cannot do anything useful.

maksimKorzh · Post by **maksimKorzh** » Sun Jul 07, 2024 1:16 pm

hgm wrote: ↑Sun Jul 07, 2024 10:06 am I thought that the main problem with Go was evaluation. Search is one thing, but if you don't know what to search for, even a very powerful search algorithm cannot do anything useful.

Hello, Mr. Muller

This is indeed true, but interestingly evaluation in Go is usually treated differently compared to chess.
In different eras of Go programming many approaches has been used and eventually they have evolved
into something kind of binded all together, e.g.:
- they have started with suggesting moves to attack/save a group + pattern recognition + semi-random moves to take territory (80s)
- then minimax search was added to solve simple life & death problems, e.g. if group can be captured (90s)
- then MCTS came onto the stage where candidate moves from above heuristics were played out to find a best one (00s)
- NNs driven by MCTS (2015, AlphaGo breakthrough)

So the core difference is that a given move does not have a static evalution score like in chess,
static evaluation may only be used for counting material (who captured more stones) when a local
group can be captured but for the rest of moves - they can only be ordered due to "urgency".

This is probably a noob's way to name it properly but I would've called Go evaluation to be a relative move urgency centric
instead of plain static scoring, which practically means that the same move can have different value depending on how urgently
it has been played. And here comes the MCTS with its ability to playout moves without knowing which one is better and as a result
urgency gets revealed and hence the most urgent move scores best. Later in NNs what I call urgency evolved into a policy, hence
NNs there have two heads - policy and corresponding value, but policy is prior.

Please excuse my vague understanding, I'm learning this stuff myself now. But the most important thing I've learnt so far
is that the right timing is the most valuable evaluation measure, not the absolute static score because static score can only
handle local material values while within a bigger picture. A "calm" move can cut an entire group and sacrificing say 5 stones
may lead to killing a group of 100 stones. But since capturing those 100 cannot be calculated to the end they operate with
winrate probabilities instead of static evaluation scores.

What I did in my program was replacing the vast amount of code responsible for 1 ply depth move policy (something very similar to how your N.E.G. works) with a negamax search counting captured stones as the only static eval measure. This allows obtaining a slightly better
result compared to no search engine but tremendously reduces the size of a source code. Also my program uses no patterns while
GnuGo1.2 had a database of simple patterns but despite a lack of knowledge, just relying on 6 ply depth search my program beats
a bigger source wise and more knowledgeable program. This can be compared to how your MicroMax beats TSCP. Very similar.

JacquesRW · Post by **JacquesRW** » Sun Jul 07, 2024 5:21 pm

maksimKorzh wrote: ↑Sun Jul 07, 2024 1:16 pm So the core difference is that a given move does not have a static evalution score like in chess,

Both moves and positions can be prescribed static evaluations (albeit they may be of poor quality), and that is exactly what is done in modern approaches, with policy and value heads of a NN respectively.

maksimKorzh wrote: ↑Sun Jul 07, 2024 1:16 pm But since capturing those 100 cannot be calculated to the end they operate with
winrate probabilities instead of static evaluation scores.

A heuristic (e.g. a NN) that produces a winrate probability without explicitly searching is exactly what we would all a static evaluation, just in a different form to what you might normally see in minimax programs.

Using Negamax for the minimalist game of Go

Using Negamax for the minimalist game of Go

Re: Using Negamax for the minimalist game of Go

Re: Using Negamax for the minimalist game of Go

Re: Using Negamax for the minimalist game of Go