Tuning again

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Rebel
Posts: 7299
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Tuning again

Post by Rebel »

Joona post [ http://74.220.23.57/forum/viewtopic.php?t=40662 ] brings back sweet memories.

For the use of self-play I at the time wrote a small util (see below) that emulates a match between 2 equal engines in order to find out how many games it would take before every try (round) would give a reliable result. I consider a reliable result in the range of 49.9 - 50.1%

After all 1% is 6-7 elo points.

Running the utility shows that 10,000 games so now and then still may produce a 49-51% result so one is still left with an 6-7 elo error margin.

Only after 100,000 games things become stable.

Since I don't have the hardware to play 100,000 games I limit myself to 4000. When it shows an improvement I run it again with a different database. Kind of verification process. Then I make a decision.

Thoughts ?

The C-code then with apologies for the "goto" use, I am raised with that.

Ed

------------------------------------------------------------------

Code: Select all

#include <stdio.h>
#include <stdlib.h>

void main()            // emulate matches

{       int r,x,max,c; float win,loss,draw,f1,f2,f3,f4; char w[200]; int rnd,d,e;

        srand(rnd);

again:  printf("Number of Games "); gets(w); max=atoi(w);

loop:   x=0; win=0; loss=0; draw=0; printf("\n");

next:   if (x==max) goto einde;

        r=rand(); r=r&3; if (r==0) goto next;
        if (r==1) win++;
        if (r==2) loss++;
        if (r==3) draw++;
        x++; if (x==(max/4)) goto disp;
             if (x==(max/2)) goto disp;
             if (x==(max/4)+(max/2)) goto disp;
             if (x==max) goto disp;
        goto next;


disp:   f1=win+(draw/2); f2=loss+(draw/2); f4=x; f3=(f1*100)/f4; d=f1; e=f2;
        printf("%d-%d (%.1f%%)  ",d,e,f3);
        goto next;

einde:  c=getch(); if (c=='q') return;
        if (c=='a') { printf("\n\n"); goto again; }
        goto loop;

}

Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Tuning again

Post by Edmund »

Going for a predefined LOS margin is much more accurate. Why are you tackling the problem from the other side?
User avatar
Rebel
Posts: 7299
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Tuning again

Post by Rebel »

What's LOS ?
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Tuning again

Post by zamar »

Rebel wrote:Since I don't have the hardware to play 100,000 games I limit myself to 4000.
Maybe time to buy a new machine? Let's say we use 5 seconds for one match.

100000 * 5s / (24 * 60 * 60) = 5.78 days.

This is approximately the time, it took to optimize one parameter set :-)
Joona Kiiski
User avatar
Rebel
Posts: 7299
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Tuning again

Post by Rebel »

5 secs for a whole game? Never tried that :shock: This generation surely has a whole new elo explanation.

With Arena running on a quad I can do do 4 matches simultaneously at 8 ply producing 6000 games a day.

But I will try your 5 secs idea.
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Tuning again

Post by zamar »

Rebel wrote:5 secs for a whole game? Never tried that :shock:
Well, to be more exact we used 5s+0.1s/move time controls but ran 4 matches in parallel, so it approximately resulted in 4games/20s.
Joona Kiiski
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Tuning again

Post by bob »

Rebel wrote:What's LOS ?
Likelihood Of Superiority

BayesElo will provide this.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Tuning again

Post by mcostalba »

Rebel wrote:5 secs for a whole game? Never tried that :shock: This generation surely has a whole new elo explanation.

With Arena running on a quad I can do do 4 matches simultaneously at 8 ply producing 6000 games a day.

But I will try your 5 secs idea.
Running games at fixed depth (especially so low like 8 plies) has some drawback, running in a GUI like Arena has even more drawbacks, I'd suggest a command line tournament manager like cutechess-cli and run on time.

BTW your C is very assemblish, lovely stuff, really, no joking: it has a kind of vintage fashion.
User avatar
Rebel
Posts: 7299
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Tuning again

Post by Rebel »

mcostalba wrote:Running games at fixed depth (especially so low like 8 plies) has some drawback,
Eval tuning I strictly do at fixed depth. I don't want external factors like time control or permanent brain to interfere. Enough volume will flatten all the horizon effects eventually, both sides.
running in a GUI like Arena has even more drawbacks, I'd suggest a command line tournament manager like cutechess-cli and run on time.
Downloaded...

I like Arena because it supports nodes-matches. IMO a better way to test search related changes than on time.
BTW your C is very assemblish, lovely stuff, really, no joking: it has a kind of vintage fashion.
All my engines were in assembler. I just can't get used to these brackets.

Code: Select all

{  
   { 
       { 
           {
           }
       }
   }
}
Things like that drives me crazy :wink:
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Tuning again

Post by mcostalba »

Rebel wrote: Eval tuning I strictly do at fixed depth. I don't want external factors like time control or permanent brain to interfere. Enough volume will flatten all the horizon effects eventually, both sides.
IMHO the main drawbacks are: impossible to test depth sensible stuff like king safety and artificial same depth for midgame and endgame. But I agree for some evaluation parameters could be good, actually I will give it a try.
All my engines were in assembler. I just can't get used to these brackets.

Code: Select all

{  
   { 
       { 
           {
           }
       }
   }
}
Things like that drives me crazy :wink:
Mee too :wink: The problem here is the excessive indentation level more than the brackets in itself.