Speeding Up The Tuner

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

D Sceviour
Posts: 570
Joined: Mon Jul 20, 2015 5:06 pm

Re: Speeding Up The Tuner

Post by D Sceviour »

Joost Buijs wrote: Sun Sep 06, 2020 9:29 am I just read all the fens (about 4 million) at the start of the tuning process, convert them to binary positions and keep them in memory, this whole process takes a few seconds at most.
That is interesting. I send each fen to a setboard() routine where it also converts to a binary position. Buffers are cleared, legality tests are done, pawn hash is cleared and re-evaluated. 4 million positions would take about 20 hours to do this for my engine, so I am not sure what might be taking so long. Pre-processing has cured all this for me.
To determine the error I call quiescence instead of the evaluation function, this is a bit time consuming though, tuning all the weights from zero including the psqt's takes about 30 mins on my 32 core machine, only re-tuning them goes a lot faster of course.
The "quiet-labeled.epd" was created by the Zurichess author. The purpose was to create a file that had already been tested with quiescence. Thus, only a single evaluation needs to be done.
When I tune from zero I keep the pawn value for the opening phase fixed at 70 CP as an anchor to keep the weights in line with what I would like to see.
I intend to fix all my piece values as per the Xiphos method which I posted here:

http://talkchess.com/forum3/viewtopic.p ... os#p778360

Most of my recent elo gains have not been from tuning, but from improved move ordering, mobility scoring and king safety. The purpose of re-opening the tuner was to create a larger PST[piece][sq][enemy king] table, and to see if it might give a strength gain. This PST method is being used in the NN approaches for evaluation.
Kieren Pearson
Posts: 70
Joined: Tue Dec 31, 2019 2:52 am
Full name: Kieren Pearson

Re: Speeding Up The Tuner

Post by Kieren Pearson »

chrisw wrote: Sun Sep 06, 2020 1:21 pm
D Sceviour wrote: Sun Sep 06, 2020 2:56 am I noticed that 99% of the calculation time for stochastic gradient tuning was spent parsing fen strings, and building positional structures. Only 1% of the time was spent on evaluation. To speed up the tuner I saved a binary file of positional structures to disk. Then, the binary was recalled for tuning. Instead of 5 hours spent tuning a variable, it now takes less than one second.

Most engines have a unique position structure, and a "position_t" is often all that is necessary to evaluate a position. It takes several hours to create the binary file, but it only needs to be done once. Instead of:

if (fgets(line, 256, fin) == NULL)

that programs like Ethereal use with an epd file, to tune now use with the binary file:

while (fread(pos, sizeof(position_t), 1 , f)) {

The binary file could be re-opened into another array for tuning, but disk read is so fast that it is not necessary. Besides, if threading is used then the disk binary file can be used to save memory. The "quiet-labeled.epd" was an old common file used for tuning. It creates a binary of 113,282 KB for the Schooner position_t structure.

A link from someone would be appreciated if there are newer and better tuning epd's.
+1

If there's a public respository to keep them in, I can contribute
42M evaluated EPDs from Ed Schroeders Human database (distilled from LiChess, I think)
8.6M evaluated EPDs distilled from CCRL 40-40
13M evaluated EPDs distilled from CCRL-blitz

evals are at 25 ms, SF11, one thread

The game result is not normally saved in a position structure, so an unused variable is used to store the game result. Here is a sample 'C' code of the technique to convert an epd file and build the binary file:

Code: Select all

void CreateBinaryFile() {
FILE * f;
FILE * f2;
int n = 0;
double result;
position_t *pos = rsd->pos;

   strcpy(Tfilename,"quiet-labeled.epd");
   strcpy(Wfilename,"tune_array.bin");

   f = fopen(Tfilename,"r+");
   if (f == NULL) {
	printf("file not opened\n");
	return;
   }

   f2 = fopen(Wfilename,"wb+");
   if (f2 == NULL) {
	printf("file f2 not opened\n");
	return;
   }

   n = 0;

   while (fgets(fen,128,f)) {
	n+=1;

// the parsing method here was taken from Crafty
	nargs = ReadParse(fen," ;=");

//this loads the fen and builds the position structure in my program
        Command(rsd);

	if (!strcmp(args[5], "\"0-1\"")) {
		result = 0.0;
	} else if (!strcmp(args[5], "\"1-0\"")) {
		result = 1.0;
	} else if (!strcmp(args[5], "\"1/2-1/2\"")) {
		result = 0.5;
	} else {
	   printf("%d bad result\n",n);
	   printf("args[5] %s\n",args[5]);
	   printf("%s\n", fen);
        break;
	}

// save the game result in an unused variable
     pos->static_score = (uint16_t) (result * 2);

// and save the converted fen to disk with a sequential write
     fwrite(pos, sizeof(position_t), 1 , f2);
   }

   fclose (f); fclose(f2);
   printf("%d writes\n",n);
}
I'm very interested in using these datasets. Currently I am using the 725K epd quiet-labelled dataset from zuirchess. This has worked well, but I am looking for a similar dataset which is larger if possible.
chrisw
Posts: 4319
Joined: Tue Apr 03, 2012 4:28 pm

Re: Speeding Up The Tuner

Post by chrisw »

Kieren Pearson wrote: Sun Sep 06, 2020 1:39 pm
chrisw wrote: Sun Sep 06, 2020 1:21 pm
D Sceviour wrote: Sun Sep 06, 2020 2:56 am I noticed that 99% of the calculation time for stochastic gradient tuning was spent parsing fen strings, and building positional structures. Only 1% of the time was spent on evaluation. To speed up the tuner I saved a binary file of positional structures to disk. Then, the binary was recalled for tuning. Instead of 5 hours spent tuning a variable, it now takes less than one second.

Most engines have a unique position structure, and a "position_t" is often all that is necessary to evaluate a position. It takes several hours to create the binary file, but it only needs to be done once. Instead of:

if (fgets(line, 256, fin) == NULL)

that programs like Ethereal use with an epd file, to tune now use with the binary file:

while (fread(pos, sizeof(position_t), 1 , f)) {

The binary file could be re-opened into another array for tuning, but disk read is so fast that it is not necessary. Besides, if threading is used then the disk binary file can be used to save memory. The "quiet-labeled.epd" was an old common file used for tuning. It creates a binary of 113,282 KB for the Schooner position_t structure.

A link from someone would be appreciated if there are newer and better tuning epd's.
+1

If there's a public respository to keep them in, I can contribute
42M evaluated EPDs from Ed Schroeders Human database (distilled from LiChess, I think)
8.6M evaluated EPDs distilled from CCRL 40-40
13M evaluated EPDs distilled from CCRL-blitz

evals are at 25 ms, SF11, one thread

The game result is not normally saved in a position structure, so an unused variable is used to store the game result. Here is a sample 'C' code of the technique to convert an epd file and build the binary file:

Code: Select all

void CreateBinaryFile() {
FILE * f;
FILE * f2;
int n = 0;
double result;
position_t *pos = rsd->pos;

   strcpy(Tfilename,"quiet-labeled.epd");
   strcpy(Wfilename,"tune_array.bin");

   f = fopen(Tfilename,"r+");
   if (f == NULL) {
	printf("file not opened\n");
	return;
   }

   f2 = fopen(Wfilename,"wb+");
   if (f2 == NULL) {
	printf("file f2 not opened\n");
	return;
   }

   n = 0;

   while (fgets(fen,128,f)) {
	n+=1;

// the parsing method here was taken from Crafty
	nargs = ReadParse(fen," ;=");

//this loads the fen and builds the position structure in my program
        Command(rsd);

	if (!strcmp(args[5], "\"0-1\"")) {
		result = 0.0;
	} else if (!strcmp(args[5], "\"1-0\"")) {
		result = 1.0;
	} else if (!strcmp(args[5], "\"1/2-1/2\"")) {
		result = 0.5;
	} else {
	   printf("%d bad result\n",n);
	   printf("args[5] %s\n",args[5]);
	   printf("%s\n", fen);
        break;
	}

// save the game result in an unused variable
     pos->static_score = (uint16_t) (result * 2);

// and save the converted fen to disk with a sequential write
     fwrite(pos, sizeof(position_t), 1 , f2);
   }

   fclose (f); fclose(f2);
   printf("%d writes\n",n);
}
I'm very interested in using these datasets. Currently I am using the 725K epd quiet-labelled dataset from zuirchess. This has worked well, but I am looking for a similar dataset which is larger if possible.
Well, if we can find a public repository for swapping/exchanging/downloading you’re welcome to them.
Joost Buijs
Posts: 1564
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: Speeding Up The Tuner

Post by Joost Buijs »

D Sceviour wrote: Sun Sep 06, 2020 1:35 pm That is interesting. I send each fen to a setboard() routine where it also converts to a binary position. Buffers are cleared, legality tests are done, pawn hash is cleared and re-evaluated. 4 million positions would take about 20 hours to do this for my engine, so I am not sure what might be taking so long. Pre-processing has cured all this for me.
It's really weird that it would take 20 hours to convert 4 million positions to binary with setboard. I use my setboard too, but I have to admit that it does very little error checking.
The "quiet-labeled.epd" was created by the Zurichess author. The purpose was to create a file that had already been tested with quiescence. Thus, only a single evaluation needs to be done.
I use positions that have not been checked for quiescence, that's why I use the quiescence search instead of the evaluation function. Of course this is slow, but still fast enough.
I intend to fix all my piece values as per the Xiphos method which I posted here:

http://talkchess.com/forum3/viewtopic.p ... os#p778360

Most of my recent elo gains have not been from tuning, but from improved move ordering, mobility scoring and king safety. The purpose of re-opening the tuner was to create a larger PST[piece][sq][enemy king] table, and to see if it might give a strength gain. This PST method is being used in the NN approaches for evaluation.
I don't know what the Xiphos method is to fix piece values, I will take a look at it.

Tuning with logistic regression gave me an additional 120 Elo compared to my old hand tuned evaluation-function. My evaluation-function is rather old, it stems back from 2013 and I have to rework it at some time.

PST[piece][sq][enemy king] looks interesting, it is a bit like the NNUE approach, I suppose it will take a long time to tune.
Joost Buijs
Posts: 1564
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: Speeding Up The Tuner

Post by Joost Buijs »

D Sceviour wrote: Sun Sep 06, 2020 1:35 pm I intend to fix all my piece values as per the Xiphos method which I posted here:

http://talkchess.com/forum3/viewtopic.p ... os#p778360
In fact you don't need piece values at all because they are always added to the PSQ tables. I keep the piece values separate only because I'm used to it, I adjust the PSQ tables in such a way that they have a sum of zero and add the offset from zero to the piece values, of course this is nonsense, maybe I have to throw this out.
chrisw
Posts: 4319
Joined: Tue Apr 03, 2012 4:28 pm

Re: Speeding Up The Tuner

Post by chrisw »

Joost Buijs wrote: Sun Sep 06, 2020 5:12 pm
D Sceviour wrote: Sun Sep 06, 2020 1:35 pm I intend to fix all my piece values as per the Xiphos method which I posted here:

http://talkchess.com/forum3/viewtopic.p ... os#p778360
In fact you don't need piece values at all because they are always added to the PSQ tables. I keep the piece values separate only because I'm used to it, I adjust the PSQ tables in such a way that they have a sum of zero and add the offset from zero to the piece values, of course this is nonsense, maybe I have to throw this out.
After tuning, I force pawn-value-endgame to 256 and scale everything to that. I think the effect of trying to force all piece values to a set range is that the other terms will just adjust themselves to compensate, eg you might find all mobility values for the Queen raised, or the OST’s all with an offset.

What I found is that making some sort of change in the eval, a new term or whatever, immediately (during tuning) knocks the piece value terms, I guess because they’re the “easiest” values for the tuner to change. My countermeasure is to enforce piece values to stay what they were before tuning started for N epochs (5 as I guessed it) and then allow the values to float. That seemed to work in that instabilities in tuning were put a stop to, and the rationale is that some new eval term (say some king safety thing) really ought not to have a dramatic effect on relative piece values.
D Sceviour
Posts: 570
Joined: Mon Jul 20, 2015 5:06 pm

Re: Speeding Up The Tuner

Post by D Sceviour »

Joost Buijs wrote: Sun Sep 06, 2020 5:12 pm
D Sceviour wrote: Sun Sep 06, 2020 1:35 pm I intend to fix all my piece values as per the Xiphos method which I posted here:

http://talkchess.com/forum3/viewtopic.p ... os#p778360
In fact you don't need piece values at all because they are always added to the PSQ tables. I keep the piece values separate only because I'm used to it, I adjust the PSQ tables in such a way that they have a sum of zero and add the offset from zero to the piece values, of course this is nonsense, maybe I have to throw this out.
The point of the Xiphos method is that there are no calculations for material imbalance (except for the Bishop pair).
D Sceviour
Posts: 570
Joined: Mon Jul 20, 2015 5:06 pm

Re: Speeding Up The Tuner

Post by D Sceviour »

chrisw wrote: Sun Sep 06, 2020 5:47 pm After tuning, I force pawn-value-endgame to 256 and scale everything to that. I think the effect of trying to force all piece values to a set range is that the other terms will just adjust themselves to compensate, eg you might find all mobility values for the Queen raised, or the OST’s all with an offset.

What I found is that making some sort of change in the eval, a new term or whatever, immediately (during tuning) knocks the piece value terms, I guess because they’re the “easiest” values for the tuner to change. My countermeasure is to enforce piece values to stay what they were before tuning started for N epochs (5 as I guessed it) and then allow the values to float. That seemed to work in that instabilities in tuning were put a stop to, and the rationale is that some new eval term (say some king safety thing) really ought not to have a dramatic effect on relative piece values.
I think this is what Andrew Grant refers to as non-linear tuning. I am only beginning to look at his latest research so I am not sure.
Joost Buijs
Posts: 1564
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: Speeding Up The Tuner

Post by Joost Buijs »

D Sceviour wrote: Sun Sep 06, 2020 5:50 pm
Joost Buijs wrote: Sun Sep 06, 2020 5:12 pm
D Sceviour wrote: Sun Sep 06, 2020 1:35 pm I intend to fix all my piece values as per the Xiphos method which I posted here:

http://talkchess.com/forum3/viewtopic.p ... os#p778360
In fact you don't need piece values at all because they are always added to the PSQ tables. I keep the piece values separate only because I'm used to it, I adjust the PSQ tables in such a way that they have a sum of zero and add the offset from zero to the piece values, of course this is nonsense, maybe I have to throw this out.
The point of the Xiphos method is that there are no calculations for material imbalance (except for the Bishop pair).
My guess is that you always need some information about material imbalance. I use a very large table indexed by the (perfect) material hash which contains a scaling factor, significant combinations are tuned by logistic regression, most others remain at 100%. I could also put the whole material evaluation value into that table, but this would make it twice as large. Since I incrementally update the material evaluation it would not gain much if anything at all.

The bishop pair and opposite bishops are treated separately because I don't distinguish between lite and dark square bishops.
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: Speeding Up The Tuner

Post by Alayan »

Joost Buijs wrote: Sun Sep 06, 2020 4:43 pm
D Sceviour wrote: Sun Sep 06, 2020 1:35 pm Most of my recent elo gains have not been from tuning, but from improved move ordering, mobility scoring and king safety. The purpose of re-opening the tuner was to create a larger PST[piece][sq][enemy king] table, and to see if it might give a strength gain. This PST method is being used in the NN approaches for evaluation.
I don't know what the Xiphos method is to fix piece values, I will take a look at it.

Tuning with logistic regression gave me an additional 120 Elo compared to my old hand tuned evaluation-function. My evaluation-function is rather old, it stems back from 2013 and I have to rework it at some time.

PST[piece][sq][enemy king] looks interesting, it is a bit like the NNUE approach, I suppose it will take a long time to tune.
With 5x64x64 values, you get over 20K parameters. Some of these combinations are very frequent but many others will be less than 1 in a million. This makes overfitting a massive issue. You don't just need quality but also quantity in the dataset for it to work out, tens of millions of positions is a lower-bound, hundreds of millions may be where diminishing return on dataset size kick in.

So, you can't use a dataset generation method that requires a lot of hardware time, it's unfeasible. You need fast dataset generation even at the cost of accuracy.