Speeding Up The Tuner

D Sceviour · Post by **D Sceviour** » Sun Sep 06, 2020 2:56 am

I noticed that 99% of the calculation time for stochastic gradient tuning was spent parsing fen strings, and building positional structures. Only 1% of the time was spent on evaluation. To speed up the tuner I saved a binary file of positional structures to disk. Then, the binary was recalled for tuning. Instead of 5 hours spent tuning a variable, it now takes less than one second.

Most engines have a unique position structure, and a "position_t" is often all that is necessary to evaluate a position. It takes several hours to create the binary file, but it only needs to be done once. Instead of:

if (fgets(line, 256, fin) == NULL)

that programs like Ethereal use with an epd file, to tune now use with the binary file:

while (fread(pos, sizeof(position_t), 1 , f)) {

The binary file could be re-opened into another array for tuning, but disk read is so fast that it is not necessary. Besides, if threading is used then the disk binary file can be used to save memory. The "quiet-labeled.epd" was an old common file used for tuning. It creates a binary of 113,282 KB for the Schooner position_t structure. A link from someone would be appreciated if there are newer and better tuning epd's.

The game result is not normally saved in a position structure, so an unused variable is used to store the game result. Here is a sample 'C' code of the technique to convert an epd file and build the binary file:

Code: Select all

void CreateBinaryFile() {
FILE * f;
FILE * f2;
int n = 0;
double result;
position_t *pos = rsd->pos;

   strcpy(Tfilename,"quiet-labeled.epd");
   strcpy(Wfilename,"tune_array.bin");

   f = fopen(Tfilename,"r+");
   if (f == NULL) {
	printf("file not opened\n");
	return;
   }

   f2 = fopen(Wfilename,"wb+");
   if (f2 == NULL) {
	printf("file f2 not opened\n");
	return;
   }

   n = 0;

   while (fgets(fen,128,f)) {
	n+=1;

// the parsing method here was taken from Crafty
	nargs = ReadParse(fen," ;=");

//this loads the fen and builds the position structure in my program
        Command(rsd);

	if (!strcmp(args[5], "\"0-1\"")) {
		result = 0.0;
	} else if (!strcmp(args[5], "\"1-0\"")) {
		result = 1.0;
	} else if (!strcmp(args[5], "\"1/2-1/2\"")) {
		result = 0.5;
	} else {
	   printf("%d bad result\n",n);
	   printf("args[5] %s\n",args[5]);
	   printf("%s\n", fen);
        break;
	}

// save the game result in an unused variable
     pos->static_score = (uint16_t) (result * 2);

// and save the converted fen to disk with a sequential write
     fwrite(pos, sizeof(position_t), 1 , f2);
   }

   fclose (f); fclose(f2);
   printf("%d writes\n",n);
}

AndrewGrant · Post by **AndrewGrant** » Sun Sep 06, 2020 3:12 am

D Sceviour wrote: ↑Sun Sep 06, 2020 2:56 am I noticed that 99% of the calculation time for stochastic gradient tuning was spent parsing fen strings, and building positional structures. Only 1% of the time was spent on evaluation.

That strikes me as ... very wrong? Setup for tuning every param using ~34 million positions only takes me a few seconds. If you go for a few thousand epochs, that startup time pales in comparison. What exactly were are you doing in that 99% block of time? Or are you just doing like 5 epochs and calling it a day.

D Sceviour · Post by **D Sceviour** » Sun Sep 06, 2020 3:17 am

AndrewGrant wrote: ↑Sun Sep 06, 2020 3:12 am
D Sceviour wrote: ↑Sun Sep 06, 2020 2:56 am I noticed that 99% of the calculation time for stochastic gradient tuning was spent parsing fen strings, and building positional structures. Only 1% of the time was spent on evaluation.
That strikes me as ... very wrong? Setup for tuning every param using ~34 million positions only takes me a few seconds. If you go for a few thousand epochs, that startup time pales in comparison. What exactly were are you doing in that 99% block of time? Or are you just doing like 5 epochs and calling it a day.

I am not sure I follow your "epoch" allusions. Perhaps Schooner's string functions are slow, but most programmers have commented that it takes "hours" to tune.

AndrewGrant · Post by **AndrewGrant** » Sun Sep 06, 2020 3:41 am

D Sceviour wrote: ↑Sun Sep 06, 2020 3:17 am
AndrewGrant wrote: ↑Sun Sep 06, 2020 3:12 am
D Sceviour wrote: ↑Sun Sep 06, 2020 2:56 am I noticed that 99% of the calculation time for stochastic gradient tuning was spent parsing fen strings, and building positional structures. Only 1% of the time was spent on evaluation.
That strikes me as ... very wrong? Setup for tuning every param using ~34 million positions only takes me a few seconds. If you go for a few thousand epochs, that startup time pales in comparison. What exactly were are you doing in that 99% block of time? Or are you just doing like 5 epochs and calling it a day.
I am not sure I follow your "epoch" allusions. Perhaps Schooner's string functions are slow, but most programmers have commented that it takes "hours" to tune.

You said "stochastic gradient tuning", that implies epochs or iterations.
How many times do you iterate over the entire dataset and adjust the parameters based on the gradients.

D Sceviour · Post by **D Sceviour** » Sun Sep 06, 2020 3:57 am

AndrewGrant wrote: ↑Sun Sep 06, 2020 3:41 am You said "stochastic gradient tuning", that implies epochs or iterations.
How many times do you iterate over the entire dataset and adjust the parameters based on the gradients?

The question is a good one. Currently, the iterations continue until the Sigmoid error is less than the best known Sigmoid error, and the situation is no longer "improved". With a grain size increment of one for example, this can take 20 iterations to go from 75 to 95 for a single parameter. I guess that is correct.

AndrewGrant · Post by **AndrewGrant** » Sun Sep 06, 2020 4:46 am

D Sceviour wrote: ↑Sun Sep 06, 2020 3:57 am With a grain size increment of one for example

So... not Gradient Decent, then? You are doing naive texel tuning and hand adjusting values by +1 -1 searching for optima?

D Sceviour · Post by **D Sceviour** » Sun Sep 06, 2020 5:54 am

AndrewGrant wrote: ↑Sun Sep 06, 2020 4:46 am
D Sceviour wrote: ↑Sun Sep 06, 2020 3:57 am With a grain size increment of one for example
So... not Gradient Decent, then? You are doing naive texel tuning and hand adjusting values by +1 -1 searching for optima?

The short answer is no, but your question is getting off topic. I have only recently resurrected tuning having not looked at it for some time. If I recall, the method I am using is a hybrid between Texel's tuning and a method we discussed some time ago. The whole thing is under a re-write. The point trying to made here is to speed up the tuning process which can be done by pre-processing the fens.

AndrewGrant · Post by **AndrewGrant** » Sun Sep 06, 2020 6:13 am

D Sceviour wrote: ↑Sun Sep 06, 2020 5:54 am
AndrewGrant wrote: ↑Sun Sep 06, 2020 4:46 am
D Sceviour wrote: ↑Sun Sep 06, 2020 3:57 am With a grain size increment of one for example
So... not Gradient Decent, then? You are doing naive texel tuning and hand adjusting values by +1 -1 searching for optima?
The short answer is no, but your question is getting off topic. I have only recently resurrected tuning having not looked at it for some time. If I recall, the method I am using is a hybrid between Texel's tuning and a method we discussed some time ago. The whole thing is under a re-write. The point trying to made here is to speed up the tuning process which can be done by pre-processing the fens.

My point is that if you are doing actual Gradient Decent, you don't need to speed anything up.

Joost Buijs · Post by **Joost Buijs** » Sun Sep 06, 2020 9:29 am

I just read all the fens (about 4 million) at the start of the tuning process, convert them to binary positions and keep them in memory, this whole process takes a few seconds at most.

To determine the error I call quiescence instead of the evaluation function, this is a bit time consuming though, tuning all the weights from zero including the psqt's takes about 30 mins on my 32 core machine, only re-tuning them goes a lot faster of course.

When I tune from zero I keep the pawn value for the opening phase fixed at 70 CP as an anchor to keep the weights in line with what I would like to see.

chrisw · Post by **chrisw** » Sun Sep 06, 2020 1:21 pm

D Sceviour wrote: ↑Sun Sep 06, 2020 2:56 am I noticed that 99% of the calculation time for stochastic gradient tuning was spent parsing fen strings, and building positional structures. Only 1% of the time was spent on evaluation. To speed up the tuner I saved a binary file of positional structures to disk. Then, the binary was recalled for tuning. Instead of 5 hours spent tuning a variable, it now takes less than one second.

Most engines have a unique position structure, and a "position_t" is often all that is necessary to evaluate a position. It takes several hours to create the binary file, but it only needs to be done once. Instead of:

if (fgets(line, 256, fin) == NULL)

that programs like Ethereal use with an epd file, to tune now use with the binary file:

while (fread(pos, sizeof(position_t), 1 , f)) {

The binary file could be re-opened into another array for tuning, but disk read is so fast that it is not necessary. Besides, if threading is used then the disk binary file can be used to save memory. The "quiet-labeled.epd" was an old common file used for tuning. It creates a binary of 113,282 KB for the Schooner position_t structure.

A link from someone would be appreciated if there are newer and better tuning epd's.

+1

If there's a public respository to keep them in, I can contribute
42M evaluated EPDs from Ed Schroeders Human database (distilled from LiChess, I think)
8.6M evaluated EPDs distilled from CCRL 40-40
13M evaluated EPDs distilled from CCRL-blitz

evals are at 25 ms, SF11, one thread

The game result is not normally saved in a position structure, so an unused variable is used to store the game result. Here is a sample 'C' code of the technique to convert an epd file and build the binary file:

Code: Select all

void CreateBinaryFile() {
FILE * f;
FILE * f2;
int n = 0;
double result;
position_t *pos = rsd->pos;

   strcpy(Tfilename,"quiet-labeled.epd");
   strcpy(Wfilename,"tune_array.bin");

   f = fopen(Tfilename,"r+");
   if (f == NULL) {
	printf("file not opened\n");
	return;
   }

   f2 = fopen(Wfilename,"wb+");
   if (f2 == NULL) {
	printf("file f2 not opened\n");
	return;
   }

   n = 0;

   while (fgets(fen,128,f)) {
	n+=1;

// the parsing method here was taken from Crafty
	nargs = ReadParse(fen," ;=");

//this loads the fen and builds the position structure in my program
        Command(rsd);

	if (!strcmp(args[5], "\"0-1\"")) {
		result = 0.0;
	} else if (!strcmp(args[5], "\"1-0\"")) {
		result = 1.0;
	} else if (!strcmp(args[5], "\"1/2-1/2\"")) {
		result = 0.5;
	} else {
	   printf("%d bad result\n",n);
	   printf("args[5] %s\n",args[5]);
	   printf("%s\n", fen);
        break;
	}

// save the game result in an unused variable
     pos->static_score = (uint16_t) (result * 2);

// and save the converted fen to disk with a sequential write
     fwrite(pos, sizeof(position_t), 1 , f2);
   }

   fclose (f); fclose(f2);
   printf("%d writes\n",n);
}

Speeding Up The Tuner

Speeding Up The Tuner

Re: Speeding Up The Tuner

Re: Speeding Up The Tuner

Re: Speeding Up The Tuner

Re: Speeding Up The Tuner

Re: Speeding Up The Tuner

Re: Speeding Up The Tuner

Re: Speeding Up The Tuner

Re: Speeding Up The Tuner

Re: Speeding Up The Tuner