Rounding

Gerd Isenberg · Post by **Gerd Isenberg** » Wed Jan 18, 2012 11:24 pm

lkaufman wrote:Any opinions?

A sum of rounded subterms is worse (rounding errors may accumulate) a rounded sum of subterms. I guess future eval will use some 16-bit and/or 32-bit float with heavy usage of SIMD dot-products.

lkaufman · Post by **lkaufman** » Wed Jan 18, 2012 11:25 pm

rvida wrote:
lkaufman wrote:
rvida wrote:
lkaufman wrote:Others do not, namely (I believe) Ivanhoe, Critter, and (probably) Houdini.
Houdini uses 1/3200 pawn unit in eval, 1/200 in search.
Thanks, so that is presumably one of the changes between Houdini 1.0 and 1.5, since 1.0 was basically Ivanhoe. Since we all know that this was a big jump for Houdini, this strongly suggests that rounding is a good idea. Of course it's possible that this was one bad idea among other good ones, but this seems unlikely.
No, that change was between 1.5 and 2.0.
Houdini 1.5 used 1/200 pawn unit in both eval and search.

Well, this is interesting. It's pretty clear from the data that Houdini 2 is much stronger than 1.5 at bullet, slightly stronger at blitz, and not at all stronger (probably weaker) at longer games. But this change should help at long games if at all (more accurate eval should be more important with depth). Can you think of any other change between 1.5 and 2.0 that might have helped fast results while hurting slow results?

rvida · Post by **rvida** » Wed Jan 18, 2012 11:26 pm

lkaufman wrote:My opinion is that it does, but since both Critter and Ivanhoe don't use more refined eval than the search uses, their authors presumably hold a different opinion. If so I would like to know why they do it the way that they do. I guess there is no one here who can answer for Ivanhoe though.

I think 1/256 granularity used in Critter is good enough. 1/100 used in most engines might be a bit coarse for some terms (eg. mobility, where you add N*number of squares and N=1 is too small N=2 too big)

lkaufman · Post by **lkaufman** » Wed Jan 18, 2012 11:33 pm

Gerd Isenberg wrote:
lkaufman wrote:Any opinions?
A sum of rounded subterms is worse (rounding errors may accumulate) a rounded sum of subterms. I guess future eval will use some 16-bit and/or 32-bit float with heavy usage of SIMD dot-products.

That was also our thinking. I'm just wondering if we are missing some downside to finer-resolution eval, holding search resolution constant? It doesn't seem so, but I have to wonder why Ippo went to such coarse resolution for scoring as 1 cp when Rybka used 1 mp. Since Ippo was stronger than R3, I doubt that they made too many regressive steps, though to me this seems like one. I'm not trying to reopen the Rybka-Ippo debate, but I think there is at least concensus that the Ippo author(s) knew what was in R3.

marcelk · Post by **marcelk** » Wed Jan 18, 2012 11:33 pm

lkaufman wrote:
marcelk wrote:
lkaufman wrote: It is not at all simple to test. You have to write two distinct evaluation functions, one using millipawns (to be rounded to centipawns), the other using centipawns directly. That's a huge project.

Not sure what you mean there. The second is simple in case you have the first already. Rounding just adds discretization errors to eval (=bad) in return for fewer nodes spent on meaningless researches (=good).
You are talking about rounding a given eval function. My question is whether having more accurate, refined eval does any good if you round it off to the same ultimate resolution (before search) anyway.

Yes, if you do rounding, doing it once afterwards instead of at all subterms definitely is 'better', as you accumulate less discretization error. And the search benefit remains unchanged. Also that should be easy to measure (or at least approximated) for example by rounding the weights that go into a higher resolution eval. With more effort one can retune the weights and allow only multiples of a value. Whether 'better' translates into measurable elo is another question and could depend on the program structure. Eg, one program might evaluate each square individually, and needs higher internal precision. Another program might first count squares with a certain property and then apply one weight. It can get away with less precision.

lkaufman · Post by **lkaufman** » Wed Jan 18, 2012 11:39 pm

marcelk wrote:
lkaufman wrote:
marcelk wrote:
lkaufman wrote: It is not at all simple to test. You have to write two distinct evaluation functions, one using millipawns (to be rounded to centipawns), the other using centipawns directly. That's a huge project.

Not sure what you mean there. The second is simple in case you have the first already. Rounding just adds discretization errors to eval (=bad) in return for fewer nodes spent on meaningless researches (=good).
You are talking about rounding a given eval function. My question is whether having more accurate, refined eval does any good if you round it off to the same ultimate resolution (before search) anyway.
Yes, if you do rounding, doing it once afterwards instead of at all subterms definitely is 'better', as you accumulate less discretization error. And the search benefit remains unchanged. Also that should be easy to measure (or at least approximated) for example by rounding the weights that go into a higher resolution eval. With more effort one can retune the weights and allow only multiples of a value. Whether 'better' translates into measurable elo is another question and could depend on the program structure. Eg, one program might evaluate each square individually, and needs higher internal precision. Another program might first count squares with a certain property and then apply one weight. It can get away with less precision.

Okay, but why would a program like IPPO use such coarse eval resolution as one cp? Surely there might be some terms that could use half-centipawn resolution for example. So why not provide for this possibility? In other words, is there some benefit to avoiding rounding as Ippo does, or was this just foolish? Surely they could not know that they would never want to try eval terms that were not multiples of 1 cp?

lkaufman · Post by **lkaufman** » Wed Jan 18, 2012 11:43 pm

rvida wrote:
lkaufman wrote:My opinion is that it does, but since both Critter and Ivanhoe don't use more refined eval than the search uses, their authors presumably hold a different opinion. If so I would like to know why they do it the way that they do. I guess there is no one here who can answer for Ivanhoe though.
I think 1/256 granularity used in Critter is good enough. 1/100 used in most engines might be a bit coarse for some terms (eg. mobility, where you add N*number of squares and N=1 is too small N=2 too big)

Exactly! So would you care to hazard a guess as to why the Ippo author(s) used such coarse eval resolution? Also, you say "1/100 used in most engines" but as far as I know only the Ippos use such coarse resolution for eval. Who else did you have in mind?

rvida · Post by **rvida** » Wed Jan 18, 2012 11:59 pm

lkaufman wrote:
rvida wrote:
lkaufman wrote:My opinion is that it does, but since both Critter and Ivanhoe don't use more refined eval than the search uses, their authors presumably hold a different opinion. If so I would like to know why they do it the way that they do. I guess there is no one here who can answer for Ivanhoe though.
I think 1/256 granularity used in Critter is good enough. 1/100 used in most engines might be a bit coarse for some terms (eg. mobility, where you add N*number of squares and N=1 is too small N=2 too big)
Exactly! So would you care to hazard a guess as to why the Ippo author(s) used such coarse eval resolution?

Simplicity. Also because 1cp is the most natural unit to us humans.

lkaufman wrote: Also, you say "1/100 used in most engines" but as far as I know only the Ippos use such coarse resolution for eval. Who else did you have in mind?

Boot, Crafty, Delphi, Fruit, Gull just to name a few...

lkaufman · Post by **lkaufman** » Thu Jan 19, 2012 12:08 am

rvida wrote:
lkaufman wrote:
rvida wrote:
lkaufman wrote:My opinion is that it does, but since both Critter and Ivanhoe don't use more refined eval than the search uses, their authors presumably hold a different opinion. If so I would like to know why they do it the way that they do. I guess there is no one here who can answer for Ivanhoe though.
I think 1/256 granularity used in Critter is good enough. 1/100 used in most engines might be a bit coarse for some terms (eg. mobility, where you add N*number of squares and N=1 is too small N=2 too big)
Exactly! So would you care to hazard a guess as to why the Ippo author(s) used such coarse eval resolution?
Simplicity. Also because 1cp is the most natural unit to us humans.

lkaufman wrote: Also, you say "1/100 used in most engines" but as far as I know only the Ippos use such coarse resolution for eval. Who else did you have in mind?
Boot, Crafty, Delphi, Fruit, Gull just to name a few...

I'm surprised to see Crafty and Fruit in the list. Fruit, because Rybka is supposed to have used the Fruit eval; when I started work on Rybka (2.2) it used millipawn eval (or 1/1024 perhaps). Do you happen to know about Rybka 1? As for Crafty, I thought it does fairly fancy stuff with mobility, which would seem to be impossible with 1 cp resolution. Any comment?

kbhearn · Post by **kbhearn** » Thu Jan 19, 2012 1:58 am

They could probably get around such a term though. In the case of mobility, the reason it might make sense is it's a large multiplication, but at the same time you could instead use a lookup array and be able to both tune mobility in a nonlinear fashion, and be able to return an overall value for it that's not that far off even if a linear 1.5 cp ramp is the 'correct' value for the term. The possibility that you have enough terms off by a half centipawn in the same direction so as to accumulate enough total difference to change the root move seems like it should be rather rare.

Rounding

Re: Rounding

Re: Rounding

Re: Rounding

Re: Rounding

Re: Rounding

Re: Rounding

Re: Rounding

Re: Rounding

Re: Rounding

Re: Rounding