Yes. Lot of games. but at 560 games at a time (70 nodes x 8 cores per node) it goes quickly. Very quickly...Tom Likens wrote:Yes, I agree. Even if the value eventually converges, as Evert saw, it still seems suspicious. At the very least it's a term that is producing conflicting game outcomes over a wide, disparate range of values, which means it's likely not defined well enough to be correct, (or it could be an outright bug as you mentioned). I think that information is useful as an indication that something "stinks" in the engine.bob wrote:You are into what I call a "red flag" area of tuning. It should raise a red flag anytime you have a term that doesn't converge to one value. That can mean lots of things. From an outright bug, where the term is really meaningless, to a term that is not written very well and produces the same sort of problem.
What you are seeing is EXACTLY what I saw when trying to tune Fruit's "history pruning threshold". I couldn't make heads or tails of this in Crafty and no matter what type of threshold in my code I used, it sort of produced random Elo changes. I tested Fruit and found the same, which convinced me it was a bogus way of triggering a reduction.
I more commonly see three types of results. When I test/tune, I always try to cover both sides of the optimal value so that I see a neat curve with a clear peak at some optimal V, and which drops off on either side. And that's a good one. But I also see some terms that drop off on the small-value side, but which rise to some point and then no matter how much the value is increased further, the Elo remains stable. Ditto for the other direction where a term produces the same elo from zero to N, then starts to drop off beyond N. Those are a bit harder to choose.
But whenever there is no clear peak, it deserves a LOT of attention to figure out why.
Just a quick question on what you wrote above, when you test a new parameter are each of the points on the curves you describe single 30k+ runs on the cluster? I'm assuming "yes", but that's a *lot* of games. Also, what makes the last case so difficult (the zero to N case)? Is it because this term could be completely unnecessary (i.e. a value of 0 effectively removes it from the equation)?
regards,
--tom
the problem with either of the two odd cases is "what is the best value to use? The one with the biggest value for optimal Elo, or the one with the smallest value for optimal Elo. I prefer to err on the side of too-small values, as too-large values can have odd effects that are hard to predict (tossing pawns for positional "gain" and such..)