bob wrote:A simple question here:  "what is wrong with you?"
There is effort involved in making the changes.  There is effort involved in verifying that the changes don't have bugs.  There is effort involved in analyzing how all of this affects the rest of the search.  So I am not arguing anything different at all.  I just added explanation as to several reasons why I don't want to do things to my program that won't be used in real games...
I can't imagine anyone missing the connection that "more bugs" equates to "more effort".  In addition to the effort to make the initial changes...
So developing and properly testing an engine is an effort. Therefore one should not do it. Strange logic... 
 
 
I don't now, have never, and will never care about trying to make the search deterministic so that when something strange happens in a game, I can reproduce that result exactly.  First, it isn't possible due to pondering and SMP issues, both of which affect hashing issues, and other search issues that rely on data that carries from move to move.  So why add code to make parts of the thing deterministic, when it is completely impossible to make the most important parts deterministic????
Talk about "surreal" this really is it...
Well, not all engines are SMP. I would even be so bold to claim that virtually no engines are SMP. 

 But even if one would have an SMP engine, what you say here merely tells you that it is a bad idea to test evaluation changes under SMP conditions. Like it is a bad idea to test eval or search changes under keep-hash conditions. Of course SMP has to be tested when you develop it, and retested now and then to make sure that you haven't inadvertently broken anything. But there is no need whatsover to retest it for any tiny evaluation change you make.
Ever wonder where that idea came from?  You will find it in Crafty.  
Well, I never look at source code of other engines, so I cannot exclude that others do it too. It just seemed the obvious way to do it. Quibbling about priority does not  seem of much relevance. The point was that it is a totally trivial change, that I would not even bother to test separately.
Except I don't use it to "invalidate" anything whatsoever, which is a lousy idea.  I use it to prevent an old position from an earlier search from living in the hash table too long (because of its deep draft) wasting that entry with unnecessary data.  I've _never_ "cleared" the hash table in any program I have since Cray Blitz days.  The "id" flag was something I used prior to 1980 to avoid stomping thru the large hash tables I could create back then.  So I have never heard such a silly statement in my life since I already do that.  I am talking about code to _clear_ entries so that they won't be used to influence the next search or whatever.  And no I don't do that.  And no it would not be a couple of minutes worth of work.  And yes, it could introduce errors.  And yes the code would be worthless because the very idea is worthless...
Well, uMax uses replace-always, so there an entry cannot affect the search in any way. More advanced replacement algorithms typically use some aging field in the hash to prevent the accumulation of deep results that you describe above.Often they store something like Draft+SearchNr there, and use that quantity as the basis for replacement decisions. In such a design, it would be a trivial change to switch to using the SearchNr itself in stead, as Uri does. That would still be a trivial change, and that it is an "effort" to implement and test it cannot be taken seriously.  Those that consider such an option useful, will happily make this 0.000001% effort.
The rest is merely your opinion, that has been discussed so much already that I won't comment on it.
Sorry, but that is _not_ why we build labs.  We build labs so that we can control everything that is controllable.  But physics labs don't control atomic motion any more than I do.  
Sure we do. This is why physicists measure cross sections of chemical or nuclear reactions by atomic or  molecular beams of well defined velocity, intersecting each other at well defined angle.
And if it is my goal to design a process that is going to work 10K meters beneath the surface of the ocean, I am not going to be doing my testing in a vacuum tank.  Yes I'd like to eliminate any outside influence that I can ignore, but no I don't want to eliminate any component that is going to be an inherent part of the environment that the thing has to work in...
You are putting way too much into what you _think_ I am saying, rather than addressing what I really _am_ saying.
Simply:
I am not interested in designing, writing or testing something that I am not going to use in real games.  It wastes too much time.  There are exceptions.  If I see a need for a debugging tool in my program, I add it.  Using conditional compilation.  And I deal with the testing and debugging because I feel that the resulting code will be worth the effort.  But for what has been discussed here, it would change the basic search environment significantly, so that I could no longer be sure I am testing the same algorithm that is used in real games.  Carrying or not carrying hash table entries from search to search can be a big deal.  When you ponder and start at previous depth -1 or you start at ply 1 can be a big deal.  And I certainly want to include those factors in my testing since they are an integral part of how crafty searches...  and they influence the branching factor, time to move, etc...
I don't think I missed any of that. It just doesn't sound convincing to me. Sure the hash table has an impact on performance. But there should not be any interaction between the hashing and evaluation changes, so I would prefer to test them in isolation. If you empirically optimize the total package, you run the risk of compensating one wrong with another wrong, adding a flawed evaluation term or adopting a bad search strategy just because iit makes you search more robust towards hash errors.
I call your approach more alchemy that science, because of the testing requirements needed to make sure your "test probe" is very high-impedence.  
Well, if you want to make an electronics metaphore... This is only what I would do to check out if a given piece of equipment was working according to given specifications. When I was developing something new, or repairing something that was faulty, I always started by breaking all feed-back loops, isolate all logical blocks from each other, and then make sure that they all perform as I think they should.
I can say with no reservation that good programs of the future are going to be non-deterministic from the get-go, because they _will_ be parallel.  Given that, this entire thread is moot...
Except, of course, that you won't have to test their eval with parallel search. There is absolutely nothing to be gained in doing that, if you have 8 cores, better to play 8 independent games, one on each of them, at 8 times slower time control. Or just 6 times slower, to reach the same depth, as SMP speedup will probably not be 100%. So you can play more games of the same quality with the same resources.