An objective test process for the rest of us?

hgm · Post by **hgm** » Sat Sep 22, 2007 7:25 pm

bob wrote:You overlook _one_ important detail. Lose X elo due to clearing the hash, lose Y elo to do a deterministic timing algorithm. Lose Z elo by clearing killer moves. Before long you have lost a _bunch_. A 10 elo improvement is not exactly something one can throw out if he hopes to make progress...

And after finding and removing 10 bugs, you flip a switch, and tataa...!

Suddenly all the lost Elo points are back!

bob · Post by **bob** » Sat Sep 22, 2007 9:28 pm

hgm wrote:
bob wrote:You overlook _one_ important detail. Lose X elo due to clearing the hash, lose Y elo to do a deterministic timing algorithm. Lose Z elo by clearing killer moves. Before long you have lost a _bunch_. A 10 elo improvement is not exactly something one can throw out if he hopes to make progress...
And after finding and removing 10 bugs, you flip a switch, and tataa...!

Suddenly all the lost Elo points are back!

Not if you don't have the code to do any of those things effectively...

hgm · Post by **hgm** » Sat Sep 22, 2007 9:57 pm

#ifdef and #endif can work wonders...

hristo · Post by **hristo** » Sat Sep 22, 2007 10:05 pm

hgm wrote:#ifdef and #endif can work wonders...

...and nightmares ...

all the same at the end, since we rewrite the software from scratch.

Regards,
Hristo

bob · Post by **bob** » Sun Sep 23, 2007 6:01 pm

hgm wrote:#ifdef and #endif can work wonders...

with introducing bugs, I would agree. It is not a simple if-def test in my code to make the time decisions deterministic... It would be changes in several places. And that kind of testing would hide one type of error I could have because that code would never get executed during testing, only in games...

hgm · Post by **hgm** » Sun Sep 23, 2007 6:50 pm

This discussion really gets surrealistic. What point do you want to make? You want to convince the world that it is too difficult for you to implement a simple compiler switch between two types of time management, without errors?

OK, you convinced me. You haven't a hope in hell to get this ever working.

In Joker, however, before thinking on a move, I calculate 3 time limits, based on time left on the clock and number of moves to do in that time. These limits give the time where I can still start new iterations, the time where I will do not start search of new moves in the root if the score is not much worse as that of the previous iteration, and the time where I abort the search unconditionally. And indeed, in many places in the code there are tests on these time limits.

So what do I do to switch to the 'finish-iterations' time management? I simply set the second and third time limit to such a large value that they cannot be exceeded. Like:

Code: Select all

// Calculate time limits
TimePerMove = TimeLeft/(MovesLeft+4); // or whatever
TimeLimit1 = 0.6*TimePerMove; // no new iterations start after this
TimeLimit2 = 2.0*TimePerMove; // no new root moves can be searched after this
TimeLimit3 = 4.0*TimePerMove; // search in progress will be aborted after this
#ifdef DETERMINISTIC
TimeLimit2 = TimeLimit3 = 1e9;
#endif

Only one #ifdef in a bug-free place.

You see, if something is difficult or not depends not so much on what you want to do, but more on who the programmer is...

bob · Post by **bob** » Sun Sep 23, 2007 9:40 pm

hgm wrote:This discussion really gets surrealistic. What point do you want to make? You want to convince the world that it is too difficult for you to implement a simple compiler switch between two types of time management, without errors?

No. I want to convince the world that adding code, which only makes my engine behave _differently_ than it would in normal games, and which would only be used in a specific type of test methodology, has the potential to introduce errors that also have to be debugged. And it causes me to test the program in a different mode than in the way it normally runs.

I find questionable timing decisions every now and then. I don't want to remove those or I won't see them. I don't want to test program A and then use program B in a real game. That means I don't want to stop carrying trans/ref entries from iteration to iteration and search to search, I don't want to clear killers and such after each search. I don't want to complete iterations before I stop. The list goes on and on. It isn't just _one_ change, if you read Uri's post. It represents a pretty significant departure. And it turns off things that I think belong in the test cycle. And adds stuff that should not.

So it isn't about "difficulty". I've already written the 50,000+ line program, it is about thorough testing rather than testing things in isolation in a mode that I would never use in real games...

what's so surreal about that?

OK, you convinced me. You haven't a hope in hell to get this ever working.

In Joker, however, before thinking on a move, I calculate 3 time limits, based on time left on the clock and number of moves to do in that time. These limits give the time where I can still start new iterations, the time where I will do not start search of new moves in the root if the score is not much worse as that of the previous iteration, and the time where I abort the search unconditionally. And indeed, in many places in the code there are tests on these time limits.

So what? Do it in Joker. Mine's a bit more complex than that. And we were not _just_ talking about timing, but about clearing the hash, killers, history counters, and I assume pawn hash since that is "hash"... I did timing like that a _long_ time ago. Everyone did. But by the late 70's we had "gotten over it" The branching factor is not constant from iteration to iteration. So you occasionally make mistakes there that waste time. And sometimes you miss going to the next iteration and failing low quickly... Nothing wrong with doing it the simple and easy way. I just don't do it like that and consider it enough of a "step back" that I don't plan on adding it for testing only, along with all the other things Uri mentioned... I believe that the successive positions played in a chess game are not independent at all, that there should be some sort of "flow" between them. I use that approach to make my search more efficient/accurate by maintaining all information between searches until it slowly fades away due to disuse...

So what do I do to switch to the 'finish-iterations' time management? I simply set the second and third time limit to such a large value that they cannot be exceeded. Like:
Code: Select all
// Calculate time limits
TimePerMove = TimeLeft/(MovesLeft+4); // or whatever
TimeLimit1 = 0.6*TimePerMove; // no new iterations start after this
TimeLimit2 = 2.0*TimePerMove; // no new root moves can be searched after this
TimeLimit3 = 4.0*TimePerMove; // search in progress will be aborted after this
#ifdef DETERMINISTIC
TimeLimit2 = TimeLimit3 = 1e9;
#endif
Only one #ifdef in a bug-free place.

You see, if something is difficult or not depends not so much on what you want to do, but more on who the programmer is...

jwes · Post by **jwes** » Sun Sep 23, 2007 10:06 pm

bob wrote:
hgm wrote:
bob wrote:You overlook _one_ important detail. Lose X elo due to clearing the hash, lose Y elo to do a deterministic timing algorithm. Lose Z elo by clearing killer moves. Before long you have lost a _bunch_. A 10 elo improvement is not exactly something one can throw out if he hopes to make progress...
And after finding and removing 10 bugs, you flip a switch, and tataa...!

Suddenly all the lost Elo points are back!
Not if you don't have the code to do any of those things effectively...

The point is that by deferring implementation of full hash tables, he mekes all his bugs deterministic, which means anytme his program plays a bad move in a test, he can easily recreate it while debugging.

hgm · Post by **hgm** » Sun Sep 23, 2007 10:32 pm

bob wrote: So it isn't about "difficulty". I've already written the 50,000+ line program, it is about thorough testing rather than testing things in isolation in a mode that I would never use in real games...

So you are singing a quite different son now. Well, it wasn't me who introdued fear for bugs in this discussion. But now it is simply too much work...

Your other remarks sound rather strange, as we concluded before that Joker's and Crafty's time management were nearly identical. So "I can't make any chocolate of this", as we would say in Dutch.

Incrementing the HashKey(s) before each search to invalidate the hash also seems neither bug prone, nor to take more than 2 minutes to code... To be frank, I hardly ever heard such silly excuses in my life.

That you want to test your engine under "real life" conditions, rather than under controlled conditions is... Well, let me say it is something that scientists abandoned about 300 years ago. It is the reason that we build laboratories, to shield our objects of investigation as much as we can from "real life", take them apart if we can, and test the individual components in isolation to avoid having to study the more complex behavior of the total system. And when we know the components and their behavior under all conceivable circumstances thouroughly, we put them together in a way that we then can be sure to produce the desired effect when our constructions leave the lab. AS the coorect behaviour would be a consequence of understanding the system and the interaction of the components, so that we can predict how it would behave under _any_ circumstances, rather than the ones we happened to try out.

That is science. the other thing used to be known as alchemy. But if yu think it is a good idea to turn back the clock 300 years, well, suit yourself. I wouldn't advice it to anyone, that's all.

bob · Post by **bob** » Mon Sep 24, 2007 12:25 am

hgm wrote:
bob wrote: So it isn't about "difficulty". I've already written the 50,000+ line program, it is about thorough testing rather than testing things in isolation in a mode that I would never use in real games...
So you are singing a quite different son now. Well, it wasn't me who introdued fear for bugs in this discussion. But now it is simply too much work...

A simple question here: "what is wrong with you?"

There is effort involved in making the changes. There is effort involved in verifying that the changes don't have bugs. There is effort involved in analyzing how all of this affects the rest of the search. So I am not arguing anything different at all. I just added explanation as to several reasons why I don't want to do things to my program that won't be used in real games...

I can't imagine anyone missing the connection that "more bugs" equates to "more effort". In addition to the effort to make the initial changes...

I don't now, have never, and will never care about trying to make the search deterministic so that when something strange happens in a game, I can reproduce that result exactly. First, it isn't possible due to pondering and SMP issues, both of which affect hashing issues, and other search issues that rely on data that carries from move to move. So why add code to make parts of the thing deterministic, when it is completely impossible to make the most important parts deterministic????

Talk about "surreal" this really is it...

Your other remarks sound rather strange, as we concluded before that Joker's and Crafty's time management were nearly identical. So "I can't make any chocolate of this", as we would say in Dutch.

Sorry, but we have jumped back and forth enough between your two programs that I don't pretend to keep up with who does what... I've explained what I do, which is non-deterministic with the PC's inaccurate real-time clock. Not to mention the effect my opponent has when I ponder, and then the effect caused by SMP searching.

Incrementing the HashKey(s) before each search to invalidate the hash also seems neither bug prone, nor to take more than 2 minutes to code... To be frank, I hardly ever heard such silly excuses in my life.

Ever wonder where that idea came from? You will find it in Crafty. Except I don't use it to "invalidate" anything whatsoever, which is a lousy idea. I use it to prevent an old position from an earlier search from living in the hash table too long (because of its deep draft) wasting that entry with unnecessary data. I've _never_ "cleared" the hash table in any program I have since Cray Blitz days. The "id" flag was something I used prior to 1980 to avoid stomping thru the large hash tables I could create back then. So I have never heard such a silly statement in my life since I already do that. I am talking about code to _clear_ entries so that they won't be used to influence the next search or whatever. And no I don't do that. And no it would not be a couple of minutes worth of work. And yes, it could introduce errors. And yes the code would be worthless because the very idea is worthless...

That you want to test your engine under "real life" conditions, rather than under controlled conditions is... Well, let me say it is something that scientists abandoned about 300 years ago. It is the reason that we build laboratories, to shield our objects of investigation as much as we can from "real life", take them apart if we can, and test the individual components in isolation to avoid having to study the more complex behavior of the total system.

Sorry, but that is _not_ why we build labs. We build labs so that we can control everything that is controllable. But physics labs don't control atomic motion any more than I do. And if it is my goal to design a process that is going to work 10K meters beneath the surface of the ocean, I am not going to be doing my testing in a vacuum tank. Yes I'd like to eliminate any outside influence that I can ignore, but no I don't want to eliminate any component that is going to be an inherent part of the environment that the thing has to work in...

You are putting way too much into what you _think_ I am saying, rather than addressing what I really _am_ saying.

Simply:

I am not interested in designing, writing or testing something that I am not going to use in real games. It wastes too much time. There are exceptions. If I see a need for a debugging tool in my program, I add it. Using conditional compilation. And I deal with the testing and debugging because I feel that the resulting code will be worth the effort. But for what has been discussed here, it would change the basic search environment significantly, so that I could no longer be sure I am testing the same algorithm that is used in real games. Carrying or not carrying hash table entries from search to search can be a big deal. When you ponder and start at previous depth -1 or you start at ply 1 can be a big deal. And I certainly want to include those factors in my testing since they are an integral part of how crafty searches... and they influence the branching factor, time to move, etc...

And when we know the components and their behavior under all conceivable circumstances thouroughly, we put them together in a way that we then can be sure to produce the desired effect when our constructions leave the lab. AS the coorect behaviour would be a consequence of understanding the system and the interaction of the components, so that we can predict how it would behave under _any_ circumstances, rather than the ones we happened to try out.

That is science. the other thing used to be known as alchemy. But if yu think it is a good idea to turn back the clock 300 years, well, suit yourself. I wouldn't advice it to anyone, that's all.
I call your approach more alchemy that science, because of the testing requirements needed to make sure your "test probe" is very high-impedence. I can say with no reservation that good programs of the future are going to be non-deterministic from the get-go, because they _will_ be parallel. Given that, this entire thread is moot...

An objective test process for the rest of us?

Re: An objective test process for the rest of us?

Re: An objective test process for the rest of us?

Re: An objective test process for the rest of us?

Re: An objective test process for the rest of us?

Re: An objective test process for the rest of us?

Re: An objective test process for the rest of us?

Re: An objective test process for the rest of us?

Re: An objective test process for the rest of us?

Re: An objective test process for the rest of us?

Re: An objective test process for the rest of us?