A Debugging Strategy

Jan Brouwer · Post by **Jan Brouwer** » Thu Sep 17, 2009 7:24 pm

Fguy64 wrote:OK one thing my program does not have yet is proper diagnostics output, PV or perftt or whatever you want to call it. But I do know what level1 node is being searched when something weird happens. So, if my alpha-beta is working properly, the following should work (I think)

As I see it, If I know what level1 move is being searched when the program craps out, then I should be able to make that move manually, reduce the ply by 1, restart the engine, and the engine should crap out in the exact same position.

Agreed?

A trick that I use, which typically works quite quickly, is to temporarily add a global counter to the piece of code where the problem is.
The first run I use to determine the counter value when the problem occurs (e.g. an assert that fails).
I then add an if (counter == ...) statement with a value slightly lower than the target counter value, and break there in the debugger.
That way you can go "backward in time" and step through the code just before the problem occurs.

Btw, an intriguing concept is "reversible debugging" where after hitting a break-point, the debugger is able to step backwards in time.
I have never had the opportunity to try it, but it sounds promising.

Fguy64 · Post by **Fguy64** » Thu Sep 17, 2009 7:44 pm

Jan Brouwer wrote:
Fguy64 wrote:OK one thing my program does not have yet is proper diagnostics output, PV or perftt or whatever you want to call it. But I do know what level1 node is being searched when something weird happens. So, if my alpha-beta is working properly, the following should work (I think)

As I see it, If I know what level1 move is being searched when the program craps out, then I should be able to make that move manually, reduce the ply by 1, restart the engine, and the engine should crap out in the exact same position.

Agreed?
A trick that I use, which typically works quite quickly, is to temporarily add a global counter to the piece of code where the problem is.
The first run I use to determine the counter value when the problem occurs (e.g. an assert that fails).
I then add an if (counter == ...) statement with a value slightly lower than the target counter value, and break there in the debugger.
That way you can go "backward in time" and step through the code just before the problem occurs.

Btw, an intriguing concept is "reversible debugging" where after hitting a break-point, the debugger is able to step backwards in time.
I have never had the opportunity to try it, but it sounds promising.

That's very interesting. A global counter. I'll watch out for opportunities to try that.

I know often I find myself stepping forward in time to find the manifestation of a bug, then stepping backwards to find the source. I think that is what you are talking about, and yeah intuitively it makes sense to have some kind of a tracking variable to add some rigor to the process.

xsadar · Post by **xsadar** » Thu Sep 17, 2009 8:09 pm

Jan Brouwer wrote:
Fguy64 wrote:OK one thing my program does not have yet is proper diagnostics output, PV or perftt or whatever you want to call it. But I do know what level1 node is being searched when something weird happens. So, if my alpha-beta is working properly, the following should work (I think)

As I see it, If I know what level1 move is being searched when the program craps out, then I should be able to make that move manually, reduce the ply by 1, restart the engine, and the engine should crap out in the exact same position.

Agreed?
A trick that I use, which typically works quite quickly, is to temporarily add a global counter to the piece of code where the problem is.
The first run I use to determine the counter value when the problem occurs (e.g. an assert that fails).
I then add an if (counter == ...) statement with a value slightly lower than the target counter value, and break there in the debugger.
That way you can go "backward in time" and step through the code just before the problem occurs.

Btw, an intriguing concept is "reversible debugging" where after hitting a break-point, the debugger is able to step backwards in time.
I have never had the opportunity to try it, but it sounds promising.

Hey, that counter trick sounds great! For most purposes the node counter would work. Seems so obvious that it kind of makes me wonder why I never thought of it. But I guess most bugs I get don't lead to crashes, only to wrong values.

michiguel · Post by **michiguel** » Thu Sep 17, 2009 8:20 pm

xsadar wrote:
Jan Brouwer wrote:
Fguy64 wrote:OK one thing my program does not have yet is proper diagnostics output, PV or perftt or whatever you want to call it. But I do know what level1 node is being searched when something weird happens. So, if my alpha-beta is working properly, the following should work (I think)

As I see it, If I know what level1 move is being searched when the program craps out, then I should be able to make that move manually, reduce the ply by 1, restart the engine, and the engine should crap out in the exact same position.

Agreed?
A trick that I use, which typically works quite quickly, is to temporarily add a global counter to the piece of code where the problem is.
The first run I use to determine the counter value when the problem occurs (e.g. an assert that fails).
I then add an if (counter == ...) statement with a value slightly lower than the target counter value, and break there in the debugger.
That way you can go "backward in time" and step through the code just before the problem occurs.

Btw, an intriguing concept is "reversible debugging" where after hitting a break-point, the debugger is able to step backwards in time.
I have never had the opportunity to try it, but it sounds promising.
Hey, that counter trick sounds great! For most purposes the node counter would work. Seems so obvious that it kind of makes me wonder why I never thought of it. But I guess most bugs I get don't lead to crashes, only to wrong values.

I use the node counter and it has been very helpful. The other thing I do is to dump the tree, limited by nodes. For instance, If I catch the crash at node 1,000,000 I dump the tree as long as the node counter in >950,000
Then I see exactly what branches have been traversed. Other flags trigger print statements that tell me where I am in the different functions.

Miguel

michiguel · Post by **michiguel** » Thu Sep 17, 2009 8:26 pm

Fguy64 wrote:
hgm wrote:
Fguy64 wrote:been there. Anyways I don't use an IDE either, so I suppose I am doubly handicapping myself. I'm no genius, nor am I efficient. It's been a long time coming. But I learned long ago to keep changes small. Write a few lines, recompile, review, test. Then repeat. And make lots of backups.
For me putting in print statements in strategic places is still the most powerful way to debug. In a recursive program like the search of a chess engine you must be able to limit the printout to nodes that you are really interested in, though, but with the if(PATH) method that works quite well. It is a bit of a pain when an error occurs 20 ply deep in the tree, to zoom in on it. But especially once you have implemented hashing it is the only reliable way I found to make errors reproducible.
Cool, well I know there is room for major improvements in my diagnostics, and your idea seems good to me.

Even in my "seroing in" stategy, that often will tell me more or less "where" in the code the problem manifests itself, then very often I will add a print statement that tells me "what" the problem is.

As a hobbyist, I tend not to bother with a lot of tools like sophisticated debugging and performance evaluation programs, or even IDEs, honestly to me they seem more trouble than they're worth, but I suspect that if I were to seriously consider programming as a profession, then I'd probably have to "get with the program".

I use printf statements, grep, and the simplest editor you find in gnome, which is gedit. I use gdb to find where the heck the program crashed, but the print statements I have all over the code are more useful. In Wndows I used to use a MS IDE and later Dev-C++. I do not think I was more productive with those.

Miguel

Jan Brouwer · Post by **Jan Brouwer** » Thu Sep 17, 2009 10:03 pm

xsadar wrote:
Jan Brouwer wrote:
Fguy64 wrote:OK one thing my program does not have yet is proper diagnostics output, PV or perftt or whatever you want to call it. But I do know what level1 node is being searched when something weird happens. So, if my alpha-beta is working properly, the following should work (I think)

As I see it, If I know what level1 move is being searched when the program craps out, then I should be able to make that move manually, reduce the ply by 1, restart the engine, and the engine should crap out in the exact same position.

Agreed?
A trick that I use, which typically works quite quickly, is to temporarily add a global counter to the piece of code where the problem is.
The first run I use to determine the counter value when the problem occurs (e.g. an assert that fails).
I then add an if (counter == ...) statement with a value slightly lower than the target counter value, and break there in the debugger.
That way you can go "backward in time" and step through the code just before the problem occurs.

Btw, an intriguing concept is "reversible debugging" where after hitting a break-point, the debugger is able to step backwards in time.
I have never had the opportunity to try it, but it sounds promising.
Hey, that counter trick sounds great! For most purposes the node counter would work. Seems so obvious that it kind of makes me wonder why I never thought of it. But I guess most bugs I get don't lead to crashes, only to wrong values.

Sure, you could use an existing counter.
But note that in C, it is very easy to add a temporary counter, which you remove again when the problem is fixed:

Code: Select all


    assert&#40;board.is_valid&#40;)); // ok

// REMOVE
&#123;
  static int xxx; // don't forget the "static"
  if (++xxx == 123&#41; &#123;
    board.print&#40;); // place breakpoint here
  &#125;
&#125;
    board.undo&#40;);
    assert&#40;board.is_valid&#40;)); // fails, something wrong with undo&#40;)?

xsadar · Post by **xsadar** » Fri Sep 18, 2009 12:14 am

Jan Brouwer wrote:
xsadar wrote:
Jan Brouwer wrote:
Fguy64 wrote:OK one thing my program does not have yet is proper diagnostics output, PV or perftt or whatever you want to call it. But I do know what level1 node is being searched when something weird happens. So, if my alpha-beta is working properly, the following should work (I think)

As I see it, If I know what level1 move is being searched when the program craps out, then I should be able to make that move manually, reduce the ply by 1, restart the engine, and the engine should crap out in the exact same position.

Agreed?
A trick that I use, which typically works quite quickly, is to temporarily add a global counter to the piece of code where the problem is.
The first run I use to determine the counter value when the problem occurs (e.g. an assert that fails).
I then add an if (counter == ...) statement with a value slightly lower than the target counter value, and break there in the debugger.
That way you can go "backward in time" and step through the code just before the problem occurs.

Btw, an intriguing concept is "reversible debugging" where after hitting a break-point, the debugger is able to step backwards in time.
I have never had the opportunity to try it, but it sounds promising.
Hey, that counter trick sounds great! For most purposes the node counter would work. Seems so obvious that it kind of makes me wonder why I never thought of it. But I guess most bugs I get don't lead to crashes, only to wrong values.
Sure, you could use an existing counter.
But note that in C, it is very easy to add a temporary counter, which you remove again when the problem is fixed:
Code: Select all
    assert&#40;board.is_valid&#40;)); // ok

// REMOVE
&#123;
  static int xxx; // don't forget the "static"
  if (++xxx == 123&#41; &#123;
    board.print&#40;); // place breakpoint here
  &#125;
&#125;
    board.undo&#40;);
    assert&#40;board.is_valid&#40;)); // fails, something wrong with undo&#40;)?

Yes, I understood that from your original suggestion. I only referred to the node counter because that's what should have made this idea obvious to me.

A Debugging Strategy

Re: A Debugging Strategy

Re: A Debugging Strategy

Re: A Debugging Strategy

Re: A Debugging Strategy

Re: A Debugging Strategy

Re: A Debugging Strategy

Re: A Debugging Strategy