inconsistent performance

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: inconsistent performance

Post by Desperado »

Sorry, but i am missing something in this thread (perhaps i overread...).

But what kind of speedtest are you doing, a "perft", or a search with all its
abilities ?

if you dont have the same problems on a (simple,no TT or other "advanced" stuff, just move,moveback,nodecount) perft, then go bughunting.

I dont think there are hardware-probs, or os-probs (just try other engines, and have a look if they also have the problem...), or other software which performance depends on hardware.

good luck :-)
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: inconsistent performance

Post by Don »

Desperado wrote:Sorry, but i am missing something in this thread (perhaps i overread...).

But what kind of speedtest are you doing, a "perft", or a search with all its
abilities ?

if you dont have the same problems on a (simple,no TT or other "advanced" stuff, just move,moveback,nodecount) perft, then go bughunting.

I dont think there are hardware-probs, or os-probs (just try other engines, and have a look if they also have the problem...), or other software which depends on hardware.

good luck :-)
I"m running 20 positions to depth 13 using the chess program. The nodes match, the scores match and PV's match - everything not time related matches perfectly so I suspect no bugs. I have a table I use for evaluation that is similar to a pawn structure hash table and I looked for hashing bugs and found none. If there is a hashing bug it's of the nature where it doesn't produce wrong results.
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: inconsistent performance

Post by Desperado »

Oh, i think there can be a lot of bugs, also if the values,all you mentioned are matching...
1.
I had once the same(maybe similar) problem, all was matching...,
then i watched the taskmanager where memory usage got higher and higher and :-) higher. i didnt notice the failure because i always used short
time controls for testing my stuff. At the end, i didnt delete dynamic allocated memory(so swapping becomes the problem)...in this case all numbers for example were matching...

2.
imagine a loop, for let us say an attackbitboard, where 40/56 bits are set
with a control structure like.
while(tmp)
{
sq=bsf64()
...
clb64(tmp)
}

if something like this works not propper, all numbers maybe equal, but
may there shouldnt be "40" but perhaps 12 bits set...
there can be thousands of bugs, without changing the "search numbers", if this loop is for example not in "search-sensible" function itegrated.

(sorry for my bad english...i hope you understand what i mean)

The highest probability is in my opinion, to find a bug, or are there "significant indicators" for a hardware problem, which would also occur on other software(chess engines) ?

So did you tried a simple perft ? (5 min to write such a function and you know more than now, or not?, nothing to loose ! :-) ), if this works well why the hardware should have a problem ?
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: inconsistent performance

Post by Don »

Desperado wrote:Oh, i think there can be a lot of bugs, also if the values,all you mentioned are matching...
1.
I had once the same(maybe similar) problem, all was matching...,
then i watched the taskmanager where memory usage got higher and higher and :-) higher. i didnt notice the failure because i always used short
time controls for testing my stuff. At the end, i didnt delete dynamic allocated memory(so swapping becomes the problem)...in this case all numbers for example were matching...

2.
imagine a loop, for let us say an attackbitboard, where 40/56 bits are set
with a control structure like.
while(tmp)
{
sq=bsf64()
...
clb64(tmp)
}

if something like this works not propper, all numbers maybe equal, but
may there shouldnt be "40" but perhaps 12 bits set...
there can be thousands of bugs, without changing the "search numbers", if this loop is for example not in "search-sensible" function itegrated.

(sorry for my bad english...i hope you understand what i mean)

The highest probability is in my opinion, to find a bug, or are there "significant indicators" for a hardware problem, which would also occur on other software(chess engines) ?

So did you tried a simple perft ? (5 min to write such a function and you know more than now, or not?, nothing to loose ! :-) ), if this works well why the hardware should have a problem ?
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: inconsistent performance

Post by Don »

Desperado wrote:Oh, i think there can be a lot of bugs, also if the values,all you mentioned are matching...
1.
I had once the same(maybe similar) problem, all was matching...,
then i watched the taskmanager where memory usage got higher and higher and :-) higher. i didnt notice the failure because i always used short
time controls for testing my stuff. At the end, i didnt delete dynamic allocated memory(so swapping becomes the problem)...in this case all numbers for example were matching...

2.
imagine a loop, for let us say an attackbitboard, where 40/56 bits are set
with a control structure like.
while(tmp)
{
sq=bsf64()
...
clb64(tmp)
}

if something like this works not propper, all numbers maybe equal, but
may there shouldnt be "40" but perhaps 12 bits set...
there can be thousands of bugs, without changing the "search numbers", if this loop is for example not in "search-sensible" function itegrated.

(sorry for my bad english...i hope you understand what i mean)

The highest probability is in my opinion, to find a bug, or are there "significant indicators" for a hardware problem, which would also occur on other software(chess engines) ?

So did you tried a simple perft ? (5 min to write such a function and you know more than now, or not?, nothing to loose ! :-) ), if this works well why the hardware should have a problem ?
There are certainly bugs in the program and in all chess programs, but I'm looking for a performance bug that varies the search time and nothing else. You said there could be thousands of bugs that do not change the deterministic behavior of the program and I agree, but 99% of these bugs do not randomly make the program vary in time spent this much.

Running perft is pretty far fetched for finding this bug. I'm not looking for ANY bug in the program, I'm looking for a performance bug that sometimes makes the program run fast, and sometimes slow without changing the search itself even by 1 node.

I do not allocate memory dynamically, other than for setting the transposition table size. This other big hash table I use is declared with a fixed size. I'm quite sure there is no malloc/free type of performance bug.

I've pretty much convinced myself that this is an artifact of modern processors, as the performance improves they are less and less predictable in performance. With SMP it is even worse. However I'm still seeking to find ways to minimize this.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: inconsistent performance

Post by bob »

Don wrote:
Desperado wrote:Oh, i think there can be a lot of bugs, also if the values,all you mentioned are matching...
1.
I had once the same(maybe similar) problem, all was matching...,
then i watched the taskmanager where memory usage got higher and higher and :-) higher. i didnt notice the failure because i always used short
time controls for testing my stuff. At the end, i didnt delete dynamic allocated memory(so swapping becomes the problem)...in this case all numbers for example were matching...

2.
imagine a loop, for let us say an attackbitboard, where 40/56 bits are set
with a control structure like.
while(tmp)
{
sq=bsf64()
...
clb64(tmp)
}

if something like this works not propper, all numbers maybe equal, but
may there shouldnt be "40" but perhaps 12 bits set...
there can be thousands of bugs, without changing the "search numbers", if this loop is for example not in "search-sensible" function itegrated.

(sorry for my bad english...i hope you understand what i mean)

The highest probability is in my opinion, to find a bug, or are there "significant indicators" for a hardware problem, which would also occur on other software(chess engines) ?

So did you tried a simple perft ? (5 min to write such a function and you know more than now, or not?, nothing to loose ! :-) ), if this works well why the hardware should have a problem ?
There are certainly bugs in the program and in all chess programs, but I'm looking for a performance bug that varies the search time and nothing else. You said there could be thousands of bugs that do not change the deterministic behavior of the program and I agree, but 99% of these bugs do not randomly make the program vary in time spent this much.

Running perft is pretty far fetched for finding this bug. I'm not looking for ANY bug in the program, I'm looking for a performance bug that sometimes makes the program run fast, and sometimes slow without changing the search itself even by 1 node.

I do not allocate memory dynamically, other than for setting the transposition table size. This other big hash table I use is declared with a fixed size. I'm quite sure there is no malloc/free type of performance bug.

I've pretty much convinced myself that this is an artifact of modern processors, as the performance improves they are less and less predictable in performance. With SMP it is even worse. However I'm still seeking to find ways to minimize this.
I have two observations:

(1) I don't think it is a bug. A bug is not going to greatly alter CPU time while not changing node count, evaluation numbers or anything to do with the size/shape of the tree.

(2) I also don't think this is "typical of modern processors" since I am running on all kinds of machines. If you were concerned about a .1-.5 second time variance, then I would not be very surprised. But if you are seeing something larger, then "something is up".

If you have something you can send me, I can run it on 3-4 different types of machines (PIV/2.8ghz, core2 duo 2.0ghz, core-2 XEON 2.33ghz (2x4 cores) for starters. If the problem doesn't show up here, you will know you have something to look at on the machine you are using...