inconsistent performance

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: inconsistent performance

Post by Don »

hgm wrote:It will be a bit difficult to make sensible comments on this if you don't tell us what hardware you are running on, and what your engine tries to use of this (e.g. is it SMP, how many cores, what hash setting, does it use huge tables).
I already commented on some of this, but my processer is one of the older core 2 duo's. Here is what /proc/cpuinfo yields:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz
stepping : 6
cpu MHz : 1600.000
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 5319.99
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: inconsistent performance

Post by Don »

bob wrote:
Don wrote:I'm having a serious problem with performance benchmarks with my chess program. I cannot get a consistent set of times from one run to the next. The difference isn't minor, it's major. For instance one position went from 167.3 seconds to 104.5 to do a 15 ply search. Another when from 43.7 to 27.1. I can run this many times and get a different result each time, some problems running faster and some running slower and there seems to be no pattern to it.

I have the ability to test this by CPU time but it makes no difference, same basic behavior. I run these on an unloaded core 2 duo machine in Linux and top, ps or uptime shows the load to be very close to zero when I start the test.

The program is deterministic, so I get the same score, number of nodes searched, and principal variation - identical in every way. It's really making life difficult for performance benchmarking!

Has anyone else run into this? I don't remember this being a problem before. I strongly suspect some kind of caching issue. As far as I can see, however, the program is deterministic in every way when I want it to be.

- Don
I can usually see a 1% max deviation in time, caused by a program being loaded into a different set of real pages each time it is run, which can affect memory-to-cache mapping. But I've never seen anything like that with one exception...

You didn't mention hardware. Is it possible you have two different kinds of memory? I had one of the original Toshiba pentium laptops, with 8 megs of ram. Which ran on the usual 64 bit bus. But I added another 16mb of ram, and it turns out their expansion slot was 32 bits and not 64 bits. Which meant I could run once, get lucky, and get loaded into the first 8mb and run like the blazes. The next time, I could get loaded into the upper 16mb and run dog slow. And other times I would get a combination with the speed varying all over the place.

I can run the same position N times on my laptop and see almost zero variance using linux...
I have nothing like that, the memory is 2 Gig and is composed of 2 equal simms that are identical. It's a cheap motherboard that cannot be upgraded beyond 2 Gig of memory.

It does appear, however, to be strongly related to processor affinity as I can make this problem go away when I force it to run on a single physical processor.

When I first got the machine I did tests and found that if I ran 2 copies of any chess program, one of them would run a little bit slower that one alone, but not by much. I also discovered that if you ran 3 copies (I have 2 cores) one of them suffered much more than the other two. I think that over time this effect averages out as I have read that the linux kernel tries to ensure fairness and catches this situation but evidently over time, not in a very smooth way. But I have never tested this explicitly. My tester forks off several programs and I instrument the run times of each program (average time per game) and after enough games have been played I get very consistent results. Of course for 1 instance of the tester only 2 programs are actually computing, and I usually run tests without pondering as it's more resource friendly when you only have one machine.


- Don
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: inconsistent performance

Post by bob »

Aleks Peshkov wrote:I have little experience with Linux, but on Windows system power management is common reason. It is possible to observe that CPU runs sometimes on designed speed, sometimes on half frequency clock. Overheating can also make Intel CPU to drop speed without any software control.
Yep. I take care of this up front so I never see it as an issue, and therefore never think to suggest this. On my linux laptop I always have it set to run with power management disabled when running on A/C, to avoid this, and I only do performance testing while on A/C rather than on battery power.

I also make certain nothing else is running. Windows likes to frequently poll "mother microsoft" looking for updates and the like which can skew performance analysis runs.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: inconsistent performance

Post by bob »

Don wrote:
bob wrote:
Don wrote:I'm having a serious problem with performance benchmarks with my chess program. I cannot get a consistent set of times from one run to the next. The difference isn't minor, it's major. For instance one position went from 167.3 seconds to 104.5 to do a 15 ply search. Another when from 43.7 to 27.1. I can run this many times and get a different result each time, some problems running faster and some running slower and there seems to be no pattern to it.

I have the ability to test this by CPU time but it makes no difference, same basic behavior. I run these on an unloaded core 2 duo machine in Linux and top, ps or uptime shows the load to be very close to zero when I start the test.

The program is deterministic, so I get the same score, number of nodes searched, and principal variation - identical in every way. It's really making life difficult for performance benchmarking!

Has anyone else run into this? I don't remember this being a problem before. I strongly suspect some kind of caching issue. As far as I can see, however, the program is deterministic in every way when I want it to be.

- Don
I can usually see a 1% max deviation in time, caused by a program being loaded into a different set of real pages each time it is run, which can affect memory-to-cache mapping. But I've never seen anything like that with one exception...

You didn't mention hardware. Is it possible you have two different kinds of memory? I had one of the original Toshiba pentium laptops, with 8 megs of ram. Which ran on the usual 64 bit bus. But I added another 16mb of ram, and it turns out their expansion slot was 32 bits and not 64 bits. Which meant I could run once, get lucky, and get loaded into the first 8mb and run like the blazes. The next time, I could get loaded into the upper 16mb and run dog slow. And other times I would get a combination with the speed varying all over the place.

I can run the same position N times on my laptop and see almost zero variance using linux...
I have nothing like that, the memory is 2 Gig and is composed of 2 equal simms that are identical. It's a cheap motherboard that cannot be upgraded beyond 2 Gig of memory.

It does appear, however, to be strongly related to processor affinity as I can make this problem go away when I force it to run on a single physical processor.

When I first got the machine I did tests and found that if I ran 2 copies of any chess program, one of them would run a little bit slower that one alone, but not by much. I also discovered that if you ran 3 copies (I have 2 cores) one of them suffered much more than the other two. I think that over time this effect averages out as I have read that the linux kernel tries to ensure fairness and catches this situation but evidently over time, not in a very smooth way. But I have never tested this explicitly. My tester forks off several programs and I instrument the run times of each program (average time per game) and after enough games have been played I get very consistent results. Of course for 1 instance of the tester only 2 programs are actually computing, and I usually run tests without pondering as it's more resource friendly when you only have one machine.


- Don
I apparently missed this if you gave it, but exactly what kind of hardware? For example, a dual chip AMD box can easily do this, as dual chips (not a single dual core machine) are NUMA, and you could get your memory allocated on one node, and then run on the other and see 2x memory slowdown. If you are just running a single process, linux won't bounce the process from core to core, its processor affinity is quite good by default. Don't know about windows...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: inconsistent performance

Post by bob »

Bill Rogers wrote:Don
I have experienced almost the same thing. I was timming a certain move in a chess game and thought the result was fine until I did it a second time. It was then that I noticed that even though I was not using my modem the computer was keeping it activated so I turn it off. To my supprise my timing improved drastically. So it appears that some things running in the background can effect your timing on many things.
Bill
The "el-cheapo" modems (AKA "winmodem") are horrible. The CPU has to provide the timing to send individual bits, rather than using a small processor with a buffer, UART, etc. The operating system device controller is responsible for too much work, including error detection / recovery, compression / decompression, in addition to actually stuffing bytes into the thing one by one as they are sent...

ugly and bad for performance. I don't own any, except in my laptop where it is turned off and never used anyway.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: inconsistent performance

Post by bob »

Don wrote:
hgm wrote:It will be a bit difficult to make sensible comments on this if you don't tell us what hardware you are running on, and what your engine tries to use of this (e.g. is it SMP, how many cores, what hash setting, does it use huge tables).
I already commented on some of this, but my processer is one of the older core 2 duo's. Here is what /proc/cpuinfo yields:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz
stepping : 6
cpu MHz : 1600.000
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 5319.99
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
Since this appears to be Linux (/proc/cpuinfo output you gave suggests this) is this a relatively new kernel (last year or so)???
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: inconsistent performance

Post by Don »

bob wrote:
Don wrote:
hgm wrote:It will be a bit difficult to make sensible comments on this if you don't tell us what hardware you are running on, and what your engine tries to use of this (e.g. is it SMP, how many cores, what hash setting, does it use huge tables).
I already commented on some of this, but my processer is one of the older core 2 duo's. Here is what /proc/cpuinfo yields:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz
stepping : 6
cpu MHz : 1600.000
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 5319.99
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
Since this appears to be Linux (/proc/cpuinfo output you gave suggests this) is this a relatively new kernel (last year or so)???
Here is the kernel: Linux version 2.6.24-23-generic

I'm running Ubuntu and I'm one version behind at 8.04.2 I have not had perfectly smooth results with ugrades and prefer to install from scratch and every 6 months is too much. I think we may be close to the next release however and so I will probably re-install soon.

I wonder if there have been any improvements in this regard? Maybe I will poke around and see.

You claimed in another post to this forum that Linux was pretty good at dealing with processor affinity, but my empirical evidence here shows that if I force it to run on a single processor, I am getting the very best times. The 3 best runs out of 10 all happened with me forcing it to use a specific processor. Some runs were much worse, and some were almost as good. I'm not happy with that because I feel that I should be able to run a single processor application on an unloaded machine and expect to get full performance.

There are system calls where the program can make it's own decision but I don't feel that this is a good decision to take away from the OS, especially since in my testing I may have many different programs running, but I may provide a command line option to make it possible to experiment with this.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: inconsistent performance

Post by bob »

Don wrote:
bob wrote:
Don wrote:
hgm wrote:It will be a bit difficult to make sensible comments on this if you don't tell us what hardware you are running on, and what your engine tries to use of this (e.g. is it SMP, how many cores, what hash setting, does it use huge tables).
I already commented on some of this, but my processer is one of the older core 2 duo's. Here is what /proc/cpuinfo yields:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz
stepping : 6
cpu MHz : 1600.000
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 5319.99
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
Since this appears to be Linux (/proc/cpuinfo output you gave suggests this) is this a relatively new kernel (last year or so)???
Here is the kernel: Linux version 2.6.24-23-generic

I'm running Ubuntu and I'm one version behind at 8.04.2 I have not had perfectly smooth results with ugrades and prefer to install from scratch and every 6 months is too much. I think we may be close to the next release however and so I will probably re-install soon.

I wonder if there have been any improvements in this regard? Maybe I will poke around and see.

You claimed in another post to this forum that Linux was pretty good at dealing with processor affinity, but my empirical evidence here shows that if I force it to run on a single processor, I am getting the very best times. The 3 best runs out of 10 all happened with me forcing it to use a specific processor. Some runs were much worse, and some were almost as good. I'm not happy with that because I feel that I should be able to run a single processor application on an unloaded machine and expect to get full performance.

There are system calls where the program can make it's own decision but I don't feel that this is a good decision to take away from the OS, especially since in my testing I may have many different programs running, but I may provide a command line option to make it possible to experiment with this.
The processor affinity stuff hasn't changed significantly in 2+ years now so you are probably OK there. I wonder if the CPU frequency scaling process you have is not so clever and the both CPUs are not running at 100%? gnome has a good cpu freq scaling monitor that will help although it is not a constant monitor or it would intrude on performance...

First cut would be to disable power management completely. I think there is a kernel boot option something like noacpi that will turn this stuff off.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: inconsistent performance

Post by Don »

bob wrote:
The processor affinity stuff hasn't changed significantly in 2+ years now so you are probably OK there. I wonder if the CPU frequency scaling process you have is not so clever and the both CPUs are not running at 100%? gnome has a good cpu freq scaling monitor that will help although it is not a constant monitor or it would intrude on performance...

First cut would be to disable power management completely. I think there is a kernel boot option something like noacpi that will turn this stuff off.
Ok, I will experiment a bit with this stuff. The power managment stuff is not of much use to me, especially on a desktop machine and in view of the fact that my machine is rarely idling - I am constantly running some kind of test.

Isn't this also a setting in the bios? I think I remember turning off anyting in the bios having to do with power management.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: inconsistent performance

Post by bob »

Don wrote:
bob wrote:
The processor affinity stuff hasn't changed significantly in 2+ years now so you are probably OK there. I wonder if the CPU frequency scaling process you have is not so clever and the both CPUs are not running at 100%? gnome has a good cpu freq scaling monitor that will help although it is not a constant monitor or it would intrude on performance...

First cut would be to disable power management completely. I think there is a kernel boot option something like noacpi that will turn this stuff off.
Ok, I will experiment a bit with this stuff. The power managment stuff is not of much use to me, especially on a desktop machine and in view of the fact that my machine is rarely idling - I am constantly running some kind of test.

Isn't this also a setting in the bios? I think I remember turning off anyting in the bios having to do with power management.
Yes, but that is "advisory". The operating system can ignore any bios settings, particularly since linux doesn't use any part of the bios except the bare initial boot code, once the first boot stage is loaded, the bios is out of the loop...