bob wrote:My take here is that there is little use in worrying about something you can't fix.
But of course you can fix when the scheduler in its wisdom makes bad decisions (as shown in the real life examples above) by modifying the affinities of the programs running in the right way. Call it bug-fixing
Considering that most of the changes we try are about what? A maximum expectation of 2-3 elo? Then if the scheduler doesn't function as it should the results are misleading. One can accept the change as an improvement (and never look back) while in reality it is a regression and we are back in the dark ages of the 70-80-90's.
If you look at the statistics given in the OP I know from experience the 240 version is hurt by at least 5 elo due to the scheduler having a bad day made visible by coretemp and the loading percentages of the cores.
Either you test and do it right, or don't test at all.
I was really thinking of testing done by OTHERS when I was writing that. In my case, with my own testing, I very carefully set up a testbed that introduced as little noise as possible.
But when the testing goes outside of my reach. Such as SSDF, TCEC, and all the other events and tests run by others, there's not much that can be done. VERY difficult to try to fix many of these problems given the different versions of windows, linux and MacOS that are out there, where by "fix" I mean that the program itself does whatever is needed to avoid thread bouncing and such. And there's nothing to do when a malicious GUI or opponent wants to cause problems, except to not run when the program notices competition for resources. And programs can directly interfere with the GUI as well, as we have seen in past nonsense. Flood the GUI with messages which can make it slow to respond to opponent legitimate messages and interfere with time usage.
I would NEVER consider testing under windows. I want a tightly controlled environment that will work consistently each and every time.
bob wrote:
Sorry, but not even close. And if you are talking windows 10, not at all.
I use windows 7. I never tested with Linux, so I cannot compare.
But I can say that the test results I do are consistent enough when I retest groups of patches, and I do it various times between versions. Sometimes I have surprises, of course, but Andscacs advance well in general.
bob wrote:I would NEVER consider testing under windows. I want a tightly controlled environment that will work consistently each and every time.
I understand your concerns, but believe me, you can test on Windows and works more than reasonably well
Agree, just strip everything on a separate PC. When the (my) PC is idle tempcore shows 0% in load of all 8 cores, so now and then it steals a few percentages to disapeear the next display. I guess that all OS's have that.
Here is a simple solution that fixes the capriciousness of the Windows scheduler via a batch file. For example a match between ProDeo and Crafty (Hi Bob) using 4 cores. Start Coretemp and then the match, watch the coretemp loading percentages and notice how they fluctuate, then run the below batch file and notice the fluctuation is much less since each of the 4 matches are pinned to a fixed core with affinity. In the control panel check-out the affinities of each executable and how things are arranged.
Here is the batch file I use on my 2 node NUMA (2 x 4 cores) 16 threads in total running self-play match between PoDeo 2.2 (PD22x) and ProDeo 2.4 (PD24x).