Page 2 of 3

Re: 4 x Intel Xeon good idea?

Posted: Wed Nov 18, 2020 5:04 pm
by smatovic
If you run 24/7 the cents per kilowatt-hour can make a difference and a CPU with smaller fab process might pay off, imho, sitting in Europe with ~30 Eurocents/kWh.

--
Srdja

Re: 4 x Intel Xeon good idea?

Posted: Wed Nov 18, 2020 7:44 pm
by Milos
smatovic wrote: Wed Nov 18, 2020 5:04 pm If you run 24/7 the cents per kilowatt-hour can make a difference and a CPU with smaller fab process might pay off, imho, sitting in Europe with ~30 Eurocents/kWh.

--
Srdja
These are your German prices. If you ran your server in Serbia for example you'd pay 5.8 eurocents/kWh during the day and only 1.4 during the night which is peanuts ;).

Re: 4 x Intel Xeon good idea?

Posted: Wed Nov 18, 2020 7:52 pm
by smatovic
Milos wrote: Wed Nov 18, 2020 7:44 pm
smatovic wrote: Wed Nov 18, 2020 5:04 pm If you run 24/7 the cents per kilowatt-hour can make a difference and a CPU with smaller fab process might pay off, imho, sitting in Europe with ~30 Eurocents/kWh.

--
Srdja
These are your German prices. If you ran your server in Serbia for example you'd pay 5.8 eurocents/kWh during the day and only 1.4 during the night which is peanuts ;).
Hehe, better run during night, then you have enough current ;)

--
Srdja

Re: 4 x Intel Xeon good idea?

Posted: Wed Nov 18, 2020 7:55 pm
by Milos
smatovic wrote: Wed Nov 18, 2020 7:52 pm
Milos wrote: Wed Nov 18, 2020 7:44 pm
smatovic wrote: Wed Nov 18, 2020 5:04 pm If you run 24/7 the cents per kilowatt-hour can make a difference and a CPU with smaller fab process might pay off, imho, sitting in Europe with ~30 Eurocents/kWh.

--
Srdja
These are your German prices. If you ran your server in Serbia for example you'd pay 5.8 eurocents/kWh during the day and only 1.4 during the night which is peanuts ;).
Hehe, better run during night, then you have enough current ;)

--
Srdja
Running 3950X (even stock) 24/7 in Germany costs more than running 24/7 4xXeonV2 in Serbia ;).
And you better hope Russians don't turn off gas otherwise you might not have electricity even for 30+ Eurocents/kWh.

Re: 4 x Intel Xeon good idea?

Posted: Wed Nov 18, 2020 8:06 pm
by smatovic
Milos wrote: Wed Nov 18, 2020 7:55 pm
smatovic wrote: Wed Nov 18, 2020 7:52 pm
Milos wrote: Wed Nov 18, 2020 7:44 pm
smatovic wrote: Wed Nov 18, 2020 5:04 pm If you run 24/7 the cents per kilowatt-hour can make a difference and a CPU with smaller fab process might pay off, imho, sitting in Europe with ~30 Eurocents/kWh.

--
Srdja
These are your German prices. If you ran your server in Serbia for example you'd pay 5.8 eurocents/kWh during the day and only 1.4 during the night which is peanuts ;).
Hehe, better run during night, then you have enough current ;)

--
Srdja
Running 3950X (even stock) 24/7 in Germany costs more than running 24/7 4xXeonV2 in Serbia ;).
And you better hope Russians don't turn off gas otherwise you might not have electricity even for 30+ Eurocents/kWh.
Okay, not seriously interested to get into (greeny) politics in here, but we pay here a lot of taxes, for infrastructure, and things like EEG - the switch to renewable energy with our Eurocents. Maybe this pays off for .de in the long run, dunno...time will tell.

--
Srdja

Re: 4 x Intel Xeon good idea?

Posted: Wed Nov 18, 2020 9:28 pm
by Laskos
Milos wrote: Wed Nov 18, 2020 3:28 am
Alayan wrote: Wed Nov 18, 2020 2:46 am If running parallel single-core games, the total nps is a good indication of the performance, but multi-threading is lossy.

At equal overall nps, more threads plays weaker.

The scaling efficiency from 12 (24) to 48 (96) threads is really poor, so matching or moderately beating the modern 12 cores in total nps means losing in chess strength.
I don't agree on that in this particular case even with SF NNUE where LazySMP is practically broken beyond repair (I wonder how you SF developers didn't yet revert to old SFs SMP that scales much better with NNUE).
48 threads (no HT) with 40Mnps (4xXeonV2) is probably stronger (or at least on par) than 24 threads (with HT) and 28Mnps (3950X).
With classical SF difference would be even more pronounced.
Are you sure LazySMP has problems with SF NNUE? What would be the reason? Worse randomization of children threads?
If reverting to YBWC is the solution, then it should be very bad news for strong machines, as beyond 32 threads the scaling sucked. Hyperthreading should be again switched off with more than 8 cores and so on.

Re: 4 x Intel Xeon good idea?

Posted: Wed Nov 18, 2020 9:40 pm
by Alayan
His claim is unsupported so unless proper tests demonstrate that the scaling efficiency actually got worse, I'd just assume it was rubbish.

The proper measure of scaling efficiency of N threads is finding the time factor X needed so that 1-core with X*T time scores 50% against N-cores with T time. The resulting efficiency is X/N.

Re: 4 x Intel Xeon good idea?

Posted: Wed Nov 18, 2020 10:54 pm
by Laskos
Alayan wrote: Wed Nov 18, 2020 9:40 pm His claim is unsupported so unless proper tests demonstrate that the scaling efficiency actually got worse, I'd just assume it was rubbish.

The proper measure of scaling efficiency of N threads is finding the time factor X needed so that 1-core with X*T time scores 50% against N-cores with T time. The resulting efficiency is X/N.
It was shown to at least 64 threads that classical SF with improved Lazy SMP scaled amazingly well compared to YBWC. For 64 cores, the effective speedup was about 40 compared to something like 18 with YBWC. Maybe I will find that plot among Andreas (the author of the excellent FGRL rating list) data, if it was him showing this.

Re: 4 x Intel Xeon good idea?

Posted: Wed Nov 18, 2020 11:56 pm
by Milos
Laskos wrote: Wed Nov 18, 2020 10:54 pm
Alayan wrote: Wed Nov 18, 2020 9:40 pm His claim is unsupported so unless proper tests demonstrate that the scaling efficiency actually got worse, I'd just assume it was rubbish.

The proper measure of scaling efficiency of N threads is finding the time factor X needed so that 1-core with X*T time scores 50% against N-cores with T time. The resulting efficiency is X/N.
It was shown to at least 64 threads that classical SF with improved Lazy SMP scaled amazingly well compared to YBWC. For 64 cores, the effective speedup was about 40 compared to something like 18 with YBWC. Maybe I will find that plot among Andreas (the author of the excellent FGRL rating list) data, if it was him showing this.
Well Alayan goes and calls BS on any result that he doesn't like.
There are quite a few convincing results in the thread that he started by basically calling CCRL results BS.
http://talkchess.com/forum3/viewtopic.p ... 3&start=50
Some of those results from mwyoung looked quite strong (you know my opinion about his testing methods). That and the other result (http://talkchess.com/forum3/viewtopic.php?f=2&t=75363) looked suspicious enough so I did myself a test of scaling of SF-NNUEdev vs Lc0.
Ofc, it's nowhere near comprehensive as what Andreas did (http://talkchess.com/forum3/viewtopic.php?f=2&t=74188) but for me it's pretty indicative.
I played SF_1c vs Lc0, SF_2c vs Lc0, SF_4c vs Lc0, SF_8c vs Lc0, SF_16c vs Lc0. No HT, fixed multiplier for all cores, bullet match (1'+0.6''), 500 custom openings reversed colors.
And I got, respectively, the following results: 58.5%, 62.85%, 66.1%, 67.85%, 70.45%.
Then I repeated it with SF11 vs a weaker Lc0 net so that winning and draw percentages stay similar. And here I got: 51.55% 58.9% 64.65% 68.3% 72.15%.
Image

Re: 4 x Intel Xeon good idea?

Posted: Thu Nov 19, 2020 12:30 am
by Laskos
Milos wrote: Wed Nov 18, 2020 11:56 pm
Laskos wrote: Wed Nov 18, 2020 10:54 pm
Alayan wrote: Wed Nov 18, 2020 9:40 pm His claim is unsupported so unless proper tests demonstrate that the scaling efficiency actually got worse, I'd just assume it was rubbish.

The proper measure of scaling efficiency of N threads is finding the time factor X needed so that 1-core with X*T time scores 50% against N-cores with T time. The resulting efficiency is X/N.
It was shown to at least 64 threads that classical SF with improved Lazy SMP scaled amazingly well compared to YBWC. For 64 cores, the effective speedup was about 40 compared to something like 18 with YBWC. Maybe I will find that plot among Andreas (the author of the excellent FGRL rating list) data, if it was him showing this.
Well Alayan goes and calls BS on any result that he doesn't like.
There are quite a few convincing results in the thread that he started by basically calling CCRL results BS.
http://talkchess.com/forum3/viewtopic.p ... 3&start=50
Some of those results from mwyoung looked quite strong (you know my opinion about his testing methods). That and the other result (http://talkchess.com/forum3/viewtopic.php?f=2&t=75363) looked suspicious enough so I did myself a test of scaling of SF-NNUEdev vs Lc0.
Ofc, it's nowhere near comprehensive as what Andreas did (http://talkchess.com/forum3/viewtopic.php?f=2&t=74188) but for me it's pretty indicative.
I played SF_1c vs Lc0, SF_2c vs Lc0, SF_4c vs Lc0, SF_8c vs Lc0, SF_16c vs Lc0. No HT, fixed multiplier for all cores, bullet match (1'+0.6''), 500 custom openings reversed colors.
And I got, respectively, the following results: 58.5%, 62.85%, 66.1%, 67.85%, 70.45%.
Then I repeated it with SF11 vs a weaker Lc0 net so that winning and draw percentages stay similar. And here I got: 51.55% 58.9% 64.65% 68.3% 72.15%.
Image
Wow! And what's the matter with LazySMP and NNUE? It isn't very obvious that this should happen at all.