Questions on SMP search

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Questions on SMP search

Post by Houdini »

Gian-Carlo Pascutto wrote:I'm not clear what exactly you save here. There is one push per function call, but in your case each called function still has to load the address from somewhere (even if it's the code bytes). The pop has to happen at function exit anyway. So the only advantage would be that you turn an indirect load into a direct load.

I'd expect your method to cause severe code cache pollution on CPUs where threads share the same L1 instruction cache (hyperthreading, AMD Bulldozer, GPU).

So this doesn't look like a clear-cut gain to me.

Note that if you really want to avoid passing the pointer that much, you can get the same effect via TLS.
The measured speed gain is in the order of 3% to 5% on a wide range of hardware. It's not a lot, but it comes at zero cost.

I don't think one should worry too much about code cache pollution. The end result mimics what you get when running multiple copies of the engine in single-thread mode. In my experience this doesn't produce any slowdown, I can for example run 15 copies of Houdini with 1 thread on a 16-core box (leaving 1 core for Windows), and they will all be running at nearly full speed.

Robert
Gian-Carlo Pascutto
Posts: 1260
Joined: Sat Dec 13, 2008 7:00 pm

Re: Questions on SMP search

Post by Gian-Carlo Pascutto »

In my experience this doesn't produce any slowdown, I can for example run 15 copies of Houdini with 1 thread on a 16-core box (leaving 1 core for Windows), and they will all be running at nearly full speed.
If the CPU you are using has a separate L1 for each core, and at most one thread per core, you will not have any cache pollution, so obviously you will not observe it. But that was not the scenario I mentioned.

I'm quite willing to believe that you benchmarked this as a fraction faster on whatever CPU you use now. I am also quite sure it's a very brittle optimization.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions on SMP search

Post by bob »

Houdini wrote:
Gian-Carlo Pascutto wrote:
Houdini wrote: Houdini's SMP code uses a special technique in which each thread runs its version of the code prewired to its own internal board representation. In other words, each thread uses its own code operating on its own memory segment.
I could see the point in splitting off R/W data to avoid contention. But cacheable read-only things like code? What is the design rationale behind that?

And what does "code prewired to its own internal board representation mean"?
The goal is to avoid passing the board representation structure as a parameter to each call. Each thread has its version of the code that uses a fixed board representation structure.
For example, instead of calling

Code: Select all

  Evaluate(Position, A, B, ...)
I use

Code: Select all

  Evaluate<ThreadID>(A, B, ...)
This generates slightly more efficient code.
(Obviously in the source code all this is done with C++ templates)

Robert
Robert,

Hand-waving does _not_ cut it here. "slightly more efficient code" is a crock. Why? The search and evaluation represents a ton of instructions and read-only data, in addition to the usual read/write data. Many Intel processors have shared L2 or shared L3 cache. Why, on a 4/8 core chip, do you want to have N copies of essentially the same code in a shared cache, when one will do and stress cache much less?

Several of us have actually tested these kinds of ideas extensively. Passing a pointer to local data around has no significant cost on Intel. When I modified Crafty to use parallel search (version 15.0, almost 15 years ago) I expected a 10% slow-down in execution speed. I didn't see any appreciable loss. And avoiding duplicate code in a shared cache is significant.

If that was the only way you could figure out how to do a parallel search, that's fine. But please don't try to pawn it off as "the most efficient way." It isn't'
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions on SMP search

Post by bob »

bhlangonijr wrote:
Houdini wrote:
Gian-Carlo Pascutto wrote:
Houdini wrote: Houdini's SMP code uses a special technique in which each thread runs its version of the code prewired to its own internal board representation. In other words, each thread uses its own code operating on its own memory segment.
I could see the point in splitting off R/W data to avoid contention. But cacheable read-only things like code? What is the design rationale behind that?

And what does "code prewired to its own internal board representation mean"?
The goal is to avoid passing the board representation structure as a parameter to each call. Each thread has its version of the code that uses a fixed board representation structure.
For example, instead of calling

Code: Select all

  Evaluate(Position, A, B, ...)
I use

Code: Select all

  Evaluate<ThreadID>(A, B, ...)
This generates slightly more efficient code.
(Obviously in the source code all this is done with C++ templates)

Robert

I am not sure if it helps (substantially) as we usually pass a pointer to the board data structure (or in my case using C++ object reference). The cost of having one more pointer in the call stack might be almost zero.
It _is_ almost zero...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions on SMP search

Post by bob »

Houdini wrote:Mr van Kervinck, please stop polluting Ben-Hur Carlos' topic about SMP search.
It's very annoying that a random person pops up in the middle of a technical discussion to ask an off-topic question to which he/she already knows the answer, with the sole purpose of launching another futile discussion. That is called trolling.

Robert
Robert, you opened this door when you started to claim Houdini was a completely original program, after having claimed to have modified Robo*. It is clear as to its origin, as ample evidence has been introduced in this forum over the past 6 months. So waving the questions off won't make them go away. A little honesty goes a long way here...
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Questions on SMP search

Post by Houdini »

bob wrote:Hand-waving does _not_ cut it here. "slightly more efficient code" is a crock. Why? The search and evaluation represents a ton of instructions and read-only data, in addition to the usual read/write data. Many Intel processors have shared L2 or shared L3 cache. Why, on a 4/8 core chip, do you want to have N copies of essentially the same code in a shared cache, when one will do and stress cache much less?

Several of us have actually tested these kinds of ideas extensively. Passing a pointer to local data around has no significant cost on Intel. When I modified Crafty to use parallel search (version 15.0, almost 15 years ago) I expected a 10% slow-down in execution speed. I didn't see any appreciable loss. And avoiding duplicate code in a shared cache is significant.

If that was the only way you could figure out how to do a parallel search, that's fine. But please don't try to pawn it off as "the most efficient way." It isn't'
LOL, one has to love your condescending tone ("If that was the only way you could figure out how to do a parallel search").
As I wrote earlier, I can disable this feature with a simple compilation switch.
This means I can compare the two versions in a matter of seconds. And as I said before, it generates a 3% to 5% speed-up for Houdini.

Robert
Last edited by Houdini on Tue Apr 26, 2011 6:46 pm, edited 1 time in total.
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Questions on SMP search

Post by Houdini »

bob wrote:Robert, you opened this door when you started to claim Houdini was a completely original program, after having claimed to have modified Robo*. It is clear as to its origin, as ample evidence has been introduced in this forum over the past 6 months. So waving the questions off won't make them go away. A little honesty goes a long way here...
If in the middle of a technical discussion a random person pops up with an attempt at hijacking the thread, I expect forums moderators to prevent this, not encourage or even participate in it (as you're doing now).

The result is that a lot of otherwise potentially interesting threads are turned into garbage. This unfortunately seems to happen a lot on computer chess-related forums.

Robert
User avatar
marcelk
Posts: 348
Joined: Sat Feb 27, 2010 12:21 am

Re: Questions on SMP search

Post by marcelk »

bob wrote:
Houdini wrote:Mr van Kervinck, please stop polluting Ben-Hur Carlos' topic about SMP search.
It's very annoying that a random person pops up in the middle of a technical discussion to ask an off-topic question to which he/she already knows the answer, with the sole purpose of launching another futile discussion. That is called trolling.

Robert
Robert, you opened this door when you started to claim Houdini was a completely original program, after having claimed to have modified Robo*. It is clear as to its origin, as ample evidence has been introduced in this forum over the past 6 months. So waving the questions off won't make them go away. A little honesty goes a long way here...
Indeed. Feel free to submit an abuse complaint to the moderators of the group. What is annoying are people who are claiming other's work as their own.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions on SMP search

Post by bob »

Houdini wrote:
bob wrote:Robert, you opened this door when you started to claim Houdini was a completely original program, after having claimed to have modified Robo*. It is clear as to its origin, as ample evidence has been introduced in this forum over the past 6 months. So waving the questions off won't make them go away. A little honesty goes a long way here...
If in the middle of a technical discussion a random person pops up with an attempt at hijacking the thread, I expect forums moderators to prevent this, not encourage or even participate in it (as you're doing now).

The result is that a lot of otherwise potentially interesting threads are turned into garbage. This unfortunately seems to happen a lot on computer chess-related forums.

Robert
We elect to let discussions go wherever they go, without a bunch of "hall monitors" trying to direct traffic. Adults _often_ inject tangential comments in a live discussion, and nobody says "shut up, that should be a different discussion"...

You can expect forum moderators to eliminate personal attacks, and encourage threads posted in the wrong forum to be continued in the more appropriate forum. But beyond that, we are "hands off".
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Questions on SMP search

Post by Houdini »

marcelk wrote:Indeed. Feel free to submit an abuse complaint to the moderators of the group. What is annoying are people who are claiming other's work as their own.
Even if you are annoyed by anything I or anyone else did or said, there still is no reason for you to jump into Ben-Hur Carlos' technical thread and hijack it for the sole purpose of expressing your personal frustrations.

People like Bob call that a "live adult discussion", I call that a lack of consideration for the original poster. It significantly reduces the quality of the exchanges on this forum. If that was your goal, you have succeeded.

Robert