Search found 196 matches

by Sesse
Tue Mar 31, 2020 8:50 am
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Removing Large Arrays
Replies: 36
Views: 3340

Re: Removing Large Arrays

No I/O goes through caches. The I/O hardware has no access to cache, it is purely device -> memory and the reverse.
Some Xeons allow devices to DMA into the L3 cache (Intel DDIO).
by Sesse
Fri Mar 20, 2020 11:29 pm
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Strange sporadic speed limitation in engine running in Linux on Ryzen
Replies: 19
Views: 1833

Re: Strange sporadic speed limitation in engine running in Linux on Ryzen

syzygy's explanation makes sense, and it's easy to see if it holds true or not. Run the binary with perf stat -d, and observe the ratio of L1 dcache misses to total accesses. If it's much higher in the slow runs, it's likely that you're seeing an L1 cache aliasing effect (bank conflicts).
by Sesse
Tue Mar 10, 2020 11:31 am
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Removing Large Arrays
Replies: 36
Views: 3340

Re: Removing Large Arrays

Deberger wrote:
Tue Mar 10, 2020 3:43 am
is replaced with a function call which must be more than 200 times slower, on any platform.
Why would it be? It really depends on whether the array is in L1d or not.
by Sesse
Tue Mar 10, 2020 11:30 am
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Removing Large Arrays
Replies: 36
Views: 3340

Re: Removing Large Arrays

Loading a byte into a 32-bit register with zero or sign-extension is just as fast as loading an int into it. Generally this is true, but not always. I remember having a case on older Opteron CPUs where using uint32_t instead of uint8_t would help, due to some port contention issue (the movzx could ...
by Sesse
Thu Feb 20, 2020 11:21 pm
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: SIMD methods in TT probing and replacement
Replies: 15
Views: 1339

Re: SIMD methods in TT probing and replacement

Yes, that was the initial idea. Originally I proposed the older MMX instructions for this; these also have an 8x8bit compare. Problem is that you have to use compiler intrinsics, and that it often takes extra instructions to shuttle the data into the SIMD registers. You just do the load directly in...
by Sesse
Thu Feb 20, 2020 3:38 pm
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: SIMD methods in TT probing and replacement
Replies: 15
Views: 1339

Re: SIMD methods in TT probing and replacement

SIMD trickery is getting increasingly common in hash tables in general; see e.g. Google's SwissTables. If you byte-align your “control word” (store 8 bits per hash entry), you can use SSE2 parallel comparison (PCMPEQB) and get out one bit per (potentially) matching entry, using PMOVMSKB, which also ...
by Sesse
Tue Jan 28, 2020 4:15 pm
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Why is MultiPV so slow?
Replies: 25
Views: 2937

Re: Why is MultiPV so slow?

Heh, so it depends on the position?

I wonder; when we change positions, do we automatically get the 8 ply penalty for hash table replacement? Could turning that off help spv search win more of the time?
by Sesse
Mon Jan 27, 2020 11:41 pm
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Why is MultiPV so slow?
Replies: 25
Views: 2937

Re: Why is MultiPV so slow?

For future reference: I tried integrating manual multi-PV in my scripts, and it was markedly slower than just regular multi-PV. I have no idea why it didn't work this time; perhaps it was the different position, perhaps these things pan out very differently with many threads, perhaps the hash size m...
by Sesse
Sun Jan 26, 2020 9:32 am
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Why is MultiPV so slow?
Replies: 25
Views: 2937

Re: Why is MultiPV so slow?

Note that I have a different need for MultiPV from what a typical chess player would; I want to communicate to the viewer (on analysis.sesse.net) the score for every possible move. Typical use cases are “why can't he just capture that piece?” and “is this a situation where the player needs to find a...
by Sesse
Sat Jan 25, 2020 7:05 pm
Forum: Computer Chess Club: Programming and Technical Discussions
Topic: Why is MultiPV so slow?
Replies: 25
Views: 2937

Re: Why is MultiPV so slow?

Later moves tend to be worse. Worse, means more fail-lows. More fail-lows means more time spent to resolve them. So it's not unusual to have the later moves being more costly to search and absorbing a disproportionate part of the "nodes budget". Hm. So a problem is that the engine spends a dispropo...