Page 20 of 29

Re: buying a new computer

Posted: Mon Aug 19, 2019 5:47 pm
by Leo
What does I/O and memory channel mean? How does it benefit chess engines?

Re: buying a new computer

Posted: Mon Aug 19, 2019 5:50 pm
by Leo
jpqy wrote: Sun Aug 18, 2019 12:12 pm AMD 3rd Gen Ryzen Threadripper ‘Sharkstooth’ With 32 Zen 2 Cores Possibly Spotted in Geekbench – Up To 35% Faster Than Ryzen Threadripper 2990WX

https://wccftech.com/amd-ryzen-threadri ... u-spotted/

JP.
Very good news.

Re: buying a new computer

Posted: Mon Aug 19, 2019 6:00 pm
by zullil
Leo wrote: Mon Aug 19, 2019 5:47 pm What does I/O and memory channel mean? How does it benefit chess engines?
More memory channels means that more data can travel between CPU and RAM in a given amount of time. So, for chess engines, reading information from the hash table becomes faster, for example.

Re: buying a new computer

Posted: Mon Aug 19, 2019 6:07 pm
by dragontamer5788
Leo wrote: Mon Aug 19, 2019 5:47 pm What does I/O and memory channel mean? How does it benefit chess engines?
I/O is the number of PCIe connections to the CPU. Each PCIe 3.0 "lane" connection supports 985 MB/s to... well... anything. Ethernet, GPUs, NVMe SSDs, etc. etc. On modern computers, you use x2 or x4 connections to SSDs (1.9GB/s and 3.9GB/s respectively), and x8 or x16 connections to GPUs (15.7GB/s). The M.2 slots usually use x4 connections.

Note that some CPUs support PCIe 4.0, which doubles the bandwidth. However, GPUs and SSDs which use PCIe 4.0 are extremely new and extremely expensive, so the typical CPU-builder will probably stick with PCIe 3.0 parts.

Your typical CPU (Ryzen 7 or Intel i7) will only support x16 lanes + x4 lanes to the "southbridge". The "southbridge" is all of the miscellaneous features of your motherboard lies (USB connections, SATA, maybe a legacy PCIe 2.0 or even PCI / ISA port). Some motherboards lay out the NVMe drives "behind" the southbridge, which is annoying... to say the least.

A high-end CPU like Threadripper supports x64 lanes of I/O, allowing stupifying amounts of I/O bandwidth: https://www.guru3d.com/news-story/eight ... 8-gbs.html

-------------

DDR4 Memory channels is similar: its the number of DDR4 channels that are supported. DDR4 is clocked at different rates, but typical DDR4 is 2600 MT/s, or roughly 21GB/s. Two channels of 2600 RAM will give you 42GB/s, while Threadripper supports 4x channels, or 84GB/s.

Note that these memory channels work in parallel. Which means you only really access the full bandwidth if you use multiple threads. A single core typically only has a fraction of the memory bandwidth. This depends on details, such as NUMA architecture and microarchitectural details (in particular: cache hierarchy, load/store units, etc. etc.). Its very difficult to write a program to use all memory bandwidth available, you pretty much have to go AVX2 and very carefully lay out your memory accesses to work with the prefetchers. But more-bandwidth will (in general) help out with multi-threaded programs.

When you have 16x cores or 32x threads running on one CPU, you'll probably start to run out of memory-bandwidth.

---------------

I/O could benefit a chess engine if you use large amounts of GPUs, SSD drives, or Network (Ethernet) connections. You could feasibly put 4x GPUs (in 8x lane configuration) + RAID0 4x NVMe SSDs + 10GbE on a Threadripper for instance, giving 4x GPUs worth of compute and maybe 10GB/s read/write speed to your Tablebase.

Memory bandwidth is far more difficult to figure out, because that's engine and data-structure specific. The easiest way to take advantgae of memory bandwidth is to increase the number of threads operating on your program.

EDIT: Actually, both I/O and Memory are very difficult to fully take advantage of. Although you have very high bandwidth available on modern machines, I/O has huge latency (and even DDR4 to an extent). So you need to learn how to write asynchronous programs, or maybe heavily multithreaded programs, to fully utilize the I/O and Memory bandwidths available today.

Re: buying a new computer

Posted: Mon Aug 19, 2019 8:18 pm
by Raphexon
zullil wrote: Mon Aug 19, 2019 6:00 pm
Leo wrote: Mon Aug 19, 2019 5:47 pm What does I/O and memory channel mean? How does it benefit chess engines?
More memory channels means that more data can travel between CPU and RAM in a given amount of time. So, for chess engines, reading information from the hash table becomes faster, for example.
L3 caches of 1gb thanks to RLDRAM in future will be crazy.

Re: buying a new computer

Posted: Mon Aug 19, 2019 8:42 pm
by dragontamer5788
Raphexon wrote: Mon Aug 19, 2019 8:18 pm
zullil wrote: Mon Aug 19, 2019 6:00 pm
Leo wrote: Mon Aug 19, 2019 5:47 pm What does I/O and memory channel mean? How does it benefit chess engines?
More memory channels means that more data can travel between CPU and RAM in a given amount of time. So, for chess engines, reading information from the hash table becomes faster, for example.
L3 caches of 1gb thanks to RLDRAM in future will be crazy.
RLDRAM3 is the last version, and nothing else seems announced. I don't think there are many opportunities for latency to be further reduced in the future.

In contrast, memory bandwidth continues to grow, exponentially even. HBM, HBM2, HBM3, GDDR6, and DDR5 all have grossly improved bandwidth scores over their predecessors. As such, the question is not how to improve external-RAM latency, but the question is how to take advantage of better-and-better "dumb" bandwidth.

CPUs have SMT, Out-of-order execution, reordering buffers, and prefetching as their primary tools to take advantage of bandwidth. Programmers can use AVX2 / AVX512 to have wider load/stores and maybe use the bandwidth more effectively.

GPUs are almost exclusively SMT-based, but to a ridiculous degree. While CPUs are SMT2 (Intel "Hyperthreading"), or maybe SMT4 / SMT8 (Power9), GPUs effectively have SMT40 (AMD Vega), SMT20 (AMD Navi), or SMT32 (NVidia Turing), or SMT64 (NVidia Volta): running 20 to 64 wavefronts per compute unit / symmetric multiprocessor.

--------

SMT seems to be the most effective tool of the future for taking advantage of this bandwidth, which means learning to write wider-and-wider multithreaded programs.

It also means that large linked lists / tree structures will be relatively weaker and weaker as time goes on. Its like they say: "Disk is the new tape, Memory is the new Disk, Cache is the new RAM". Pointer-based structures will only really make sense if they work in low-latency L3 RAM. While traditional disk-based structures like B-Trees are beginning to be deployed on RAM with good effect.

Re: buying a new computer

Posted: Tue Aug 20, 2019 3:14 am
by ouachita
Reading all these posts on new high-end (CPU) computer systems made me go out and buy one . . .

Re: buying a new computer

Posted: Tue Aug 20, 2019 10:23 am
by Zenmastur
Joost Buijs wrote: Mon Aug 19, 2019 4:10 pm
Zenmastur wrote: Mon Aug 19, 2019 12:03 pm
Joost Buijs wrote: Mon Aug 19, 2019 6:44 am
jpqy wrote: Sun Aug 18, 2019 8:04 pm Ipman has benches with the new AMD Ryzen R9 3900X

and BMI2 is still clearly slower then pop version on AMD cpu's..

http://www.ipmanchess.yolasite.com/amd- ... -bench.php

JP.
Indeed, it looks like they didn't fix it. I don't need it for computer chess per se, but I'm also doing cryptography and in a new chess engine I'm working on I make heavy use of PEXT() in the evaluation function.

AMD also has slow AVX2, no AVX512 and no BF16 like the upcoming Intel 'Cooper Lake' chips. If you just use the machine to run Stockfish or any other chess engine it probably doesn't matter, but if you like to program and experiment with new algorithms it could be a drawback.
They have full AVX2 support and all data paths and EU's are now 256-bit. So, AVX2 shouldn't be "slow" on the new CPU's. AVX2 performance is on par with all intel CPU's EVEN when executing highly optimized Intel code. No AVX-512 support that I'm aware of.

Regards,

Zenmastur

Benchmarks show the same AVX2 performance because they compare 56 Intel cores with 128 AMD cores, so that is not really on par. AMD gives you a lot of cores for the money, but Intel still has the higher performance when you do the comparison with an equal number of cores.
Maybe you should buy an Intel 8 CPU 8280M system or the "NEW AND IMPROVED" 8 CPU 9200 based system!

If AVX is that important you should easily be able to justify the cost, right?

Regards,

Zenmastur

Re: buying a new computer

Posted: Tue Aug 20, 2019 10:40 am
by Dann Corbit
operations / dollar.

Re: buying a new computer

Posted: Tue Aug 20, 2019 11:22 am
by Zenmastur
Dann Corbit wrote: Tue Aug 20, 2019 10:40 am operations / dollar.
AVX is WAY to important to fall under that umbrella!