buying a new computer

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Leo
Posts: 1078
Joined: Fri Sep 16, 2016 6:55 pm
Location: USA/Minnesota
Full name: Leo Anger

Re: buying a new computer

Post by Leo »

What does I/O and memory channel mean? How does it benefit chess engines?
Advanced Micro Devices fan.
Leo
Posts: 1078
Joined: Fri Sep 16, 2016 6:55 pm
Location: USA/Minnesota
Full name: Leo Anger

Re: buying a new computer

Post by Leo »

jpqy wrote: Sun Aug 18, 2019 12:12 pm AMD 3rd Gen Ryzen Threadripper ‘Sharkstooth’ With 32 Zen 2 Cores Possibly Spotted in Geekbench – Up To 35% Faster Than Ryzen Threadripper 2990WX

https://wccftech.com/amd-ryzen-threadri ... u-spotted/

JP.
Very good news.
Advanced Micro Devices fan.
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: buying a new computer

Post by zullil »

Leo wrote: Mon Aug 19, 2019 5:47 pm What does I/O and memory channel mean? How does it benefit chess engines?
More memory channels means that more data can travel between CPU and RAM in a given amount of time. So, for chess engines, reading information from the hash table becomes faster, for example.
dragontamer5788
Posts: 201
Joined: Thu Jun 06, 2019 8:05 pm
Full name: Percival Tiglao

Re: buying a new computer

Post by dragontamer5788 »

Leo wrote: Mon Aug 19, 2019 5:47 pm What does I/O and memory channel mean? How does it benefit chess engines?
I/O is the number of PCIe connections to the CPU. Each PCIe 3.0 "lane" connection supports 985 MB/s to... well... anything. Ethernet, GPUs, NVMe SSDs, etc. etc. On modern computers, you use x2 or x4 connections to SSDs (1.9GB/s and 3.9GB/s respectively), and x8 or x16 connections to GPUs (15.7GB/s). The M.2 slots usually use x4 connections.

Note that some CPUs support PCIe 4.0, which doubles the bandwidth. However, GPUs and SSDs which use PCIe 4.0 are extremely new and extremely expensive, so the typical CPU-builder will probably stick with PCIe 3.0 parts.

Your typical CPU (Ryzen 7 or Intel i7) will only support x16 lanes + x4 lanes to the "southbridge". The "southbridge" is all of the miscellaneous features of your motherboard lies (USB connections, SATA, maybe a legacy PCIe 2.0 or even PCI / ISA port). Some motherboards lay out the NVMe drives "behind" the southbridge, which is annoying... to say the least.

A high-end CPU like Threadripper supports x64 lanes of I/O, allowing stupifying amounts of I/O bandwidth: https://www.guru3d.com/news-story/eight ... 8-gbs.html

-------------

DDR4 Memory channels is similar: its the number of DDR4 channels that are supported. DDR4 is clocked at different rates, but typical DDR4 is 2600 MT/s, or roughly 21GB/s. Two channels of 2600 RAM will give you 42GB/s, while Threadripper supports 4x channels, or 84GB/s.

Note that these memory channels work in parallel. Which means you only really access the full bandwidth if you use multiple threads. A single core typically only has a fraction of the memory bandwidth. This depends on details, such as NUMA architecture and microarchitectural details (in particular: cache hierarchy, load/store units, etc. etc.). Its very difficult to write a program to use all memory bandwidth available, you pretty much have to go AVX2 and very carefully lay out your memory accesses to work with the prefetchers. But more-bandwidth will (in general) help out with multi-threaded programs.

When you have 16x cores or 32x threads running on one CPU, you'll probably start to run out of memory-bandwidth.

---------------

I/O could benefit a chess engine if you use large amounts of GPUs, SSD drives, or Network (Ethernet) connections. You could feasibly put 4x GPUs (in 8x lane configuration) + RAID0 4x NVMe SSDs + 10GbE on a Threadripper for instance, giving 4x GPUs worth of compute and maybe 10GB/s read/write speed to your Tablebase.

Memory bandwidth is far more difficult to figure out, because that's engine and data-structure specific. The easiest way to take advantgae of memory bandwidth is to increase the number of threads operating on your program.

EDIT: Actually, both I/O and Memory are very difficult to fully take advantage of. Although you have very high bandwidth available on modern machines, I/O has huge latency (and even DDR4 to an extent). So you need to learn how to write asynchronous programs, or maybe heavily multithreaded programs, to fully utilize the I/O and Memory bandwidths available today.
Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: buying a new computer

Post by Raphexon »

zullil wrote: Mon Aug 19, 2019 6:00 pm
Leo wrote: Mon Aug 19, 2019 5:47 pm What does I/O and memory channel mean? How does it benefit chess engines?
More memory channels means that more data can travel between CPU and RAM in a given amount of time. So, for chess engines, reading information from the hash table becomes faster, for example.
L3 caches of 1gb thanks to RLDRAM in future will be crazy.
dragontamer5788
Posts: 201
Joined: Thu Jun 06, 2019 8:05 pm
Full name: Percival Tiglao

Re: buying a new computer

Post by dragontamer5788 »

Raphexon wrote: Mon Aug 19, 2019 8:18 pm
zullil wrote: Mon Aug 19, 2019 6:00 pm
Leo wrote: Mon Aug 19, 2019 5:47 pm What does I/O and memory channel mean? How does it benefit chess engines?
More memory channels means that more data can travel between CPU and RAM in a given amount of time. So, for chess engines, reading information from the hash table becomes faster, for example.
L3 caches of 1gb thanks to RLDRAM in future will be crazy.
RLDRAM3 is the last version, and nothing else seems announced. I don't think there are many opportunities for latency to be further reduced in the future.

In contrast, memory bandwidth continues to grow, exponentially even. HBM, HBM2, HBM3, GDDR6, and DDR5 all have grossly improved bandwidth scores over their predecessors. As such, the question is not how to improve external-RAM latency, but the question is how to take advantage of better-and-better "dumb" bandwidth.

CPUs have SMT, Out-of-order execution, reordering buffers, and prefetching as their primary tools to take advantage of bandwidth. Programmers can use AVX2 / AVX512 to have wider load/stores and maybe use the bandwidth more effectively.

GPUs are almost exclusively SMT-based, but to a ridiculous degree. While CPUs are SMT2 (Intel "Hyperthreading"), or maybe SMT4 / SMT8 (Power9), GPUs effectively have SMT40 (AMD Vega), SMT20 (AMD Navi), or SMT32 (NVidia Turing), or SMT64 (NVidia Volta): running 20 to 64 wavefronts per compute unit / symmetric multiprocessor.

--------

SMT seems to be the most effective tool of the future for taking advantage of this bandwidth, which means learning to write wider-and-wider multithreaded programs.

It also means that large linked lists / tree structures will be relatively weaker and weaker as time goes on. Its like they say: "Disk is the new tape, Memory is the new Disk, Cache is the new RAM". Pointer-based structures will only really make sense if they work in low-latency L3 RAM. While traditional disk-based structures like B-Trees are beginning to be deployed on RAM with good effect.
ouachita
Posts: 454
Joined: Tue Jan 15, 2013 4:33 pm
Location: Ritz-Carlton, NYC
Full name: Bobby Johnson

Re: buying a new computer

Post by ouachita »

Reading all these posts on new high-end (CPU) computer systems made me go out and buy one . . .
SIM, PhD, MBA, PE
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: buying a new computer

Post by Zenmastur »

Joost Buijs wrote: Mon Aug 19, 2019 4:10 pm
Zenmastur wrote: Mon Aug 19, 2019 12:03 pm
Joost Buijs wrote: Mon Aug 19, 2019 6:44 am
jpqy wrote: Sun Aug 18, 2019 8:04 pm Ipman has benches with the new AMD Ryzen R9 3900X

and BMI2 is still clearly slower then pop version on AMD cpu's..

http://www.ipmanchess.yolasite.com/amd- ... -bench.php

JP.
Indeed, it looks like they didn't fix it. I don't need it for computer chess per se, but I'm also doing cryptography and in a new chess engine I'm working on I make heavy use of PEXT() in the evaluation function.

AMD also has slow AVX2, no AVX512 and no BF16 like the upcoming Intel 'Cooper Lake' chips. If you just use the machine to run Stockfish or any other chess engine it probably doesn't matter, but if you like to program and experiment with new algorithms it could be a drawback.
They have full AVX2 support and all data paths and EU's are now 256-bit. So, AVX2 shouldn't be "slow" on the new CPU's. AVX2 performance is on par with all intel CPU's EVEN when executing highly optimized Intel code. No AVX-512 support that I'm aware of.

Regards,

Zenmastur

Benchmarks show the same AVX2 performance because they compare 56 Intel cores with 128 AMD cores, so that is not really on par. AMD gives you a lot of cores for the money, but Intel still has the higher performance when you do the comparison with an equal number of cores.
Maybe you should buy an Intel 8 CPU 8280M system or the "NEW AND IMPROVED" 8 CPU 9200 based system!

If AVX is that important you should easily be able to justify the cost, right?

Regards,

Zenmastur
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: buying a new computer

Post by Dann Corbit »

operations / dollar.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: buying a new computer

Post by Zenmastur »

Dann Corbit wrote: Tue Aug 20, 2019 10:40 am operations / dollar.
AVX is WAY to important to fall under that umbrella!
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.