Space/Time Tradeoff in year 2020+: What can the community do with 20TB+ Hard Drives?
Posted: Fri Nov 08, 2019 7:41 pm
https://www.pcmag.com/news/371784/seaga ... es-in-2020
While there has been a lot of talk about new CPUs, GPUs and maybe flash storage (or "optane") technologies, hard drive progress continues to march forward. Seagate will release 18TB "Conventional" Hard Drives next year, and 20TB "Shingled" Hard drives as well (SMR has very poor write performance, but higher-capacity). WD has similarly announced 18TB and 20TB "MAMR" drives (microwave assisted). Seagate is expected to report more progress on their HAMR (heat-assisted) drives, while Toshiba is currently shipping 16TB drives. We can reasonably expect HDD progress to continue forward.
So my question to the community: what can the chess community do with such huge storage capacities?
* IIRC, the 7-man Syzygy Tablebase weigh in at 18.4TB. Since Syzygy is typically a read-only methodology, the 20TB SMR drive could feasibly be used. Maybe 2x or 3x drives could be used for multithreading purposes: different threads could get their own hard drives to help speed up the search. If you have a 1gbps connection, you can expect to download a 18.4TB file within 3 days. But SMR drives are pretty slow, 40MBps or slower, so it'd take over 5+ days to write to the SMR drive!
* It might be more practical to wait for 20TB HAMR or MAMR drives and stay away from SMR for now. Or just RAID0 2x 18TB helium drives together.
* 7-man Lomonosov Tablebase weighs in at over 100TiBs, requiring 6x 18TB hard drives to access. Larger hard-drives are still needed!
* Larger opening books: I don't know much opening-book theory, but surely bigger-is-better? The wiki suggests that an entry can be anywhere from 16-bytes to 100-bytes per position (depending on the methodology). A 18TiB hard drive could therefore store somewhere between 180-billion an 1.2 Trillion entries.
* EDIT: If we assume 500x writes per second, each 512-bytes long (the typical sectorsize of a hard drive), it would take 800+ days to fill 18 TiB drives. And 500 IOPs is on the high-end of what HDDs can do. Realistically speaking, the only way to write to 18 TiB of hard-drive space is if the HDD were written sequentially: wherein typical HDDs achieve ~200MB/s. Any pragmatic use of a HDD would almost certainly require a 1TiB+ NVMe SSD (~100,000 writes/second and 2GBps read/write speed) to "cache + reorder" data into a form that optimizes the HDD's usage. Even in this ideal "sequential" scenario, it would take over a day before all 18TiB were written to disk.
* For inspiration, it should be noted that IBM's "quantum supremacy" counter-paper leveraged 64PiB of hard drives for the 53-qubit simulation. And would require 128 PiB for 54-qubits simulations. Big-hard drives can still accelerate big computer problems, the question is "how" ??
While there has been a lot of talk about new CPUs, GPUs and maybe flash storage (or "optane") technologies, hard drive progress continues to march forward. Seagate will release 18TB "Conventional" Hard Drives next year, and 20TB "Shingled" Hard drives as well (SMR has very poor write performance, but higher-capacity). WD has similarly announced 18TB and 20TB "MAMR" drives (microwave assisted). Seagate is expected to report more progress on their HAMR (heat-assisted) drives, while Toshiba is currently shipping 16TB drives. We can reasonably expect HDD progress to continue forward.
So my question to the community: what can the chess community do with such huge storage capacities?
* IIRC, the 7-man Syzygy Tablebase weigh in at 18.4TB. Since Syzygy is typically a read-only methodology, the 20TB SMR drive could feasibly be used. Maybe 2x or 3x drives could be used for multithreading purposes: different threads could get their own hard drives to help speed up the search. If you have a 1gbps connection, you can expect to download a 18.4TB file within 3 days. But SMR drives are pretty slow, 40MBps or slower, so it'd take over 5+ days to write to the SMR drive!
* It might be more practical to wait for 20TB HAMR or MAMR drives and stay away from SMR for now. Or just RAID0 2x 18TB helium drives together.
* 7-man Lomonosov Tablebase weighs in at over 100TiBs, requiring 6x 18TB hard drives to access. Larger hard-drives are still needed!
* Larger opening books: I don't know much opening-book theory, but surely bigger-is-better? The wiki suggests that an entry can be anywhere from 16-bytes to 100-bytes per position (depending on the methodology). A 18TiB hard drive could therefore store somewhere between 180-billion an 1.2 Trillion entries.
* EDIT: If we assume 500x writes per second, each 512-bytes long (the typical sectorsize of a hard drive), it would take 800+ days to fill 18 TiB drives. And 500 IOPs is on the high-end of what HDDs can do. Realistically speaking, the only way to write to 18 TiB of hard-drive space is if the HDD were written sequentially: wherein typical HDDs achieve ~200MB/s. Any pragmatic use of a HDD would almost certainly require a 1TiB+ NVMe SSD (~100,000 writes/second and 2GBps read/write speed) to "cache + reorder" data into a form that optimizes the HDD's usage. Even in this ideal "sequential" scenario, it would take over a day before all 18TiB were written to disk.
* For inspiration, it should be noted that IBM's "quantum supremacy" counter-paper leveraged 64PiB of hard drives for the 53-qubit simulation. And would require 128 PiB for 54-qubits simulations. Big-hard drives can still accelerate big computer problems, the question is "how" ??