GPU performance increases had started to slow down, observed AMD corporate vice president for data centre GPU and accelerated processing Brad McCredie, so the company created a new GPU design from scratch to allow the construction of systems that can perform in the exaflop range.
Using the new AMD CDNA architecture, the MI100 GPU delivers up to 11.5 Tflops (FP64) performance for HPC workloads and up to 46.1 Tflops (FP32 matrix) for AI and machine learning workloads, the company claims.
This is three times faster than the MI50 for FP32 matrix operations, and the MI100 delivers around twice the peak performance per dollar compared with Nvidia's A100, he said. That means customers can do "twice the amount of science for the same amount of money."
|
AMD's approach is to use separate architectures for gaming (RDNA) and HPC (CDNA). This was "an easy choice to make" said McCredie, as customers never want to both on a single system.
CDNA is a "tremendous accomplishment" as it allows compute density to double, and AI workloads to run nearly seven times faster, he said.
Version 4.0 of the AMD ROCm developer software has been optimised for the MI100.
"We've received early access to the MI100 accelerator, and the preliminary results are very encouraging. We've typically seen significant performance boosts, up to 2-3x compared to other GPUs," said Oak Ridge Leadership Computing Facility director of science Bronson Messer.
"What's also important to recognise is the impact software has on performance. The fact that the ROCm open software platform and HIP developer tool are open source and work on a variety of platforms, it is something that we have been absolutely almost obsessed with since we fielded the very first hybrid CPU/GPU system."
He added "The ability to run molecular simulations that aren't just a few million atoms, but a few billion atoms, provides a more realistic representation of the science, and to be able to do that as a matter of course and over and over again will lead to a significant amount of important discoveries."
The MI100 is only being sold as a PCIe card. Bridge cards can interconnect up to four MI100s using the second-generation Infinity architecture, which is provides up to twice the peak peer-to-peer bandwidth of PCIe 4.0. In a server, MI100 GPUs can be configured with up to two fully-connected quad GPU hives, each providing up to 552 GBps of P2P I/O bandwidth for fast data sharing.
Systems using the MI100 are expected by end of 2020 from vendors including Dell, Gigabyte, HPE and Supermicro.
"Dell EMC PowerEdge servers will support the new AMD Instinct MI100, which will enable faster insights from data. This would help our customers achieve more robust and efficient HPC and AI results rapidly," said Dell Technologies senior vice president for PowerEdge servers Ravi Pendekanti.
"AMD has been a valued partner in our support for advancing innovation in the data centre. The high-performance capabilities of AMD Instinct accelerators are a natural fit for our PowerEdge server AI and HPC portfolio."
HPE vice president and general manager of HPC Bill Mannel said "Customers use HPE Apollo systems for purpose-built capabilities and performance to tackle a range of complex, data-intensive workloads across high-performance computing (HPC), deep learning and analytics.
"With the introduction of the new HPE Apollo 6500 Gen10 Plus system, we are further advancing our portfolio to improve workload performance by supporting the new AMD Instinct MI100 accelerator, which enables greater connectivity and data processing, alongside the 2nd Gen AMD Epyc processor. We look forward to continuing our collaboration with AMD to expand our offerings with its latest CPUs and accelerators."
In related news, AMD said it is still on track to begin volume shipments of the 3rd Generation Epyc 'Milan' processors with Zen 3 core to certain HPC and cloud customers this quarter, with the public launch expected in Q1 2021.
Pawsey Supercomputing Centre's forthcoming supercomputer – which will be Australia's most powerful — will be built around future AMD Epyc CPUs and Instinct accelerators.