It's not just the very fastest supercomputers that use Nvidia GPUs: two-thirds of the Top500 systems include the company's products, Nvidia officials announced.
Two years ago, Nvidia or Mellanox components were in just over 40% of the then Top500 systems.
Nvidia Selene (part of the company's internal research cluster) is the fastest industrial supercomputer in the US, the second fastest in the world, and taking a creditable seventh place on the Top500 list.
Comprising 280 DGX A100 servers containing a total of 2,240 A100 GPUs, 494 Mellanox InfiniBand switches and 7PB of flash storage, it delivers more than 1 exaflops on an AI benchmark.
It delivers more than 20gigaflops per watt (putting it close to the top of the Green500 list), and was built in less than a month following Nvidia's reference architecture.
Using just 16 of its DGX A100 servers, Selene set a TPCx-BB benchmark record, performing 19.5 times faster than the previous record.
According to Nvidia, achieving the same performance with CPUs rather than GPUs would require 16 times as many servers, more than five times as many racks, three times as much power, and would cost seven times more.
Apache Spark 3.0 is now GPU accelerated for data preparation as well as model training, allowing a single pipeline and simpler infrastructure.
The company also announced the first product that aligns Nvidia's strengths with those of recently acquired networking vendor Mellanox.
UFM Cyber-AI reads and processes network telemetry from supercomputers, explained senior vice president of marketing Gilad Shainer, and applies machine learning to sense or predict failures, or to detect suspicious activity.
Predicting failures allows systems managers to fix issues before failure occurs, thus improving availability, and the recent cases where European supercomputers were illegitimately commandeered to mine cryptocurrency could have been detected along with other abnormal behaviour, he said.
In related news, Nvidia revealed a PCIe version of its A100 GPU card. It has the same peak performance as the existing A100 SXM board, but is designed for incorporation in mainstream servers and has a lower thermal design power.
Major server vendors including Cisco, Dell, Fujitsu, HP Enterprise and Lenovo have announced support for the card, and around 50 servers using the PCIe A100 are expected to be available by the end of the year.
The PCIe A100 is an OEM product and will not be sold separately.
"Adoption of Nvidia A100 GPUs into leading server manufacturers' offerings is outpacing anything we've previously seen," said Nvidia vice president and general manager of accelerated computing Ian Buck.
"The sheer breadth of Nvidia A100 servers coming from our partners ensures that customers can choose the very best options to accelerate their data centres for high utilisation and low total cost of ownership."