The first Volta-based processor is the Tesla V100 data centre GPU, designed for AI inferencing and training, as well as HPC and graphics acceleration.
Volta uses 21 billion transistors to deliver performance equivalent to 100 CPUs for deep learning workloads.
"Deep learning, a groundbreaking AI approach that creates computer software that learns, has insatiable demand for processing power. Thousands of Nvidia engineers spent over three years crafting Volta to help meet this need, enabling the industry to realise AI's life-changing potential."
The V100 includes 640 of Nvdia's new Tensor cores to delivers 120 teraflops of deep learning performance, equivalent to the performance of 100 CPUs.
For more traditional high performance computing workloads, Volta's combination of CUDA and Tensor cores means a single server with Tesla V100 GPUs can replace hundreds of commodity CPUs.
Other improvements in the Volta architecture include a doubling of the throughput of the NVLink interconnect, and the use of 900GBps HBM2 DRAM for 50% more memory bandwidth,.
The CUDA, cuDNN and TensorRT software has been optimised for Volta, making it easy for frameworks and applications to take advantage of the performance improvements.
Amazon Web Services, Google, Microsoft and Tencent indicated they would be offering Volta GPUs in the cloud, while Baidu, Facebook and the Oak Ridge National Laboratory expressed intent to use the new architecture.