Hardware Left Skin

06 Market Segment Skins

Hardware Right Skin

06 Market Segment Skins

05 Market Segment Banner

Friday, 11 October 2024 05:42

AMD announces major new products upping its ante in AI chip domination Featured

By

AMD chair, president, and CEO Lisa Su celebrated her 10th anniversary in the top job with major new product announcements during the company's Advancing AI event in San Francisco. The company wants to be the leader in AI chip manufacturing, from the desktop through to the data centre, and it's bringing the goods to the table.

Not quite twelve months ago Dr Su announced its Instinct MI300X accelerator, saying "AI is the most transformational technology in 50 years. Maybe the only thing close is the introduction of the Internet, but with AI the adoption has been much, much quicker and we're only at the beginning of the AI era."

iTWire dubbed the MI300X as "a hefty silicon sandwich" with huge memory capacity, high bandwidth transmission speeds, vast cache, and GPUs and APUs all combined to bring a solid piece of hardware that took less space and less energy, but with higher outputs, than previous-gen products. Unlike other events iTWire attends where companies will speak about their intentions to produce something, AMD launched the MI300X, and listed the customers who had already placed orders - Azure, Oracle, Meta, OpenAI, among others.

AMD's expectation that generative AI would be in huge demand was right, so much so that the MI300X became "our fastest-ramping product in AMD history," Dr Su said, hitting sales well over a billion dollars already. In fact, if anything, AMD's forecast underestimated how rapidly the world would embrace genAI. Originally, AMD said the data centre market would grow from $45B in 2023 to over $400B by 2027; today, Su said the revised expectation is now more than $500B by 2028, with more than 60% CAGR annually. "The AI demand has continued to take off and exceed expectations," she said. "The rate of investment is driven by new use cases and larger models."

At the same time, AMD upgraded its earnings forecast to investors from the previously expected $US 4B to $US 4.5B for 2024.

AMD has positioned itself to be a major player, if not the major player, in this worldwide demand for AI hardware with major product launches that bring more power, more efficiency, more capability - but yet with a smaller footprint, smaller energy bill, faster return on investment, and lower total cost of ownership than before, and certainly than its competitors are offering.

"Computing is at the heart of modern life," Dr Su said, citing examples from cloud, to healthcare, to industrial and automotive, from connectivity to PCs and gaming, and, of course, artificial intelligence. "Over the next decade AI will enable so many new experiences that will make computing an even more essential part of our lives." These experiences, she suggested, will include medical discoveries, smarter cities, resilient supply chains, and enhanced productivity.

"Our goal at AMD is to make AMD the end-to-end AI leader," she said, explaining this would be achieved through four broad themes.

  1. delivering the best high-performance, energy-efficient portfolio of training and inference compute engines
  2. embracing and supporting an open, proven, developer-friendly software platform
  3. deeply co-innovating with partners on an AI ecosystem
  4. cluster-level systems design, bringing all the pieces for customers; not merely the chip, but the rack, switch, and more

These themes, or pillars, shone through in the AMD announcements and product launches, specifically:

  • the new AMD Instinct MI325X accelerator, announced now, and in production in Q4 2024, with devices shipping 2H 2025
  • improvements and enhancements to the AMD ROCm software suite, which now natively supports more than a million open source models without modification
  • new AMD Ryzen AI 300 series processors with 50 TOPS of AI processing power for Windows Copilot+ PCs
  • new AMD EPYC 9005 series processors for the data centre
  • new AMD Pensando Salina DPU and AMD Pensando Pollara 400 AI NIC to optimise data pipelines and GPU communication

That's a lot of new product information, and with products either out or nearly out, and customers and OEMs already lined up, it's a strong play from AMD as it continues to gain market share.

 

AMD Instinct MI325X accelerator

AMD has launched its new Instinct MI325X accelerator, featuring market-leading HBM3E (high bandwidth memory) capacity of 256GB. It is powered by the AMD CDNA 3 GPU architecture, and offers 1.3 PF FP16 performance, and 2.6 PF FP8 performance. It delivers AI inference performance with up to 1.4x higher throughput and significant latency improvements compared to the Nvidia H200.

AMD said the Instinct MI300 series is already powering the most popular GenAI solutions from the likes of Azure, OpenAI, Meta, DataBricks, FlexAI, WorldLabs, and others. Partners taking up AMD Instinct devices include Microsoft, Oracle, Dell, HPE, Lenovo, Supermicro, ASUS, and Cisco, among others.

The MI325X uses the same form factor as the MI3000 making it easier to upgrade. Production commences in Q4 2024, and will be shipping in the second half of 2025.

AMD has committed to a release cadence, and previewed its upcoming Instinct MI355X, already under way. This beast will include FP6 and FP4 support and will be based on the upcoming AMD CDNA 4 GPU architecture. It will feature 2.3TB HBM3E memory and 64TB/s memory bandwidth. It will significantly improve performance for floating point maths with 37 PF FP8 and 18.5 PF FP16.

AMDInstinct

 

OEM partners announced their own Instinct MI325X-based products, such as the HPE ProLiant Compute XD685 and the Lenovo ThinkSystem SR685aLenovo ThinkSystem SR685a.

 

 

AMD ROCm software enhancements

Tied in with the AMD Instinct GPU accelerator is AMD's ROCm software that is a series of tools, compilers, libaries, and utilities for creating GPU-powered software. AMD announced ROCm now has twice the inference and training performance that it did last year, with three times the amount of models now supported out of the box - more than one million, in fact.

AMD has also deepened its partnership with the AI ecosystem and many open source tools and libraries are providing native AMD Instinct support. These include PyTorch, Triton, Hugging Face, vLLM, JAX, and SGlang.

AMD said the performance has been significantly improved across diverse large language models like Llama 3.1, Mixtral, and Qwen. General inference improvement will continue with each subsequent ROCm release.

"We've spent the last 10 months laser-focused in ensuring customers can get up and running," Dr Su said.

"ROCm is enabling open innovation at scale," said AMD SVP AI Vamshi Boppana

 

 

AMD Ryzen AI 300 series processors

It's not all about the data centre. AMD also launched new Ryzen AI Pro 300 series processors for commercial PCs. These new CPUs feature integrated AI engines, known as the NPU or neural processing unit, that can deliver up to 55 TOPS, or trillions of operations per second. Yes, 55 trillion. This exceeds the Microsoft Copilot+ requirements of 40 TOPS.

The processors provide multi-day battery life and enhanced productivity features for business users, and AMD has partnerships that have plans already for more than a hundred Ryzen AI Pro PC options coming to market.

iTWire previously checked out the Microsoft Surface laptop 7th edition, which was among the very first devices to meet the Windows Copilot+ spec. The spec provides for fast, efficient, machinery with an NPU and more than 40 built-in AI models in the silicon itself. This means you can generate text, create images, translate text in real-time, without requiring a subscription or credits with an online AI product, nor even requiring an Internet connection. The Surface laptop uses an ARM processor, but new Copilot+ PCs rocking the AMD Ryzen AI 300 Pro will be faster and longer-lasting.

The AMD Ryzen AI 300 series processors are available now, as are Copilot+ PCs featuring them.

 

 

AMD EPYC 9005 series processors

Back to the data centre, AMD announced the 5th gen AMD EPYC CPUs, dubbed "Turin", based on its Zen5 / Zen5c architecture. The new CPUs deliver record-breaking performance and efficiency for diverse workloads, featuring:

  • Up to 192 cores and 500W TDP
  • Up to 384MB L3 Cache
  • SP5 socket compatibility with "Genoa" EPYC CPUs
  • Up to 12ch DDR5-6400 and 128 PCIe 5.0 lanes
  • Advanced security features such as secure memory encryption, and SEV
  • 2.7x performance increase compared to Intel's top-of-stack Emerald Rapids CPU
  • 60% more performance at the same licensing cost compared to Intel Xeon, meaning companies can extract greater value out of software licensed per-core
  • industry-leading SPEC CPU benchmarks, exceeding Intel Xeon in SPECrate 2017_int_base performance and energy efficiency
  • 1.17x average generational IPC uplift for enterprise and cloud server workloads
  • Up to 4x workload performance improvements across various applications compared to previous generations and Intel Xeon

The AMD EPYC "Turin" is "the world's best for cloud, enterprise, and AI," Dr Su said. It scales up and scales out, offers confidential compute with trusted I/O, and brings hugely increased performance. So much so, in fact, that it brings 60% more performance than a top-spec'd Intel Xeon 5th gen 8592 CPU, she said. "For software licensed per core, that's 60% more performance for no additional licensing cost."

"It has 3.9 times the throughput performance on MySQL OLTP," she said, as well as being "the fastest CPU for the most challenging high-performance computing problems. With up to 3.9x improved time-to-insight researchers working on the most complicated problems in the world will get to their answers faster if they are using AMD."

AMD calculated that if a data centre refreshes its hardware every four years then it could achieve a 7:1 consolidation on its current equipment, based on top-of-the-line Intel Xeon Platinum 8280 servers from 2020. "The 5th gen EPYC can do the same amount of work of 1000 servers with just 131 modern AMD EPYC 9965 servers," Dr Su said.

That's 87% fewer servers, 68% less power, and 62% lower TCO, not to mention the savings from software licensing, floor space, and power. "Use those savings to grow your business," she said.

Customers embracing AMD EPYC CPUs include Oracle Cloud Infrastructure, with OCI SVP Karan Batta stating Oracle has deployed AMD everywhere - CPU, GPU, and DPU. OCI customers like Uber are running off this AMD hardware in OCI's data centres, as well as PayPal, who use Oracle's Exadata product.

Meta has deployed no less than one-and-a-half million EPYC CPUs. Further, "the Meta Llama 405B model now runs exclusively on the AMD MI300X for all live traffic due to its large memory and TCO advantages."

The 5th gen AMD EPYC "Turin" CPU is shipping now.

 

 

AMD Pensando Salina and AMD Pensando Pollara

AMD announced two new networking solutions designed to boost the performance of AI systems: the AMD Pensando Salina DPU and the AMD Pensando Pollara 400 NIC.

Salina is the third iteration of AMD's programmable DPU, offering double the performance, bandwidth and scale compared to its predecessor. It supports 400G throughput and focuses on optimising the front-end network of AI clusters, which delivers data to the cluster. Salina will be critical for enhancing performance, efficiency, security, and scalability in data-intensive AI applications.

Pollara 400 is the industry's first UEC (Ultra Ethernet Consortium)-ready AI NIC. It is designed to manage data transfer between accelerators and clusters in the back-end network of AI systems. The Pollara 400 supports next-generation RDMA software and is backed by an open networking ecosystem. This AI NIC will be crucial for maximizing performance, scalability, and efficiency in accelerator-to-accelerator communication.

Both Salina and Pollara 400 are currently being sampled by customers and are expected to be available in the first half of 2025. 

The background to these devices is that data centre processors and servers and systems are all inter-connected. An AI system that performs training and inference will potentially have thousands upon thousands of GPUs. Yet, research from Meta found "at an average, 30% of the training cycle time is elapsed in waiting for networking" and "communication accounts for 40% to 75% of time with training and distributed inference models."

That is, large-scale AI work is constrained by the networking between the many components. To combat this, hardware providers already created the DPU - which only two years ago former VMware CEO Raghu Raghuram said would become ubiquitious within the next decade of computing. Like Su, Raghuram's crystal ball didn't anticipate how rapidly the world's technology thirst would demand more, more, more from hardware. Actually, ChatGPT was still to come onto the scene when Raghuram made that comment.

As it turns out, not only has the DPU aided with data ingestion by handling work the CPU would otherwise need to do, but it's not enough. AI systems demand consistent, sustained data transfer between GPUs and something more must be done.

Ethernet is the clear winner when it comes to networking over options such as Infiniband - Ethernet able to scale a network to connect more than a million GPUs vs. Infiniband topping out at 48,000 GPUs - but even so, when you have systems that complex, that large, you need even more to balance the load, manage congestion, and rapidly recover from failed/lost packets.

A new Ethernet is needed to meet the growing network demands of AI and HPC at scale, which is why AMD along with 96 other organisations have created the Ultra Ethernet Consortium (UEC). The UEC 1.0 specification is expected to be released between January and March 2025.

This is the background behind the Pollara 400, an AI network interface that has built-in intelligent multi-pathing ech that can spray packets across multiple, optimal routes. It avoids congestion, it can quickly detect lost packets and handle them with selective re-transmission and fast loss recovery. It routes around traffic problems. In essence, when you have a vast network, the Pollara 400 transforms it into an efficient, high-performance highway.

The Pollara 400 uses AMD's 3rd gen P4 engine, can handle 120M packets/s, has a 400GB/s wire rate, and can deal with 5M concurrent connections/s. It uses software-defined networking to ensure it's scalable, updatable, and future-proof.

Additionally, AMD Announced the AMD Pensando Salina 400, the "best DPU for evolving front-end networks, the number one DPU for hyperscalers" - a 400G PCIe Gen 5 2x 400GE device with 2x DDR5 102GB/s memory bandiwdth up to 128GB DDR, and 16 N1 ARM cores. This Salina DPU features a stateful firewall, encryption, load balancer, network address translation, and storage offload.

The devices were not built in isolation but with close collaboration, including companies such as Azure, Cisco, Tensorwave, Oracle Cloud Infrastructure, and IBM Cloud.

Microsoft announced it is making a Microsoft Azure smart switch based on the new AMD DPU, stating it projected more than a whopping $100m in savings. This is currently in testing.

The AMD Pensando products will continue to evolve through software upgrades, and is a "network that can evolve with the growing needs and demands brought on by AI," Dr Su said.

Read 2211 times

Please join our community here and become a VIP.

Subscribe to ITWIRE UPDATE Newsletter here
JOIN our iTWireTV our YouTube Community here
BACK TO LATEST NEWS here




EXL AI IN ACTION VIRTUAL EVENT 20 MARCH 2025

Industry leaders are looking to transform their businesses and achieve measurable outcomes with AI.

As organisations across APAC navigate the complexities of AI adoption, this must-attend event brings together industry leaders, real-world demonstrations, and visionary panel discussions to bridge the gap between proof-of-concepts and enterprise-wide AI implementation.

Learn how to overcome common challenges in deploying AI at scale.​

Unlock cost savings, efficiency, and better customer experiences with AI.

Discover how industry expertise and data intelligence enable practical AI deployment.

Register for the event now!

REGISTER!

PROMOTE YOUR WEBINAR ON ITWIRE

It's all about Webinars.

Marketing budgets are now focused on Webinars combined with Lead Generation.

If you wish to promote a Webinar we recommend at least a 3 to 4 week campaign prior to your event.

The iTWire campaign will include extensive adverts on our News Site itwire.com and prominent Newsletter promotion https://itwire.com/itwire-update.html and Promotional News & Editorial. Plus a video interview of the key speaker on iTWire TV https://www.youtube.com/c/iTWireTV/videos which will be used in Promotional Posts on the iTWire Home Page.

Now we are coming out of Lockdown iTWire will be focussed to assisting with your webinars and campaigns and assistance via part payments and extended terms, a Webinar Business Booster Pack and other supportive programs. We can also create your adverts and written content plus coordinate your video interview.

We look forward to discussing your campaign goals with you. Please click the button below.

MORE INFO HERE!

BACK TO HOME PAGE
David M Williams

David has been computing since 1984 where he instantly gravitated to the family Commodore 64. He completed a Bachelor of Computer Science degree from 1990 to 1992, commencing full-time employment as a systems analyst at the end of that year. David subsequently worked as a UNIX Systems Manager, Asia-Pacific technical specialist for an international software company, Business Analyst, IT Manager, and other roles. David has been the Chief Information Officer for national public companies since 2007, delivering IT knowledge and business acumen, seeking to transform the industries within which he works. David is also involved in the user group community, the Australian Computer Society technical advisory boards, and education.

Share News tips for the iTWire Journalists? Your tip will be anonymous

Subscribe to Newsletter

*  Enter the security code shown: img0

WEBINARS & EVENTS

CYBERSECURITY

PEOPLE MOVES

GUEST ARTICLES

Guest Opinion

ITWIRETV & INTERVIEWS

RESEARCH & CASE STUDIES

Channel News

Comments