GUEST OPINION: Attacks on software supply chains are becoming an increasing concern for security teams around the world. These attacks can cause significant disruption or financial losses for those targeted.
Cloudian has announced an open-source software contribution that fuses PyTorch, the widely acclaimed machine learning library, with local Cloudian HyperStore S3-compatible storage solutions.
COMPANY NEWS: Having detailed AMD's new data centre processor technologies and spokent to partners, AMD CEO Lisa Su introduced everything that the company was doing with AI, and spoke to the partners the company was working with. She introduced AMD President, Victor Peng, who brought on PyTorch Founder, Soumith Chantala and Hugging Face CEO, Clement DeLangue. Lisa then closed the show with a big announcement...
Lisa Su
There's incredible interest in AI across the industry and this is also, you know, when we look at it. AI is really the defining technology that's shaping the next generation of computing and for. It's AMD's largest and most strategic long term growth opportunity. Now in AI, we're focused on three key areas.
First, it's delivering a broad portfolio of high performance GPUs, CPUs and adaptive computing solutions for AI training and inference spanning across data centre. Edge and intelligent endpoints. 2nd, it's developing an open and proven software platform to enable our AI hardware to be deployed broadly and easily. And 3rd it's really working with the industry and it's expanding the deep and collaborative partnerships we have established to really enable the ecosystem to accelerate. At scale, because in this space it's all about the ecosystem.
Now when you look at where we are today, we're actually very uniquely positioned with a broad portfolio of AI platforms across data centre, Edge and end. And that's powered by a number of engines that's powered by our rising AI engine, our versal, our alveo, our EPYC, and of course our instinct accelerators. Now looking at where we're deployed today, it's really in many, many different places. So if you look at the edge, for example, NASA uses our leadership Fpgas on the Mars Rovers. To accelerate AI based image detection.
When you look at automotive Daimler via. There are Subaru. These are just some of our customers that are using AI, silicon and software to power their driver, assist and advanced safety features. In healthcare, leaders like Clarius are using AMD Adaptive SCS for faster AI based imaging and solutions to really allow doctors to make quicker and more accurate diagnosis and industrial. Customers like AB are using our technology. For AI assisted. But and Taco cloud? It's using products for their vision. Applications such as AI based privacy and broad systems.
And earlier this year, we launched our Rising 7040 series CPUs, the industry's first X86 processors with the dedicated AI engine. And these have been ramping nicely and we expect more than 70 Windows PC designs from the top OEM's to launch later this year, powered by rising AI. So when you look at. All that, there's no question that AI. Will be the. Key driver of silicon consumption for the foreseeable future, but. Largest opportunity is in the data set. And over the last six months or so, the broad adoption of generative AI with large language models has really taken this growth to a different level. So people keep. Asking me, you know, what is the opportunity? So what is the opportunity? What I'd like to say is, look, we are still very, very early in the life cycle of AI. I mean there's so much opportunity for us. But when we try to size it, we think about the data centre, AI accelerator Tam growing from you know something like $30 billion this year over 50% compound annual growth rate, 150 billion in 2027 and it may. Be higher or maybe lower.
But what I can say for sure, Is it's going to be a lot because there is justice for Mendis. Tremendous demand. Now AMD's been investing in the data centre, accelerating market for many, many years and you know today if you look at where we are, we power many of the fastest supercomputers in the world that are using AI to solve some of the world's biggest challenges. For example, Oakridge national labs. This is the number one supercomputer in the world. It's the industry's first exascale supercomputer in frontier. Running large AI models on AMD instant GPUs to accelerate their Cancer Research in Finland the LUMI supercomputer.
AMD instinct GPU's to power the largest Finnish large language models with 13 billion parameters, and we're also collaborating with researchers at the Allen Institute, who are also using looming to create a state of the art fully open LM with 70 billion parameters that will be. Used by the global scientific community. Microsoft actually uses EPYC Instant processors, and they've built the 11th fastest supercomputer on the recent top 500 list to run AI and HPC workloads. And we're also working with a number of other companies like create telecom on their 11 billion parameter large language model. So let's just take a deeper look at how Lumi is using EPYC CPUs and instinct accelerators for AI.
Video on how AI is helping Cancer research is shown
It's just one of the many stories of, you know, how people are using AI to really accelerate sort of next generation systems. Enable generative AI you need both best in class hardware, but we also need a great software ecosystem, so let me now invite AMD President Victor Peng to the stage to share more about the growing. Software ecosystem for AI solutions.
Thank you, Lisa. Good morning. It's really great to be here. Many of you may have attended our financial animal state about just about a year ago and I had the great pleasure to talk to you about our vision of the base of AI. It's great that I'm. Here, now just about a year. Later to share where we are and where we're heading. In terms of software? And it's especially exciting for me, not only because.
That would really put all. The focus of the company under one roof that they've energy team and our newly formed AI. But really, talking to customers and partners and really seeing how we could really help them, you know, realise this tremendous opportunity everybody sees and also some some of the most challenging problems.
So really super excited about that. And I can sum up our direction and the current state of things right now and. Our label AI and software development with three words. Proven and written. Yeah, well, this is a journey we've made really great progress in building. A powerful self restack that. Works with the open ecosystem of models, libraries, frameworks and tools. Our platforms are proven and deployments today, as you heard from Lisa and Frontier and new ladies, massive scaled out operations and you know the demand for our platform is simply getting tremendous momentum and we are ready to meet that.
So before I focus on the software and the ecosystem like to share with you some of the progress we've made on our platforms. At CES, we announced the Rising 7040 and as Lisa mentioned it. Is the first. X86 CPU with an integrated AI accelerator, which is our XD AI engine. 2040 is now in production with features like video collaboration through the Windows Studio Facts and Onyx Runtime support, which was announced at the recent Microsoft Building. In the embedded space, we're sampling reversal AI products for leading customers in multiple markets, including markets like. Automotive and industry. And for our EPYC platform, our latest MDN 4.0 release is integrated with Tensorflow and delivering very significant improvements on AI.
Now moving to data. Centre GPUs we're working with leaders like. Microsoft and other leading cloud services providers as. Well, as many nimble. Very innovative small companies and we're seeing demand for our GPU. 'S growth tremendously. And quite broadly. So now Lisa. Is going to talk about the MI 300 in a little more detail later, so let's move. On to soft. Realising the application.
Performance really does require a leadership software staff. Optimise for the ecosystem. Let me first. Cover rockham, which is the software stack we have. For our instinct Data Centre GPU. Rockham is a complete. Set of libraries, runtime compilers and tools and needs to develop run into. The AI model. A significant portion of Rockin' Stack is actually open. Our drivers language tools like our debugger profile and our libraries are all. Rockwell also supports the AI software ecosystem, including Open Spring. Now rockin is actually in its fifth generation.
And it includes a very comprehensive suite of optimizations for AI as well as high performance computing. Once yes, support. And for examples on the API side, we have optimised kernels for large language models. For the data types like FP8. Explorer technologies like open eyes. The Rockets staff includes tools that allow easy porting of existing software to AMD Instinct Platt. Now to ensure the quality and the robustness of the problem, we run hundreds of thousands of framework tests nightly and we validate across a broad range.
Thousands of AI operators and intend models. Now this provides a very sound foundation for. Pipe Forge and tensor flow compliance. Play with a really rich. Out of the box AI experience. So now let's. Move up the stack to frameworks and specifically Pytorch, which is one of the. You know, most popular and growing framework. And what better person to do that than? To have one of the founders of Pytorch talk about the collaboration aiding Pytorch, you're doing to advance AI. So I'd like to invite Soumith Chantala to the stage to talk about Pytorch.
Victor Peng
Why don't we? Start off by you giving us a high level overview of PyTorch and the highlights of the latest iteration of PyTorch do.
Soumith Chantala
Sure thing PyTorch is one of the more popular AI frameworks in the industry. It's used by several companies that. Would you be coming here? Meta, obviously, is one of its biggest users. There's opening at Tesla almost like everyone in the industry probably uses pipe. Origin some form if they're using AI and Piper is the fundamental software to which most of the AI. Like neural networks, training and inference happens. And we recently released typos 2.0 which builds a. Tyler, that that gives you speedups like about 50%. To 100%. Faster over like the out of the box pipers before and it's powered by opening at Triton. The technology that we talked earlier. They're and they're pretty excited about it. You're seeing a lot of customers being super. Happy about it. And that's that's papered.
Victor Peng
You know those innovations you just talked about with the signature speed up, we're really. Excited about that? We're super excited about partnering together. Working with the pipe Roach as a founding member. Foundation. So can you share some of your thoughts about the ageing pipe forge collaboration?
Soumith Chantala
For sure, the PyTorch and AMD collaboration goes way back several years back, as Alexis, my colleague talked earlier today, Andy and Meta have been collaborating and. These forms and paper mostly came out of meta. It's a multi year collaboration we've been giving and a lot of feedback on many aspects of hardware and software to run AI workloads and AMD DNS have been partnering together to build their outcome stack and a bunch of pipework operators and integration to. Robustly test the whole stack and I'm pretty excited for the current support, especially like on the instinct accelerators that that ROCKHAM enables and I'm I'm pretty excited about MI 3. Under the cell. So I I think like this is the start. Looking forward to like how customers end up, like finding the maturity of the stack, but like we've spent a lot of time trying to like make sure. Does come out.
Victor Peng
Yeah, we're really excited about MI 300 and. As well as several others. So how do you see the collaboration? That we're doing together, benefiting the developer community.
Soumith Chantala
Sure. Yeah. This generally in AI workloads, one of the things we have is we have a. Single dominating vendor and when you write your workloads and you have to make them supported by switching to a different hardware, there's a lot of work that goes into doing that like there's a lot of software where the developers have to do to like move your neural network workloads. From one platform to another. So one of the things that. They've done with AMD, with the Wacom stack and the hybrid integration is you don't actually have to do that much work, or almost no work in a lot of cases to go from one platform to the other, so you might do a little bit of tuning work once you're deployed onto your AMD GPU, but it's like. Super seamless and so developers I think are gonna have a huge productivity boost as they try to switch to like the in the back end of pipers versus like the the TPU or like NVIDIA back end.
So I'm pretty excited about like the overall productivity algorithms would have when they're switching to the AMD back end overall like. And it's starting with the instant GPUs and hoping like, you know, you'll enable it for like all of your? Other classes? GPU.
Victor Peng
Absolutely. The goal to. Enabling the so all the developers across all our platforms. So thank you so much. For the partnership, really appreciate it.
Victor Peng
I touched two point. Oh, provides an open performance and productive option for developers to develop their latest ad inventions and that option that creativity and democratisation. If you will is super. One of only two. GPU platforms that is integrated with PIE. Torch at the slump. OK, so we've covered rockham and integration open frame. That means it's moving up to the top of. The stack with AI models and algorithms.
Hugging face is the leading enabler of AI model. Innovation in the open source comma. They offer an. Extremely wide range of models, including Transformers that are at the heart of Geneva eye, but also. Vision models and. Models for all kinds of other applications and use cases, and this stands from you know multi billion to trillion parameters to just millions and low single digit. The frame and the. Model so here to talk to. Us more about the groundbreaking work and the partnership we have with Huggy Face. Clement Delangue, CEO of Hugging Face. Thanks. For coming, why don't we just start off? With share your thoughts. About why open source matters for the growth of. The proliferation of the AI.
Clement Delangue
Thanks for having me. So first it's important to remember that post progress in AI in the past five to 10 years has been thanks to. Open science and open source. Maybe we would be 50 years away from where we are today without it. Now when we look to the future open science and open source AI are not only a way to accelerate technology, but also to level the playing field and distribute to all companies, startups, nonprofits, regulators, the tremendous power of AI. The tremendous value creation of the eye and the tremendous productivity gains right in the future, we want every single company to be able to train and run their chat shift on AMD hub, right?
And all of that allows companies to build AI themselves, right? Just to use AI APIs, for example. By doing so, most of the time with customised, specialised smaller models, it makes AI faster, cheaper and better to run. It also makes it safer. For the companies, but also. For the field. In general, because it creates more opportunities for transparency and accountability, which fosters more ethical AI.
Victor Peng
I'm personally thrilled that we just recently formalised our relationship. Can you share your thoughts on the Hugging Face and AMD partnership?
Clement Delangue
Yes, we super excited about this new partnership that we are today. I think we can do it in class. So looking face is the key to have become the most used open platform for AI. Today we have 15,000 companies. Using our software. And they have shared over half a million open models, data sets and demos. Like some you might have heard of. Like stable diffusion, Falcon balloon stock coder music Gen that has just been released by metal a few days ago. Over 5000 models, 5000 new models are were added just just last week. Just five 5000 new models and I think it shows the crazy speed of the open source AI community these days.
We will optimise all of that for AMD platforms, starting with instinct GPUs followed by Verizon, EPYC Radeon embedded products like personal and those that we've heard about. We will also include AMD hardware in our regression test for some of our most popular libraries like Transformers and our CICI. To ensure that new models like the 5000 that have been added last week are natively optimised for AMD platforms.
Victor Peng
All those publication miles. Across our entire portfolio, but starting? With the MI state and. That's amazing. 5000 miles. In just a week, that's it's crazy. So what are your thoughts on the? Benefit of our partnership? That's can provide to the developer community. Whole industry as a whole.
Clement Delangue
Yes. So you know the goal is to really have the best combo between hardware and soft. Writes. It's really important that hardware doesn't become some sort of the portal bottleneck or gatekeeper for AI when it develops. So what we're trying to do is to extend the range of options for AI builders, both for training. I'm super excited, in particular about the ability of AMD to power large language models in data centres thanks to the memory capacity and bandwidth advantage. Ultimately, AI is becoming the default to build all tech for all industries, right from language models that we're talking about, but also image audio. Videos, right, we're seeing more and more time series biology, chemistry and many. Many more domains. Hopefully this collaboration will be one step, a great step to democratise AI even further and improve everyone's life.
Victor Peng
Truly inspired vision. And you know which we share at AMD. We deeply believe in that. And we're so excited that we're working together with Hugging Face to make that vision. Thank you so much for that.
Victor Peng
OK, with so many models being optimised and you've heard that and we'll soon be running ahead of the box on starting with our instinct platforms, you know we're bringing the. Power that you heard. Of the open.
Source community that speed of innovation. The breadth of models onto the ad platforms. I mean, I think, you know, we all know that AI, the rate of innovation. Of this is unfair. The open source community is clearly a major driver of that rate and the breadth of innovation. So now software integration. With and optimised for the open ecosystem. To deliver a productive, performant and open development stack. Is vital to. Our platforms are proving at scale in production deployments. And AMD is ready to help our customers and developers achieve the next breakthrough we need. Thank you so much and now. I'd like to. Invite Lisa back onto the stage.
Lisa Su
Hi, thank you, Victor. And a very, very special thanks to Summit and clan for joining us today. And and talking about our partnership. No question, the software ecosystem is so important. For enabling our. Hardware to be deployed more broadly, I would say it's a journey and we know it's a journey, but we've made a tremendous. Progress over the past year with Rockham and the most important frameworks like Pie, torch, tensor flow and Onyx, and also really expanding our collaborations with key model developers and distributors like hugging face and many of the other open source models out there.
Now we always like to save hardware for last, so turning to AI hardware, let me say the generative AI in large language models have changed the landscape. OK? The need for more compute is growing exponentially. Whether you're talking about training or frankly, whether you're talking about inference, larger models give you better accuracy, and there's a tremendous amount of experimentation and development that's coming across in the industry. And you've heard that from both Simon and Clem at the centre of this are GPU's.
GPU's are enabled, generate AI. So now what? We'll turn to our instinct GPU road map. CNA is the underlying architecture for instinct accelerators. It's designed specifically for AI and HPC workloads. CDNA 3 is our brand new architecture. That uses a new. Compute Engine the latest data formats 5 and six nanometer process technology and the most advanced chiplet packaging technologies. At CES earlier this year, we previewed MI 300 a. It's the world's first data centre. And what we have is our CDMA three GPU architecture with 24 high performance Zen, 4 CPU quarters that exists in our leadership general.
Are that I talked about earlier in the show. And we also add with that 100 and. 28 gigabytes. Of HBM 3 memory all in a single package. And what we have is unified memory across. The CPU and GPU. Which is frankly very effective, particularly for some HPC workloads. Now this results in eight times more performance and five times better efficiency compared to the MI250X accelerator that is in the largest supercomputers today. Now mid term, today has also been designed into supercomputers already and it's slated for the two plus Exaflop El Capitan system at Lawrence Livermore National Labs. It's the most complex trip we've ever built, with more than 146 billion transistors across 13 chiplets.
Now you guys know. We've led the industry with. Use of chiplets in our products and our use of chiplets in this product is actually very, very strategic. We created a family of products. So in addition to the M300A product with our triplet construction. We can actually replace the three Zen 4 CPU chiplets with two additional cDNA 3 chiplets to create a GPU only version of MI 200 optimised for large language models and AI.
We call this MI300. Now my 300 X to address the larger memory requirements of large language models, we actually added an additional 64 gigabytes of HBM 3 memory. So with that through. I am super excited to show you for the very first time, MI300X. No you don't. You were. Paying attention, you might see it looks. Very, very similar to MI 300 A because basically we took three triplets off and put two triplets on and we stacked more HD and three memory. But what you see with MI300X is we truly designed this product for generative AI it combined. You need three with an industry leading 192 gigabytes of HBM 3. 5.2 terabytes per second of memory bandwidth and it has a. 150. 3 billion transistors across 12-5 nanometer and six nanometer chip links.
So I. Love this chip? So look, when you look at the world, there are many things that you need. You need great computer engines, but you also need a lot of memory for everything that's going on. So when you compare my 300 X to the competition, MI300X offers. 2.4 times more memory and 1.6 times more memory bandwidth, and with all of that additional memory capacity, we actually have an advantage for large language models because we can run larger models directly in memory. And what that does is for the largest models, it actually reduces the number of GPUs you need significantly speeding up the performance.
Especially for inference, as well as reducing total cost of ownership. So of course. I want to show. You the chip in action for the very first time, so let's my 300 in action felli. For this demo. We wanted to show you a large language model running real time inference on a we had. GPU and face here with us. We're actually going to run the recently released Falcon 40B foundational large language model, which is currently the most popular model on hugging face right now.
Featuring 40 billion. So what's watch for the first time ever? Am I 300? X running Falcon on a single GPU accelerator. All right, let's start. We have to give the model a prompt. So we are here in San Francisco. Let's say write a poem about San Francisco. Here we go, the phones coming. You can see it's responding in real time. I'm not a great poet. I don't know. About you guys. The City of Dreams that always keeps. You yearning for more? See that? Poem's pretty good, huh? Now look, you guys have all used, you know General API already and you've seen. A number of dinner. AI demos in the in the past few months. But what I want? To emphasise that's special about this demo is it's the first time a large language model decides can be run entirely in memory. On a single GPU. We've run a number of new larger models, including Meadow, 66 billion OPT model as well as the 65 billion LAMA models. And if you just love using FP16 inferencing, a single MI300X can run models up to approximately 80 billion for hours. So what does this actually mean? If you look at the industry today, you often see that, first of all, the model sizes are getting much larger and you actually need multiple GPU's to run the latest large one with models.
With MI300X you can reduce the number of GPUs for this and as model sizes continue growing, this will become even more important. So with more memory, more memory bandwidth, and fewer GPUs needed. What this means for proper voiders as well as enterprise users is we can run more inference jobs per GPU than you could before, and what that enables is for us to deploy MI 300 X at scale. The power next Gen LM's with lower total cost of ownership, really making the technology much, much more accessible to the broader.
Ecosystem and what that also means is that not only do we believe that we have better total cost of ownership, but we've also been able to significantly reduce the amount of development time needed to to. Deploy on my. 300 X. Our goal with my 300 X. Is to make it as easy to deploy as possible, and what that means is the infrastructure is also incredibly important. Which is why I'm also excited to announce. The AMD instinct platform. And what we're doing with this platform is again we're all about open infrastructure. So what we're. Putting is 8. MI300X's in the industry standard OCP infrastructure and for customers. What that means is they can use all of this AI.
Compute capability and memory of. M300X, in an industry standard. Platform that drops right into their existing infrastructure with actually very minimal changes. And with this leveraging of OCP platform specification, we're actually accelerating customers time to market it and reducing overall development costs while making it really easy to deploy MY300X into their existing AI, rap and server. So to sum it up, when we give you a few key takeaways, first of all, we're incredibly excited about AI. We see AI everywhere in. MI300X what we offer is leadership team. For AI workloads. We're really, really focused on making it easy for our customers and partners to deploy so that instant platform really lowers the barriers for adoption and frankly, the enterprise ready software stack.
We know that this is so important, we've made tremendous progress through our work with the frameworks and the and models with our partners, and there's going to be a lot. More that's going on. Over the next many years in this area, now let me talk about availability. So I'm happy to say that in my 300 a began sampling to our lead HPC and AI customers earlier this. Quarter and we're. On track to begin sampling MI 300 X and the eight GPU instant platform beginning in the third quarter. And in addition, we expect both of these products to ramp in production in the fourth quarter of this year.
So I really look forward to sharing more details. On the MI300 family. When we launch later this year. Let me wrap things up for today. We showed you. A tremendous amount of new data centre and AI technologies from our expanded portfolio of leadership, EPYC server processors with Genoa, Bergamo and Genoa X to our industry-leading GPUs and smart mix to our expanding AI software ecosystem and our next generation. MY300X data centre AI accelerators. It's just an incredible piece of innovation, truly exciting. But what's more than that? We really appreciate the partnership with our partners and a very, very special thank you to our partners who are here with us today, AWAS Meta. Microsoft Citadel Pie, Torch and hugging face we truly believe in Code development Co innovation and partnership. It's been a great day as we take another major step forward to make AMD the data centre and AI partner of choice. Thank you so much for joining us.
No matter if you're a technology expert, a business professional, or simply want to understand the world around you better and make more informed decisions, there’s no denying data and AI are hot topics right now. Here are great resources to get you running, without spending a penny.
GUEST EVENT: Nvidia today announced that company founder and CEO Jensen Huang will deliver the opening keynote at GTC 2023, covering the latest advancements in generative AI, the metaverse, large language models, robotics, cloud computing and more.
For most developers the security/performance trade off is still the hardest one to tackle, even as the cost of processing[…]
RISC has been overhyped. While it is an interesting low-level processor architecture, what the world needs is high-level system architectures,[…]
There are two flaws that are widespread in the industry here. The first is that any platform or language should[…]
Ajai Chowdhry, one of the founders and CEO of HCL is married to a cousin of a cousin of mine.[…]
I wonder when they will implement all of this, and what the pricing plans will be.FWIW, these days the proposed[…]