AMD Unveils Vision for an Open AI Ecosystem

0
138
AMD Unveils Vision for an Open AI Ecosystem

At the Advancing AI 2025 event, AMD unveiled its comprehensive vision for an integrated AI platform, as well as an open, scalable rack-level AI infrastructure built on industry standards.

Key points presented by AMD and its partners

  • Building an open AI ecosystem: Demonstrates progress in building an open AI ecosystem with the new AMD Instinct MI350 series accelerators.
  • Growth of the AMD ROCm ecosystem: Shows the continued expansion of the AMD ROCm software ecosystem.
  • New open designs and roadmap: Unveiled powerful, new, open rack-level designs and a roadmap to deliver leading rack-level AI performance beyond 2027.

Dr. Lisa Su, chairman and CEO of AMD, said: “AMD is driving AI innovation at an unprecedented rate, highlighted by the launch of our AMD Instinct MI350 series accelerators, advances in our next-generation rack-level AMD ‘Helios’ solutions, and growing momentum for our ROCm open software stack. We are entering the next phase of AI, driven by open standards, collaborative innovation, and AMD’s extended leadership in a broad ecosystem of hardware and software partners collaborating to define the future of AI.”

Lisa Su

Advanced solutions to accelerate the open AI ecosystem

AMD has announced a broad portfolio of hardware, software, and solutions to enable the full spectrum of AI:

  • AMD Instinct MI350 GPU Series: The Instinct MI350 Series GPUs were introduced, setting a new benchmark for performance, efficiency, and scalability in generative AI and HPC. The MI350 series, consisting of the Instinct MI350X and MI355X GPUs and platforms, delivers a 4x increase in AI computing power from generation to generation and a 35x leap in inference, paving the way for transformative AI solutions across industries. The MI355X also delivers significant price-performance improvements, generating up to 40% more tokens per dollar compared to competitive solutions.
  • End-to-end, open rack-level AI infrastructure: AMD demonstrated an end-to-end, open, rack-level AI infrastructure that is already being deployed with AMD Instinct MI350 series accelerators, 5th Generation AMD EPYC processors, and AMD Pensando Pollara NICs in hyperscale deployments such as Oracle Cloud Infrastructure (OCI) and will be widely available in H2 2025.

Preview of the next-generation Helios AI rack

  • Preview of the next-generation Helios AI rack: AMD also unveiled its next-generation AI rack called “Helios”. It will be built on the next-generation AMD Instinct MI400 series GPUs, which are expected to provide up to 10 times higher performance when running inference on Mixture of Experts models compared to the previous generation, as well as AMD EPYC “Venice” processors based on Zen 6 and AMD Pensando “Vulcano” network cards.
  • The latest version of the ROCm 7 open source software stack: Designed to meet the growing demands of generative AI and high-performance computing workloads while significantly improving the developer experience. ROCm 7 includes improved support for industry-standard frameworks, enhanced hardware compatibility, and new development tools, drivers, APIs, and libraries to accelerate AI development and deployment.
  • Energy efficiency: The Instinct MI350 series exceeded AMD’s five-year goal of improving the power efficiency of AI training and HPC nodes by a factor of 30, ultimately delivering a 38x improvement. AMD also announced a new 2030 goal of achieving a 20x increase in rack-level energy efficiency from the 2024 base year, enabling a typical AI model that requires more than 275 racks today to be trained in less than one fully loaded rack by 2030, using 95% less power.
  • AMD Developer Cloud: AMD announced the wide availability of AMD Developer Cloud for global developers and open source communities. Designed specifically for fast, high-performance AI development, users will have access to a fully managed cloud environment with the tools and flexibility to start AI projects and grow unlimited. Strategic collaborations with leaders such as Hugging Face, OpenAI, and Grok prove the power of collaboratively developed, open source solutions.

AMD Instinct MI350 GPU Series

Extensive ecosystem of partners

Today, seven of the top 10 largest modelers and AI companies are using Instinct accelerators for production workloads. These companies include Meta, OpenAI, Microsoft, and xAI, who joined AMD and other partners at Advancing AI to discuss how they are working with AMD to solve AI problems to train leading AI models, deliver insights at scale, and accelerate AI research and development:

  • Meta: Detailed how the Instinct MI300X has been extensively deployed for Llama 3 and Llama 4 inferences. Meta expressed excitement about the MI350 and its processing power, TCO performance, and next-generation memory. Meta continues to work closely with AMD on AI roadmaps, including plans for the Instinct MI400 series platform.
  • OpenAI: CEO Sam Altman discussed the importance of holistically optimized hardware, software, and algorithms, as well as OpenAI’s close partnership with AMD in AI infrastructure, with GPT research and models on Azure in production on the MI300X, and deep design development on the MI400 series platforms.
  • Oracle Cloud Infrastructure (OCI): One of the first industry leaders to implement AMD’s open rack-level AI infrastructure with AMD Instinct MI355X GPUs. OCI leverages AMD processors and GPUs to deliver balanced, scalable performance for AI clusters, and announced that it will offer zetascale AI clusters accelerated by the latest AMD Instinct processors with up to 131,072 MI355X GPUs to enable customers to build, train, and infer AI at scale.
  • HUMAIN: Discussed its landmark agreement with AMD to build an open, scalable, resilient, and cost-effective AI infrastructure using the full range of computing platforms that only AMD can provide.
  • Microsoft: Announced that the Instinct MI300X now powers both proprietary and open source models in production on Azure.
  • Cohere: Shared that its high-performance, scalable Command models are deployed on Instinct MI300X, delivering enterprise-grade LLM inference with high throughput, efficiency, and data privacy.
  • Red Hat: Described how its expanded collaboration with AMD enables production-ready AI environments, with AMD Instinct GPUs on Red Hat OpenShift AI delivering powerful, efficient AI processing in hybrid cloud environments.
  • Astera Labs: Emphasized how the open UALink ecosystem accelerates innovation and delivers greater value to customers, and shared plans to offer a comprehensive portfolio of UALink products to support next-generation AI infrastructure.
  • Marvell: Joined AMD to highlight its collaboration as part of the UALink consortium, which develops open connectivity, bringing maximum flexibility to AI infrastructure.

LEAVE A REPLY

Please enter your comment!
Please enter your name here