AMD announces Instinct MI350 Series GPUs

There’s no shortage of AI hardware out there. But when you’re deploying massive models, accelerating inference at scale, or pushing HPC workloads to the edge—raw performance alone doesn’t cut it. You need performance that shows up in the real world, works with the infrastructure you already have, and delivers tangible ROI.

That’s exactly what the brand-new AMD Instinct™ MI350 Series GPUs were built to do. Unveiled at Advancing AI in June 2025, MI350 Series GPUs represent the latest leap forward in AI and HPC acceleration from AMD. Built for today’s most demanding compute environments—from generative AI to scientific simulation—these GPUs offer next-level performance, efficiency, and deployment flexibility you can count on from day one.

Performance Built for Real-World AI

Powered by the latest 4th Gen AMD CDNA™ architecture, the AMD Instinct™ MI350X and MI355X GPUs bring serious upgrades where it counts—throughput, memory, efficiency, and compatibility—so you can move faster, train bigger, and get more done without reinventing your data center.

The flagship AMD Instinct™ MI355X platform delivers up to 4X peak theoretical performance over the previous generation MI300X platform1, based on architectural improvements and supported precision formats.

MI350 Series GPUs Accelerate your GenAI Outcomes
In AMD real-world inference testing with the Llama 3.1 405B model, the MI355X platform demonstrated substantial throughput gains compared to the MI300X platform across key generative AI tasks:

Up to 4.2X better performance in AI agent and chatbot workloads2
Up to 2.9X better performance in content generation2
Up to 3.8X better performance in summarization2
Up to 2.6X better performance in conversational AI2
Over 3X Generational Inference Improvment Chart
When compared to today’s most powerful competitive GPUs, the AMD Instinct™ MI355X platform continues to lead the way in some of today’s most popular AI workloads.

In FP4 inference tests across large language models like Llama 3.1 405B and DeepSeek R1, the MI355X platform consistently delivers higher throughput than the latest NVIDIA B200 platforms—highlighting real performance gains in the environments that matter most.

Up to 1.3X better inference throughput vs B200 on Llama 3.1 405B using vLLM3
Up to 1.2X better inference throughput vs B200 on DeepSeek R1 using SGLang4
Comparable performance vs GB200 on Llama 3.1 405B, showing broad software parity5
MI355X Delivers The Highest Inference Throughput Chart
For training, the MI355X platform delivered up to 1.13X faster time-to to-train when compared to the Nvidia B200 platform6, and up to 1.12X faster time-to-train than the more expensive and more complicated NVIDIA GB200 platform on Llama 2-70B-LoRA6 AI training workloads running in FP8 datatypes.

Competitive Training Performance Chart
These results demonstrate that the AMD Instinct MI355X platform not only delivers architectural and efficiency gains—it’s also a top-tier choice for customers demanding the highest throughput on today’s largest generative AI models. Whether you're deploying with vLLM, TensorRT-LLM, or SGLang, AMD Instinct MI350 Series GPUs deliver leadership results.

Better Throughput. Smarter Economics.

The MI355X GPU isn’t just about being faster, it’s about being more efficient. On Llama 3.1 405B inference in FP4, it delivers up to 40% better tokens-per-dollar than B2007. This translates directly into lower operational costs and better returns on overall investment in AI infrastructure.

Up to 40% More Tokens
If you care about cost savings, and who doesn’t, that’s a difference worth paying attention to.

Plug In and Scale

AMD Instinct™ MI350 Series GPUs are designed to drop into existing AMD platforms built on the UBB (Universal Base Board) infrastructure used for MI300 Series—no forklift upgrades or system re-architecture required. That means you can unlock next-gen performance with minimal friction, using the existing server chassis, power, and cooling infrastructure.

And when it comes to deployment, you’ve got options. AMD Instinct™ MI350X and MI355X platforms are available in both air-cooled and direct liquid-cooled versions, so you can scale based on your thermal and density goals, not your hardware limitations.

AMD Instinct MI350 Series Platforms
288GB of HBM3E Memory. No Compromises.

Each AMD Instinct™ MI350 Series GPU comes with 288GB of high-bandwidth HBM3E memory, enabling you to run a 520B+ parameter model on a single GPU8. That’s huge!

No model splitting, no interconnect bottlenecks, no extra layers of complexity, just faster time to results and simplified scaling.

AMD ROCm™ Software Keeps Getting Better

Software matters, and AMD ROCm™ continues to prove it. With optimized support for Flash Attention, Transformer Engine, and tuned GEMM operations, AMD ROCm™ 7 software is driving meaningful gains across the stack.

For example, when comparing AMD ROCm 6 to ROCm 7 software, the AMD Instinct™ MI300X GPU shows:

3.5X average uplift in inference performance across a suite of industry-standard AI models9
3X average uplift in training performance across commonly used AI training workloads10
Accelerating Inference Performance Chart
Accelerating Training Performance Chart
These improvements reflect the ongoing AMD investment in software optimization, enabling customers to unlock significantly more performance from the same hardware over time. As a result, AMD Instinct MI350 Series GPUs benefit not only from architectural advancements, but also from major software-driven gains, delivering faster inference throughput and reduced time-to-train across key AI workloads.

Trusted Performance That Scales Forward

The AMD Instinct MI350 Series is already driving AI at scale for some of the most influential names in AI, designed to scale forward and power the next wave of breakthroughs.

“Oracle Cloud Infrastructure continues to benefit from its strategic collaboration with AMD. We will be one of the first to provide the MI355X rack-scale infrastructure using the combined power of EPYC, Instinct, and Pensando. We've seen impressive customer adoption for AMD-powered bare metal instances, underscoring how easily customers can adopt and scale their AI workloads with OCI AI infrastructure. In addition, Oracle relies extensively on AMD technology, both internally for its own workloads and externally for customer-facing applications. We plan to continue to have deep engagement across multiple AMD product generations, and we maintain strong confidence in the AMD roadmap and their consistent ability to deliver to expectations.” - Mahesh Thiagarajan, Executive Vice President, Oracle Cloud Infrastructure
"Building on nearly two decades of collaboration, Dell Technologies and AMD are helping organizations leverage the full potential of AI—while reimagining data centers to be more agile, sustainable and future-ready. Joint innovations like high-performance, dense rack solutions for AMD Instinct™ MI350 Series GPUs and optimized AI Scale-Out networking drive real-world breakthroughs for smarter, more efficient AI environments.” - Ihab Tarazi, SVP and CTO for ISG, Dell Technologies
“Hewlett Packard Enterprise delivers some of the world’s largest and highly-performant AI clusters with the HPE ProLiant Compute XD servers and we look forward to delivering even greater performance with the new AMD Instinct MI355X GPUs. Our latest collaboration with AMD expands our decades-long joint engineering efforts, from the edge to exascale, and continues to advance AI innovation.” - Trish Damkroger, Senior Vice President and General Manager, HPC & AI Infrastructure Solutions, Hewlett Packard Enterprise
“We are seeing per device best-in-class performance, the linear scaling characteristics are extremely exciting to scale our large training workloads.” - Ashish Vaswani, CEO, Essential AI
"AMD MI355X GPUs are designed to meet the diverse and complex demands of today’s AI workloads, delivering exceptional value and flexibility. As AI development continues to accelerate, the scalability, security, and efficiency these GPUs deliver are more essential than ever. We are proud to be among the first cloud providers worldwide to offer AMD MI355X GPUs, empowering our customers with next-generation AI infrastructure.” - J.J. Kardwell, CEO, Vultr
Built on a proven foundation, the MI350 Series delivers greater throughput, higher efficiency, and software-driven acceleration for today’s largest AI workloads—scaling seamlessly to tackle even more complex demands ahead.

The Bottom Line

AMD Instinct™ MI350 Series GPUs deliver what matters most:
Performance that shows up in the workloads you actually run
Massive memory that simplifies how you scale
Efficiency that delivers strong ROI
Compatibility that lets you deploy without delay
This is what modern AI infrastructure demands: streamlined deployment, open software, and performance that scales with your demands. With AMD Instinct™ MI350 Series GPUs and platforms, you can accelerate innovation, simplify growth, and deliver results—on your terms.