AMD Instinct MI355X (288GB HBM3E)

Overview

AMD Instinct MI355X is the 288GB HBM3E upgrade variant of the MI350 series, launching in H2 2025 (MI350 has 192GB HBM3E). Built on the CDNA 4 architecture, TSMC 3nm process, it features 288GB HBM3E memory (largest HBM capacity in the industry), 8 TB/s memory bandwidth (highest bandwidth in the industry), and 4.6 PFLOPS FP8 dense compute. It is AMD's key product between the NVIDIA B200 and MI400, targeting the AI large-model training and inference market.

Key upgrades (vs MI350X 192GB):

HBM capacity: 192GB → 288GB (+50%)
Memory bandwidth: 6.4 TB/s → 8 TB/s (+25%)
FP8 dense: 3.6 PF → 4.6 PF (+28%)
Supports FP4 / FP6 new precision formats (MI350 supports FP8 only)
Interconnect: UALoF (Ultra Accelerator Link Fabric) 600 GB/s (MI350 limited to 200 GB/s PCIe)

Core Specifications

Item	Specification
Architecture	AMD CDNA 4 (same architecture as MI350)
Process Node	TSMC 3nm (N3)
GPU Cores	304 CDNA 4 Compute Units
HBM	288 GB HBM3E (largest in the industry)
HBM Channels	8 stacks × 36GB HBM3E
Memory Bandwidth	8 TB/s (highest in the industry)
FP4 sparse	10.1 PFLOPS
FP6 sparse	10.1 PFLOPS
FP8 dense	5 PFLOPS
BF16 dense	2.5 PFLOPS
FP16 dense	2.5 PFLOPS
FP32	115 TFLOPS
TDP	~1400 W (typical board power)
Form Factor	OAM / PCIe Gen5 ×16
Interconnect	UALoF 600 GB/s (competes with NVLink 5)
Volume Production	H2 2025
Unit Price (OAM)	~$25,000 (estimated)

Comparison with MI350X 192GB

Metric	MI355X 288GB	MI350X 192GB	Improvement
Process Node	3nm	3nm	Same
HBM Capacity	288GB	192GB	+50%
HBM Bandwidth	8 TB/s	6.4 TB/s	+25%
FP8 dense	4.6 PF	3.6 PF	+28%
FP4 Support	Yes (9.2 PF sparse)	No	New
FP6 Support	Yes (6.9 PF sparse)	No	New
Interconnect	UALoF 600 GB/s	PCIe 5.0 200 GB/s	3×
TDP	750W	750W	Same
Price (estimated)	~$25K	~$20K	+25%

Comparison with NVIDIA B200

Metric	AMD MI355X	NVIDIA B200	Difference
Memory	288GB HBM3E	192GB HBM3E	MI355X +50%
Bandwidth	8 TB/s	8 TB/s	Same
FP8 dense	4.6 PF	4.5 PF sparse	MI355X slightly ahead (dense vs sparse)
FP4 sparse	9.2 PF	9 PF sparse	Same
BF16	2.3 PF dense	2.25 PF sparse	MI355X slightly ahead
Interconnect	UALoF 600 GB/s	NVLink 5 1.8 TB/s	B200 3×
TDP	750W	1000W	MI355X -25%
Software	ROCm 7 + Open	CUDA + Proprietary	AMD open
Price	~$25K	$30-40K	MI355X -25%

MI355X advantages: Largest HBM capacity (288GB) + lowest TDP (750W) + open interconnect (UALoF) — one of the best hardware choices for large-model inference.

8 TB/s Memory Bandwidth Technology

Dimension	Implementation
HBM3E	8 stacks × 1024-bit wide
Clock	9.2 Gbps (industry's highest)
PHY	AMD custom Infinity Fabric memory controller
Prefetch	Adaptive prefetch algorithms
Error Correction	On-die ECC + Side-band ECC

UALoF (Ultra Accelerator Link Fabric)

Dimension	Specification
Bandwidth	600 GB/s bidirectional
Topology	Fully connected / Dragonfly+
Protocol	Custom (NVLink-like but open)
Latency	~1 μs
Support	MI300X / MI325X / MI350X / MI355X / MI400 full product line
Governance	UALink Consortium (established Q3 2024, AMD / Intel / Meta / Microsoft / Google and others)
2025 Members	30+ companies
vs NVLink	1/3 bandwidth, but fully open (NVLink is proprietary)

Strategic significance of UALoF: Break NVIDIA's NVLink monopoly. B200's 1.8 TB/s NVLink is 3× UALoF, but UALoF can interconnect any vendor's accelerators (NVIDIA / Groq / Habana / Tenstorrent), making it the future AI datacenter interconnect standard.

Vendor Information

Item	Details
Company	Advanced Micro Devices (AMD)
Product Page	https://www.amd.com/en/products/accelerators/instinct-mi350.html
CEO	Lisa Su
Foundry	TSMC 3nm
H2 2025 Volume Production	Yes
2026 Roadmap	MI400 (3nm+, 432GB HBM4)
2025 Revenue (MI business)	~$8B (+80% YoY)
Key Customers	Microsoft Azure (MAI platform), Meta, Oracle, Anthropic, Tenstorrent, newly launched LaminiAI

AMD Instinct Product Line

Product	Launch	Memory	FP8 dense	Status
MI250X	Q4 2021	128GB HBM2E	0 (FP16: 383 TF)	EOL
MI300X	Q4 2023	192GB HBM3	1.3 PF	In production
MI325X	Q4 2024	256GB HBM3E	2.6 PF	In production
MI350X	Q3 2025	192GB HBM3E	3.6 PF	In production
MI355X	H2 2025	288GB HBM3E	4.6 PF	New
MI400	2026	432GB HBM4	40 PF FP4 dense	Roadmap

Key Features

288GB HBM3E: Industry's largest HBM capacity, exceeding NVIDIA B200's 192GB
8 TB/s bandwidth: Industry's highest memory bandwidth
FP4 / FP6 / FP8 multi-precision: New low-precision support (same era as NVIDIA Blackwell)
UALoF 600 GB/s: Open interconnect, competing with NVLink
Helios rack: 72× MI355X + 36× EPYC Venice + Pensando NIC (H2 2025)
Open ROCm software: vs proprietary CUDA
Drawback: ROCm software maturity still 2-3 years behind CUDA

Helios Rack (72-GPU)

Item	Configuration
GPU Count	72× MI355X
CPU Count	36× EPYC Venice (256-core Zen 6)
NIC	Pensando Vulcano 800GbE
GPU Interconnect	UALoF fully connected
CPU-GPU	PCIe Gen5 x16 + Infinity Fabric
Total Memory	20.7 TB HBM3E
Total Compute	331 PF FP8 dense
Rack TDP	~80 kW
Launch	H2 2025 (alongside MI355X)

Use Cases

✅ Large model training (288GB accommodates larger models, UALoF multi-card interconnect)
✅ LLM inference (288GB fits Llama 3 405B FP16 + large KV Cache)
✅ Multimodal AI (Stable Diffusion 3, Sora training)
✅ HPC + AI convergence (ROCm + MPI compatible)
✅ Cloud providers (open ecosystem, multi-cloud deployment)
✅ Government/state-owned enterprises (AMD US brand)
❌ CUDA-only proprietary workloads
❌ NVLink tightly-coupled code

MI355X vs MI400 (2026)

Metric	MI355X (H2 2025)	MI400 (2026)	Improvement
Memory	288GB HBM3E	432GB HBM4	+50%
Bandwidth	8 TB/s	19.6 TB/s	2.45×
FP4 dense	4.6 PF FP8	40 PF FP4	~9×
Interconnect	UALoF 600 GB/s	UALoF 1.3 TB/s	2.2×
Process Node	3nm	3nm+ (N3P)	Slightly newer
TDP	750W	~1000W	+33%

AMD MI350X - 192GB sibling variant
AMD MI325X - 256GB previous generation
AMD MI300X - Previous main product
AMD MI400 - Next-gen HBM4
NVIDIA B200 - Competitor
NVIDIA B300 Ultra - Competitor
Huawei Ascend 920 - Domestic comparison
Intel Gaudi 4 - Open ecosystem comparison

Overview​

Core Specifications​

Comparison with MI350X 192GB​

Comparison with NVIDIA B200​

8 TB/s Memory Bandwidth Technology​

UALoF (Ultra Accelerator Link Fabric)​

Vendor Information​

AMD Instinct Product Line​

Key Features​

Helios Rack (72-GPU)​

Use Cases​

MI355X vs MI400 (2026)​

Related Cards​