Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
freem
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Openai/6925f847-0600-800d-8a2e-57670d2a3feb
Add languages
Page
Discussion
English
Read
Edit
Edit source
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
Edit source
View history
General
What links here
Related changes
Special pages
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{DISPLAYTITLE:TPU v7 vs MI450X vs Rubin}} ''This conversation was exported from ChatGPT''<ref>{{cite web|title=TPU v7 vs MI450X vs Rubin|url=https://chatgpt.com/share/6925f847-0600-800d-8a2e-57670d2a3feb|publisher=ChatGPT|access-date=2025-11-29}}</ref> == gpt-5-1-thinking == === User: Compare tpu 7 ironwood to mi450x and Rubin in fp8, fp4 power consumption, memory bandwidth, memory capacity === Compare tpu 7 ironwood to mi450x and Rubin in fp8, fp4 power consumption, memory bandwidth, memory capacity Assume tpu has 1000W power and mi450x has 1400w and Rubin has 2300w Make a 10GW comparison table === Assistant: Here’s a clean way to look at it: first per-accelerator, then a 10 GW build-out. === Here’s a clean way to look at it: first per-accelerator, then a 10 GW build-out. ==== 1. Per-accelerator assumptions (TPU v7 Ironwood vs MI450X vs Rubin) ==== Specs I’m using (from public info + one explicit assumption): * TPU v7 Ironwood - FP8: 4.6 PFLOP/s per chip The Register<ref>{{cite web|title=The Register|url=https://www.theregister.com/2025/11/06/googles_ironwood_tpus_ai/|publisher=The Register|date=2025-11-06|access-date=2025-11-29}}</ref> - HBM: 192 GB HBM3E, 7.4 TB/s bandwidth Google Cloud<ref>{{cite web|title=Google Cloud|url=https://cloud.google.com/blog/products/compute/inside-the-ironwood-tpu-codesigned-ai-stack|publisher=Google Cloud|access-date=2025-11-29}}</ref> - Power: you told me to assume 1000 W (this lines up with 9,216 chips ≈ 10 MW in Google material → ~1 kW/chip). XPU.pub<ref>{{cite web|title=XPU.pub|url=https://xpu.pub/2025/04/16/google-ironwood/|publisher=xpu.pub|date=2025-04-16|access-date=2025-11-29}}</ref> - FP4: not published. I’ll assume 2× FP8 throughput ≈ 9.2 PFLOP/s (same pattern as other tensor cores). This is a model assumption, not a spec. * AMD Instinct MI450X - From AMD Helios rack numbers: 72 MI450 → 1.4 EF FP8, 2.9 EF FP4 per rack, so per GPU: - FP8 ≈ 19.4 PFLOP/s, FP4 ≈ 40.3 PFLOP/s AMD<ref>{{cite web|title=AMD|url=https://www.amd.com/en/blogs/2025/amd-helios-ai-rack-built-on-metas-2025-ocp-design.html|publisher=amd.com|access-date=2025-11-29}}</ref> - HBM4: 432 GB, 19.6 TB/s per GPU AMD<ref>{{cite web|title=AMD|url=https://www.amd.com/en/blogs/2025/amd-helios-ai-rack-built-on-metas-2025-ocp-design.html|publisher=amd.com|access-date=2025-11-29}}</ref> - Power (your assumption): 1400 W * NVIDIA Vera Rubin (VR200 / Rubin GPU) - FP4: 50 PFLOP/s per GPU (2 reticle-limited dies, 50 PF FP4) Glenn K. Lockwood<ref>{{cite web|title=Glenn K. Lockwood|url=https://www.glennklockwood.com/garden/processors/R200|publisher=Glenn K. Lockwood|access-date=2025-11-29}}</ref> - FP8: from NVL72 system numbers: 3.6 EF FP4 vs 1.2 EF FP8 → per GPU ≈ 16.7 PF FP8 Jon Peddie Research<ref>{{cite web|title=Jon Peddie Research|url=https://www.jonpeddie.com/news/nvidia-prepares-vera-rubin-for-next-ai-wave/|publisher=Jon Peddie Research|access-date=2025-11-29}}</ref> - HBM4: 288 GB, ~20 TB/s bandwidth Tom's Hardware<ref>{{cite web|title=Tom's Hardware|url=https://www.tomshardware.com/pc-components/gpus/nvidias-vera-rubin-platform-in-depth-inside-nvidias-most-complex-ai-and-hpc-platform-to-date|publisher=Tom's Hardware|access-date=2025-11-29}}</ref> - Power (your assumption, consistent with leaks of 2.3–2.5 kW Rubin Ultra): 2300 W TweakTown<ref>{{cite web|title=TweakTown|url=https://www.tweaktown.com/news/108068/nvidia-could-change-cooling-solution-for-rubin-ultra-ai-gpus-for-huge-2300w-thermal-concerns/index.html|publisher=tweaktown.com|access-date=2025-11-29}}</ref> ===== Per-chip comparison ===== (All FLOPs are dense, no sparsity; TF/W is “teraFLOPs per watt.”) | Accelerator | Power (W) | FP8 (PF) | FP8 (TF/W) | FP4 (PF) | FP4 (TF/W) | HBM (GB) | HBM BW (TB/s) | | --------------- | --------- | -------- | ---------- | -------- | ---------- | -------- | ------------- | | TPU v7 Ironwood | 1000 | 4.6 | 4.6 | 9.2* | 9.2 | 192 | 7.4 | | AMD MI450X | 1400 | 19.4 | 13.9 | 40.3 | 28.8 | 432 | 19.6 | | NVIDIA Rubin | 2300 | 16.7 | 7.25 | 50 | 21.7 | 288 | ~20 | *Ironwood FP4 is an assumption (2× FP8) because Google hasn’t published FP4 peak. Quick takeaways per device: * Perf/W: MI450X is the clear winner in FP8 and FP4 TF/W under your power assumptions. * Raw FP4: Rubin wins on absolute FP4 per chip (50 PF vs ~40 PF MI450X). * Memory capacity: MI450X has 1.5× Rubin’s HBM and 2.25× Ironwood’s (432 GB vs 288 GB vs 192 GB). GamingBench<ref>{{cite web|title=GamingBench|url=https://gamingbench.net/amd-ceo-confirms-instinct-mi450-is-built-on-2nm-process-technology/|publisher=gamingbench.net|access-date=2025-11-29}}</ref> * Memory bandwidth: MI450X is also top on per-chip HBM BW at 19.6 TB/s, slightly above Rubin and far above Ironwood. ==== 2. 10 GW build-out (accelerator power only) ==== Now apply your power assumptions to a 10 GW accelerator budget (ignoring CPUs, networking, cooling overhead, etc.). * Total power budget: 10,000 MW = 10,000,000,000 W Number of accelerators you can power: * Ironwood: 10,000,000,000 W / 1,000 W ≈ 10.0 M chips * MI450X: 10,000,000,000 W / 1,400 W ≈ 7.14 M GPUs * Rubin: 10,000,000,000 W / 2,300 W ≈ 4.35 M GPUs (Values rounded to three sig figs below.) ===== 10 GW cluster comparison ===== Units: * Total HBM = EB (exabytes, 1 EB = 10⁹ GB) * Total BW = PB/s * Compute = EFLOP/s (1 EF = 10³ PF) | Metric @ 10 GW accel power | TPU v7 Ironwood | AMD MI450X | NVIDIA Rubin | | -------------------------- | --------------- | ---------- | ------------ | | Accelerators (count) | 1.00×10⁷ | 7.14×10⁶ | 4.35×10⁶ | | Total HBM (EB) | 1.92 EB | 3.09 EB | 1.25 EB | | Total HBM BW (PB/s) | 7.4×10⁴ | 1.4×10⁵ | 8.7×10⁴ | | Total FP8 (EF) | 4.6×10⁴ | 1.39×10⁵ | 7.25×10⁴ | | Total FP4 (EF) | 9.2×10⁴* | 2.88×10⁵ | 2.17×10⁵ | *Again, Ironwood FP4 is an inferred 2× FP8, so its FP4 figures are less solid than MI450X/Rubin, which are tied to specific system or chip disclosures. The Next Platform<ref>{{cite web|title=The Next Platform|url=https://www.jonpeddie.com/news/nvidia-prepares-vera-rubin-for-next-ai-wave/|publisher=The Next Platform|access-date=2025-11-29}}</ref> ==== 3. How they stack up in a 10 GW “AI factory” ==== If you hold total watts fixed: * Throughput side: - MI450X gives ~3× the FP8 and ~3× the FP4 of Ironwood at the same 10 GW, and about 2× the FP8 and ~1.3× the FP4 of Rubin. * Memory side: - MI450X cluster has >2.5× the HBM of a Rubin cluster at the same power, and ~1.6× Ironwood’s memory. - HBM bandwidth at 10 GW is also highest for MI450X (~140,000 PB/s). Intuition: * Ironwood wins on tight Google/Pathways integration and inference focus, but on raw flops & HBM per watt it’s behind the next-gen HBM4 GPUs. * Rubin is the “max FP4” monster with extreme power draw; at equal facility power it gives a bit less FP8 and much less HBM than MI450X, but more FP4 than MI450X if you normalized per device rather than per watt. * MI450X is the “dense memory + efficiency” option: at a fixed 10 GW budget you get the most HBM, the most bandwidth, and the most FP8/FP4 among the three, assuming these roadmap numbers hold. If you want, I can redo the 10 GW table under different power assumptions (e.g., Rubin at 2.5 kW, MI450X at 1.5 kW, Ironwood air-cooled at 800 W) or switch the metric to tokens/s per watt for some target model size. == References == <references /> [[Category:ChatGPT Shares]] [[Category:openai]]
Summary:
Please note that all contributions to freem are considered to be released under the Creative Commons Attribution-ShareAlike 4.0 (see
Freem:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Templates used on this page:
Template:Cite web
(
edit
)
Module:Citation/CS1
(
edit
)
Module:Citation/CS1/COinS
(
edit
)
Module:Citation/CS1/Configuration
(
edit
)
Module:Citation/CS1/Date validation
(
edit
)
Module:Citation/CS1/Identifiers
(
edit
)
Module:Citation/CS1/Utilities
(
edit
)
Module:Citation/CS1/Whitelist
(
edit
)
Module:Citation/CS1/styles.css
(
edit
)