Design Issues May Postpone Launch of NVIDIA’s Advanced Blackwell AI Chips
When approached for comment, an NVIDIA spokesperson did not address communications with customers regarding the delay but stated that “production is on track to ramp” later this year. The Information reports that Microsoft, Google, Amazon Web Services, and Meta declined to comment on the matter, while Taiwan Semiconductor Manufacturing Company (TSMC) did not respond to inquiries.
Update1:
The production issue was discovered by manufacturer TSMC, and involves the processor die that connects two Blackwell GPUs on a GB200.” — via Data Center Dynamics
NVIDIA needs to redesign its chip, requiring a new TSMC production test before mass production. Rumors say they’re considering a single-GPU version to expedite delivery. The delay leaves TSMC production lines idle temporarily.
Update 2:
SemiAnalysis’s Dylan Patel reports in a message on Twitter (now knows as X) that Blackwell supply will be considerably lower than anticipated in Q4 2024 and H1 2025. This shortage stems from TSMC’s transition from CoWoS-S to CoWoS-L technology, required for NVIDIA’s advanced Blackwell chips. Currently, TSMC’s AP3 packaging facility is dedicated to CoWoS-S production, while initial CoWoS-L capacity is being installed in the AP6 facility.
Additionally, NVIDIA appears to be prioritizing production of GB200 NVL72 units over NVL36 units. The GB200 NVL36 configuration features 36 GPUs in a single rack with 18 individual GB200 compute nodes. In contrast, the NVL72 design incorporates 72 GPUs, either in a single rack with 18 double GB200 compute nodes or spread across two racks, each containing 18 single nodes.