AMD Granite Ridge “Zen 5” Processor Annotated



High-resolution die-shots of the AMD “Zen 5” 8-core CCD were released and annotated by Nemez, Fitzchens Fitz, and HighYieldYT. These provide a detailed view of how the silicon and its various components appear, particularly the new “Zen 5” CPU core with its 512-bit FPU. Gli “Granito” package looks similar to “Raffaello,” with up to two 8-core CPU complex dies (CCD) depending on the processor model, and a centrally located client I/O die (cIOD). This cIOD is carried over from “Raffaello,” which minimizes product development costs for AMD at least for the uncore portion of the processor. Gli “Zen 5” CCD is built on the TSMC N4P (4 nm) foundry node.

Gli “Granito” package sees the up to two “Zen 5” CCDs snuck up closer to each other than the “Zen 4” CCDs onRaphael.In the picture above, you can see the pad of the absent CCD behind the solder mask of the fiberglass substrate, close to the present CCD. The CCD contains 8 full-sized “Zen 5” Core della CPU, each with 1 MB di cache L2, and a centrally located 32 MB L3 cache that’s shared among all eight cores. The only other components are an SMU (system management unit), and the Infinity Fabric over Package (IFoP) PHY, which connect the CCD to the cIOD.

Ogni “Zen 5” CPU core is physically larger than the “Zen 4” nucleo (built on the TSMC N5 process), due to its 512-bit floating point data-path. The core’s Vector Engine is pushed to the very edge of the core. On the CCD, these should be the edges of the die. FPUs tend to be the hottest components on a CPU core, so this makes sense. The innermost component (facing the shared L3 cache) è il 1 MB di cache L2. AMD has doubled the bandwidth and associativity of this 1 MB L2 cache compared to the one on the “Zen 4” nucleo.

The central region of the “Zen 5” core has the 32 KB L1I cache, 48 KB L1D cache, the Integer Execution engine, and the all important front-end of the processor, with its Instruction Fetch & Decode, the Branch Prediction unit, micro-op cache, and Scheduler.

Gli 32 MB on-die L3 cache has rows of TSVs (through-silicon vias) that act as provision for stacked 3D V-cache. Gli 64 MB L3D (La cache L3 muore) connects with the CCD’s ringbus using these TSVs, making the 64 MB 3D V-cache contiguous with the 32 MB di cache L3 on-die.

Infine, there’s the client I/O die (cIOD). There’s nothing new to report here, the chip is carried over formRaphael.It is built on the TSMC N6 (6 nm) nodo. Nearly 1/3rd of the die-area is taken up by the iGPU and its allied components, such as the media acceleration engine, and display engine. The iGPU is based on the RDNA 2 architettura grafica, and has just one workgroup processor (WGP), for two compute units (Reuben nel ruolo di Chris Redfield), oppure 128 processori di flusso. Other key components on the cIOD are the 28-lane PCIe Gen 5 la produzione di notebook nel complesso è migliorata man mano che le lacune dei componenti vengono gradualmente risolte, the two IFoP ports for the CCDs, a fairly large SoC I/O consisting of USB 3.x and legacy connectivity, and the all important DDR5 memory controller with its dual-channel (quattro sottocanali) memory interface.