AMD Granite Ridge “Zen5” Processor Annotated



High-resolution die-shots of the AMD “Zen5” 8-core CCD were released and annotated by Nemez, Fitzchens Fitz, and HighYieldYT. These provide a detailed view of how the silicon and its various components appear, particularly the new “Zen5” CPU core with its 512-bit FPU. La “Cresta de granito” package looks similar to “Rafael,” with up to two 8-core CPU complex dies (CCD) depending on the processor model, and a centrally located client I/O die (qué). This cIOD is carried over from “Rafael,” which minimizes product development costs for AMD at least for the uncore portion of the processor. La “Zen5” CCD is built on the TSMC N4P (4 Nuevo Méjico) nodo de fundición.

La “Cresta de granito” package sees the up to two “Zen5” CCDs snuck up closer to each other than the “Zen 4” CCDs onRaphael.In the picture above, you can see the pad of the absent CCD behind the solder mask of the fiberglass substrate, close to the present CCD. The CCD contains 8 full-sized “Zen5” Núcleos de CPU, each with 1 MB de caché L2, and a centrally located 32 MB L3 cache that’s shared among all eight cores. The only other components are an SMU (system management unit), and the Infinity Fabric over Package (IFoP) PHYs, which connect the CCD to the cIOD.

Cada “Zen5” CPU core is physically larger than the “Zen 4” centro (built on the TSMC N5 process), due to its 512-bit floating point data-path. The core’s Vector Engine is pushed to the very edge of the core. On the CCD, these should be the edges of the die. FPUs tend to be the hottest components on a CPU core, so this makes sense. The innermost component (facing the shared L3 cache) es el 1 Caché MB L2. AMD has doubled the bandwidth and associativity of this 1 MB L2 cache compared to the one on the “Zen 4” centro.

The central region of the “Zen5” core has the 32 KB L1I cache, 48 KB L1D cache, the Integer Execution engine, and the all important front-end of the processor, with its Instruction Fetch & Decode, the Branch Prediction unit, micro-op cache, and Scheduler.

La 32 MB on-die L3 cache has rows of TSVs (through-silicon vias) that act as provision for stacked 3D V-cache. La 64 MB L3D (L3 cache die) connects with the CCD’s ringbus using these TSVs, making the 64 MB 3D V-cache contiguous with the 32 MB on-die L3 cache.

Por último, there’s the client I/O die (qué). There’s nothing new to report here, the chip is carried over formRaphael.It is built on the TSMC N6 (6 Nuevo Méjico) El MCM de Intel utiliza un troquel de GPU junto al troquel de núcleo de CPU. Nearly 1/3rd of the die-area is taken up by the iGPU and its allied components, such as the media acceleration engine, and display engine. The iGPU is based on the RDNA 2 arquitectura gráfica, and has just one workgroup processor (WGP), for two compute units (CU), o 128 procesadores de flujo. Other key components on the cIOD are the 28-lane PCIe Gen 5 interfaz, the two IFoP ports for the CCDs, a fairly large SoC I/O consisting of USB 3.x and legacy connectivity, and the all important DDR5 memory controller with its dual-channel (four sub-channel) interfaz de memoria.