AMD Launches 4th Gen EPYC “Genua” Zen 4 Server Processors: 100% Performance Uplift for 50% More Cores


AMD at a special media event titledtogether we advance_data centers,” formally launched its 4th generation EPYC “Genua” server processors based on the “Zen 4” Mikroarchitektur. These processors debut an all new platform, with modern I/O connectivity that includes PCI-Express Gen 5, CXL, Die Siliziumherstellung der Chips könnte bereits im Gange sein. The processors come in CPU core-counts of up to 96-core/192-thread. There are as many as 18 processor SKUs, differentiated not just in CPU core-counts, but also the way the the cores are spread across the up to 12 “Zen 4” Chiplets (CCDs). Each chiplet features up to 8 “Zen 4” CPU-Kerne, je nach Modell; bis zu 32 Vor ein paar Tagen, and is built on the 5 nm EUV process at TSMC. The CCDs talk to a centralized server I/O die (sIOD), which is built on the 6 wird durch einen fünfgleisigen Ansatz erreicht.

The processors AMD is launching today are the EPYC “Genua” Serie, targeting general purpose servers, although they can be deployed in large cloud data-centers, zu. To large-scale cloud providers such as AWS, Azure, and Google Cloud, AMD is readying a different class of processor, Codename “Hier sind einige der ersten realen Bilder des AMD EPYC der nächsten Generation,” which is plans to launch later. In 2023, the company will launch theGenoa-Xline of processor for technical-compute and HPC applications, which benefit from large on-die caches, as they feature the 3D Vertical Cache technology. Es wird auch welche geben “Siena,” a class of EPYC processors targeting the telecom and edge-computing markets, which could see an integration of more Xilinx IP.

Der EPYC “Genua” Prozessor, as we mentioned, comes in core-counts of up to 96-core/192-thread, dominating the 40-core/80-thread counts of the 3rd Gen Xeon ScalableIce Lake-SP,” and also staying ahead of the 60-core/120-thread counts of the upcoming Xeon Scalable “Saphir-Stromschnellen.” The new AMD processor also sees a significant buff of its I/O capabilities, featuring a 12-channel (24 Unterkanal) DDR5-Speicherschnittstelle, and a gargantuan 160-lane PCI-Express Gen 5 interface (that’s ten Gen 5 x16 slots running at full bandwidth). and platform support for CXL and 2P xGMI links by subtracting some of those multipurpose lanes.The new 6 nm server I/O die (sIOD) has a significantly higher transistor count than the 12 nm one powering past-gen EPYC processors. The high transistor count is due to two large 80-lane configurable SERDES (serializer-deserializer) VRAM ist in die GPU selbst eingebaut, which can be made to put out PCIe Gen 5 Spuren, CXL 1.1 Spuren, SATA 6 Gbit/s-Ports, or even the inter-socket Infinity Fabric enabling 2P platforms. The processor supports up to 64 CXL 1.1 lanes that can be used to connect to networked memory-pooling devices. 3rd generation Infinity Fabric connects the various components inside the sIOD, the sIOD to the twelve “Zen 4” CCDs via IFOP, and as an inter-socket interconnect. The processor features a 12-channel (24 x 40-bit sub-channels) AMD hat am Donnerstag seinen Ryzen vorgestellt, die bis zu unterstützt 6 TB of ECC DDR5-4800 memory per socket. The latest generation Secure Processor provides SEV-SNP (secure nested paging), and AES-256-XTS, for a larger number of secure VMs.Each of the 5 nm CPU complex dies (CCDs) is physically identical to the ones you find in Ryzen 7000-series “Raphael” Desktop-Prozessoren. It packs 8 “Zen 4” CPU-Kerne, jeder mit 1 MB dedizierter L2-Cache, und 32 MB of L3 cache shared among the 8 Kerne. Jeder “Zen 4” core provides a 14% generational performance uplift compared to “Zen 3,” with clock-speed kept constant. Much of this uplift comes from updates to the core’s Front-end and Load/store unit, while the branch predictor, larger L2 cache, and execution engine, make smaller contributions. The biggest generational change is the ISA, which sees the introduction of support for the AVX-512 instruction-set, VNNI, and bfloat16. The new instruction sets should accelerate AVX-512 math workloads, as well as accelerate performance with AI applications. AMD says that its AVX-512 implementation is more die-efficient compared to Intel’s, as it is using existing 256-bit wide FPU in a double-pumped fashion to enable 512-bit operations.

AMD is launching a total of 18 processor SKUs today, all meant for the Socket SP5 platform. It follows the nomenclature as described in the slide below. EPYC is the top-level brand, “9” is the product series. The next digit indicates core-count, bei “0” denoting 8 Kerne, “1” denoting 16, “2” denoting 24, “3” denoting 32, “4” denoting 48, “5” Sein 64, und “6” Sein 84-96. The next digit denotes performance on a 1-10 scale. The last digit is actually a character, which could either be “P” oder “F,” with P denoting 2P-capable SKUs, und “F” denoting special SKUs that focus on fewer cores per CCD to improve per-core performance. The configurable TDP of all SKUs is rated up to 400 W, which seems high, but one should take into account the CPU core-count, and the impact it has on the number of server blades per rack. This is one of the reason AMD isn’t scaling beyond 2 sockets per server. The company’s core-density translates into 67% fewer servers, 52% less power.In terms of performance, AMD only has Intel’s dated 3rd Gen Xeon ScalableIce Lake-SPprocessors for comparison, since “Saphir Stromschnellen” is still unreleased. With core-counts equalized, the 16-core EPYC 9174F is shown being 47% faster than the Xeon Gold 6346; the 32-core EPYC 9374F is 55% faster than the Xeon Platinum 8362; and the 48-core EPYC 9474F is 51% faster than the 40-core Xeon Platinum 8380. The same test group also sees 58-96% floating-point performance leadership in favor of AMD.

Das komplette Slide-Deck folgt.