Cerebras Systems établit un record pour les plus grands modèles d'IA jamais formés sur un seul appareil -

Cerebras Systems, the pioneer in high performance artificial intelligence (IA) computing, today announced, pour la première fois, the ability to train models with up to 20 billion parameters on a single CS-2 system – a feat not possible on any other single device. By enabling a single CS-2 to train these models, Cerebras reduces the system engineering time necessary to run large natural language processing (NLP) models from months to minutes. It also eliminates one of the most painful aspects of NLP—namely the partitioning of the model across hundreds or thousands of small graphics processing units (GPU).

“In NLP, bigger models are shown to be more accurate. But traditionally, only a very select few companies had the resources and expertise necessary to do the painstaking work of breaking up these large models and spreading them across hundreds or thousands of graphics processing units,” said Andrew Feldman, CEO and Co-Founder of Cerebras Systems. “As a result, only very few companies could train large NLP models – it was too expensive, time-consuming and inaccessible for the rest of the industry. Today we are proud to democratize access to GPT-3 1.3B, GPT-J 6B, GPT-3 13B and GPT-NeoX 20B, enabling the entire AI ecosystem to set up large models in minutes and train them on a single CS-2.”

“GSK generates extremely large datasets through its genomic and genetic research, and these datasets require new equipment to conduct machine learning,” said Kim Branson, SVP of Artificial Intelligence and Machine Learning at GSK. “The Cerebras CS-2 is a critical component that allows GSK to train language models using biological datasets at a scale and size previously unattainable. These foundational models form the basis of many of our AI systems and play a vital role in the discovery of transformational medicines.”

These world first capabilities are made possible by a combination of the size and computational resources available in the Cerebras Wafer Scale Engine-2 (WSE-2) and the Weight Streaming software architecture extensions available via release of version R1.4 of the Cerebras Software Platform, CSoft.

When a model fits on a single processor, AI training is easy. But when a model has either more parameters than can fit in memory, or a layer requires more compute than a single processor can handle, complexity explodes. The model must be broken up and spread across hundreds or thousands of GPU. This process is painful, often taking months to complete. To make matters worse, the process is unique to each network compute cluster pair, so the work is not portable to different compute clusters, or across neural networks. It is entirely bespoke.

The Cerebras WSE-2 is the largest processor ever built. Il est 56 times larger, a 2.55 trillion more transistors, Voici la première photo d'AMD Socket SP5 100 times as many compute cores as the largest GPU. The size and computational resources on the WSE-2 enables every layer of even the largest neural networks to fit. The Cerebras Weight Streaming architecture disaggregates memory and compute allowing memory (which is used to store parameters) to grow separately from compute. Thus a single CS-2 can support models with hundreds of billions even trillions of parameters.

Graphics processing units on the other hand have a fixed amount of memory per GPU. If the model requires more parameters than fit in memory, one needs to buy more graphics processors and then spread work over multiple GPUs. The result is an explosion of complexity. The Cerebras solution is far simpler and more elegant: by disaggregating compute from memory, the Weight Streaming architecture allows support for models with any number of parameters to run on a single CS-2.

Powered by the computational capacity of the WSE-2 and the architectural elegance of the Weight Streaming architecture, Cerebras is able to support, on a single system, the largest NLP networks. By supporting these networks on a single CS-2, Cerebras reduces set up time to minutes and enables model portability. One can switch between GPT-J and GPT-Neo, for example, with a few key strokes, a task that would take months of engineering time to achieve on a cluster of hundreds of GPUs.

With customers in North America, Asia, Europe and the Middle East, Cerebras is delivering industry leading AI solutions to a growing roster of customers in the enterprise, gouvernement, and high performance computing (CHP) segments including GlaxoSmithKline, AstraZeneca, TotalEnergies, nference, Argonne National Laboratory, Lawrence Livermore National Laboratory, Pittsburgh Supercomputing Center, Leibniz Supercomputing Centre, National Center for Supercomputing Applications, Edinburgh Parallel Computing Centre (EPCC), National Energy Technology Laboratory, and Tokyo Electron Devices.

For more information about the Cerebras Software Platform, veuillez visiter https://www.cerebras.net/product-software/.

Nouvelles

50+ enhanced games available at launch November 7 – PlayStation.Blog

Alors que les séries Persona et Yakuza s'envolent au-delà 20 millions de ventes chacun, Sega affirme que ses succès dans les JRPG sont dus à la puissance des versions multiplateformes

"Venus Vacation PRISM - DEAD OR ALIVE Xtreme -" Disponible le 6 mars, 2025 & Disponible en précommande – PlayStation.Blog

Sélections de la boutique en ligne Nintendo Life & Lecteurs’ Choix (Octobre 2024)

Les joueurs de Monster Hunter Wilds continuent de créer Yoshi-P de Final Fantasy 14 dans le créateur de personnage bêta, et il n'en est pas totalement content

Yoshi-P de Final Fantasy 14 confirme que Square Enix reste attaché à sa stratégie multiplateforme, mais espère que davantage de fans Xbox joueront réellement à ses nouveaux JRPG

ICYMI: « Musique Nintendo’ A une prévention des spoilers, Voici comment l'activer

Diablo 4 patron’ plan pour Gears of War 6 aurait emmené la série dans l'espace, mais “nous n'allions pas commencer à faire ça avec Mass Effect”

Bagarre d'art de boîte – Duel: Générations Sonic X Shadow

Cerebras Systems établit un record pour les plus grands modèles d'IA jamais formés sur un seul appareil