Intel Gaudi 2 AI Accelerator Powers Through Llama 2 Text Generation



Intel’s “AI Everywhere” hype campaign has generated the most noise in mainstream and enterprise segments. Team Blue’s Gaudi—a family of deep learning accelerators—does not hit the headlines all that often. Their current generation model, Gaudi 2, is overshadowed by Team Green and Red alternatives—according to Intel’s official marketing spiel: “it performs competitively on deep learning training and inference, with up to 2.4x faster performance than NVIDIA A100.” Habana, an Intel subsidiary, has been working on optimizing Large Language Model (LLM) inference on Gaudi 1 and 2 for a while—their co-operation with Hugging Face has produced impressive results, as of late February. Siddhant Jagtap, an Intel Data Scientist, has demonstrated: “how easy it is to generate text with the Llama 2 family of models (7b, 13b and 70b) using Optimum Habana and a custom pipeline class.”

Jagtap reckons that folks will be able to: “run the models with just a few lines of code” on Gaudi 2 accelerators—additionally, Intel’s hardware is capable of accepting single and multiple prompts. The custom pipeline class: “has been designed to offer great flexibility and ease of use. Moreover, it provides a high level of abstraction and performs end-to-end text-generation which involves pre-processing and post-processing.” His article/blog outlines various prerequisites and methods of getting Llama 2 text generation up and running on Gaudi 2. Jagtap concluded that Habana/Intel has: “presented a custom text-generation pipeline on Intel Gaudi 2 AI accelerator that accepts single or multiple prompts as input. This pipeline offers great flexibility in terms of model size as well as parameters affecting text-generation quality. Furthermore, it is also very easy to use and to plug into your scripts, and is compatible with LangChain.” Hugging Face reckons that Gaudi 2 delivers roughly twice the throughput speed of NVIDIA A100 80 GB in both training and inference scenarios. Intel has teased third generation Gaudi accelerators—industry watchdogs believe that next-gen solutions are designed to compete with Team Green H100 AI GPUs.