OpenAI and Broadcom announce chip designed for LLM inference at scale

OpenAI and Broadcom announced the Jalapeño chip on May 15, 2024, a custom-designed processor engineered for large language model (LLM) inference at scale within data centers. This collaboration marks a significant step for both companies, aiming to address the growing computational demands of advanced AI models. The Jalapeño chip is the initial product from a multi-year partnership focused on developing specialized hardware for AI workloads. Broadcom, a long-standing provider of semiconductor and infrastructure software solutions, brings its expertise in chip design and manufacturing, while OpenAI, known for its development of models like GPT-4, contributes its deep understanding of AI inference requirements. The companies stated that this first-generation chip is part of a long-term strategy to continuously refine and improve hardware tailored for AI, suggesting future iterations will offer enhanced performance and efficiency. The primary target for Jalapeño deployment is large-scale data centers, where the majority of LLM inference operations occur. This strategic move by OpenAI and Broadcom highlights the increasing industry focus on creating bespoke silicon solutions to accelerate AI development and deployment, moving beyond general-purpose processors.
Original source — read the full reporting at the publisher:
Read on Ars Technica