Google Unveils Gemini 1.5 Pro With 1 Million Token Context Window
Google announced Gemini 1.5 Pro, a new multimodal large language model, on February 15, 2024, featuring a breakthrough 1 million token context window. This expanded context window allows the model to process significantly larger amounts of information at once, including up to 1 hour of video, 11 hours of audio, or over 30,000 lines of code. The model is currently available in a limited preview for developers and enterprise customers via the Google AI Studio and Vertex AI platforms.
This advancement represents a substantial leap in AI's capacity for understanding and reasoning over lengthy and complex data. Previous models, including earlier versions of Gemini, had context windows typically ranging from 32,000 to 128,000 tokens. The 1 million token capacity of Gemini 1.5 Pro enables more sophisticated analysis of extensive documents, codebases, and multimedia content, potentially transforming fields like legal research, software development, and content analysis.
Gemini 1.5 Pro also incorporates a novel Mixture-of-Experts (MoE) architecture, which Google states makes it more efficient and performant than standard dense models. This architecture allows the model to selectively activate specific parts of its neural network for different tasks, leading to faster processing and reduced computational cost. The model demonstrates strong performance across a range of benchmarks, maintaining high accuracy even with the expanded context.
Google highlighted specific use cases, such as analyzing lengthy research papers, summarizing entire code repositories, and understanding complex video narratives. The company emphasized its commitment to responsible AI development, with safety filters and testing integrated into the preview release. Developers can experiment with Gemini 1.5 Pro's capabilities to build new applications that leverage its enhanced long-context understanding.
Original source — read the full reporting at the publisher:
Read on The Economist