Home/News/Mistral AI Releases Leanstral 1.5 for Lean 4 Proof Engineering
MarkTechPost3 min read

Mistral AI Releases Leanstral 1.5 for Lean 4 Proof Engineering

Mistral AI released Leanstral 1.5 this week, a code agent model specifically designed for the Lean 4 proof assistant. This new model targets applications in automated theorem proving and proof engineering, with its weights made available under the permissive Apache 2.0 license. A free API endpoint, leanstral-1-5, has also been launched.

Leanstral 1.5 represents an update to the earlier Leanstral-2603 model and is part of Mistral AI's Mistral Small 4 family. The model employs a mixture-of-experts (MoE) architecture, activating a subset of specialized sub-networks for each token to maintain computational efficiency while leveraging a large overall capacity. It features 128 experts, with 4 active per token, a total size of 119 billion parameters, and 6.5 billion activated parameters per token. The model supports a context length of 256,000 tokens and can process both text and image inputs, though its output is limited to text.

The training process for Leanstral 1.5 involved three stages: mid-training, supervised fine-tuning, and reinforcement learning using CISPO. Two distinct reinforcement learning environments were utilized to shape the model's agentic capabilities. In the multiturn environment, the model was tasked with proving or disproving theorems, iteratively refining its proofs based on feedback from the Lean compiler. The code agent environment immersed Leanstral within a simulated filesystem, enabling it to interact with files, execute bash commands, and utilize the Lean language server for real-time information on goals, errors, and types. This allowed for tasks such as completing partial proofs and creating auxiliary lemmas, with context compaction used to manage long tasks within the context window.

Mistral AI reports that Leanstral 1.5 demonstrates strong performance, saturating the miniF2F benchmark and achieving 100% accuracy on its validation set. The model's correctness is verified against target theorems using Mistral's fork of SafeVerify. The model successfully solved 587 out of 672 problems on the PutnamBench, indicating significant advancements in its theorem-proving capabilities.

Original source — read the full reporting at the publisher:

Read on MarkTechPost

Read next