Devstral: A new state-of-the-art open model for coding agents

Written by

Xingyao Wang, Graham Neubig

Published on

May 20, 2025

Today we introduce Devstral, an agentic LLM for software engineering tasks. Devstral is built under a collaboration between Mistral AI and All Hands AI, and outperforms all open-source models on SWE-Bench Verified by a large margin. We release Devstral under the Apache 2.0 license.

Devstral Performance

Agentic LLMs for software development

While typical LLMs are excellent at atomic coding tasks such as writing standalone functions or code completion, they currently struggle to solve real-world software engineering problems. Real-world development requires contextualising code within a large codebase, identifying relationships between disparate components, and identifying subtle bugs in intricate functions.

Devstral is an attempt at tackling this problem. Devstral is trained to solve real GitHub issues; it was trained to be compatible with the OpenHands, coding agent. Here, we show Devstral's performance on the popular SWE-Bench Verified benchmark, a dataset of 500 real-world GitHub issues which have been manually screened for correctness.

OpenHands+Devstral achieves a score of 46.8% on SWE-Bench Verified, outperforming prior open-source SoTA by more than 6% points. Compared to OpenHands with larger models such as Deepseek-V3-0324 (671B) and Qwen3 232B-A22B, Devstral achieves much more accurate results.

In the table below, we also compare Devstral to closed and open models evaluated within other agent frameworks (including ones custom for the model). Here, we find that Devstral achieves substantially better performance than a number of closed-source alternatives. For example, Devstral surpasses the recent GPT-4.1-mini by over 20%.

DevStral Performance Comparison

Getting Started

Devstral is light enough to run on a single RTX 4090 GPU or a Mac with 32GB RAM, making it an ideal choice for local deployment and on-device use. To learn more about how to deploy it, check out the local model instructions for OpenHands, and the following tutorial video.

Alternatively, if you would like to try out OpenHands without going through the local setup process, the OpenHands cloud allows you to get started right away.

Availability

We are releasing this model for free under an Apache 2.0 license for the community to build on, customize, and accelerate autonomous software development.

The model is also available on the Mistral API under the name devstral-small-2505 at the same price as Mistral Small 3.1: \$0.1/M input tokens and \$0.3/M output tokens.

For comparison, Claude 3.7 Sonnet is 30x more expensive per input token and 50x per output token.

You can download the model on Hugging Face, Ollama, Kaggle, Unsloth starting today.

The performance of the model also makes it a suitable choice for agentic coding on privacy-sensitive repositories in enterprises, especially ones subject to stringent security and compliance requirements. For enterprise deployments that require deployment in private settings, or higher-fidelity customization such as continued pre-training or distilling Devstral's capabilities into other models, please contact us.

Get in touch!

We'd love to hear what you think of this new model release. We're excited about it and we hope that you are too. If you'd like to join the community:

Join our Slack workspace - Here we talk about research, architecture, and future development.
Join our Discord server - This is a community-run server for general discussion, questions, and feedback.

Citation

Devstral: A new state-of-the-art open model for coding agents

Learning to Verify AI-Generated Code

OpenHands Product Update - March 2026

The OpenHands Vulnerability Fixer: Automated Security Remediation with AI Agents

Get useful insights in our blog

Insights and updates from the OpenHands team

Thank you for your submission!

Oops! Something went wrong while submitting the form.

Building the open standard for autonomous software development.

OpenHands is the foundation for secure, transparent, model-agnostic coding agents - empowering every software team to build faster with full control.