We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…
mistralai/Mistral-Small-3.1-24B-Instruct-2503 cover image
featured

mistralai/Mistral-Small-3.1-24B-Instruct-2503

Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and extends context capabilities up to 128K tokens while maintaining top-tier text performance. Its 24 billion parameters and instruction fine-tuning deliver fast, local deployment for both text and vision tasks.

Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and extends context capabilities up to 128K tokens while maintaining top-tier text performance. Its 24 billion parameters and instruction fine-tuning deliver fast, local deployment for both text and vision tasks.

Public
$0.05/$0.10 in/out Mtoken
fp8
128,000
mistralai/Mistral-Small-3.1-24B-Instruct-2503 cover image

Mistral-Small-3.1-24B-Instruct-2503

Ask me anything

0.00s

Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.
This model is an instruction-finetuned version of: Mistral-Small-3.1-24B-Base-2503.

Mistral Small 3.1 can be deployed locally and is exceptionally "knowledge-dense," fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized.

It is ideal for:

  • Fast-response conversational agents.
  • Low-latency function calling.
  • Subject matter experts via fine-tuning.
  • Local inference for hobbyists and organizations handling sensitive data.
  • Programming and math reasoning.
  • Long document understanding.
  • Visual understanding.

For enterprises requiring specialized capabilities (increased context, specific modalities, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.

Learn more about Mistral Small 3.1 in our blog post.

Key Features

  • Vision: Vision capabilities enable the model to analyze images and provide insights based on visual content in addition to text.
  • Multilingual: Supports dozens of languages, including English, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Nepali, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Turkish, Ukrainian, Vietnamese, Arabic, Bengali, Chinese, Farsi.
  • Agent-Centric: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
  • Advanced Reasoning: State-of-the-art conversational and reasoning capabilities.
  • Apache 2.0 License: Open license allowing usage and modification for both commercial and non-commercial purposes.
  • Context Window: A 128k context window.
  • System Prompt: Maintains strong adherence and support for system prompts.
  • Tokenizer: Utilizes a Tekken tokenizer with a 131k vocabulary size.

Benchmark Results

When available, we report numbers previously published by other model providers, otherwise we re-evaluate them using our own evaluation harness.

Pretrain Evals

ModelMMLU (5-shot)MMLU Pro (5-shot CoT)TriviaQAGPQA Main (5-shot CoT)MMMU
Small 3.1 24B Base81.01%56.03%80.50%37.50%59.27%
Gemma 3 27B PT78.60%52.20%81.30%24.30%56.10%

Instruction Evals

Text

ModelMMLUMMLU Pro (5-shot CoT)MATHGPQA Main (5-shot CoT)GPQA Diamond (5-shot CoT )MBPPHumanEvalSimpleQA (TotalAcc)
Small 3.1 24B Instruct80.62%66.76%69.30%44.42%45.96%74.71%88.41%10.43%
Gemma 3 27B IT76.90%67.50%89.00%36.83%42.40%74.40%87.80%10.00%
GPT4o Mini82.00%61.70%70.20%40.20%39.39%84.82%87.20%9.50%
Claude 3.5 Haiku77.60%65.00%69.20%37.05%41.60%85.60%88.10%8.02%
Cohere Aya-Vision 32B72.14%47.16%41.98%34.38%33.84%70.43%62.20%7.65%

Vision

ModelMMMUMMMU PROMathvistaChartQADocVQAAI2DMM MT Bench
Small 3.1 24B Instruct64.00%49.25%68.91%86.24%94.08%93.72%7.3
Gemma 3 27B IT64.90%48.38%67.60%76.00%86.60%84.50%7
GPT4o Mini59.40%37.60%56.70%76.80%86.70%88.10%6.6
Claude 3.5 Haiku60.50%45.03%61.60%87.20%90.00%92.10%6.5
Cohere Aya-Vision 32B48.20%31.50%50.10%63.04%72.40%82.57%4.1

Multilingual Evals

ModelAverageEuropeanEast AsianMiddle Eastern
Small 3.1 24B Instruct71.18%75.30%69.17%69.08%
Gemma 3 27B IT70.19%74.14%65.65%70.76%
GPT4o Mini70.36%74.21%65.96%70.90%
Claude 3.5 Haiku70.16%73.45%67.05%70.00%
Cohere Aya-Vision 32B62.15%64.70%57.61%64.12%

Long Context Evals

ModelLongBench v2RULER 32KRULER 128K
Small 3.1 24B Instruct37.18%93.96%81.20%
Gemma 3 27B IT34.59%91.10%66.00%
GPT4o Mini29.30%90.20%65.8%
Claude 3.5 Haiku35.19%92.60%91.90%