We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

FLUX.2 is live! High-fidelity image generation made simple.

allenai logo

allenai/

Olmo-3.1-32B-Instruct

$0.20

in

$0.60

out

/ 1M tokens

Olmo is a series of Open language models, developed by Allen Institute for AI (Ai2), designed to enable the science of language models.

Deploy Private Endpoint
Public
65,536
JSON
Function
allenai/Olmo-3.1-32B-Instruct cover image
allenai/Olmo-3.1-32B-Instruct cover image
Olmo-3.1-32B-Instruct

Ask me anything

0.00s

Settings

Model Information

license: apache-2.0 base_model: allenai/Olmo-3.1-32B-Instruct-DPO language:

  • en library_name: transformers datasets:
  • allenai/Dolci-Instruct-RL

Model Details

Logo for Olmo 3.1 32B Instruct model

Model Card for Olmo-3.1-32B-Instruct

We introduce Olmo 3, a new family of 7B and 32B models both Instruct and Think variants. Long chain-of-thought thinking improves reasoning tasks like math and coding.

Olmo is a series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets. We are releasing all code, checkpoints, logs (coming soon), and associated training details.

The core models released in this batch include the following:

StageOlmo 3 7B ThinkOlmo (3/3.1) 32B ThinkOlmo 3 7B InstructOlmo 3.1 32B Instruct
Base ModelOlmo-3-7BOlmo-3-32BOlmo-3-7BOlmo-3-32B
SFTOlmo-3-7B-Think-SFTOlmo-3-32B-Think-SFTOlmo-3-7B-Instruct-SFTOlmo-3.1-32B-Instruct-SFT
DPOOlmo-3-7B-Think-DPOOlmo-3-32B-Think-DPOOlmo-3-7B-Instruct-DPOOlmo-3.1-32B-Instruct-DPO
Final Models (RLVR)Olmo-3-7B-ThinkOlmo-3-32B-Think
Olmo-3.1-32B-Think
Olmo-3-7B-InstructOlmo-3.1-32B-Instruct

Model Description

  • Developed by: Allen Institute for AI (Ai2)
  • Model type: a Transformer style autoregressive language model.
  • Language(s) (NLP): English
  • License: This model is licensed under Apache 2.0. It is intended for research and educational use in accordance with Ai2's Responsible Use Guidelines.
  • Contact: Technical inquiries: olmo@allenai.org. Press: press@allenai.org
  • Date cutoff: Dec. 2024.

Model Sources

Evaluation

MetricOlmo 3.1 32B Instruct SFTOlmo 3.1 32B Instruct DPOOlmo 3.1 32B InstructApertus 70BQwen 3 32B (No Think)Qwen 3 VL 32B InstructQwen 2.5 32BGemma 3 27BGemma 2 27BOLMo 2 32B
Math
MATH74.486.693.436.284.395.180.287.451.549.2
AIME 202412.735.267.80.3127.975.415.728.94.74.6
AIME 20258.223.357.90.121.364.213.422.90.90.9
OMEGA15.533.342.25.623.444.019.224.09.19.8
Reasoning
BigBenchHard69.082.184.057.080.489.080.982.466.065.6
ZebraLogic30.651.161.79.028.486.724.124.817.213.3
AGI Eval English71.779.479.561.682.489.478.976.970.968.4
Coding
HumanEvalPlus80.885.786.742.983.989.382.679.267.544.4
MBPP+61.563.665.145.867.969.066.665.761.249.0
LiveCodeBench v335.449.654.79.757.570.249.939.028.710.6
IF
IFEval87.787.388.870.487.588.181.985.462.185.8
IFBench29.736.339.726.031.337.236.731.327.836.4
Knowledge & QA
MMLU79.081.980.970.285.888.784.674.676.177.1
PopQA23.728.525.033.525.925.728.030.230.437.2
GPQA41.347.948.627.954.461.444.645.039.936.4
Chat
AlpacaEval 2 LC42.269.759.819.967.984.381.965.539.838.0
Safety92.188.989.577.181.685.882.268.874.484.2

Model Details

Stage 1: SFT

  • supervised fine-tuning on the Dolci-Think-SFT-7B dataset. This dataset consits of math, code, chat, and general knowledge queries.
  • Datasets: Dolci-Think-SFT-7B, Dolci-Instruct-SFT

Stage 2:DPO

  • direct preference optimization on the Dolci-Think-DPO-7B dataset. This dataset consits of math, code, chat, and general knowledge queries.
  • Datasets: Dolci-Think-DPO-7B, Dolci-Instruct-DPO

Stage 3: RLVR

  • reinforcement learning from verifiable rewards on the Dolci-Think-RL-7B dataset. This dataset consits of math, code, instruction-following, and general chat queries.
  • Datasets: Dolci-Think-RL-7B, Dolci-Instruct-RL

Bias, Risks, and Limitations

Like any base language model or fine-tuned model without safety filtering, these models can easily be prompted by users to generate harmful and sensitive content. Such content may also be produced unintentionally, especially in cases involving bias, so we recommend that users consider the risks when applying this technology. Additionally, many statements from OLMo or any LLM are often inaccurate, so facts should be verified.

License

This model is licensed under Apache 2.0. It is intended for research and educational use in accordance with Ai2's Responsible Use Guidelines.

Citation

A technical manuscript is forthcoming!

Model Card Contact

For errors in this model card, contact olmo@allenai.org.