NVIDIA Nemotron 3 Super - blazing-fast agentic AI, ready to deploy today!
stepfun-ai/
$0.10
in
$0.30
out
$0.02
cached
/ 1M tokens
Step 3.5 Flash is an open-source reasoning model by StepFun with 196B total parameters (11B active) using Mixture of Experts. It features a 256K context window, deep reasoning, tool calling, and agentic capabilities, achieving 97.3 on AIME 2025 and 74.4% on SWE-bench Verified.

Ask me anything
Settings
Step 3.5 Flash is an open-source frontier reasoning model by StepFun. Built on a sparse Mixture of Experts (MoE) architecture, it activates only 11B of its 196B total parameters per token — delivering state-of-the-art performance at a fraction of the cost of dense models.
**\<think>** blocks, controllable via reasoning_effort parameter (none, low, medium, high)response_format| Category | Benchmark | Score |
|---|---|---|
| Math | AIME 2025 | 97.3 |
| Math | HMMT 2025 (Feb.) | 98.4 |
| Coding | LiveCodeBench-V6 | 86.4 |
| Coding | SWE-bench Verified | 74.4% |
| Agentic | Terminal-Bench 2.0 | 51.0% |
| Agentic | GAIA (no file) | 84.5 |
| Agentic | BrowseComp | 51.6 |
| Total Parameters | 196B |
| Active Parameters | ~11B per token |
| Context Window | 256K tokens |
| Experts | 288 routed + 1 shared per layer (Top-8 selection) |
| License | Apache 2.0 |
from openai import OpenAI
client = OpenAI(
base_url="https://api.deepinfra.com/v1/openai",
api_key="YOUR_DEEPINFRA_TOKEN",
)
# Basic chat with reasoning
response = client.chat.completions.create(
model="stepfun-ai/Step-3.5-Flash",
messages=[{"role": "user", "content": "Prove that sqrt(2) is irrational."}],
)
print(response.choices[0].message.reasoning_content) # thinking
print(response.choices[0].message.content) # answer
# Disable reasoning for faster responses
response = client.chat.completions.create(
model="stepfun-ai/Step-3.5-Flash",
messages=[{"role": "user", "content": "Hello!"}],
extra_body={"reasoning_effort": "none"},
)
# Tool calling
response = client.chat.completions.create(
model="stepfun-ai/Step-3.5-Flash",
messages=[{"role": "user", "content": "What's the weather in Paris?"}],
tools=[{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
},
}],
)
Links
- https://huggingface.co/stepfun-ai/Step-3.5-Flash
- https://github.com/stepfun-ai/Step-3.5-Flash
- https://arxiv.org/abs/2602.10604
© 2026 Deep Infra. All rights reserved.