DeepInfra raises $107M Series B to scale the inference cloud — read the announcement

stepfun-ai/
$0.20
in
$1.15
out
$0.04
cached
/ 1M tokens
Step 3.7 Flash is an open-source multimodal reasoning model by StepFun with 198B total parameters (11B active) using Mixture of Experts. It accepts text and image inputs and features a 256K context window, selectable reasoning effort, tool calling, and agentic capabilities for coding and search workflows, scoring 80.9% on GPQA Diamond and 56.3% on SWE-bench Pro.

Ask me anything
Settings
Step 3.7 Flash is an open-source frontier multimodal reasoning model by StepFun. Built on a sparse Mixture of Experts (MoE) architecture, it activates only ~11B of its 198B total parameters per token, and pairs its language backbone with a vision encoder for native image understanding — delivering state-of-the-art reasoning at a fraction of the cost of dense models.
image_url content format**\<think>** blocks, with selectable depth via reasoning_effort (low, medium, high). Reasoning is always on for this modelresponse_format| Category | Benchmark | Score |
|---|---|---|
| Reasoning | GPQA Diamond | 80.9% |
| Coding | SWE-bench Pro | 56.3% |
| Agentic | Terminal-Bench 2.1 | 59.6% |
| Total Parameters | 198B |
| Active Parameters | ~11B per token |
| Context Window | 256K tokens |
| Modality | Text + Image |
| Reasoning | Always on; selectable effort (low / medium / high) |
| License | Apache 2.0 |
from openai import OpenAI
client = OpenAI(
base_url="https://api.deepinfra.com/v1/openai",
api_key="YOUR_DEEPINFRA_TOKEN",
)
# Chat with reasoning
response = client.chat.completions.create(
model="stepfun-ai/Step-3.7-Flash",
messages=[{"role": "user", "content": "Prove that sqrt(2) is irrational."}],
)
print(response.choices[0].message.reasoning_content) # thinking
print(response.choices[0].message.content) # answer
# Image understanding (multimodal)
response = client.chat.completions.create(
model="stepfun-ai/Step-3.7-Flash",
messages=[{"role": "user", "content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}},
]}],
)
# Control reasoning depth
response = client.chat.completions.create(
model="stepfun-ai/Step-3.7-Flash",
messages=[{"role": "user", "content": "Plan a 3-day trip to Tokyo."}],
extra_body={"reasoning_effort": "high"},
)
# Tool calling
response = client.chat.completions.create(
model="stepfun-ai/Step-3.7-Flash",
messages=[{"role": "user", "content": "What's the weather in Paris?"}],
tools=[{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
},
}],
)
Links
- https://huggingface.co/stepfun-ai/Step-3.7-Flash
- https://github.com/stepfun-ai/Step-3.7-Flash
© 2026 DeepInfra. All rights reserved.