DeepInfra raises $107M Series B to scale the inference cloud — read the announcement
Qwen/
$2.50 in $7.50 out $0.50 cached / 1M tokens
*The largest and most capable in the Qwen3.7 series. Qwen3.7 is a next‑generation flagship model designed for the agent‑centric.

Ask me anything
Settings
The Max model, the largest and most capable in the Qwen3.7 series, currently offers a pure‑text‑only interface for public experimentation. Qwen3.7 is a next‑generation flagship model designed for the agent‑centric era, with its core strengths lying in the breadth and depth of its agent‑level capabilities: it excels at programming, office and productivity tasks, and long‑term autonomous execution.
What sets Qwen3.7-Max apart is the breadth and depth of its agent capabilities. It excels as a coding agent, from frontend prototyping to complex multi-file engineering. It serves as a reliable office and productivity assistant through MCP integrations and multi-agent orchestration. It sustains coherent reasoning across extremely long horizons — as demonstrated by a 35-hour, fully autonomous kernel optimization run comprising over 1,000 tool calls. It generalizes across agent scaffolds, performing consistently whether deployed through Claude Code, OpenClaw, Qwen Code, or other frameworks.
In coding agents, Qwen3.7-Max performs strongly on SWE-Pro (60.6), SWE-Multilingual (78.3), SciCode (53.5), and QwenSVG (1608). On Terminal Bench 2.0-Terminus (69.7), it outperforms DS-V4-Pro Max (67.9). On SWE-Verified (80.4), it is on par with Opus-4.6 Max (80.8) and DS-V4-Pro Max (80.6).
In general-purpose agents, improvements are even more pronounced. Qwen3.7-Max performs exceptionally well on MCP-Mark (60.8 vs. GLM-5.1’s 57.5), MCP-Atlas (76.4 vs. Opus-4.6’s 75.8), and Skillsbench (59.2 vs. K2.6’s 56.2), and demonstrates strong GPU kernel optimization capabilities on Kernel Bench L3 (1.98x median speedup, 96% win rate). It also scores highly on BFCL-V4 (75.0), Qwenclaw (64.3), and ClawEval (65.2), closely approaching Opus-4.6 Max. On the office automation benchmark SpreadSheetBench-v1, it achieves a top-tier score of 87.
In reasoning, Qwen3.7-Max achieves leading results on GPQA Diamond (92.4 vs. Opus-4.6’s 91.3), HLE (41.4 vs. Opus-4.6’s 40), HMMT 2026 Feb (97.1 vs. Opus-4.6’s 96.2), IMOAnswerBench (90 vs. DS-V4-Pro’s 89.8), and Apex (44.5 vs. DS-V4-Pro’s 38.3), demonstrating exceptional strength on the hardest reasoning benchmarks.
In general capabilities and multilingualism, Qwen3.7-Max stands out on IFBench (79.1 vs. DS-V4-Pro’s 77.0), demonstrating precise instruction following. It achieves leading scores on WMT24++ (85.8) and MAXIFE (89.2), confirming top-tier multilingual understanding and translation quality. It also delivers strong results on SuperGPQA (73.6) and QwenWorldBench (57.3).
Notably, these scores are drawn from a wide variety of agent scaffolds. Rather than optimizing for any single framework, Qwen3.7-Max delivers consistently across Claude Code, OpenClaw, Qwen Code, and custom tool-use frameworks, making it a reliable drop-in backbone for any agent system.
Cowork Productivity Assistant Qwen3.7-Max serves as your advanced coworker for real-world productivity. Its powerful agent capabilities fundamentally streamline professional workflows — synthesizing complex information, performing in-depth data analysis and modeling, and generating publication-ready documents and visualizations — to reliably handle high-complexity enterprise workloads.
Qwen3.7-Max features native compatibility with mainstream agent harnesses. For long-horizon tasks, it supports autonomous planning and continuous execution across multi-hour sessions. Through thousands of tool calls and dozens of refinement iterations, it steadily improves output quality. Complex projects that typically require one to two weeks of specialized team effort can now be completed end-to-end within hours, delivering measurable productivity gains.
© 2026 DeepInfra. All rights reserved.