We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/all

Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo cover image
featured
fp4
256k
$0.30/$1.20 in/out Mtoken
  • text-generation

Qwen3-Coder-480B-A35B-Instruct is the Qwen3's most agentic code model, featuring Significant Performance on Agentic Coding, Agentic Browser-Use and other foundational coding tasks, achieving results comparable to Claude Sonnet.

Qwen/Qwen3-Coder-480B-A35B-Instruct cover image
featured
fp8
256k
$0.40/$1.60 in/out Mtoken
  • text-generation

Qwen3-Coder-480B-A35B-Instruct is the Qwen3's most agentic code model, featuring Significant Performance on Agentic Coding, Agentic Browser-Use and other foundational coding tasks, achieving results comparable to Claude Sonnet.

moonshotai/Kimi-K2-Instruct cover image
featured
mixed: fp8/fp4
128k
$0.55/$2.20 in/out Mtoken
  • text-generation

Kimi K2 is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for agentic capabilities, including advanced tool use, reasoning, and code synthesis. Kimi K2 excels across a broad range of benchmarks, particularly in coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench) tasks.

Qwen/Qwen3-235B-A22B-Thinking-2507 cover image
featured
fp8
256k
$0.13/$0.60 in/out Mtoken
  • text-generation

Qwen3-235B-A22B-Thinking-2507 is the Qwen3's new model with scaling the thinking capability of Qwen3-235B-A22B, improving both the quality and depth of reasoning.

mistralai/Voxtral-Small-24B-2507 cover image
featured
bf16
$0.00300 / minute
  • automatic-speech-recognition

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding.

mistralai/Voxtral-Mini-3B-2507 cover image
featured
bf16
$0.00100 / minute
  • automatic-speech-recognition

Voxtral Mini is an enhancement of Ministral 3B, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding.

deepseek-ai/DeepSeek-R1-0528-Turbo cover image
featured
fp4
32k
$1.00/$3.00 in/out Mtoken
  • text-generation

The DeepSeek R1 0528 turbo model is a state of the art reasoning model that can generate very quick responses

Qwen/Qwen3-235B-A22B-Instruct-2507 cover image
featured
fp8
256k
$0.13/$0.60 in/out Mtoken
  • text-generation

Qwen3-235B-A22B-Instruct-2507 is the updated version of the Qwen3-235B-A22B non-thinking mode, featuring Significant improvements in general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage.

Qwen/Qwen3-30B-A3B cover image
featured
fp8
40k
$0.08/$0.29 in/out Mtoken
  • text-generation

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support

Qwen/Qwen3-32B cover image
featured
fp8
40k
$0.10/$0.30 in/out Mtoken
  • text-generation

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support

Qwen/Qwen3-14B cover image
featured
fp8
40k
$0.06/$0.24 in/out Mtoken
  • text-generation

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support.

meta-llama/Llama-4-Maverick-17B-128E-Instruct-Turbo cover image
featured
fp8
8k
$0.50 / Mtoken
  • text-generation

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. Llama 4 Maverick, a 17 billion parameter model with 128 experts

meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 cover image
featured
fp8
1024k
$0.15/$0.60 in/out Mtoken
  • text-generation

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. Llama 4 Maverick, a 17 billion parameter model with 128 experts

meta-llama/Llama-4-Scout-17B-16E-Instruct cover image
featured
bfloat16
320k
$0.08/$0.30 in/out Mtoken
  • text-generation

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. Llama 4 Scout, a 17 billion parameter model with 16 experts

deepseek-ai/DeepSeek-R1-0528 cover image
featured
fp4
160k
$0.50/$2.15 in/out Mtoken
  • text-generation

The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528.

deepseek-ai/DeepSeek-V3-0324 cover image
featured
fp4
160k
$0.28/$0.88 in/out Mtoken
  • text-generation

DeepSeek-V3-0324, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token, an improved iteration over DeepSeek-V3.

mistralai/Devstral-Small-2507 cover image
featured
fp8
125k
$0.07/$0.28 in/out Mtoken
  • text-generation

Devstral is an agentic LLM for software engineering tasks, making it a great choice for software engineering agents.

Unlock the most affordable AI hosting

Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.