We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/text-generation

Text generation AI models can generate coherent and natural-sounding human language text, making them useful for a variety of applications from language translation to content creation.

There are several types of text generation AI models, including rule-based, statistical, and neural models. Neural models, and in particular transformer-based models like GPT, have achieved state-of-the-art results in text generation tasks. These models use artificial neural networks to analyze large text corpora and learn the patterns and structures of language.

While text generation AI models offer many exciting possibilities, they also present some challenges. For example, it's essential to ensure that the generated text is ethical, unbiased, and accurate, to avoid potential harm or negative consequences.

zai-org/GLM-4.5 cover image
featured
fp8
128k
$0.60/$2.20 in/out Mtoken
  • text-generation

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

zai-org/GLM-4.5-Air cover image
featured
fp8
128k
$0.20/$1.10 in/out Mtoken
  • text-generation

The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.

Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo cover image
featured
fp4
256k
$0.30/$1.20 in/out Mtoken
  • text-generation

Qwen3-Coder-480B-A35B-Instruct is the Qwen3's most agentic code model, featuring Significant Performance on Agentic Coding, Agentic Browser-Use and other foundational coding tasks, achieving results comparable to Claude Sonnet.

Qwen/Qwen3-Coder-480B-A35B-Instruct cover image
featured
fp8
256k
$0.40/$1.60 in/out Mtoken
  • text-generation

Qwen3-Coder-480B-A35B-Instruct is the Qwen3's most agentic code model, featuring Significant Performance on Agentic Coding, Agentic Browser-Use and other foundational coding tasks, achieving results comparable to Claude Sonnet.

Qwen/Qwen3-235B-A22B-Thinking-2507 cover image
featured
fp8
256k
$0.13/$0.60 in/out Mtoken
  • text-generation

Qwen3-235B-A22B-Thinking-2507 is the Qwen3's new model with scaling the thinking capability of Qwen3-235B-A22B, improving both the quality and depth of reasoning.

moonshotai/Kimi-K2-Instruct cover image
featured
mixed: fp8/fp4
128k
$0.50/$2.00 in/out Mtoken
  • text-generation

Kimi K2 is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for agentic capabilities, including advanced tool use, reasoning, and code synthesis. Kimi K2 excels across a broad range of benchmarks, particularly in coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench) tasks.

deepseek-ai/DeepSeek-R1-0528-Turbo cover image
featured
fp4
32k
$1.00/$3.00 in/out Mtoken
  • text-generation

The DeepSeek R1 0528 turbo model is a state of the art reasoning model that can generate very quick responses

Qwen/Qwen3-235B-A22B-Instruct-2507 cover image
featured
fp8
256k
$0.13/$0.60 in/out Mtoken
  • text-generation

Qwen3-235B-A22B-Instruct-2507 is the updated version of the Qwen3-235B-A22B non-thinking mode, featuring Significant improvements in general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage.

Qwen/Qwen3-30B-A3B cover image
featured
fp8
40k
$0.08/$0.29 in/out Mtoken
  • text-generation

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support

Qwen/Qwen3-32B cover image
featured
fp8
40k
$0.10/$0.30 in/out Mtoken
  • text-generation

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support

Qwen/Qwen3-14B cover image
featured
fp8
40k
$0.06/$0.24 in/out Mtoken
  • text-generation

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support.

meta-llama/Llama-4-Maverick-17B-128E-Instruct-Turbo cover image
featured
fp8
8k
$0.50 / Mtoken
  • text-generation

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. Llama 4 Maverick, a 17 billion parameter model with 128 experts

meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 cover image
featured
fp8
1024k
$0.15/$0.60 in/out Mtoken
  • text-generation

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. Llama 4 Maverick, a 17 billion parameter model with 128 experts

meta-llama/Llama-4-Scout-17B-16E-Instruct cover image
featured
bfloat16
320k
$0.08/$0.30 in/out Mtoken
  • text-generation

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. Llama 4 Scout, a 17 billion parameter model with 16 experts

deepseek-ai/DeepSeek-R1-0528 cover image
featured
fp4
160k
$0.50/$2.15 in/out Mtoken
  • text-generation

The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528.

deepseek-ai/DeepSeek-V3-0324 cover image
featured
fp4
160k
$0.28/$0.88 in/out Mtoken
  • text-generation

DeepSeek-V3-0324, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token, an improved iteration over DeepSeek-V3.

mistralai/Devstral-Small-2507 cover image
featured
fp8
125k
$0.07/$0.28 in/out Mtoken
  • text-generation

Devstral is an agentic LLM for software engineering tasks, making it a great choice for software engineering agents.

Unlock the most affordable AI hosting

Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.