Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/text-generation

Text generation AI models can generate coherent and natural-sounding human language text, making them useful for a variety of applications from language translation to content creation.

There are several types of text generation AI models, including rule-based, statistical, and neural models. Neural models, and in particular transformer-based models like GPT, have achieved state-of-the-art results in text generation tasks. These models use artificial neural networks to analyze large text corpora and learn the patterns and structures of language.

While text generation AI models offer many exciting possibilities, they also present some challenges. For example, it's essential to ensure that the generated text is ethical, unbiased, and accurate, to avoid potential harm or negative consequences.

meta-llama/Meta-Llama-3.1-70B-Instruct cover image
featured
bfloat16
65.08%
128k
$0.35/$0.40 in/out Mtoken
  • text-generation

Meta developed and released the Meta Llama 3.1 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8B, 70B and 405B sizes

meta-llama/Meta-Llama-3.1-8B-Instruct cover image
featured
bfloat16
128k
$0.055 / Mtoken
  • text-generation

Meta developed and released the Meta Llama 3.1 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8B, 70B and 405B sizes

meta-llama/Meta-Llama-3.1-405B-Instruct cover image
featured
fp8
32k
$1.79 / Mtoken
  • text-generation

Meta developed and released the Meta Llama 3.1 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8B, 70B and 405B sizes

meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo cover image
featured
fp8
64.69%
128k
$0.29/$0.40 in/out Mtoken
  • text-generation

Meta developed and released the Meta Llama 3.1 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8B, 70B and 405B sizes

Qwen/Qwen2.5-Coder-32B-Instruct cover image
featured
bfloat16
32k
$0.18 / Mtoken
  • text-generation

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). It has significant improvements in code generation, code reasoning and code fixing. A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.

nvidia/Llama-3.1-Nemotron-70B-Instruct cover image
featured
128k
$0.35/$0.40 in/out Mtoken
  • text-generation

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries. This model reaches Arena Hard of 85.0, AlpacaEval 2 LC of 57.6 and GPT-4-Turbo MT-Bench of 8.98, which are known to be predictive of LMSys Chatbot Arena Elo. As of 16th Oct 2024, this model is #1 on all three automatic alignment benchmarks (verified tab for AlpacaEval 2 LC), edging out strong frontier models such as GPT-4o and Claude 3.5 Sonnet.

Qwen/Qwen2.5-72B-Instruct cover image
featured
bfloat16
32k
$0.35/$0.40 in/out Mtoken
  • text-generation

Qwen2.5 is a model pretrained on a large-scale dataset of up to 18 trillion tokens, offering significant improvements in knowledge, coding, mathematics, and instruction following compared to its predecessor Qwen2. The model also features enhanced capabilities in generating long texts, understanding structured data, and generating structured outputs, while supporting multilingual capabilities for over 29 languages.

meta-llama/Llama-3.2-90B-Vision-Instruct cover image
featured
32k
$0.35/$0.40 in/out Mtoken
  • text-generation

The Llama 90B Vision model is a top-tier, 90-billion-parameter multimodal model designed for the most challenging visual reasoning and language tasks. It offers unparalleled accuracy in image captioning, visual question answering, and advanced image-text comprehension. Pre-trained on vast multimodal datasets and fine-tuned with human feedback, the Llama 90B Vision is engineered to handle the most demanding image-based AI tasks. This model is perfect for industries requiring cutting-edge multimodal AI capabilities, particularly those dealing with complex, real-time visual and textual analysis.

meta-llama/Llama-3.2-11B-Vision-Instruct cover image
featured
128k
$0.055 / Mtoken
  • text-generation

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and visual question answering, bridging the gap between language generation and visual reasoning. Pre-trained on a massive dataset of image-text pairs, it performs well in complex, high-accuracy image analysis. Its ability to integrate visual understanding with language processing makes it an ideal solution for industries requiring comprehensive visual-linguistic AI applications, such as content creation, AI-driven customer service, and research.

microsoft/WizardLM-2-8x22B cover image
featured
bfloat16
64k
$0.50 / Mtoken
  • text-generation

WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to those leading proprietary models.

Austism/chronos-hermes-13b-v2 cover image
fp16
4k
Replaced
  • text-generation

This offers the imaginative writing style of chronos while still retaining coherency and being capable. Outputs are long and utilize exceptional prose. Supports a maxium context length of 4096. The model follows the Alpaca prompt format.

Gryphe/MythoMax-L2-13b-turbo cover image
fp8
4k
Replaced
  • text-generation

Faster version of Gryphe/MythoMax-L2-13b running on multiple H100 cards in fp8 precision. Up to 160 tps.

HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1 cover image
fp8
64k
Replaced
  • text-generation

Zephyr 141B-A35B is an instruction-tuned (assistant) version of Mixtral-8x22B. It was fine-tuned on a mix of publicly available, synthetic datasets. It achieves strong performance on chat benchmarks.

KoboldAI/LLaMA2-13B-Tiefighter cover image
fp16
4k
$0.10 / Mtoken
  • text-generation

LLaMA2-13B-Tiefighter is a highly creative and versatile language model, fine-tuned for storytelling, adventure, and conversational dialogue. It combines the strengths of multiple models and datasets, including retro-rodeo and choose-your-own-adventure, to generate engaging and imaginative content. With its ability to improvise and adapt to different styles and formats, Tiefighter is perfect for writers, creators, and anyone looking to spark their imagination.

NousResearch/Hermes-3-Llama-3.1-405B cover image
fp8
128k
$1.79/$2.49 in/out Mtoken
  • text-generation

Hermes 3 is a cutting-edge language model that offers advanced capabilities in roleplaying, reasoning, and conversation. It's a fine-tuned version of the Llama-3.1 405B foundation model, designed to align with user needs and provide powerful control. Key features include reliable function calling, structured output, generalist assistant capabilities, and improved code generation. Hermes 3 is competitive with Llama-3.1 Instruct models, with its own strengths and weaknesses.

Phind/Phind-CodeLlama-34B-v2 cover image
fp16
4k
Replaced
  • text-generation

Phind-CodeLlama-34B-v2 is an open-source language model that has been fine-tuned on 1.5B tokens of high-quality programming-related data and achieved a pass@1 rate of 73.8% on HumanEval. It is multi-lingual and proficient in Python, C/C++, TypeScript, Java, and more. It has been trained on a proprietary dataset of instruction-answer pairs instead of code completion examples. The model is instruction-tuned on the Alpaca/Vicuna format to be steerable and easy-to-use. It accepts the Alpaca/Vicuna instruction format and can generate one completion for each prompt.