We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/text-generation

Text generation AI models can generate coherent and natural-sounding human language text, making them useful for a variety of applications from language translation to content creation.

There are several types of text generation AI models, including rule-based, statistical, and neural models. Neural models, and in particular transformer-based models like GPT, have achieved state-of-the-art results in text generation tasks. These models use artificial neural networks to analyze large text corpora and learn the patterns and structures of language.

While text generation AI models offer many exciting possibilities, they also present some challenges. For example, it's essential to ensure that the generated text is ethical, unbiased, and accurate, to avoid potential harm or negative consequences.

Qwen/QwQ-32B-Preview cover image
bfloat16
32k
Replaced
  • text-generation

QwQ is an experimental research model developed by the Qwen Team, designed to advance AI reasoning capabilities. This model embodies the spirit of philosophical inquiry, approaching problems with genuine wonder and doubt. QwQ demonstrates impressive analytical abilities, achieving scores of 65.2% on GPQA, 50.0% on AIME, 90.6% on MATH-500, and 50.0% on LiveCodeBench. With its contemplative approach and exceptional performance on complex problems.

Qwen/Qwen2-72B-Instruct cover image
bfloat16
32k
Replaced
  • text-generation

The 72 billion parameter Qwen2 excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning.

Qwen/Qwen2-7B-Instruct cover image
bfloat16
32k
Replaced
  • text-generation

The 7 billion parameter Qwen2 excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning.

Qwen/Qwen2.5-7B-Instruct cover image
bfloat16
32k
$0.05/$0.10 in/out Mtoken
  • text-generation

The 7 billion parameter Qwen2.5 excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning

Qwen/Qwen2.5-Coder-7B cover image
32k
Replaced
  • text-generation

Qwen2.5-Coder-7B is a powerful code-specific large language model with 7.61 billion parameters. It's designed for code generation, reasoning, and fixing tasks. The model covers 92 programming languages and has been trained on 5.5 trillion tokens of data, including source code, text-code grounding, and synthetic data.

Sao10K/L3-70B-Euryale-v2.1 cover image
fp8
8k
Replaced
  • text-generation

Euryale 70B v2.1 is a model focused on creative roleplay from Sao10k

Sao10K/L3-8B-Lunaris-v1 cover image
bfloat16
8k
Replaced
  • text-generation

A generalist / roleplaying model merge based on Llama 3. Sao10K has carefully selected the values based on extensive personal experimentation and has fine-tuned them to create a customized recipe.

Sao10K/L3.1-70B-Euryale-v2.2 cover image
fp8
128k
$0.70/$0.80 in/out Mtoken
  • text-generation

Euryale 3.1 - 70B v2.2 is a model focused on creative roleplay from Sao10k

Sao10K/L3.3-70B-Euryale-v2.3 cover image
fp8
128k
$0.70/$0.80 in/out Mtoken
  • text-generation

L3.3-70B-Euryale-v2.3 is a model focused on creative roleplay from Sao10k

bigcode/starcoder2-15b-instruct-v0.1 cover image
fp16
Replaced
  • text-generation

We introduce StarCoder2-15B-Instruct-v0.1, the very first entirely self-aligned code Large Language Model (LLM) trained with a fully permissive and transparent pipeline. Our open-source pipeline uses StarCoder2-15B to generate thousands of instruction-response pairs, which are then used to fine-tune StarCoder-15B itself without any human annotations or distilled data from huge and proprietary LLMs.

cognitivecomputations/dolphin-2.6-mixtral-8x7b cover image
bfloat16
32k
Replaced
  • text-generation

The Dolphin 2.6 Mixtral 8x7b model is a finetuned version of the Mixtral-8x7b model, trained on a variety of data including coding data, for 3 days on 4 A100 GPUs. It is uncensored and requires trust_remote_code. The model is very obedient and good at coding, but not DPO tuned. The dataset has been filtered for alignment and bias. The model is compliant with user requests and can be used for various purposes such as generating code or engaging in general chat.

cognitivecomputations/dolphin-2.9.1-llama-3-70b cover image
bfloat16
8k
Replaced
  • text-generation

Dolphin 2.9.1, a fine-tuned Llama-3-70b model. The new model, trained on filtered data, is more compliant but uncensored. It demonstrates improvements in instruction, conversation, coding, and function calling abilities.

deepinfra/airoboros-70b cover image
fp16
4k
Replaced
  • text-generation

Latest version of the Airoboros model fine-tunned version of llama-2-70b using the Airoboros dataset. This model is currently running jondurbin/airoboros-l2-70b-2.2.1

google/codegemma-7b-it cover image
fp16
8k
Replaced
  • text-generation

CodeGemma is a collection of lightweight open code models built on top of Gemma. CodeGemma models are text-to-text and text-to-code decoder-only models and are available as a 7 billion pretrained variant that specializes in code completion and code generation tasks, a 7 billion parameter instruction-tuned variant for code chat and instruction following and a 2 billion parameter pretrained variant for fast code completion.