Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:

Viewing all

featured

text-generation

text-to-image

automatic-speech-recognition

embeddings

token-classification

fill-mask

text-classification

question-answering

image-classification

object-detection

custom

zero-shot-image-classification

Category/text-generation

Text generation AI models can generate coherent and natural-sounding human language text, making them useful for a variety of applications from language translation to content creation.

There are several types of text generation AI models, including rule-based, statistical, and neural models. Neural models, and in particular transformer-based models like GPT, have achieved state-of-the-art results in text generation tasks. These models use artificial neural networks to analyze large text corpora and learn the patterns and structures of language.

While text generation AI models offer many exciting possibilities, they also present some challenges. For example, it's essential to ensure that the generated text is ethical, unbiased, and accurate, to avoid potential harm or negative consequences.

fp16

16k

Replaced

bigcode/

starcoder2-15b

text-generation

StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages. It specializes in code completion.

bigcode/starcoder2-15b-instruct-v0.1 cover image

fp16

$0.15 / Mtoken*

bigcode/

starcoder2-15b-instruct-v0.1

text-generation

We introduce StarCoder2-15B-Instruct-v0.1, the very first entirely self-aligned code Large Language Model (LLM) trained with a fully permissive and transparent pipeline. Our open-source pipeline uses StarCoder2-15B to generate thousands of instruction-response pairs, which are then used to fine-tune StarCoder-15B itself without any human annotations or distilled data from huge and proprietary LLMs.

codellama/CodeLlama-34b-Instruct-hf cover image

fp16

Replaced

codellama/

CodeLlama-34b-Instruct-hf

text-generation

Code Llama is a state-of-the-art LLM capable of generating code, and natural language about code, from both code and natural language prompts. This particular instance is the 34b instruct variant

codellama/CodeLlama-70b-Instruct-hf cover image

fp16

Replaced

codellama/

CodeLlama-70b-Instruct-hf

text-generation

CodeLlama-70b is the largest and latest code generation from the Code Llama collection.

bfloat16

32k

Replaced

databricks/

dbrx-instruct

text-generation

DBRX is an open source LLM created by Databricks. It uses mixture-of-experts (MoE) architecture with 132B total parameters of which 36B parameters are active on any input. It outperforms existing open source LLMs like Llama 2 70B and Mixtral-8x7B on standard industry benchmarks for language understanding, programming, math, and logic.

fp16

Deprecated

deepinfra/

airoboros-70b

text-generation

Latest version of the Airoboros model fine-tunned version of llama-2-70b using the Airoboros dataset. This model is currently running jondurbin/airoboros-l2-70b-2.2.1

fp16

$0.07 / Mtoken*

google/

codegemma-7b-it

text-generation

CodeGemma is a collection of lightweight open code models built on top of Gemma. CodeGemma models are text-to-text and text-to-code decoder-only models and are available as a 7 billion pretrained variant that specializes in code completion and code generation tasks, a 7 billion parameter instruction-tuned variant for code chat and instruction following and a 2 billion parameter pretrained variant for fast code completion.

meta-llama/Llama-2-13b-chat-hf cover image

fp16

Deprecated

meta-llama/

Llama-2-13b-chat-hf

text-generation

Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.

meta-llama/Llama-2-70b-chat-hf cover image

fp16

Deprecated

meta-llama/

Llama-2-70b-chat-hf

text-generation

LLaMa 2 is a collections of LLMs trained by Meta. This is the 70B chat optimized version. This endpoint has per token pricing.

meta-llama/Llama-2-7b-chat-hf cover image

fp16

Deprecated

meta-llama/

Llama-2-7b-chat-hf

text-generation

mistralai/Mistral-7B-Instruct-v0.1 cover image

bfloat16

32k

$0.07 / Mtoken*

mistralai/

Mistral-7B-Instruct-v0.1

text-generation

The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-v0.1 generative text model using a variety of publicly available conversation datasets.

bfloat16

32k

$0.07 / Mtoken*

mistralai/

Mistral-7B-Instruct-v0.3

text-generation

mistralai/Mixtral-8x22B-v0.1 cover image

fp16

64k

Replaced

mistralai/

Mixtral-8x22B-v0.1

text-generation

Mixtral-8x22B is the latest and largest mixture of expert large language model (LLM) from Mistral AI. This is state of the art machine learning model using a mixture 8 of experts (MoE) 22b models. During inference 2 expers are selected. This architecture allows large models to be fast and cheap at inference. This model is not instruction tuned.

bfloat16

$0.08 / Mtoken*

openchat/

openchat-3.6-8b

text-generation

Latest Models

Phind/

Phind-CodeLlama-34B-v2

bigcode/

starcoder2-15b

openchat/

openchat_3.5

Gryphe/

MythoMax-L2-13b

openai/

whisper-tiny

Featured Models

cognitivecomputations/

dolphin-2.6-mixtral-8x7b

BAAI/

bge-large-en-v1.5

mistralai/

Mixtral-8x22B-Instruct-v0.1

microsoft/

WizardLM-2-7B

openai/

whisper-large

mistralai/

Mistral-7B-Instruct-v0.2

Company

Pricing

Docs

Compare

DeepStart

About

Careers

Privacy

Terms

*State and local taxes may apply.